Step 1 of 3
Get your free API key
Sign in with Google from the dashboard. Your key is tied to your account.
Amply is an API that returns real-time, machine-readable data on the best service for a given task, including price, speed, and reliability.
Three steps: get a key, copy the request, run it.
Step 1 of 3
Sign in with Google from the dashboard. Your key is tied to your account.
Step 2 of 3
Same payload as the API console. Replace YOUR_API_KEY.
curl -s -X POST "https://www.useamply.com/api/v1/route" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"task":"store 100k 1536 dimensional vectors with metadata filters and run 50 similarity queries","dimension":1536,"workload_type":"hybrid","filter_complexity":"high"}'Step 3 of 3
You get JSON with recommended, rankings, economics, and why, then wire it into your agent.
The problem
Most agents either pick an API on a whim, or spend 10 to 30 seconds reasoning it out. That costs roughly $0.10 to $0.50 per decision, adds serious latency, and still produces inconsistent results.
Multiply that across millions of agent actions, and it becomes a massive inefficiency in the system.
We're moving from humans choosing tools to agents choosing tools. Agents don't browse, compare, or read docs. They optimize for cost, speed, and reliability. Right now, that decision layer is effectively broken.
The solution
One request returns the best option with real price, real latency, and real success rate, in milliseconds of server-side routing, often under 200ms of handler work (compute_ms / X-Amply-Compute-Ms). Full HTTP round-trip depends on your network to the edge. Machine readable, ready to wire into your agent loop.
POST /api/v1/route on production requires Authorization: Bearer <your key> (omit Bearer only if the server has no AMPLY_API_KEYS). Replace YOUR_API_KEY in the snippet.
Production routing economics — use a user-bound key (dashboard or POST /api/v1/signup/agent) so wallet debit, quota, and scoped audit apply. Server-only keys match AMPLY_API_KEYS and return X-Amply-Principal-Bound: 0. See docs → production keys.
Documentation · OpenAPI JSON · Status & diagnostics
curl -s -X POST "https://www.useamply.com/api/v1/route" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"task":"store 100k 1536 dimensional vectors with metadata filters and run 50 similarity queries","dimension":1536,"workload_type":"hybrid","filter_complexity":"high"}'{
"amply": {
"api_version": "v1",
"route_id": "rt_8f2a9c1e4b3d",
"decision_ms": 186,
"observation_window": "rolling_7d",
"refreshed_at": "2026-03-24T12:00:00Z"
},
"recommended": {
"provider_id": "pinecone_serverless",
"region_hint": "us_east_1",
"confidence": 0.91,
"score": 0.91
},
"rankings": [
{ "provider_id": "pinecone_serverless", "rank": 1, "score": 0.91 },
{ "provider_id": "qdrant_cloud", "rank": 2, "score": 0.84 },
{ "provider_id": "weaviate_cloud", "rank": 3, "score": 0.79 }
],
"economics": {
"currency": "USD",
"estimated_cost_usd_per_1m_vector_ops": 12.4,
"billing_model": "usage_metered"
},
"performance": {
"p50_latency_ms": 41,
"p95_latency_ms": 78,
"p99_latency_ms": 94,
"success_rate_7d": 0.997,
"error_rate_7d": 0.003,
"samples_7d": 1842000
},
"why": "Strongest fit for hybrid search with high filter complexity: best p99 under your implied SLO, lowest blended cost in live traffic, and stable success rate week over week.",
"decision_factors": [
{ "factor": "latency_p99_vs_slo", "weight": 0.34, "outcome": "pass" },
{ "factor": "cost_per_million_ops", "weight": 0.28, "outcome": "best_in_class" },
{ "factor": "success_rate_7d", "weight": 0.24, "outcome": "high" },
{ "factor": "workload_match", "weight": 0.14, "outcome": "hybrid_plus_filters" }
],
"agent_hints": {
"log_fields": ["amply.route_id", "recommended.provider_id", "amply.decision_ms"],
"idempotent": true,
"retry_safe": true,
"fallback_policy": "use_rankings_2_then_3_if_provider_errors"
}
}POST /v1/route with a task string. Amply returns the recommended service, numeric signals (cost, latency, reliability), and a short why: all JSON, no prose essay. Wire it into LangGraph, CrewAI, or plain HTTP.
Without Amply
With Amply
Pick the best hosted index for latency, cost, and filters.
Match model dimension and throughput to workload.
Trade off price vs SLOs with transparent economics.
Favor providers that excel at keyword + vector blends.
Route tasks that span vectors, text, and structured filters.
“We used to let the model debate Pinecone vs. Qdrant and eat tokens. Now the loop does one POST, we stash the JSON beside the trace, and `request_id` is enough when something looks off.”
“Our tests assert on keys, not paragraphs. `recommended`, `compute_ms`, and the cost fields are stable enough that we snapshot them in CI for regressions.”
“Leadership wanted a single slide, not another benchmark PDF. Everyone stared at the same quoted p99 and $/1M dims in the payload—we locked the vendor that afternoon.”
The routing API is built for agents that call it thousands of times a day—so pricing is usage-based, not a flat subscription for a single endpoint. Billing follows a prepaid-wallet model: deposit balance (for example $10), then requests draw it down until topped up. The public catalog is curated by Amply; we do not sell placement or visibility to providers.
| Agents & developers | |
|---|---|
| Routing API | Prepaid metered usage — deposit funds, then a small charge is deducted per routed request or usage unit. When balance is low/empty, top up and continue. API key required when auth is enabled. |
| Catalog | Same public GET /api/v1/providers snapshot for everyone. Rows are added and maintained editorially—no pay-to-list. |
| Docs & OpenAPI | Included with the same operational targets as the public API. |