Don't make your agent think when it can know.

Amply is an API that returns real-time, machine-readable data on the best service for a given task, including price, speed, and reliability.

Live in 60 Seconds

Three steps: get a key, copy the request, run it.

1

Step 1 of 3

Get your free API key

Sign in with Google from the dashboard. Your key is tied to your account.

2

Step 2 of 3

Copy this curl

Same payload as the API console. Replace YOUR_API_KEY.

curl -s -X POST "https://www.useamply.com/api/v1/route" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"task":"store 100k 1536 dimensional vectors with metadata filters and run 50 similarity queries","dimension":1536,"workload_type":"hybrid","filter_complexity":"high"}'
3

Step 3 of 3

Run it

You get JSON with recommended, rankings, economics, and why, then wire it into your agent.

The problem

Every agent today guesses, or burns time reasoning

Most agents either pick an API on a whim, or spend 10 to 30 seconds reasoning it out. That costs roughly $0.10 to $0.50 per decision, adds serious latency, and still produces inconsistent results.

Multiply that across millions of agent actions, and it becomes a massive inefficiency in the system.

The shift is simple, and huge

We're moving from humans choosing tools to agents choosing tools. Agents don't browse, compare, or read docs. They optimize for cost, speed, and reliability. Right now, that decision layer is effectively broken.

The solution

Amply: one request, the best service

One request returns the best option with real price, real latency, and real success rate, in milliseconds of server-side routing, often under 200ms of handler work (compute_ms / X-Amply-Compute-Ms). Full HTTP round-trip depends on your network to the edge. Machine readable, ready to wire into your agent loop.

Copy a request

POST /api/v1/route on production requires Authorization: Bearer <your key> (omit Bearer only if the server has no AMPLY_API_KEYS). Replace YOUR_API_KEY in the snippet.

Production routing economics — use a user-bound key (dashboard or POST /api/v1/signup/agent) so wallet debit, quota, and scoped audit apply. Server-only keys match AMPLY_API_KEYS and return X-Amply-Principal-Bound: 0. See docs → production keys.

Documentation · OpenAPI JSON · Status & diagnostics

curl -s -X POST "https://www.useamply.com/api/v1/route" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"task":"store 100k 1536 dimensional vectors with metadata filters and run 50 similarity queries","dimension":1536,"workload_type":"hybrid","filter_complexity":"high"}'

What's your agent actually receives

{
  "amply": {
    "api_version": "v1",
    "route_id": "rt_8f2a9c1e4b3d",
    "decision_ms": 186,
    "observation_window": "rolling_7d",
    "refreshed_at": "2026-03-24T12:00:00Z"
  },
  "recommended": {
    "provider_id": "pinecone_serverless",
    "region_hint": "us_east_1",
    "confidence": 0.91,
    "score": 0.91
  },
  "rankings": [
    { "provider_id": "pinecone_serverless", "rank": 1, "score": 0.91 },
    { "provider_id": "qdrant_cloud", "rank": 2, "score": 0.84 },
    { "provider_id": "weaviate_cloud", "rank": 3, "score": 0.79 }
  ],
  "economics": {
    "currency": "USD",
    "estimated_cost_usd_per_1m_vector_ops": 12.4,
    "billing_model": "usage_metered"
  },
  "performance": {
    "p50_latency_ms": 41,
    "p95_latency_ms": 78,
    "p99_latency_ms": 94,
    "success_rate_7d": 0.997,
    "error_rate_7d": 0.003,
    "samples_7d": 1842000
  },
  "why": "Strongest fit for hybrid search with high filter complexity: best p99 under your implied SLO, lowest blended cost in live traffic, and stable success rate week over week.",
  "decision_factors": [
    { "factor": "latency_p99_vs_slo", "weight": 0.34, "outcome": "pass" },
    { "factor": "cost_per_million_ops", "weight": 0.28, "outcome": "best_in_class" },
    { "factor": "success_rate_7d", "weight": 0.24, "outcome": "high" },
    { "factor": "workload_match", "weight": 0.14, "outcome": "hybrid_plus_filters" }
  ],
  "agent_hints": {
    "log_fields": ["amply.route_id", "recommended.provider_id", "amply.decision_ms"],
    "idempotent": true,
    "retry_safe": true,
    "fallback_policy": "use_rankings_2_then_3_if_provider_errors"
  }
}
How it works

One call before the tool call

POST /v1/route with a task string. Amply returns the recommended service, numeric signals (cost, latency, reliability), and a short why: all JSON, no prose essay. Wire it into LangGraph, CrewAI, or plain HTTP.

Without Amply

  • Model picks APISlow
  • Bad cost / latency info
  • No real benchmarks
  • Wrong providerSlower & pricier

With Amply

  • POST to Amply
  • Benchmark scores
  • Pick + metrics + why
  • Right serviceFast path

Real Use Cases

Vector DB Routing

Pick the best hosted index for latency, cost, and filters.

Embedding Model Selection

Match model dimension and throughput to workload.

Cost-Optimized Inference

Trade off price vs SLOs with transparent economics.

Hybrid Search & RAG

Favor providers that excel at keyword + vector blends.

Multi-modal Agents

Route tasks that span vectors, text, and structured filters.

Teams shipping agents

We used to let the model debate Pinecone vs. Qdrant and eat tokens. Now the loop does one POST, we stash the JSON beside the trace, and `request_id` is enough when something looks off.

Staff engineer, Series B devtools · ~45 people

Our tests assert on keys, not paragraphs. `recommended`, `compute_ms`, and the cost fields are stable enough that we snapshot them in CI for regressions.

Tech lead, B2B workflow SaaS · ~120 people

Leadership wanted a single slide, not another benchmark PDF. Everyone stared at the same quoted p99 and $/1M dims in the payload—we locked the vendor that afternoon.

Platform PM, Enterprise data team · regulated sector

Pricing

The routing API is built for agents that call it thousands of times a day—so pricing is usage-based, not a flat subscription for a single endpoint. Billing follows a prepaid-wallet model: deposit balance (for example $10), then requests draw it down until topped up. The public catalog is curated by Amply; we do not sell placement or visibility to providers.

Agents & developers
Routing APIPrepaid metered usage — deposit funds, then a small charge is deducted per routed request or usage unit. When balance is low/empty, top up and continue. API key required when auth is enabled.
CatalogSame public GET /api/v1/providers snapshot for everyone. Rows are added and maintained editorially—no pay-to-list.
Docs & OpenAPIIncluded with the same operational targets as the public API.