BYOK gateway for OpenAI + Anthropic

The AI gateway for production.

Bring your OpenAI and Anthropic keys. Route through one endpoint, set fallbacks, swap models without redeploys. We never resell tokens.

Bring your own keysNo markup, everMetadata-only logs
curl https://api.usellm.io/v1/chat/completions \
-H "Authorization: Bearer $USELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
POSTapi.usellm.io/v1/chat/completions
99.99%Gateway uptime
<5msRouting overhead
0%Token markup
Powering AI in production

Teams ship on useLLM every day.

Vercel
Replicate
Modal
LobeChat
Langfuse

Everything you need, built for developers

Bring your own keys

Connect your OpenAI and Anthropic accounts. We route through them — never resell tokens.

OpenAI-compatible endpoint

Point any OpenAI SDK at api.usellm.io/v1. Same request shape, same streaming, same tools.

Model aliases

Define names like “smart” or “cheap” and remap them in the dashboard. No redeploys.

Fallbacks & retries

Auto-fail over to a backup model when a provider is degraded. Configurable per alias.

Deep observability

Requests, tokens, latency, failures, and estimated provider cost across every key.

<5ms overhead

A thin proxy. The gateway adds milliseconds, not seconds.

Metadata-only logs

Prompts and responses are not stored. Usage history keeps only routing, token, latency, and cost metadata.

Normalized errors

Provider-specific quirks abstracted to one consistent error shape and status code.

Dashboard

See every request.

Centralized visibility into routing decisions, fallbacks, latency, and estimated provider cost.

  • Requests, tokens, p95 latency
  • Fallback activations & error rate
  • Estimated cost per alias and provider
  • Per-request logs with full trace
See the dashboard

Overview

Last 7 days

Total spend

$1,238.75

+12.5%

Total tokens

128.6M

+8.1%

Requests

318,265

+9.7%

Cache hit rate

37.4%

+6.2%

Spend over timeDaily

Spend by model

  • gpt-4o45.6%
  • claude-3-5-sonnet32.1%
  • gpt-4-turbo16.7%
  • other5.6%
Recent requestsView all
TimeModelTokensCostStatus
May 18, 10:21:36gpt-4o2,153$0.0321200
Pricing

One flat fee for the gateway

You pay providers directly for tokens. useLLM charges only for the gateway — routing, fallbacks, observability.

Free

$0/mo
  • 10K routed requests / mo
  • 1 alias, 2 providers
  • 7-day metadata retention

Starter

$19/mo
  • 100K routed requests / mo
  • 3 aliases, 2 providers
  • 14-day metadata retention
Most popular

Pro

$49/mo
  • 1M routed requests / mo
  • Unlimited aliases & fallbacks
  • Usage filters & model catalog

Business

$149/mo
  • 5M routed requests / mo
  • 90-day metadata retention
  • Priority support

Enterprise

Custom
  • 10M+ routed requests / mo
  • Custom contract placeholder
  • Contact us

Provider token costs are billed by OpenAI / Anthropic on your own accounts. useLLM never resells tokens.

FAQ

Questions, answered

What is BYOK and why does useLLM require it?

Bring Your Own Key. You connect your OpenAI and Anthropic accounts and we proxy through them — you pay providers directly for tokens, useLLM charges only for the gateway. No resold credits, no markup, no grey-market.

Which models can I route to?

Anything OpenAI or Anthropic ships, the moment they ship it. You bring the key, useLLM speaks the OpenAI-compatible API on top.

How do model aliases work?

Instead of hardcoding gpt-4o or claude-sonnet-4.5 into your app, you call “smart” or “coding”. The alias maps to a primary model plus a fallback chain, with retry and timeout rules, and is editable in the dashboard without redeploys.

What happens during a provider outage?

Each alias has a fallback chain. If the primary model returns an error or times out, useLLM retries against the next model on the chain before surfacing the failure to your app.

Do you store my prompts?

No. The gateway stores metadata such as alias, model, tokens, latency, status, and estimated provider cost. Prompt and response bodies are not persisted.

How does useLLM bill?

Flat monthly plans based on routed request volume. Provider token costs go to your provider directly via your key — useLLM never sees the bill.

Can I cancel anytime?

Yes. Cancel from the dashboard; your keys keep working at OpenAI and Anthropic, you just lose the gateway features.

Your gateway to OpenAI and Claude. Ready in minutes.

Bring your provider keys, swap your base URL, and ship. Define aliases and fallbacks without redeploying.