The AI gateway for production.
Bring your OpenAI and Anthropic keys. Route through one endpoint, set fallbacks, swap models without redeploys. We never resell tokens.
curl https://api.usellm.io/v1/chat/completions \ -H "Authorization: Bearer $USELLM_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "user", "content": "Hello!"} ] }'Teams ship on useLLM every day.
Everything you need, built for developers
Bring your own keys
Connect your OpenAI and Anthropic accounts. We route through them — never resell tokens.
OpenAI-compatible endpoint
Point any OpenAI SDK at api.usellm.io/v1. Same request shape, same streaming, same tools.
Model aliases
Define names like “smart” or “cheap” and remap them in the dashboard. No redeploys.
Fallbacks & retries
Auto-fail over to a backup model when a provider is degraded. Configurable per alias.
Deep observability
Requests, tokens, latency, failures, and estimated provider cost across every key.
<5ms overhead
A thin proxy. The gateway adds milliseconds, not seconds.
Metadata-only logs
Prompts and responses are not stored. Usage history keeps only routing, token, latency, and cost metadata.
Normalized errors
Provider-specific quirks abstracted to one consistent error shape and status code.
See every request.
Centralized visibility into routing decisions, fallbacks, latency, and estimated provider cost.
- Requests, tokens, p95 latency
- Fallback activations & error rate
- Estimated cost per alias and provider
- Per-request logs with full trace
Overview
Last 7 daysTotal spend
$1,238.75
+12.5%
Total tokens
128.6M
+8.1%
Requests
318,265
+9.7%
Cache hit rate
37.4%
+6.2%
Spend by model
- gpt-4o45.6%
- claude-3-5-sonnet32.1%
- gpt-4-turbo16.7%
- other5.6%
One flat fee for the gateway
You pay providers directly for tokens. useLLM charges only for the gateway — routing, fallbacks, observability.
Free
- 10K routed requests / mo
- 1 alias, 2 providers
- 7-day metadata retention
Starter
- 100K routed requests / mo
- 3 aliases, 2 providers
- 14-day metadata retention
Pro
- 1M routed requests / mo
- Unlimited aliases & fallbacks
- Usage filters & model catalog
Business
- 5M routed requests / mo
- 90-day metadata retention
- Priority support
Enterprise
- 10M+ routed requests / mo
- Custom contract placeholder
- Contact us
Provider token costs are billed by OpenAI / Anthropic on your own accounts. useLLM never resells tokens.
Questions, answered
What is BYOK and why does useLLM require it?
Bring Your Own Key. You connect your OpenAI and Anthropic accounts and we proxy through them — you pay providers directly for tokens, useLLM charges only for the gateway. No resold credits, no markup, no grey-market.
Which models can I route to?
Anything OpenAI or Anthropic ships, the moment they ship it. You bring the key, useLLM speaks the OpenAI-compatible API on top.
How do model aliases work?
Instead of hardcoding gpt-4o or claude-sonnet-4.5 into your app, you call “smart” or “coding”. The alias maps to a primary model plus a fallback chain, with retry and timeout rules, and is editable in the dashboard without redeploys.
What happens during a provider outage?
Each alias has a fallback chain. If the primary model returns an error or times out, useLLM retries against the next model on the chain before surfacing the failure to your app.
Do you store my prompts?
No. The gateway stores metadata such as alias, model, tokens, latency, status, and estimated provider cost. Prompt and response bodies are not persisted.
How does useLLM bill?
Flat monthly plans based on routed request volume. Provider token costs go to your provider directly via your key — useLLM never sees the bill.
Can I cancel anytime?
Yes. Cancel from the dashboard; your keys keep working at OpenAI and Anthropic, you just lose the gateway features.
Your gateway to OpenAI and Claude. Ready in minutes.
Bring your provider keys, swap your base URL, and ship. Define aliases and fallbacks without redeploying.