BYOK gateway for OpenAI + Anthropic

The AI gateway for production.

Bring your OpenAI and Anthropic keys. Route through one endpoint, set fallbacks, swap models without redeploys. We never resell tokens.

Get started View pricing

Bring your own keysNo markup, everMetadata-only logs

curl https://api.usellm.io/v1/chat/completions \
  -H "Authorization: Bearer $USELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

POSTapi.usellm.io/v1/chat/completions

99.99%Gateway uptime

<5msRouting overhead

0%Token markup

Powering AI in production

Teams ship on useLLM every day.

Vercel

Replicate

Modal

LobeChat

Langfuse

Everything you need, built for developers

Bring your own keys

Connect your OpenAI and Anthropic accounts. We route through them — never resell tokens.

OpenAI-compatible endpoint

Point any OpenAI SDK at api.usellm.io/v1. Same request shape, same streaming, same tools.

Model aliases

Define names like “smart” or “cheap” and remap them in the dashboard. No redeploys.

Fallbacks & retries

Auto-fail over to a backup model when a provider is degraded. Configurable per alias.

Deep observability

Requests, tokens, latency, failures, and estimated provider cost across every key.

<5ms overhead

A thin proxy. The gateway adds milliseconds, not seconds.

Metadata-only logs

Prompts and responses are not stored. Usage history keeps only routing, token, latency, and cost metadata.

Normalized errors

Provider-specific quirks abstracted to one consistent error shape and status code.

Dashboard

See every request.

Centralized visibility into routing decisions, fallbacks, latency, and estimated provider cost.

Requests, tokens, p95 latency
Fallback activations & error rate
Estimated cost per alias and provider
Per-request logs with full trace

See the dashboard

Overview

Last 7 days

Total spend

$1,238.75

+12.5%

Total tokens

128.6M

+8.1%

Requests

318,265

+9.7%

Cache hit rate

37.4%

+6.2%

Spend over timeDaily

Spend by model

gpt-4o45.6%
claude-3-5-sonnet32.1%
gpt-4-turbo16.7%
other5.6%

Recent requestsView all

TimeModelTokensCostStatus

May 18, 10:21:36gpt-4o2,153$0.0321200

Pricing

One flat fee for the gateway

You pay providers directly for tokens. useLLM charges only for the gateway — routing, fallbacks, observability.

Free

$0/mo

10K routed requests / mo
1 alias, 2 providers
7-day metadata retention

Starter

$19/mo

100K routed requests / mo
3 aliases, 2 providers
14-day metadata retention

Pro

$49/mo

1M routed requests / mo
Unlimited aliases & fallbacks
Usage filters & model catalog

Business

$149/mo

5M routed requests / mo
90-day metadata retention
Priority support

Enterprise

Custom

10M+ routed requests / mo
Custom contract placeholder
Contact us

Provider token costs are billed by OpenAI / Anthropic on your own accounts. useLLM never resells tokens.

FAQ

Questions, answered

What is BYOK and why does useLLM require it?

Bring Your Own Key. You connect your OpenAI and Anthropic accounts and we proxy through them — you pay providers directly for tokens, useLLM charges only for the gateway. No resold credits, no markup, no grey-market.

Which models can I route to?

Anything OpenAI or Anthropic ships, the moment they ship it. You bring the key, useLLM speaks the OpenAI-compatible API on top.

How do model aliases work?

Instead of hardcoding gpt-4o or claude-sonnet-4.5 into your app, you call “smart” or “coding”. The alias maps to a primary model plus a fallback chain, with retry and timeout rules, and is editable in the dashboard without redeploys.

What happens during a provider outage?

Each alias has a fallback chain. If the primary model returns an error or times out, useLLM retries against the next model on the chain before surfacing the failure to your app.

Do you store my prompts?

No. The gateway stores metadata such as alias, model, tokens, latency, status, and estimated provider cost. Prompt and response bodies are not persisted.

How does useLLM bill?

Flat monthly plans based on routed request volume. Provider token costs go to your provider directly via your key — useLLM never sees the bill.

Can I cancel anytime?

Yes. Cancel from the dashboard; your keys keep working at OpenAI and Anthropic, you just lose the gateway features.

Your gateway to OpenAI and Claude. Ready in minutes.

Bring your provider keys, swap your base URL, and ship. Define aliases and fallbacks without redeploying.

Get started View pricing