Let your agents run indefinitely.
Never hit rate limits again.
Keep OpenClaw, Hermes, Pi, Conductor, and custom workers running through long tasks without token anxiety. OpenLimits gives your agents unlimited token usage, no cooldowns, and no rate-limit lockouts — across GLM, MiniMax, DeepSeek, Claude, and Codex model groups from one endpoint.
Hi! I'm Claude. Try asking me anything — this is a live demo running on OpenLimits infrastructure.
5 free messages — no signup required
Your agents need more than one model family.
Agent workflows are uneven: fast loops, tool calls, planning passes, and expensive final answers should not all use the same model. OpenLimits lets you assign the right model group to the right workload.
Open, agent, and premium model groups. One key.
Use GLM, MiniMax, and DeepSeek for efficient agent loops. Move harder reasoning to Claude or Codex when the task needs it. Keep the same endpoint, dashboard, and usage history.
Effort Levels
Effort parameter — low, medium, high — works out of the box. Dial quality vs speed per request.
Your Own Dashboard
Real-time analytics, live request feed, per-model breakdowns, token tracking. Know exactly where every token goes.
Works With Everything
One API key works everywhere: OpenClaw, Hermes, Pi, Conductor, OpenCode, Claude Code CLI, Codex Desktop, Cursor, direct API, and compatible clients.
No More 4-Hour Cooldowns
Other users report getting locked out for hours after just a few messages. With OpenLimits, you never see a cooldown. Your workflow stays unbroken.
How we give you
unlimited usage
It's not a different model. It's not magic. It's infrastructure.
Bulk enterprise capacity
We purchase our own enterprise-tier API access directly from Anthropic and OpenAI — no stolen keys, no scraped credentials, no gray-market tokens. Every request hits the real Claude or Codex API through our legitimately provisioned accounts. You get the benefit of that capacity at a fraction of the cost.
Multiple providers, one key
Your requests are spread across a pool of provider accounts. No single account gets overloaded, so you never see a rate limit or cooldown.
Smart routing
Every request is routed to the provider with the lowest current utilization. If one gets throttled, we instantly fail over to another. You never notice — your request just goes through.
Full models, zero censorship
You get access to the model groups your agents need — GLM, MiniMax, DeepSeek, Claude, and Codex — with no watered-down system prompts, no refusal layers on top, and no conversation logging. We don't store your prompts or responses. Your data stays yours.
The result: your agents can run cheap fast loops, premium reasoning passes, and fallback routes through the same OpenLimits key.
1,800 requests per minute.
No concurrency limit. Period.
Our only limit is 30 requests per second (1,800/min) — and zero concurrency restrictions. Compare that to everyone else.
| Provider | Requests / min | Concurrency | Cooldowns |
|---|---|---|---|
| OpenLimits | 1,800 | Unlimited | None |
| z.ai | 60 | Token-limited | 429 errors |
| Anthropic API Tier 1 | 50 | Token-limited | 429 errors |
| Anthropic API Tier 4 | 4,000 | Token-limited | 429 errors |
| Claude Pro / Max | N/A | ~5 messages | 5h & 7d lockouts |
| OpenAI API | ~3,500 | Token-limited | 429 errors |
| xAI / Grok | Varies | Token-limited | 429 errors |
Anthropic's Tier 4 requires $400+ in deposits and still enforces strict per-model token-per-minute caps. Claude Pro/Max subscriptions lock you out after a handful of messages with multi-hour cooldowns. OpenLimits has no token-per-minute limits, no concurrency cap, and no cooldown periods — just a simple 30 req/s throughput limit that normal usage never hits.
Three steps. Thirty seconds.
Seriously, that's all it takes. No infra, no config files, no 40-page docs. See detailed setup instructions.
Sign up & get your key
Create an account and get your API key instantly. One key unlocks the model groups on your plan.
~10 secondsSet one environment variable
Point your tool to our endpoint. That's literally one line in your shell config.
~10 secondsCode like normal
Open OpenClaw, Hermes, Pi, Conductor, OpenCode, Claude Code CLI, Codex Desktop, Cursor — whatever you use. It just works. No changes to your workflow.
~10 secondsPlans by model group.
Give each agent only the access it needs, then upgrade when workloads need premium Claude and Codex models.
| Model access | CORE$39/moOpen model access for lightweight agents and background jobs | AGENTS$79/moThe default plan for day-to-day coding agents | WEEK PASS$45/weekFull Max access for one week of agent-heavy work | DAY PASS$10/24hFull Max access for one day of model-heavy work | MAX$120/moFull access for heavy agent workloads and premium models |
|---|---|---|---|---|---|
| Model groups | |||||
| Unlimited GLM 5.1 & GLM V5 Turbo | ✓ | ✓ | ✓ | ✓ | ✓ |
| Unlimited MiniMax M3 | ✓ | ✓ | ✓ | ✓ | ✓ |
| Unlimited DeepSeek V4 Family | ✓ | ✓ | ✓ | ✓ | ✓ |
| Unlimited Claude Sonnet + Haiku | — | ✓ | ✓ | ✓ | ✓ |
| Unlimited Codex mini | — | ✓ | ✓ | ✓ | ✓ |
| Unlimited Claude Opus | — | — | ✓ | ✓ | ✓ |
| Unlimited full Codex/GPT group | — | — | ✓ | ✓ | ✓ |
| Limits and routing | |||||
| No cooldowns | — | ✓ | ✓ | ✓ | ✓ |
| No usage caps | — | — | ✓ | ✓ | ✓ |
| Automatic failover | ✓ | ✓ | ✓ | ✓ | ✓ |
| Highest model-group access | — | — | — | — | ✓ |
| API features | |||||
| Streaming responses | ✓ | ✓ | ✓ | ✓ | ✓ |
| OpenAI-compatible endpoint | ✓ | ✓ | ✓ | ✓ | ✓ |
| Works with OpenClaw, Hermes, and Pi | ✓ | ✓ | ✓ | ✓ | ✓ |
| Works with Claude-compatible clients | ✓ | ✓ | ✓ | ✓ | ✓ |
| Effort levels + extended thinking | — | ✓ | ✓ | ✓ | ✓ |
| Image support | ✓ | ✓ | ✓ | ✓ | ✓ |
| Dashboard & analytics | ✓ | ✓ | ✓ | ✓ | ✓ |
| Billing | |||||
| Cancel anytime | ✓ | ✓ | ✓ | ✓ | ✓ |
| Start with | Start Core → | Start Agents → | Start Week Pass → | Start Day Pass → | Start Max → |
Built for devs who ship
Not a toy. Real infrastructure with real observability.
Full Streaming
Real-time SSE streaming, fully native. No wrappers, no latency overhead.
Token Analytics
Input, output, cache reads, cache writes — per request. See where your tokens go.
Model Breakdown
Usage by model, daily trends, cost estimates. All in your dashboard.
Live Feed
Watch requests stream in real-time. Filter by model. See tokens flow as they happen.
Zero Downtime
No cooldowns, no waiting, no “please try again later.” Your requests always go through. Period.
Native Agent Model Access
GLM, MiniMax, DeepSeek, Claude, and Codex model groups are exposed through one OpenLimits model list and one API key.
Questions you probably have
Is there a catch?
No hidden fees. Plans are recurring subscriptions that renew automatically until you cancel — cancel anytime from the billing portal. Full refund within 14 days if you haven't used the service at all. After any usage, all sales are final. See our Refund Policy.
Is this built for agents?
Yes. OpenLimits is positioned for OpenClaw, Hermes, Pi, Conductor, OpenCode, Cursor, Claude Code CLI, Codex Desktop, and custom workers that need stable API-compatible model access.
What models do I get?
Plans are grouped by access: Core includes unlimited GLM, MiniMax, and DeepSeek; Agents adds unlimited Claude Sonnet/Haiku and Codex mini; Max adds unlimited Claude Opus and Codex/GPT model groups.
Can this get me banned?
No. You're not using anyone else's account or violating any terms. We use our own enterprise accounts purchased directly from Anthropic and OpenAI. Your usage goes through our infrastructure — your personal accounts are never involved or at risk.
Does it work with Claude Code CLI?
Yep. Set one environment variable and it works exactly as you'd expect — extended thinking, streaming, everything.
What about rate limits?
No 5-hour rolling limits, no 7-day caps, no concurrency limits, no token-per-minute caps. Our only limit is 30 requests per second (1,800/min) — well above what any human or coding tool needs. Compare that to z.ai (60 RPM), Anthropic's API (50 RPM on Tier 1), or Claude Pro/Max (locked out after a few messages). See the full comparison.
Is setup actually 30 seconds?
Yes. Get a key, paste one env variable, done. We timed it. Multiple times. It's 30 seconds.
Can I track my usage?
Your own dashboard with real-time analytics, request history, model breakdowns, token tracking, and a live feed. It's pretty nice.
Stop hitting limits. Start shipping.
From $39/month. GLM, MiniMax, DeepSeek, Claude, and Codex model groups for your agents. Your own dashboard.
Choose Your Plan →