[release] 5 min · May 11, 2026

Anthropic–SpaceX Colossus — The Opus Rate Limit Story

Anthropic leased SpaceX's entire Colossus 1 data center. Claude Code limits doubled, but the 1,500% Opus API rate limit jump is the real story for production teams.

#anthropic#claude#ai-infrastructure#rate-limits#opus

Anthropic announced on May 6 at its Code with Claude conference in San Francisco that it has leased the entire compute capacity of SpaceX’s Colossus 1 data center in Memphis — over 300 megawatts, more than 220,000 NVIDIA GPUs spanning H100, H200, and GB200 accelerators, coming online within the month. The headline will be the SpaceX partnership. The number I care about is buried further down: a 1,500% increase in Opus API input tokens per minute for Tier 1 users.

TL;DR

  • Infrastructure: Anthropic leased all of SpaceX’s Colossus 1 — 300+ MW, 220,000+ GPUs, live within weeks
  • Claude Code: 5-hour rate limits doubled for Pro, Max, Team, and seat-based Enterprise; peak-hour throttling removed for Pro and Max
  • Opus API: Input tokens per minute up 1,500%, output tokens per minute up 900% for Tier 1
  • The catch: Weekly caps did not move — multi-agent workloads still hit the same ceiling

Anthropic Leases SpaceX’s Flagship AI Facility — What Happened

Three things changed for developers on May 6. First, Claude Code’s five-hour rate limit doubled across every paid plan — Pro, Max, Team, and seat-based Enterprise. Second, peak-hour throttling was removed entirely for Pro and Max Claude Code accounts. Third, and most consequentially for API-heavy teams, Opus model rate limits jumped dramatically: Tier 1 input tokens per minute climbed 1,500%, output tokens per minute rose 900%.

The compute backing these changes is Colossus 1, originally built by xAI in Memphis before Musk merged xAI with SpaceX earlier this year. Anthropic is now the tenant of what was, until recently, the data center meant to power Grok. The irony is notable — Musk called Anthropic “misanthropic and evil” on X on February 12, just 84 days before this deal closed. He then spent time with senior Anthropic team members the week before the announcement and came away saying he was “impressed” and that Claude will “probably” be good. Compute economics override personal grudges faster than anyone expected.

This deal slots into a broader multi-vendor compute strategy Anthropic has been assembling throughout 2026. Amazon committed up to 5 gigawatts, with roughly 1 GW online by year-end. Google and Broadcom signed for 5 GW starting in 2027. Microsoft and NVIDIA are providing $30 billion in Azure capacity. Fluidstack brings $50 billion in US infrastructure investment. Colossus 1 is not Anthropic betting on SpaceX — it is Anthropic hedging against every single one of these partners simultaneously.

Anthropic also “expressed interest” in partnering with SpaceX on multi-gigawatt orbital compute capacity. This is roadmap positioning, not deployable infrastructure. The engineering timeline for orbital data centers is measured in years at minimum. Do not factor this into any near-term planning.

Why This Matters

The Claude Code doubling will get the most attention in developer forums, and it is the least interesting change. If you were hitting the five-hour limit before, you were likely also hitting the weekly cap — and that cap did not move. Anthropic’s announcement was explicit about this. For the Claude Code power user running multi-agent coding sessions that burn through quota in sustained bursts, the practical improvement is marginal. You hit the weekly wall slightly later in the week instead of slightly earlier. That is not nothing, but it is not the unlock people will mistake it for.

The removal of peak-hour throttling for Pro and Max is more meaningful. Before this change, Claude Code would degrade during high-demand periods — precisely the moments when developers are most likely to be working. If you have ever had a Claude Code session slow to a crawl at 2 PM Pacific on a Tuesday, that specific problem is now gone for paid users. Consistent throughput matters more than burst capacity for anyone building workflows around the tool.

But the real story is the Opus API rate limit jump. A 1,500% increase in input tokens per minute changes what you can architecturally do with Opus. The pattern I have been watching — and that teams building agentic infrastructure have been blocked on — is Opus-as-orchestrator: you run Opus at the top of an agent hierarchy handling reasoning, planning, and delegation, while Sonnet handles the execution-layer worker tasks. This configuration was previously rate-limited into near-impracticality beyond prototype scale. If you had 5 worker agents reporting back to an Opus coordinator, each returning context that needed to be processed, you would slam into the input token ceiling within minutes of sustained operation.

With a 15x increase on that ceiling, the math changes. Not “we can run experiments” math — “we can run this in staging and evaluate whether it works in production” math. That is a meaningful threshold to cross. Combined with the 900% output token increase, Opus can now both consume and generate at rates that support real multi-agent orchestration patterns rather than synthetic benchmarks. The Advisor Tool pattern, where a cheap Sonnet executor escalates hard decisions to Opus, also benefits directly — advisor calls that previously risked queueing behind rate limits now have 15x more headroom to complete without throttling.

Weekly caps are still unchanged. If your multi-agent pipeline runs Opus continuously, you will hit the weekly limit well before Friday. The per-minute increase lets you burst harder, not run longer. Design your architecture for batch windows, not continuous operation, until Anthropic moves the weekly number.

The competitive context matters here. Anthropic is not doing this out of generosity — it is doing this because OpenAI’s o3 and Google’s Gemini Ultra are both targeting the “orchestrator model” tier. If Opus cannot physically serve requests fast enough to function as an agent coordinator, teams will evaluate alternatives regardless of quality. The rate limit increase is a retention play for the highest-value API customers, funded by capacity that was literally built for a competitor’s model.

The Take

I care about one number from this entire announcement, and it is the 1,500% Opus input token rate limit increase. If that figure holds under real production load — not launch-day promotional capacity, but sustained throughput when Colossus 1 is fully allocated — it changes the economics of Opus-as-orchestrator from “interesting demo pattern” to “viable production architecture.” That is worth paying attention to.

The SpaceX headline is designed to dominate the news cycle, and it will. Musk going from “misanthropic and evil” to leasing Anthropic his flagship AI facility in under three months is a striking reversal that says more about the state of compute markets than it does about either company’s principles. But for anyone building on Claude’s API, the political theater is irrelevant. What matters is whether the capacity translates to sustained, reliable throughput — not just a better number on the rate limit page.

My concrete advice: do not rebuild your agent architecture around these limits today. Anthropic moved the per-minute ceiling but kept the weekly cap flat. That is a controlled release valve, not an open floodgate. They are managing costs carefully while signaling capacity. The weekly cap will move eventually — probably when Colossus 1 is fully online and the Amazon gigawatt comes through later this year. Until then, design your multi-agent workloads for burst patterns within weekly budgets.

If you are running the Opus orchestrator pattern at prototype scale and have been waiting for rate limits to make it viable, now is the time to run serious load tests. Not to ship to production — to validate whether the 15x headroom is enough for your specific agent topology. The window between “limits just increased” and “everyone discovered they increased” is when you will get the best real-world throughput data. Use it.

Don’t build your architecture around limits that will shift again in 90 days.