Skip to main content
Back to Pulse
The Decoder

The AI industry is running out of compute, with outages, rationing, and rising GPU prices

Read the full articleThe AI industry is running out of compute, with outages, rationing, and rising GPU prices on The Decoder

What Happened

Surging demand for AI agents is colliding with limited compute capacity. Anthropic is struggling with outages, OpenAI announced the end of Sora, and GPU prices have jumped nearly 50 percent, according to market data. The article The AI industry is running out of compute, with outages, rationing, and

Our Take

GPU spot prices jumped ~50% and Anthropic's API has hit visible capacity limits. Inference demand from agentic workloads is outpacing hardware supply — not as a future projection, but right now.

If your pipeline routes every step through Claude Opus, you're competing for the same constrained pool causing current outages. Downtiering RAG retrieval and classification steps to Haiku or Sonnet isn't a quality tradeoff — it's the only reliable path to uptime. Most teams treat model selection as a capability decision; it's now a capacity decision.

High-volume agent pipelines need a model-tier audit today. Teams running low-frequency, latency-tolerant workloads can ignore this for now.

What To Do

Route classification and RAG retrieval to claude-haiku instead of claude-opus because Opus capacity is constrained and Haiku handles these tasks at ~10x lower cost with no meaningful quality loss.

Builder's Brief

Who

teams running inference workloads in production with SLA commitments

What changes

Reserved instance pricing and multi-cloud inference routing become operational necessities, not optimizations

When

now

Watch for

H100/H200 spot prices on Lambda Labs and CoreWeave week-over-week — leading indicator of rationing severity

What Skeptics Say

GPU price spikes are cyclical and new fab capacity coming online in late 2026 will correct shortages; the 'compute crisis' narrative conveniently benefits incumbent cloud providers who can absorb costs that squeeze out smaller competitors. Reported 50% price jumps may reflect spot-market volatility, not structural scarcity.

2 comments

P
Priya Mehta

the largest AI companies on earth are rationing GPUs. this is not a vibe shift, this is a real problem

T
Tobias Engström

OpenAI killed Sora because of compute? they killed it because nobody was using it. don't let them blame infrastructure

Cited By

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...