2026-04-06 19:03:22
Cheaper AI was supposed to ease the compute crunch. Instead, it made it worse. The Jevons paradox, applied to intelligence, means that every time the price of a token falls, demand rises faster than supply can scale.
But the labs can’t clear the queue by raising prices — their customers will defect to the next best alternative. So the compute crunch shows up where economists would least expect it. As we wrote yesterday, OpenAI is passing on opportunities, Anthropic is adjusting session limits for its Pro users and open-source models are being shut down.
But first, let’s start with the demand. OpenAI’s APIs processed 6 billion tokens per minute in October 2025. By April this year, that had risen to 15 billion, a 2.5x increase in five months. Both OpenAI and Anthropic are in a race to maximise compute to meet demand.
That relentless load is why Google’s TPUs across all seven generations, some now seven and eight years old, are running at full utilization. Hardware that was expected to have depreciated years ago is still earning its keep.
The pressure shows up in revenue. For instance, Anthropic’s total revenue is growing, but the price per token is falling – and it’s falling faster than overall revenue is rising. This means the business is becoming increasingly dependent on volume to sustain growth.
Users are feeling the squeeze, too. Across every major AI platform usage allowances tightened last year – more tiers, stricter limits, changes that often showed up without notice.
2026-04-05 12:21:48
Always an excellent perspective on emerging systems and their impacts across the human landscape. – Neill K., a paying member
Hi all,
Welcome to the Sunday edition, where we make sense of the week behind us. Before you dive in, catch my latest podcast about how I adapted Andrej Karpathy’s autoresearch for knowledge work (essay on this is here + GitHub repo for paying members of Exponential View).
A couple of months ago, I wrote that AI capex looked more like a stampede than a bubble. I argued that by fixating on the bubble question, markets are worrying about the wrong thing. The real issue is the compute crunch. What I wrote then still stands:
The Big Tech companies have massive demand from generative AI workloads and the growth of cloud, which means they are having to turn business away. Last year, AWS lost a $10 million contract to host Fortnite because it couldn’t guarantee compute capacity. Something similar happened to Microsoft.
This week, OpenAI’s CFO says they’re passing on opportunities because there’s not enough compute. Codex went from 100,000 to 2 million developers in three months. Killing Sora will help. Anthropic has tightened the limits – some 7% of users will hit session limits they wouldn’t have hit before. H100 rental prices hit an 18-month high. Meanwhile, Alibaba closed-sourced Qwen, its open-weight leader. Tomorrow’s data edition will be dedicated to this trend.
See also:
The decade-long feud between OpenAI and Anthropic: A deeply reported piece on the history of the split and the competing visions of AI that emerged from it.
The same day OpenAI killed Sora, Kuaishou’s Kling AI reported $300 million annualized video revenue (Q4: $47 million), with 2026 expected to more than double.
GitHub Copilot started injecting promotional content into code reviews. The developer community was not happy and the feature was pulled back.
Economists forecast AI will add 1-1.5 percentage points to annual US growth by 2050 under a rapid AI progress scenario. They also believe there will be 10 million fewer jobs and inequality at its highest since 1939.
My friend believes the consensus underweights AI-driven GDP growth. Erik and point out that economists are anchored to historical precedent. I think it’ll be complex. We could reasonably see a phase change, much as the 1820s saw phase change in long-range GDP growth. But measured GDP growth might look disappointing if there is price deflation in AI-affected sectors or an expansion in non-market value. The revolution might be real, but the measuring tape might miss it.
2026-04-03 01:01:09
Science is the most reliable method humanity has found for producing knowledge. It has also, for most of history, been expensive to run.
released 600 lines of Python code a few weeks ago that started to change that. His autoresearch (see EV#565) runs an autonomous experimental loop in which a human sets a strategic direction, defines what good looks like and the agent iterates towards success within the guardrails. In Andrej’s initial experiment, it trained a GPT-2-level model over two days, 11% faster and found 20 genuine improvements.
Shortly after the release, Shopify’s CEO Toby Lütke used autoresearch on his company’s internal model, qmd; it ran 37 experiments overnight, so Toby woke up to a 0.8-billion-parameter model outscoring his previous 1.6-billion-parameter version by 19%. Toby is not a machine learning engineer.
Autoresearch is powerful because it solves two problems at once. One, it automates part of the knowledge-production process. And two, it solves the agent control problem, meaning it keeps agents on task. AI often drifts if you give it an open-ended brief or optimize for the wrong thing. To my great joy, autoresearch prevents this by design. The human decides where the car is going; autoresearch keeps its hands on the wheel.
I spent the last month adapting autoresearch for knowledge work beyond machine learning with the goal of spinning up a system that can run structured, low-cost experiments on the kinds of decisions most teams make every week. I’m calling this version AutoBeta, and I’m making the full playbook/skill available to paying members below.
Let’s go!
When I first saw autoresearch, my immediate reaction was that it didn’t have to be just about machine learning. The loop is generic – hypothesize, test, score, iterate. So I cloned it and started running it on other bits of my work.
It didn’t go exactly as I expected. The outputs looked fine but I couldn’t tell whether they were getting better. Unlike ML where the agent has a built-in feedback signal from each training run, knowledge work was missing it. A pricing decision doesn’t validate in five minutes; and the paragraphs I write don’t tell me if the argument is getting better or just changing, most of the time.
This is what makes applying autoresearch to knowledge work genuinely hard. The loop needs something to optimize against, and in knowledge work, that something doesn’t exist naturally.
So I constructed a version of autoresearch, called AutoBeta, that works across a wide range of business problems. It is not as technically robust as Karpathy’s, but it has the same design principles: I set the objective and the constraints, the experimentation takes place within the loop.
The one thing I changed was the score. I created “an oracle”, a panel of synthetic judges scoring each output against pre-defined criteria, collapsed into a single number the loop can optimize for.
2026-03-30 23:19:10
Editor’s note (31 March 2026): A previous version of this piece misstated the impact of data centres’ waste heat on local land temperatures, based on an over‑interpretation of an early preprint. We’ve updated the post with the best current evidence and recommend ’s detailed critique for readers who want to go deeper.
Hi all,
As regular readers of Exponential View know, the AI economy is scaling exponentially. Its physical dependencies – power, water, land, permits – do not. The gap between those curves defines the next phase of the build-out.
We dig into the numbers behind this gap in today’s Data edition.
Demand is locked in. Data center capacity is not hypothetical – 89% that is under construction in North America is pre-leased and barely a tenth of new supply is uncommitted. We are in a stampede after all (not a bubble).
The grid is the bottleneck. America’s data center pipeline has exploded to 241 GW – up 159% in a single year – but two-thirds of it is stuck. Grid connection queues and labor shortages mean most of that capacity exists only on paper.
The mainstream has long picked the wrong fight over water. Golf courses in the US consume more than 30x as much water as the entire country’s data center industry uses for cooling.1
Pledges all round. Anthropic promised to pay 100% of the grid upgrades for its data centers. Meanwhile, Microsoft committed to cutting data center water-use intensity 40% by 2030 and Google pledged to replenish 120% of the freshwater it consumes.
The pledges will help, but they address a resource that data centers use far less of than is commonly assumed. The externality that communities actually live inside is harder to offset with a corporate commitment: heat and jobs.
Land heat footprint. A recent preprint using satellite data reports that land surface temperature around data centres rises by 2°C on average after construction. The likely cause: replacing grassland with buildings and tarmac, which reads as hotter from space regardless of what’s inside them. The study doesn’t separate this from operational waste heat.
Hard hat economy. A typical data center creates 1,000-10,000 construction jobs but only 50-300 permanent roles. The gains are real but temporary.
These concerns, concentrated in the communities hosting the infrastructure, have sparked something unusual in American politics right now – organized, bipartisan resistance that makes local democracies relevant again.
$100 billion pushback. Locals blocked or delayed projects representing nearly $100 billion in combined investment in Q2 last year.
Bipartisan resistance. 55% of politicians who publicly opposed data centers were Republicans, and 45% Democrats. A local organizer in one of the affected counties said it well:
This is an issue that’s bringing us all together and that, I think, gives me hope.
Thanks for reading!
US golf courses used 1.63 million acre-feet of water in 2024. One acre-foot equals 325,851 gallons, giving a total of approximately 531 billion gallons. US data centers consumed an estimated 17 billion gallons of water directly for cooling in 2023. Both values use the most recent figures for direct water usage.
2026-03-29 10:00:42
Hi,
Welcome to the Sunday edition, in which we make sense of the week behind us.
If you prefer listening, here’s my latest podcast episode, where I unpack NVIDIA’s bet on Groq and OpenClaw – and what it means for the organizations you run. Watch on YouTube or listen here:
2026-03-27 22:38:16