2026-05-25 15:00:52
Hi all,
Last week, we explored what tokenmaxxing means for CFOs and how firms can buffer unexpected AI costs.
We go further today to show you why AI bills are hard to forecast today and what will happen as we crack that problem.
We estimate that the number of tokens processed per quarter has grown by around 17,000x over four years.1
Token prices have collapsed during this time. Demand for machine intelligence is highly elastic, meaning that as prices fall, consumption increases by more than the decline in price.
One reason is that cheaper tokens have made agents economically viable. At the same time, agents use tokens at rates that are orders of magnitude higher than those of chatbots for single-turn queries. That shows up as the total tokens processed per output token—advanced models do a lot of processing below the surface that a user doesn’t see.
A lot of this growth is driven by China’s domestic demand and its model providers, especially ByteDance and Alibaba.
When you use an AI agent, the final result you get is really just a summary of all the work the agent has undertaken. There may be dozens of tool calls to browse the web or load up a file to check and validate the work it has done. All of these are steps consume tokens: they become hidden multipliers.
The first of these is token amplification.2 A coding agent that operates over 10 turns might need to re-read its full context every turn. That repetitive reading of context could use as many as 55x more tokens than a single-turn query for the same task.
Actual active inference is probably only 15-20% of the total token consumption. The rest is invisible work that you, as a user, and possibly the company paying for it all, haven’t modelled.
The long tail of tool calls
Agents make anywhere between five and twenty-five tool calls per task. And each call adds more context, tokens and API costs. It also increases the likelihood that the model will need to retry the task to get it right.
2026-05-24 10:44:54
Consistently insightful, with no agenda beyond understanding how technology shapes our world. — Vincent N., a paying member
American college graduates are angry. In my Saturday column, I write:
The narrative around AI has been about promises for tomorrow, but sacrifices today. […] Engaging with this resistance won’t be effective if AI leaders throw distant fantasies at people dealing with muddy water coming from their kitchen tap today.
An OpenAI reasoning model solved an 80-year-old open problem in discrete geometry. The interesting point here is that the reasoning model found an unexpected bridge between two different fields of math. Discrete geometry and algebraic number theory are meaningfully separate cultures. An expert in one field might know something about the other, but not at the depth required for breakthrough research.
The human mathematicians who validated the proof reckon that the connection the AI found was both unexpected and non-obvious. In a sense, it’s reminiscent of AlphaGo’s Move 37.
Because AI systems can span different domains of knowledge, my hunch is that one of the biggest dividends we will see from them in these fields will be connecting domains that live in isolation because of the structure and specialization of modern science.
But of course, another way AI will impact research is by speeding up the scientific method: the full hypothesis-experiment-analysis loop.
Robin, a multi-agent system, completed the full cycle of hypothesis generation, experimental selection, data analysis and refinement. Human scientists ran specified experiments in a wet lab. The outcome was the identification of an existing drug that could be repurposed to treat macular degeneration.
2026-05-23 18:56:28
Anthropic will turn a profit this quarter, two years ahead of schedule. This is off the back of extraordinary revenue growth. In the second quarter this year, the company is likely to gross $10.9bn in revenue, more than its entire lifetime revenue to date. Operating profits should weigh in at $559 million. It’s an extraordinary achievement and one set to feature in the annals of business history.
But the backlash against AI is growing ever stronger.
Earlier this week, Google’s former boss, Eric Schmidt, was soldily booed at a college commencement address
At another college commencement, boos shocked the commencement speaker who proselytized AI. Joanna Stern interviewed one of the college students, Houda Eletr:
Joanna: Do you use AI?
Houda: Throw it away….if you create this big thing that is supposedly going to save a human’s life, then you should give it to me in a human way.
A reader from New York sent me this photo:
There will be more booing.
Consider data centers and their local impacts.
At a hearing, American politician Alexandria Ocasio-Cortez showed a jar of mud-brown water collected from a tap in Georgia, shortly after Meta started building a data center.
The narrative around AI has been about promises for tomorrow, but sacrifices today. AI leaders have warned that the world could be destroyed by this technology.
They warned that large numbers of people, particularly graduates, would lose their jobs. But everyone knows the founders are making more money than Croesus. And now, there is construction, a physical manifestation, a real inconvenience, something you can point at.
And for what? So we can explore the stars using intelligent spaceships and ‘colonize the galaxy’ according to Demis Hassabis. (Demis is by far the most sympathetic of all the AI leaders, but even his messages feel out of touch.)
Engaging with this resistance won’t be effective if AI leaders throw distant fantasies at people dealing with muddy water coming from their kitchen tap today.
Neither will facts. The US energy system is getting a much-needed boost in investment to upgrade energy generation and the grid to more modern technologies. I am certain that in a decade we’ll look back on this as the time when the US grid prepared for the 21st century, becoming more resilient, efficient and greener. But that well-informed argument is completely irrelevant in the face of today’s resistance.
Fantasies or facts, these are all fodder for the technorat. Globalization or NAFTA made a graph on some economists’ slide deck go up. Grid renewal and ‘productivity’ excited the same neurons, but don’t speak to the heart.
2026-05-18 18:53:34
Hi all, happy Monday.
This week’s Monday edition looks at one of the fastest-growing, least predictable costs in the AI stack: tokens.
Let’s go!
Vibe-coded for this edition: Play our token budget word quiz and win a prize.
Uber CTO Praveen Neppalli Naga shared last month that his 5,000 engineers had depleted their entire 2026 token budget in just four months. So has ServiceNow.
Agentic adoption was bound to drive this kind of demand and the finance has to respond. CTOs are increasing their tech budgets this year – nearly 50% say their budgets are up by 10%. (As a side note, we believe that 10% is marginal given the token explosion we are experiencing.)
A token explosion is great news for engineering teams, but a real headache for CFOs who signed off on modest pilots months ago. 71% of companies exceeded their AI budgets in 2025 and over half of the surveyed finance bosses say cost management is their greatest concern.
Partially because AI costs can be highly variable and tricky to manage. The cost is not fixed as it was with good, old fixed-seat SaaS.
Exponential token use is diffusion in action. In the US, the average monthly spend on AI by large enterprises grew 36% to $85,000 between 2024 and 2025.1
Labs in China told us that coding is where most labs are throwing their resources right now, their P0 in engineering terms.
2026-05-17 11:32:49
This is the most relevant newsletter I receive in my inbox every week. – Martin T., a paying member
Hi all,
Suffocated by export controls, Chinese AI labs have developed an efficiency moat that may define the AI market’s development over the coming years. My report from a trip to China.
In this Sunday’s edition:
Anthropic powers into Main Street as Microsoft and OpenAI consciously uncouple
Who should really control your AI?
Voltaire, the Enlightenment entrepreneur
Anthropic’s CFO Krishna Rao gave a fascinating interview on Patrick O’Shaughnessy’s podcast. It’s a cornucopia of insights: for one, the firm’s enterprise customers increased their spending by a factor of five over the past year. That 90% of Anthropic’s code is written by Claude Code is well known, but Krishna says that 90% of finance reporting is now AI-driven as well. Cowork is growing faster than Claude Code (which grew from zero to $1 billion in just six months). Krishna’s been on quite the ride himself – he joined Anthropic two years ago when revenues hit $250 million and now presides over annualized revenues approaching $50 billion – a fivefold increase in as many months. Recommend a listen.
Elsewhere: Anthropic leads OpenAI in business adoption, according to Ramp.
OpenAI and Microsoft are no longer exclusive. OpenAI now gets to play with other resellers and Microsoft gets better access to the startup’s AI, as well as drops complicated revenue-sharing arrangements.
It’s been a profitable marriage, if not an entirely happy one.
Microsoft’s $13 billion OpenAI investment has yielded over $30 billion in revenue in the two years since ChatGPT’s launch. OpenAI was the biggest customer of Microsoft Azure’s AI business, ploughing $23 billion into the hyperscaler. (Our estimates suggest up to 60% of Azure’s AI revenue came from OpenAI). Microsoft’s operating margin expanded by 3.9% over the last three years.
2026-05-15 17:17:16
Cerebras Systems, which makes massive, beautiful chips for AI workloads, hit the Nasdaq yesterday. At one point, the stock was up 157%, before settling for a more muted 107% return.
On the first day, IPO pops bring out the bears and cynics for good reason. The dotcom heyday was full of them. Calico Commerce, VA Linux, TheGlobe.com: each up 300%, 605% and 697.5% on day one, respectively. Remarkable performances, long-term disasters. VA Linux, the “best” of them, lost about 98% of its value. TheGlobe.com and Calico were effectively wiped out.
What did they have in common? Low revenues, fast growth rates and a booming, booming, booming market.
So when Cerebras opened 75% above the expected IPO price, it was tempting to file it away as another overhyped stock riding the frenzy around a new technology, itself bundled in layers of hype.
What I saw instead was a sign that the market is finally starting to grasp the demand for AI inference.
I’ve been following Cerebras since 2018, when I first spoke with Andrew Feldman, the founder. Last year, and I visited their headquarters in Sunnyvale.