2026-02-14 04:31:44
In this conversation with from , we talk about our experience with frontier agents and the systems we’re building around them.
My token usage jumped from 1 million to 100 million tokens a day in recent months because persistent agents on my machine are handling work that would have taken weeks. Rohit’s agents went into our research backplane and wrote a market report better than GPT-5.2 Pro. We also dig into what an agent economy might look like; what happens when there are trillions of these systems, and what coordination infrastructure they’ll need. We think it starts this year.
Enjoy!
Azeem
2026-02-13 23:31:33
AI companies are being valued in the hundreds of billions. $650 billion in capital expenditure commitments are being made by big tech for 2026. Yet one question remains unanswered: does it make economic sense?
We recently partnered with Epoch AI to analyze GPT-5’s unit economics, and figure out whether frontier models can be profitable (full breakdown here).
To dig deeper into what our results tell us about the wider industry, we hosted a live conversation last week between myself, , , moderated by .
We cover:
The research findings,
Possible paths to profitability,
OpenAI vs Anthropic playbook,
Winning the enterprise
Why this research made some bulls more pessimistic
What the market gets wrong.
Watch here:
Listen here:
Or read our notes:
Matt: For someone just getting into the research, what’s the big takeaway — and how did you even think about building a framework to analyse a business like this?
Jaime: To our understanding, no one had really taken on this task of piecing together all the public information about the finances of OpenAI — or any large AI company — and trying to paint a picture of what their margins actually look like. So we did this hermeneutic exercise of hunting for every data point we could find and trying to make sense of it.
The two most important takeaways: first, it seems likely that OpenAI during the past year, especially while operating GPT-5, was making more money than the cost of the compute — which is the primary expense of operating their product. But they appear to have made a very thin margin, or even lost money, after accounting for all other operating expenses: staff, sales and marketing, administrative costs, and the revenue-sharing agreement with Microsoft.
Second — and this is the part I found quite shocking — if you look at how much they spent on R&D in the four months before they released GPT-5, that quantity was likely larger than what they made in gross profits during the entire tenure of GPT-5 and GPT-5.2.
Hannah: A lot of our methodology was based on numbers we could find historically, then trying to project what would happen through the rest of 2025. For example, we had data showing 2024 was $1 billion in sales and marketing, and H1 of 2025 was $2 billion. So we built the picture using constraints like this, breaking each category down into its separate components so we could assess whether each was a realistic approximation.
Azeem: This is a complicated exercise, and one of the things that comes out of it is the question of that short model life. The family we looked at was only really the preeminent family for a few months. Enterprises don’t change the API they’re using the day a new one comes out — there’s always a lag. But consumers do, because that’s what you get access to on ChatGPT.
You may remember that when GPT-4 was set aside, it was an emotional support tool for many users, and they were very upset with how methodical and mechanical GPT-5 felt. The uncertainty is: to what extent do you actually learn and prepare for your next model based on the short life of the existing model?
There are two elements. One is more nebulous — by having a really good model, even if it lasts for a short period, you maintain your forward momentum in the market. The second is harder to unpick: what do you learn about running better and better models from actually having run a better model, even if it only lasts four months? That learning might be down in the weeds of R&D — training data choices, reinforcement learning — or in operations and just running a model at that scale. I suspect it’s hard for even OpenAI to know the contribution of that second part.
Matt: It reminds me of GPUs. I’ve been talking to finance folks about what the value of H100 chips will be in a few years, and everyone’s shrugging their shoulders. It’s a parallel to these models, what is GPT-4 worth now, when three years ago it was the frontier?
Azeem: I think the two challenges around this model are: first, is the OpenAI approach the only way to do this? We’ve seen Anthropic do something completely different. And second, is there a path to positive unit economics? Are they producing something for X dollars that they can sell for 1.3X? Or are they producing something for X that they sell for half X, which was the story of a lot of the dot-com era?
We got partway to answering that second question: yes, it’s expensive, but yes, there is some kind of gross profit margin. The level we estimate — Hannah can speak more accurately to this — is lower than a traditional software business. So we’re learning that perhaps foundation labs don’t look like software businesses. They look like something different.
Jaime: Think about the game OpenAI is playing. It’s not about becoming profitable right away. What they’re trying to do is convince investors that they have a business and research product worth scaling as much as possible, driven by the conviction that through scale, they’ll unlock new capabilities which in turn will unlock new markets.
2026-02-12 03:58:07
Five months ago, we offered the only evidence-based framework to answer the question that was taking way too much space: is AI a bubble? To get to the bottom of it through evidence rather than vibes, we tracked the five areas we believe are crucial to understand the AI investment cycle. Our indicators are: economic strain,1 industry strain,2 revenue momentum,3 valuation heat,4 and funding quality.5
Our analysis at the time – contrary to many alarmists – concluded that generative AI is a boom, not a bubble. But at the core of our approach is evidence. If evidence changes, we change our minds.
The Financial Times has published over a hundred articles invoking the “AI bubble.” , the famed hedge fund investor, disclosed shorts on Nvidia and Palantir, hardening his view earlier this year: “almost all AI companies will go bankrupt, and much of the AI spending will be written off.”
Fund managers surveyed by Bank of America cite AI overexposure as their top tail risk. In my discussions with people representing hundreds of billions of dollars of capital, there was some nervousness. It tended to be more nuanced than mainstream journalism portrayed – a concern of low-quality data center projects being built and funded on spec, without the guarantee of a blue-chip Big Tech tenant.
The strongest version of the bear case goes like this: capex is growing faster than revenue, model costs are falling (the DeepSeek moment proved dramatic efficiency gains are possible), and most enterprise AI is still chatbot-level stuff. In other words, enterprises aren’t getting results, efficiencies mean you’ll need less infrastructure and that capex overhang will just collapse.
But while the bubble narrative gained momentum, reality has moved the other way. And the evidence now points not just to a boom, but to something the bears haven’t considered: scarcity. The real risk isn’t that we’ve invested too much in AI. It’s that we haven’t invested nearly enough.
Today I want to close the bubble question, for now, and show why the markets should be bracing for a stampede.
The ratio of investment to revenue – what we call Industry Strain – has dropped from 6.1x to 4.7x in five months since we published our analysis. If Industry Strain remains high for sustained periods of time, it means that companies are not recouping their investments, and they are building speculatively.
For context, the telecoms bubble peaked at an Industry Strain of just over 4x. In the case of generative AI, strain is still at historically high levels. On our dashboard, it sits in the amber zone.6 But the trajectory matters: if it holds, the ratio drops below our 3x threshold by Q2 this year. It would signal that revenues are beginning to “carry” the installed base and that the balance sheets and external financing can stop doing the heavy lifting.
Follow the revenue to understand why. According to our own proprietary model, monthly AI revenue grew from $772 million in January 2024 to $13.8 billion by December 2025, roughly an eighteen-fold increase in two years.7
The hyperscalers are the main engine. Google Cloud grew 48% year-over-year to $17.7 billion. AWS expanded 24% to $35.6 billion. Azure grew 39%, with its contracted backlog expanding 110% to $625 billion (though it’s worth remembering that 45% of this backlog is tied to OpenAI). Our revenue model estimates that AI now accounts for 23% of Google Cloud’s business, 10% of Azure’s (the biggest of which is from OpenAI) and 5% for AWS’s as of the latest quarter.
Sundar Pichai, Satya Nadella and Andy Jassy all said on their earnings calls that AI is the main driver of growth in their cloud businesses. When the CEOs of the three largest cloud companies all tell you the same story – that AI is what’s driving their growth – the attribution question starts to answer itself.
The model providers sit a tier below, and their economics tell a more complicated story. In the analysis we did with and from Epoch AI, we found that OpenAI’s GPT-5 bundle achieved roughly 48% gross margins on $6.1 billion in revenue. This is decent, but well below the 70-80% typical of mature software. Worse, model lifespans are too short to recoup R&D. GPT-5’s four-month window generated some $3 billion in gross profit against ~$5 billion in development costs.8 Frontier models function as rapidly depreciating infrastructure, their value eroded by competition before costs were recovered.
At this stage, with Anthropic chasing OpenAI on enterprise usage and OpenAI’s growth plateauing, these companies should be optimizing for growth, not profit. Positive unit economics at the gross-margin level is enough in the phase we’re in.
And growth is popping up everywhere, not just OpenAI and Anthropic. Paris-based foundation model company, Mistral, disclosed that its annualized revenue run rate exceeded $400 million, a 20-fold increase in just one year.
Revenue growth is necessary but not sufficient. There’s a meaningful difference between a million users asking chatbot questions and ten thousand enterprises embedding AI into production workflows.
To find out which we’re seeing, we analysed more than 6,000 S&P 500 earnings calls from Q4 2022 through Q4 2025. We extracted nearly 30,000 AI-related statements. Many were corporate pabulum, but there were also specific claims about results achieved from AI projects. The share of companies making quantified AI claims (specific numbers attached to specific outcomes) jumped from 1.9% to 13.2% in that time.
When Bank of America – not a startup, not a research lab, but a 120-year-old bank –tells you AI coding tools cut their development time by 30%, saving the equivalent of 2,000 full-time engineers, the bubble debate starts to look quaint. Norway’s $2 trillion sovereign wealth fund automated portfolio monitoring with Claude, saving roughly $17-32 million per year in labor costs.9
Meta reported a 30% increase in engineering output since January 2025, most of it from agentic coding assistants. The power users have had an 80% output increase. It’s not just coding either. Western Digital, one of the world’s largest manufacturers of hard disks, reports AI tools “improving yield, detecting defect patterns through intelligent diagnostics and optimizing test processes” with productivity gains of up to 10%.
We’re looking at a technology that has crossed from experiment to infrastructure.
Most of the earnings-call claims are, honestly, boring. Percentage efficiency gains. Customer service deflection rates. Operational savings. But that’s exactly the point. Boring adoption is real adoption.
The survey data now reinforces this murky picture. Deloitte’s January 2026 State of AI in the Enterprise reports that while 25% of organisations currently have 40% or more of AI projects in production, 54% expect to reach that level within six months. Morgan Stanley’s 4Q 2025 CIO Survey shows that while IT budget growth is slowing, AI funding is increasingly coming from outside the IT department, indicating ownership by operating units rather than experimentation confined to technology teams. And in KPMG’s 2025 Global CEO Outlook, 67% of CEOs expect AI investments to deliver returns within one to three years – an acceleration from the longer three-to-five-year horizons expected just a year earlier.
There was a different timbre to the discussions with more than a dozen C-suite executives I had at Davos this year. Implementation to scale was tough, but not so tough that they weren’t already starting to think about questions about the workforce and training.
In other words, there is increasing evidence, from different sources, to show that enterprise adoption is climbing, that after a couple of years of tricky and sticky learning, bosses are doubling down. They seem to be growing in confidence over how and when they can see a return on their investment.
And something changed right at the end of 2025, and we know what it was. Models passed a threshold of coherence; they can work very reliably on tasks of an hour or two, and somewhat less reliably on longer tasks. Something clicked. And Claude Code, Anthropic’s tool to run software agents to write software, was the first beneficiary.
We will spend some time explaining what is going on here. If you’re not at least knee deep in long-running workflows, it’s quite hard to understand the implications of these systems on revenues in the genAI ecosystem.
Claude Code is a really good software engineer. The developers building it use Claude Code to code itself. Elsewhere at Anthropic, an engineer used it to build a C compiler for $20,000 in API costs.10 A comparable project built by humans would typically require five to ten engineers over 18-24 months, around $2-3 million in fully loaded labor costs.
At Exponential View, we’ve committed (written) several hundred thousand lines of code this year alone. Many apps I use daily were written by me (with Claude Code) in the past month or so. We’ve got software running that might have cost a million bucks to write, but has only cost perhaps $500 using AI agents. These systems free up at least an hour a day for me.
2026-02-10 01:23:11
Hi all,
Here’s your Monday round-up of data driving conversations this week in less than 250 words.
Let’s go!
Silicon’s AI premium ↑ AI chips are expected to generate half of all chip revenue in 2026, despite making up just 0.2% of volume.
Driverless at scale ↑ Waymo raised $16 billion at a $126 billion valuation — more than most listed carmakers.
Apps, not models ↑ ~70% of AI builders are focused on vertical applications, not foundation models. Value is shifting up the stack.
2026-02-08 11:14:16
Hi all,
Welcome to the Sunday edition #560. This was a week of overreactions. Wall Street panicked about $650 billion in AI spending. Sam Altman and Dario Amodei traded jabs.
But this is exactly what Exponential View is for… We go beneath the noise to the forces that move markets, technologies, and societies.
Today, I’ll unpack what investors got wrong in their panic, the model upgrades that matter more than the benchmarks suggest, and what I’ve learned from living with agents that never sleep (including an unveiling of my favorite thinking tool… which I built for myself.)
When more than $1 trillion was wiped off the combined valuations of big tech companies this week (and Anthropic’s Claude Cowork plugin triggered a separate $285 billion rout), we are watching markets overreact to a new paradigm they don’t really understand.
I’ve long been calling out the linear investment thinking we see on full display – simply because capital markets were not built to fund general-purpose, exponential technologies like AI. I previously wrote:
For capital markets, this uncertainty isn’t just about who might win in a well-defined game; it’s about the type of game that’s being played. Markets use tools that assume relatively stable competitive structures and roughly linear growth in order to price company-level cashflows over three- to ten-year horizons.
The hyperscalers are not spending into a void. They are supply-constrained, not demand-constrained. Microsoft’s CFO Amy Hood admitted she had to choose between allocating compute to Azure customers or to Microsoft’s own first-party products. That’s what scarcity looks like in the age of AI.
As we showed in our research with Epoch AI, there is a viable gross margin in running frontier models at inference; the economics at the model level can work. What’s expensive is the relentless cycle of R&D, where each new model depreciates in months.
But the market hasn’t yet internalized how demand explodes once models cross what called the “threshold of coherence” for agents. I know this first-hand: my agent, Mini Arnold 💪, chews through $20-30 of tokens a day, roughly $5,000 a year. And I’ve pushed it down to the cheapest model available. The moment models can reliably work for 10 or 20 hours on a task, I’ll be running hundreds of them. That’s where we’re heading within months.
So when I look at this week’s sell-off, I see investors who haven’t yet experienced the breakthrough moment of watching an AI agent compress 10 hours of tedious work into 40 minutes. Once that realisation moves from early adopters to Main Street, the $650 billion won’t look like reckless spending. It will look like they didn’t spend enough.
See also, I spoke about this live on Friday in conversation with , and our :
First impressions matter. With Framer, early-stage founders can launch a beautiful, production-ready site in hours — no dev team, no hassle.
Pre-seed and seed-stage startups new to Framer will get:
One year free: Save $360 with a full year of Framer Pro, free for early-stage startups.
No code, no delays: Launch a polished site in hours, not weeks, without technical hiring.
Built to grow: Scale your site from MVP to full product with CMS, analytics, and AI localization.
Join YC-backed founders: Hundreds of top startups are already building on Framer.
This was the week Anthropic and OpenAI went full contact. First, my take on the model upgrades – and then what the Super Bowl quip is really about.
This week we got two +0.1 upgrades: Claude Opus 4.6 and GPT-5.3-Codex. A decimal upgrade might suggest only incremental improvements, and in many ways, it is. The benchmarks for better planning, longer context, fewer errors all improved – some only by a small amount. But AI is now at a stage where only one variable really matters economically. That is, how long can AI do a task, and with how much autonomy. Here both of the incremental models show exponential dynamics.
’s benchmark update on GPT 5.2 (note, not the new Codex) shows model performance nearly doubled. The main question to ask of Opus 4.6 and 5.3-Codex: can they perform for longer?
Anthropic’s Nicholas Carlini set 16 Claude agents to build a C compiler from scratch and mostly walked away1. Two weeks and $20,000 in API costs later, those agents had written 100,000 lines of Rust code producing a compiler that can build the Linux kernel on x86, ARM, and RISC-V. This is not a toy2.
A C compiler capable of building Linux is a genuinely hard engineering problem. Carlini had tried the same experiment with earlier Opus models. Opus 4.5, released just months ago, could pass test suites but choked on real projects. Predecessors before that could barely produce a functional compiler at all. Opus 4.6 completed a two-week engineering task. Every increment on autonomous execution is a step change in real-world outcomes.
OK, now let’s turn to the Super Bowl…
Even Sam Altman got a giggle out of Anthropic’s Super Bowl ads, which position Claude as an ad-free alternative to ChatGPT; he then hit back:
Anthropic serves an expensive product to rich people. … [We] are committed to free access, because we believe access creates agency.
Sam’s remark revolves around a dichotomy as old as the commercial internet: how do you pay for your services – with your cash or with your self? In the past ten-plus years, the price was our attention. As we highlighted in EV#509 a year ago, the AI product use will move the economics from attention towards intention:
LLMs and predictive AI can go beyond this landscape of attention, to shape our intention – guiding what we want or plan to do, which some refer to as the “intention economy”. AI systems can infer and influence users’ motivations, glean signals of intent from seemingly benign interactions and personalise persuasive content at scale.
Security expert Bruce Schneier argues that when AI talks like a person, we start to trust it like a person. We treat it as if it were a friend, when in reality it’s a corporate product, built to serve a company’s goals, not ours. The chatty, “helpful” interface creates a feeling of intimacy exactly where we should be on guard and that gap between feeling and reality is what worries him and many others.
Mustafa Suleyman, the CEO of Microsoft AI, went further in our conversation. His position is that AI’s emotional intelligence is genuinely useful. It makes us calmer, more productive, more willing to delegate. But he draws a hard line: models should never simulate suffering. That, he says, is where “our empathy circuits are being hacked.” Market dynamics may push directly toward that line, because the companies whose models feel most human will win the most engagement.
See also, Google DeepMind researchers distinguish between rational persuasion and harmful manipulation in AI. One’s based on appealing to reason with facts, justifications, and trustworthy evidence. The other, tricking someone by hiding facts, distorting importance or applying pressure.
I’ve long suspected that living with powerful AI, and agents in particular, will provoke human experiences our ancestors didn’t have. Last April, I wrote about time compression as the new emotional challenge of working alongside AI:
with a new AI-driven workflow in place, I ran the steps through a series of modular prompts and automation scripts. The system parsed, filtered and structured the inputs with minimal human intervention. A light edit at the end and it was done in 15 minutes.
And yet, instead of feeling triumphant, I felt… unsettled. Had I missed something? Skipped a crucial step? Was the result too thin?
It wasn’t. The work was complete and it was good. But I hadn’t emotionally recalibrated to the new pace.
I empathize with , who calls agentic AI “a vampire,” multiple agents running constantly, demanding your human oversight and input, draining energy and making it hard to go back to human paces, including sleep. In the first days of setting up my multi-agent systems, I found myself waking up at 4am to check my agents, unblocking them and context-switching across multiple projects before I hit my human limit and went back to bed.
We’re still working out where agents genuinely help, and a key question is how many to use at once. Multi‑agent systems shine when work can be split into parallel streams, but they break down on tightly sequential tasks.
For high-consequence work – investment due diligence, thorny analysis, or divergent exploration – I now turn to Clade, a multi-agent system I built where AIs argue with themselves until better answers emerge3. Here’s how…
2026-02-07 02:29:09
In this live session, I'm joined by , founder of , and from my team, with financial journalist from .
We dig into our recent research partnership examining OpenAI's actual operating margins, R&D costs, and whether the economics of frontier AI actually work. We explore the surprisingly short lifespan of AI models, infrastructure constraints, the shift toward agentic workflows, and what all of this means for the trillion-dollar question: is this sustainable or a bubble?
Enjoy!
Azeem