MoreRSS

site iconExponential ViewModify

By Azeem Azhar, an expert on artificial intelligence and exponential technologies.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Exponential View

🔮 Davos 2026 and the end of the rules-based order

2026-01-30 02:03:28

At Davos 2026, the mood was unlike any previous World Economic Forum gathering. With Donald Trump arriving amid escalating geopolitical tensions and European leaders sounding alarms about sovereignty, I recorded live dispatches from the ground.

In this special episode, I bring together observations from four days at the annual meeting - tracking the seismic shifts in global order alongside the practical realities of AI adoption in the enterprise.

I speak about:

  • What Trump’s two-hour Davos speech revealed about the new geopolitical reality

  • Why technological sovereignty suddenly became urgent for European leaders

  • The real state of AI adoption in the enterprise, from executives who are actually doing it

  • The startup building AI agents that have completed 115 million patient interactions…

Skip to the best part:

(05:28) Mark Carney’s speech

(06:13) Why European leaders are sounding the alarm

(07:13) Why technological sovereignty is urgent

(14:24) What leaders really have to say on AI adoption

Last week, I set out the underlying argument in an essay on how the breakdown of old geopolitical assumptions is part of a broader upgrade to civilisation’s operating system.

Exponential View
🔮 The end of the fictions
I just got back to London after a week at the Annual Meeting at Davos. For the past few years, the World Economic Forum had become a kind of parody of itself, a place where billionaires flew in on private jets to discuss climate change and “stakeholder capitalism” while nothing much seemed to happen. But this year was different…
Read more

Enjoy!

Azeem

Leave a comment

How “95%” escaped into the world – and why so many believed it

2026-01-30 00:42:53

Hi all, today’s post is open to all in the service of public discourse and anti‑slop thinking.


One number still keeps turning up in speeches, board meetings, my conversations and inbox:

“95 percent”

Do I need to say more than that? OK, here’s another clue: this number traveled on borrowed authority in 2025, rarely with a footnote and it started to shape decisions.

The claim is this: 95 percent” of organizations see no measurable profit-and-loss impact from generative AI. Of course, you know what I’m talking about. It has ricocheted through Fortune, the FT, The Economist, amongst others.

Often presented as “MIT / MIT Media Lab research, the “95 percent” is treated as a settled measurement of the AI economy. It’s invading my conversations and moving the world. I’ve heard it cited by executives as they decide how to approach AI deployments and investors who use it to calibrate risk.

This number basks in the glow of MIT, the world’s best technology university. And I started to wonder if this evidence had truly earned that halo. Turns out, I’m not the only one – Toby Stuart at the Haas School of Business wrote about it as an example of how prestige and authority can turn a weak claim into an accepted truth.

Late last year, I tried to trace the claim back to its foundations. Who studied whom, when and what counted as “impact”? What makes the “95 percent” a number you can rely on, rather than a sop for clickbait? I also reached out to the authors and MIT for comment. I’ll share today what I found.

“95 percent” has become an orphaned statistic. Adding to the list of: we only use 10% of our brains, it takes seven years to digest swallowed gum and Napoleon was short. Image generated using Midjourney.

Subscribe now

The paper trail

The original report was produced in collaboration with the MIT NANDA Project. The NANDA project was, at the time the report was published, connected to the Camera Culture research group of the MIT Media Lab. MIT’s logo is the only logo that appears in the document. The phrase “MIT Media Lab” doesn’t appear in the document.

The paper’s two academic authors, Associate Professor Ramesh Raskar and Postdoctoral Associate Pradyumna Chari, both affiliated with the same Camera Culture research group at MIT Media Lab. There were two other authors, Chris Pease, an entrepreneur, and tech exec Aditya Challapally.

The first sentence of the Executive Summary is the one that set the newswires crackling:

Despite $30-40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return.

It suggests that they sampled a certain number of organizations. And this is where my issues with this paper start.

Issue one: No confidence intervals

To demonstrate the arithmetic, let’s use the sample size of 52. This was the number of interviews in the research. If 95% said they got no return, then the math is simple.

Obviously, you can’t have 49.4 organisations reply “we had failure”, it’s either 49 or 50.

This in itself is problematic. If it were 50, the real success rate is 3.8%. If 49, it is 5.8%.

But even this is naive on two further counts: it’s a sample and it might not be representative. The sample size in question here is possibly as low as 52 (the number of interviews conducted), although it might potentially be higher (see issue two below). The universe of organizations is much larger.

Ergo, we have a sample of the whole universe. Academic convention is to provide confidence intervals when you have a sample, not a census. For a sample of 52-ish from an unspecified large-ish population, the confidence intervals are roughly +/- 6%; it would drop towards +/- 3% as the sample size rises to the low hundreds.1 The paper provided no confidence intervals around the 95% number, breaching academic convention.

The paper’s research appendix says that confidence intervals were calculated using “bootstrap resampling methods where applicable.” The bootstrap method is valuable in that it doesn’t make prior assumptions on the data distribution; however, small samples produce wider intervals, reflecting greater uncertainty. If you ran a bootstrap calculation on a sample size of 52 which had 49 or 50 failures, your confidence interval is between 100% failures and 86.5% failures (using the 49/52 figure. The value for 50/52 failures sits at 90.4% to 100%). This interval means the true value is probably somewhere within a margin covering 13.5 percentage points. Reporting this figure as a single 95% value completely hides the underlying uncertainty – that failure rates in their sample are likely between the high-80s and 100%. That is a highly volatile range which shakes my confidence despite the methodological signalling.

Issue two: The sample is not representative

But given this is clearly a sample, is the sample representative? A number that blends interviews, conference responses, and public case studies, as the MIT NANDA study did, can be useful as a temperature check. But it is not, on its own, a reliable portrait of “organizations” in general. The researchers mixed 52 structured interviews, analysis of 300 public AI initiatives (out of how many? which 300 were chosen?) and 153 surveys of non-specified “senior leaders.”

Page 24 of the report, section 8.2, does in fact acknowledge the limitations of the sampling technique: “Our sample may not fully represent all enterprise segments or geographic regions” and “Organizations willing to discuss AI implementation challenges may systematically differ from those declining participation, potentially creating bias toward either more experimental or more cautious adopters.”

The sample period itself is fundamentally flawed. It spans January to June 2025, a full six months. Consider this timeline. If the earliest enterprise genAI projects launched in late 2023 or early 2024, respondents reported on efforts anywhere from twelve to eighteen months old. Yet a January 2025 interviewee might have started their project just three months prior, in late September or early October 2024. How can we meaningfully compare a three-month pilot to an eighteen-month rollout?

This survey window conflates projects at radically different stages of maturity and ambition, a serious methodological problem in any context, but especially damaging here. In 2024, enterprise AI spending grew by at least a factor of three (~10.6% compound monthly growth rate); in the first half of 2025, growth ran at 1.66x (equivalent to 2.75x annualized) according to our forecasts. Menlo Ventures estimates a 3.2x increase in enterprise AI spend in 2025. That pace of change, in usage patterns, in the composition of adopters, in the very definition of “enterprise AI,” renders a six-month survey window almost meaningless.

So, in fact, unless the researchers share their raw data, we have no idea whether the sample is full of early adopters, or early adopters with high expectations, or of people who consider themselves leaders but aren’t, or with people in fringe organizations that don’t represent the American firm. Nor do we have any sense whether people who answered “no” in January would have answered “yes” in June, had they been asked then.

Subscribe now

Issue three: The changing denominator is bamboozling

The study mixes public data (which may or may not be representative) with two different types of interviews and surveys conducted over an extended period of time. It’s mush.

But then, let’s ask a question about what we are taking a proportion of. Imagine you have a high school year of 500: 250 boys and 250 girls. If you said 60% of boys and 70% of girls passed the math exam, it wouldn’t be reasonable to say 60% of students passed the math exam (you’ve ignored the girls). Nor would it be reasonable to say that 70% of the girls passed the physics exam (you only talked about having data for the math exam).

Yet this is roughly what the NANDA paper did. The denominator used is not consistently calculated. On page 6, a chart shows 5% of firms sampled successfully implemented an embedded or task-specific genAI tool. It also shows that 60% of firms investigated a task-specific tool.

The denominator in this case includes the 40% of firms that never even investigated the task-specific tool. If this were the case, if your firm never investigated using a task-specific AI tool, your non-investigation counts as a failure for these purposes.

The real maths – if these were consistent definitions and using “tools investigated” as the denominator – is:

The trouble is, the report contradicts itself. On page 3, the report states, “Just 5% of integrated AI pilots are extracting millions in value”. But this implies the denominator is pilots launched, the number marked by the blue arrow above. If that is the case, the real success rate is

A quarter is an incredibly high proportion – if true, we’d want to do more research to assuage our sceptical tingles.

But remember the introduction to the report. It states the “95% of organizations are getting zero return”, not organizations which ran AI pilots, but organizations, period.

Issue four: The method of defining “success” is not clear

What of the measurable P&L impact? Profit-and-loss isn’t like rainfall. Anyone who has run a business knows that. In large organizations, only a few people can ever really know, and it’s unclear how many of them would be the ones answering conference surveys or interviews. It’s also unclear that you could actually measure a P&L impact with good attribution in less than several months, even if there was such an impact.

The headline number smuggles in a claim about pace: “Top performers reported average timelines of 90 days from pilot to full implementation.” Does that mean to the P&L impact or to a technical roll-out? It is unclear.

Generative AI remains young as a corporate technology. The starting gun fired in late 2023. Enterprises typically need 18-24 months to move large-scale IT deployments from pilot to production – building systems, workflows, and governance along the way.

The report claims “only 5 percent” succeeded. But recall the fieldwork window: January to June 2025. This window means that some organisations might have even seen results during the fieldwork period if they were surveyed in January rather than June. The implication: an impossibly slow path from pilot to impact.

It seems like the study defines “success” so narrowly that “not yet” becomes “never.”

That is why the right way to treat “95 percent” is not as a fact. It is a crude signal being stochastically parroted in the peanut gallery.

What to make of these four glaring issues?

I wrote to the two MIT-affiliated academic authors, Prof. Ramesh Raskar and Dr Pradyumna Chari, to clarify.

I asked specific questions about the academic process. In summary:

  • Did the “95 percent” figure represent a statistical finding from the sample, or is it a rough directional estimate - and either way, what specific population of companies and projects does it describe?

  • What exactly counts as “measurable P&L impact” - how long after implementation did they measure it, how large does the impact need to be, and are they measuring individual projects or whole company performance?2

I did not receive a response to my enquiries.

So I had to escalate to MIT Media Lab’s leadership. Raskar and Chari are attached to The Media Lab, which operates as a research laboratory within MIT’s School of Architecture and Planning. The School of Architecture and Planning rolls up to the top level of the institute.

Subscribe now

What MIT told me

Faculty Director of the MIT Media Lab, Tod Machover, did reply. MIT’s counsel, Jason Baletsa, was copied3.

In that correspondence, Professor Machover described the document as “a preliminary, non-peer-reviewed piece” created by individual researchers involved in Project NANDA. Machover told me that the NANDA research group now operates within an independent non-profit foundation. (Although the MIT Media Lab still maintains a NANDA research page.)

The Internet Archive record of the report.

The report, I was told, was posted briefly to invite feedback. Indeed, it was posted on MIT’s website from 18 August to 16 September, according to the Internet Archive.

That framing – early, exploratory, posted for comment – is plausible. Academia must be allowed to think in public. Drafts and half-formed ideas are a reasonable part of the process of scholarship.

During the feedback period, one of the four authors of the report was quoted in the media. Much of the subsequent media coverage presented it as a finding, not as an early draft posted for comment.

But this is a problem, as much for MIT as it is for us. Treating casual, informal work as equivalent to scholarship doesn’t just mislead the market; it erodes the trust scaffolding that industry participants, investors, founders, and the public rely on to make decisions.

The report no longer appears on an MIT.edu domain. However, the PDF circulating online still carries MIT branding and is hosted on a third-party research aggregator. There is no canonical MIT-hosted version, no note explaining how it should be cited, and no institutional clarification that would stop the statistic from being treated as an MIT finding.

When I pushed further on the question of whether it is correct or incorrect to call this “a piece of research from MIT / MIT Media Lab” I didn’t initially get a clear answer.

Eventually, in an email dated 12 December 2025, Kimberly Allen, MIT’s Executive Director of Media Relations, wrote on behalf of the Provost and VP for Research4: “There were MIT investigators involved. It was unpublished, non-peer-reviewed work. It has not been submitted for peer review as far as we know.”

That’s somewhat helpful. The number had been acting as an institutional fact. “MIT says” has, with very good reason, a great deal of weight. The authors themselves say: “The views expressed in this report are solely those of the authors and reviewers and do not reflect the positions of any affiliated employers.”

But the paper still carries an MIT logo so people stumbling across it might read more into it than it may deserve. That brand continues to do the work for something that appears incomplete.

It is incredibly hard to get a role at a top university. “Insanely hard” is what one academic friend told me, precisely because the high standards of those organisations have produced a very costly external signal about quality.

Subscribe now

Breaking the number

So what’s a more reliable reading? If you triangulate across what firms are actually reporting, and account for adoption lags and sampling bias, can we narrow the plausible space for ‘no enterprise-level, clearly measurable P&L impact yet’?

The strongest part of the rewrite lives with the denominator. The 5% number emerges, with huge confidence intervals, from the research if the denominator includes every organization that didn’t even try to implement a task-specific AI. This is like stating that I failed to board the American Airlines flight from JFK to Heathrow last Thursday. True, but I wasn’t booked on the flight. I wasn’t even in New York that day.

And if you consider a more reasonable denominator: those who tried to implement such an AI through a pilot, you end up with a success rate of 5 out of 20, or 25%. The error margins? Possibly in the range of +/- 15%; so a 10 to 40% success rate based on running this through a handful of reasoning LLMs at the highest thinking mode. If you make more aggressive assumptions about sampling flaws and shaky definitions, the dispersion widens further.

If the underlying data was shared by the researchers, we could do a much better job of this. But it isn’t clear why anyone should make any type of decision based on this.

A defensible inference band sits in the low-80s, not the mid-90s of organizations that started pilots on task-specific AI, not seeing results at some point between January and June 2025. But it is plausible that number could be much lower.

It certainly isn’t the nihilistic “95 percent” that markets have begun to quote as if it were an MIT-certified measurement. By this standard, SAP in 1997 or cloud in 2012 would also have failed.

Put plainly, if around 85% of organizations had no clearly measurable P&L impact, then roughly one in seven were already seeing measurable gains by early 2025. A remarkably fast path to success given how recently generative AI entered the corporate bloodstream. And I’m being conservative, it could easily have been much higher, one in five or more.5

Generated using NotebookLM

This is a far more nuanced picture than “95 percent failure” – a headline that invites either complacency (“AI is hype”) or fatalism (“it’s impossible”).

Subscribe now

Orphaned stats, and the fix

Go one level deeper, and the real issue isn’t whether the number is 95, 86, 51, or 27 – it’s that the “95 percent” has become an orphaned statistic. We’re familiar with orphaned numbers. Like we only use 10% of our brains. That it takes seven years to digest swallowed gum. That Napoleon was short. The goldfish has a 3-second memory. That there are three states of matter. It’ll be cited… just because.

But, unlike the goldfish that forgot Napoleon was average for his time, this statistic has a halo. And this number has moved capital. It has travelled faster than its caveats. It will end up embedded in investment memos long after anyone remembers to ask what they actually measured. Worst of all, it forces the reader to do basic provenance work that much of the commentary ecosystem waved through.

So here is the standard that should apply. If a statistic is going to be cited as “MIT research,” it deserves, at minimum, a stable home and enough transparency that a sceptic can try to break it. And if it shouldn’t be cited as such, its providence should be clearly explained by the institution and by its authors.

Until that happens, the “95 percent” figure should be treated for what it is: not reliable. It is viral, vibey, methodologically weak and it buries its caveats. The report served some purpose, but what that purpose is, I’m not clear.

If – big if – 15-20% of large organisations were already translating generative AI into measurable gains at the start of last year, that is not a failure story. It is the rapid climb up the right tail of a diffusion curve. So rapid in fact that we’d want to sceptically unpick any data that suggested that.

So how certain am I that this 95% statistic, so beloved by journalists and those in a hurry, is unreliable?

One hundred percent.6

Subscribe now

1

Using the rule of thumb that for a binomial proportion, an approximate two-sided 95% margin of error is

​ where p is the proportion observed in the sample, and n is the sample size.

2

These are condensed versions of longer questions I asked.

3

Baletsa is a counsel of MIT as a whole.

4

To help understand the structure here: The Provost is effectively the Chief Academic Officer. The VP of Research is responsible for academic quality. The Deans of a School (such as the School of Architecture and Planning) would report to the Provost. Faculty Directors (of particular labs within a School) would usually report into a Dean of a School.

5

But we can’t know, given all the shortcomings in the data as released.

6

I’m 95% sure that I’m 100% sure.

👀 Inside OpenAI's unit economics

2026-01-29 07:35:01

AI companies are being priced into the hundreds of billions. That forces one awkward question to the front: do the unit economics actually work?

Jevons’ paradox suggests that as tokens get cheaper, demand explodes. You’ve likely felt some version of this in the last year. But as usage grows, are these models actually profitable to run?

In our collaboration with Epoch AI, we tackle that question using OpenAI’s GPT-5 as the case study. What looks like a simple margin calculation is closer to a forensic exercise: we triangulate reported details, leaks, and Sam Altman’s own words to bracket plausible revenues and costs.

Here’s the breakdown.

— Azeem

Subscribe now


Can AI companies become profitable?

Lessons from GPT-5’s economics

Originally published on Epoch AI’s blog. Analysis by , Exponential View’s , and

Are AI models profitable? If you ask Sam Altman and Dario Amodei, the answer seems to be yes — it just doesn’t appear that way on the surface.

Here’s the idea: running each AI model generates enough revenue to cover its own R&D costs. But that surplus gets outweighed by the costs of developing the next big model. So, despite making money on each model, companies can lose money each year.

This is big if true. In fast-growing tech sectors, investors typically accept losses today in exchange for big profits down the line. So if AI models are already covering their own costs, that would paint a healthy financial outlook for AI companies.

But we can’t take Altman and Amodei at their word — you’d expect CEOs to paint a rosy picture of their company’s finances. And even if they’re right, we don’t know just how profitable models are.

To shed light on this, we looked into a notable case study: using public reporting on OpenAI’s finances,1 we made an educated guess on the profits from running GPT-5, and whether that was enough to recoup its R&D costs. Here’s what we found:

  • Whether OpenAI was profitable to run depends on which profit margin you’re talking about. If we subtract the cost of compute from revenue to calculate the gross margin (on an accounting basis),2 it seems to be about 50% — lower than the norm for software companies (where 60-80% is typical) but still higher than many industries.

  • But if you also subtract other operating costs, including salaries and marketing, then OpenAI most likely made a loss, even without including R&D.

  • Moreover, OpenAI likely failed to recoup the costs of developing GPT-5 during its 4-month lifetime. Even using gross profit, GPT-5’s tenure was too short to bring in enough revenue to offset its own R&D costs. So if GPT-5 is at all representative, then at least for now, developing and running AI models is loss-making.

This doesn’t necessarily mean that models like GPT-5 are a bad investment. Even an unprofitable model demonstrates progress, which attracts customers and helps labs raise money to train future models — and that next generation may earn far more. What’s more, the R&D that went into GPT-5 likely informs future models like GPT-6. So these labs might have a much better financial outlook than it might initially seem.

Let’s dig into the details.

Part I: How profitable is running AI models?

To answer this question, we consider a case study which we call the “GPT-5 bundle”.3 This includes all of OpenAI’s offerings available during GPT-5’s lifetime as the flagship model — GPT-5 and GPT-5.1, GPT-4o, ChatGPT, the API, and so on.4 We then estimate the revenue and costs of running the bundle.5

Revenue is relatively straightforward: since the bundle includes all of OpenAI’s models, this is just their total revenue over GPT-5’s lifetime, from August to December last year.6 This works out to $6.1 billion.7

At first glance, $6.1 billion sounds healthy, until you juxtapose it with the costs of running the GPT-5 bundle. These costs come from four main sources:

  1. Inference compute: $3.2 billion. This is based on public estimates of OpenAI’s total inference compute spend in 2025, and assuming that the allocation of compute during GPT-5’s tenure was proportional to the fraction of the year’s revenue raised in that period.8

  2. Staff compensation: $1.2 billion, which we can back out from OpenAI staff counts, reports on stock compensation, and things like H1B filings. One big uncertainty with this: how much of the stock compensation goes toward running models, rather than R&D? We assume 40%, matching the fraction of compute that goes to inference. Whether staffing follows the same split is uncertain, but it’s our best guess.910

  3. Sales and marketing (S&M): $2.2 billion, assuming OpenAI’s spending on this grew between the first and second halves of last year.1112

  4. Legal, office, and administrative costs: $0.2 billion, assuming this grew between 1.6× and 2× relative to their 2024 expenses. This accounts for office expansions, new office setups, and rising administrative costs with their growing workforce.

So what are the profits? One option is to look at gross profits. This only counts the direct cost of running a model, which in this case is just the inference compute cost of $3.2 billion. Since the revenue was $6.1 billion, this leads to a profit of $2.9 billion, or gross profit margin of 48%, and in line with other estimates.13 This is lower than other software businesses (typically 70-80%) but high enough to eventually build a business on.

On the other hand, if we add up all four cost types, we get close to $6.8 billion. That’s somewhat higher than the revenue, so on these terms the GPT-5 bundle made an operating loss of $0.7 billion, with an operating margin of -11%.14

Stress-testing the analysis with more aggressive or conservative assumptions doesn’t change the picture much:15

Confidence intervals are obtained from a Monte Carlo analysis.

And there’s one more hiccup: OpenAI signed a deal with Microsoft to hand over about 20% of their $6.1 billion revenue,16 making their losses even larger still.17 This doesn’t mean that the revenue deal is entirely harmful to OpenAI — for example, Microsoft also shares revenue back to OpenAI.18 And the deal probably shouldn’t significantly affect how we see model profitability — it seems more to do with OpenAI’s economic structure rather than something fundamental to AI models. But the fact that OpenAI and Microsoft have been renegotiating this deal suggests it’s a real drag on OpenAI’s path to profitability.

In short, running AI models is likely profitable in the sense of having decent gross margins. But OpenAI’s operating margin, which includes marketing and staffing, is likely negative. For a fast-growing company, though, operating margins can be misleading — S&M costs typically grow sublinearly with revenue, so gross margins are arguably a better proxy for long-run profitability.

So our numbers don’t necessarily contradict Altman and Amodei yet. But so far we’ve only seen half the story — we still need to account for R&D costs, which we’ll turn to now.

Part II: Are models profitable over their lifecycle?

Let’s say we buy the argument that we should look at gross margins. On those terms, it was profitable to run the GPT-5 bundle. But was it profitable enough to recoup the costs of developing it?

In theory, yes — you just have to keep running them, and sooner or later you’ll earn enough revenue to recoup these costs. But in practice, models might have too short a lifetime to make enough revenue. For example, they could be outcompeted by products from rival labs, forcing them to be replaced.

So to figure out the answer, let’s go back to the GPT-5 bundle. We’ve already figured out its gross profits to be around $3 billion. So how do these compare to its R&D costs?

Estimating this turns out to be a finicky business. We estimate that OpenAI spent $16 billion on R&D in 2025,19 but there’s no conceptually clean way to attribute some fraction of this to the GPT-5 bundle. We’d need to make several arbitrary choices: should we count the R&D effort that went into earlier reasoning models, like o1 and o3? Or what if experiments failed, and didn’t directly change how GPT-5 was trained? Depending on how you answer these questions, the development cost could vary significantly.

But we can still do an illustrative calculation: let’s conservatively assume that OpenAI started R&D on GPT-5 after o3’s release last April. Then there’d still be four months between then and GPT-5’s release in August,20 during which OpenAI spent around $5 billion on R&D.21 But that’s still higher than the $3 billion of gross profits. In other words, OpenAI spent more on R&D in the four months preceding GPT-5, than it made in gross profits during GPT-5’s four-month tenure.

So in practice, it seems like model tenures might indeed be too short to recoup R&D costs. Indeed, GPT-5’s short tenure was driven by external competition — Gemini 3 Pro had arguably surpassed the GPT-5 base model within three months.

One way to think about this is to treat frontier models like rapidly-depreciating infrastructure: their value must be extracted before competitors or successors render them obsolete. So to evaluate AI products, we need to look at both profit margins in inference as well as the time it takes for users to migrate to something better. In the case of the GPT-5 bundle, we find that it’s decidedly unprofitable over its full lifecycle, even from a gross margin perspective.

Part III: Will AI models become profitable?

So the finances of the GPT-5 bundle are less rosy than Altman and Amodei suggest. And while we don’t have as much direct evidence on other models from other labs, they’re plausibly in a similar boat — for instance, Anthropic has reported similar gross margins to OpenAI. So it’s worth thinking about what it means if the GPT-5 bundle is at all representative of other models.

The most crucial point is that these model lifecycle losses aren’t necessarily cause for alarm. AI models don’t need to be profitable today, as long as companies can convince investors that they will be in the future. That’s standard for fast-growing tech companies.

Early on, investors value growth over profit, believing that once a company has captured the market, they’ll eventually figure out how to make it profitable. The archetypal example of this is Uber — they accumulated a $32.5 billion deficit over 14 years of net losses, before their first profitable year in 2023. By that measure, OpenAI is thriving: revenues are tripling annually, and projections show continued growth. If that trajectory holds, profitability looks very likely.

And there are reasons to even be really bullish about AI’s long-run profitability — most notably, the sheer scale of value that AI could create. Many higher-ups at AI companies expect AI systems to outcompete humans across virtually all economically valuable tasks. If you truly believe that in your heart of hearts, that means potentially capturing trillions of dollars from labor automation. The resulting revenue growth could dwarf development costs even with thin margins and short model lifespans.

That’s a big leap, and some investors won’t buy the vision. Or they might doubt that massive revenue growth automatically means huge profits — what if R&D costs scale up like revenue? These investors might pay special attention to the profit margins of current AI, and want a more concrete picture of how AI companies could be profitable in the near term.

There’s an answer for these investors, too. Even if you doubt that AI will become good enough to spark the intelligence explosion or double human lifespans, there are still ways that AI companies could turn a profit. For example, OpenAI is now rolling out ads to some ChatGPT users, which could add between $2 to 15 billion in yearly revenue even without any user growth.22 They’re moving beyond individual consumers and increasingly leaning on enterprise adoption. Algorithmic innovations mean that running models could get many times cheaper each year, and possibly much faster. And there’s still a lot of room to grow their user base and usage intensity — for example, ChatGPT has close to a billion users, compared to around six billion internet users. Combined, these could add many tens of billions of revenue.

It won’t necessarily be easy for AI companies to do this, especially because individual labs will need to come face-to-face with AI’s “depreciating infrastructure” problem. In practice, the “state-of-the-art” is often challenged within months of a model’s release, and it’s hard to make a profit from the latest GPT if Claude and Gemini keep drawing users away.

But this inter-lab competition doesn’t stop all AI models from being profitable. Profits are often high in oligopolies because consumers have limited alternatives to switch to. One lab could also pull ahead because they have some kind of algorithmic “secret sauce”, or they have more compute.23 Or they develop continual learning techniques that make it harder for consumers to switch between model providers.

These competitive barriers can also be circumvented. Companies could form their own niches, and we’ve already seen that to some degree: Anthropic is pursuing something akin to a “code is all you need” mission, Google DeepMind wants to “solve intelligence” and use that to solve everything from cancer to climate change, and Meta strives to make AI friends too cheap to meter. This lets individual companies gain revenue for longer.

So will AI models (and hence AI companies) become profitable? We think it’s very possible. While our analysis of the GPT-5 bundle is more conservative than Altman and Amodei hint at, what matters more is the trend: Compute margins are falling, enterprise deals are stickier, and models can stay relevant longer than the GPT-5 cycle suggests.


Authors’ note: We’d like to thank , Josh You, David Owen, , Ricardo Pimentel, , , Lynette Bye, Jay Tate, , Juan García, Charles Dillon, Brendan Halstead, Isabel Johnson and Markov Gray for their feedback and support on this post. Special thanks to for initiating this collaboration and vital input, and for in-depth feedback and discussion.

1

Our main sources of information include claims by OpenAI and their staff, and reporting by The Information, CNBC and the Wall Street Journal. We’ve linked our primary sources through the document.

2

Technically, gross margins should also include staff costs that were essential to delivering the product, such as customer service. But these are likely a small fraction of salaries, which are in turn dominated by compute costs — so it won’t affect our analysis much, as we’ll see.

3

We focus on OpenAI models because we have the most financial data available on them.

4

Should we include Sora 2 in this bundle? You could argue that we shouldn’t, because it runs on its own platform and is heavily subsidized to kickstart a new social network, making its economics quite different. However, we find that it’s likely a rounding error for revenues, since people don’t use it much. In particular, the Sora app had close to 9 million downloads by December, compared to around 900 million weekly active users of ChatGPT.

Now, while it likely didn’t make much revenue, it might have been costly to serve — apparently making TikTok-esque AI short-form videos using Sora 2 cost OpenAI several hundred million dollars. Here’s a rough estimate: In November (when app downloads peaked), Sora 2 had “almost seven million generations happening a day”. Assuming generations were proportional to weekly active users over time, this would mean 330 million videos in total. The API cost is $0.1/s, so if the average video was 10s long, and assuming the API compute profit margin was 20%, this adds up to 330 million × $0.1 × 10 / 1.2 ≈ $250 million. This is significant, but it’s minor compared to OpenAI’s overall inference compute spend.

5

Ideally we’d have only looked at a single model, but we only have data on costs and revenues at the company-level, not at the release-level, so we do the next best thing.

6

For the purposes of this post, we assume that GPT-5’s lifetime started when GPT-5 was released (Aug 7th) and ended when GPT-5.2 was released (Dec 11th). That might seem a bit odd — after all, isn’t GPT-5.2 based on GPT-5? We thought so too, but GPT-5.2 has a new knowledge cutoff, and is apparently “built on a new architecture”, so it might have a different base model from the other models under the GPT-5 moniker.

Admittedly, we don’t know for sure that GPT-5.2 uses a different base model, but it’s a convenient way to bound the timeframe of our analysis. And it shouldn’t matter much for our estimates of profit margins, because we’re simply comparing revenues and costs over the same time period.

Also note that GPT-5 and GPT-5.1 are still available through ChatGPT and OpenAI’s API, so their useful life hasn’t strictly ended. We assume, for simplicity, that usage has been largely displaced by GPT-5.2.

7

In July, OpenAI had its first month with over $1 billion in revenue, and it closed the year with an annualized revenue of over $20 billion ($1.7 billion per month). If this grew exponentially, the average revenue over the four months of GPT-5’s tenure would’ve been close to $1.5 billion, giving a total of $6 billion during the period.

8

Last year, OpenAI earned about $13 billion in full-year revenue, compared to $6.1 billion for the GPT-5 bundle. At the same time, they spent around $7 billion running all models last year, so if we assume revenue and inference compute are proportional throughout the year, they spent 6.1 billion / 13 billion × 7 billion ≈ $3.3 billion. But in practice, these likely didn’t grow proportionally, because the compute profit margin for paid users increased from 56% in January to 68% in October. This means that inference grew cheaper relative to revenue, saving about 10% in costs, which is $300 million-ish (importantly, both free and paying users grew around 2.6× over this period from January to October).

This is then offset by an additional $200 million from other sources of IT spending, including e.g., servers and networking equipment. The total is then still around $3.3 billion - $0.3 billion + $0.2 billion = $3.2 billion.

9

H1B filings suggest an average base salary of $310,000 in 2025, ranging from $150,000 to $685,000. This seems broadly consistent with data from levels.fyi, which reports salaries ranging from $144,275 to $1,274,139 as we’re writing this. Overall, let’s go with an average of $310,000 plus around 40% in benefits. We also know that OpenAI’s staff count surged from 3,000 in mid-2025 to 4,000 by the end of 2025. We smoothly interpolate between these to get an average staff count of 3,500 employees during GPT-5’s lifetime.

Then the base salary comes to: 3,500 employees × $310,000 base salary × 1.4 benefits × 40% share of employees working on serving GPT-5 × 127 / 365 period serving ≈ $0.2 billion (the 127 comes from the number of days in GPT-5’s lifetime).

We then need to account for stock compensation. In 2025, OpenAI awarded $6 billion to employees in stock compensation. Assuming they awarded them proportionally to staff count over the year, and given the exponential increase of staff counts, that would indicate that over 42% of the stock was awarded during GPT-5’s lifetime. Assuming 40% goes to operations as before, that results in $6 billion x 42% x 40% = $1 billion stock expense for operating the GPT-5 bundle. The total staff compensation would then be around $1.2 billion.

10

It’s debatable whether the very high compensation packages for technical staff will continue as the industry matures.

11

In the first half of 2025, OpenAI spent $2 billion on S&M, which we can convert into a daily rate of $11 million per day. This grew over time (S&M spending doubled from 2024 to H1 2025), so the average pace over GPT-5’s lifetime is higher (we estimate about $17 million a day). If we multiply this by the 127 days in the window, we get a rough total of $2.2 billion.

12

This corresponds to around 30% of revenue during the period, which isn’t unusual compared to other large software companies. For example, Adobe, Intuit, Salesforce and ServiceNow all spent around 27% to 35% of their 2024-2025 revenue in S&M. That said, there are certainly examples with lower spends — for example, Microsoft and Oracle spend 9 to 15% of their revenue on marketing, though note that these are relatively mature firms — younger firms may spend higher fractions on S&M.

13

Last year, OpenAI reported a gross profit margin of 48%, which is consistent with our estimates. From the same article, Anthropic expects a similar gross profit margin, suggesting this might be representative of the industry.

14

How does this compare to previous years? The Information reported that in 2024 OpenAI made $4 billion in revenue, and spent $2.4 billion in inference compute and hosting, $700 million in employee salaries, $600 million in G&A, and $300 million in S&M. This implies a gross margin of 40% and an operating margin of 0% (excluding stock compensation).

15

In broad strokes, we perform a sensitivity analysis by considering a range of possible values for each cost component, then sampling from each to consider a range of plausible scenarios (a Monte Carlo analysis). The largest uncertainties that feed into this analysis are how much staff compensation goes to inference instead of R&D, S&M spending in the second half of 2025, and revenue during GPT-5’s tenure.

16

Two more caveats to add: first, this 20% rate isn’t publicly confirmed by OpenAI or Microsoft, at least in our knowledge. Second, the revenue sharing agreement is also more complex than just this number. Microsoft put a lot of money and compute into OpenAI, and in return it gets a significant ownership stake, special rights to use OpenAI’s technology, and some of OpenAI’s revenue. There also isn’t a single well-defined “end date”: some rights are set to last into the early 2030s, while other parts (including revenue sharing) continue until an independent panel confirms OpenAI has reached “AGI”.

17

Strictly speaking, a revenue share agreement is often seen as an expense that would impact gross margins. But we’re more interested in the unit economics that generalize across models, rather than those that are unique to OpenAI’s financial situation.

18

The deal was signed in 2019, a year before GPT-3 was released, and at this time it may have been an effective way to access compute resources and get commercial distribution. This could’ve been important for OpenAI to develop GPT-5 in the first place.

19

OpenAI’s main R&D spending is on compute, salaries and data. In 2025, they spent $9 billion on R&D AI compute, and about $1 billion on data (which includes paying for human experts and RL environments). We can estimate salary payouts in the same way we did in the previous section on inference, except we consider 60% of staff compensation rather than 40%, resulting in an expense of $4.6 billion. Finally, we add about $400 million in offices and administrative expenses, and $600 million in other compute costs (including e.g. networking costs). This adds up to about $16 billion.

20

In fact, we could be substantially lowballing the R&D costs. GPT-5 has been in the works for a long time — for example, early reasoning models like o1 probably helped develop GPT-5’s reasoning abilities. GPT-5.1 was probably being developed between August and November, covering a good chunk of the GPT-5 bundle’s tenure. But there’s a countervailing consideration: some of the R&D costs for GPT-5 probably help develop future models like “GPT-6”. So it’s hard to say what the exact numbers are, but we’re pretty confident that our overall point still stands.

21

Because OpenAI’s expenses are growing exponentially, we can’t just estimate the share of R&D spending in this period as one-third of the annual total. Assuming a 2.3× annual growth rate in R&D expenses — comparable to the increase in OpenAI’s R&D compute spending from 2024 to 2025 — the costs incurred between April 16 and August 7 would account for approximately 35% of total yearly R&D expenses.

22

OpenAI was approaching 900 million weekly active users in December last year. For ads, they project a revenue of $2 per free user for 2026, and up to $15 per free user for 2030. Combining these numbers gives our estimate of around $2 billion to $15 billion.

23

For the investors who are willing to entertain more extreme scenarios, an even stronger effect is when “intelligence explosion” dynamics kick in — if OpenAI pulls ahead at the right time, they could use their better AIs to accelerate their own research, amplifying a small edge into a huge lead. This might sound like science fiction to a lot of readers, but representatives from some AI companies have publicly set these as goals. For instance, Sam Altman claims that one of OpenAI’s goals is to have a “true automated AI researcher” by March 2028.

📈 Data to start your week

2026-01-26 23:48:36

Hi all,

Here’s your Monday round-up of data driving conversations this week in less than 250 words.

Let’s go!

Subscribe now


  1. Enterprise shift ↑ OpenAI’s enterprise customers now account for 40% of the business, up from 25% last year. We estimate that their enterprise revenue has grown roughly 5.8x year-on-year.

  2. Compute squeeze ↑ Following the Claude 4.5 Opus launch, H100 rental prices hit a new 8-month high, up ~14% since November lows.

  3. Humans reading documentation ↓ Almost half of traffic (48%) to developer documentation is now from AI, not humans.

Read more

🔮 Exponential View #558: Davos & reinventing the world; OpenAI's funk; markets love safety; books are cool, robots & AI Squid Game++

2026-01-25 11:28:19

Hi all,

I just got back from Davos, and this year was different. The AI discussion was practical – CEOs asking each other what’s actually happening with their workforces, which skills matter now. At the same time, I saw leaders struggling to name the deeper shifts reshaping our societies. Mark Carney came closest, and in this week’s essay I pick up his argument and extend it through the Exponential View lens.

Enjoy!

Davos and civilizational OS

Mark Carney delivered a speech that will echo for a long time, about “the end of a pleasant fiction and the beginning of a harsh reality.” Carney was talking about treaties and trade but the fictions unravelling go much deeper.

Between 2010 and 2017, three fundamental inputs to human progress – energy, intelligence, and biology – crossed a threshold. Each moved from extraction to learning, from “find it and control it” to “build it and improve it.” This is not a small shift. It is an upgrade to the operating system of civilization. For most of history, humanity ran on what I call the Scarcity OS – resources are limited, so the game is about finding them, controlling them, defending your share. This changed with the three crossings. As I write in my essay this weekend:

In each of the three crossings, a fundamental input to human flourishing moved from a regime of extraction, where the resource is fixed, contested, and depleting, to a regime of learning curves, where the resource improves with investment and scales with production.

At Davos, I saw three responses: the Hoarder who concludes the game is zero-sum (guess who), the Manager who tries to patch the system (Carney), and the Builder who sees that the pie is growing and the game is not about dividing but creating more. The loudest voices in public right now are hoarders, the most respectable are managers, and the builders are too busy building to fight the political battle. The invitation of this moment? Not to mourn the fictions, but to ask: what was I actually doing that mattered, and how much more of it can I do now?

Full reflections in this week’s essay:

Subscribe now


Finding new alpha

OpenAI was the dominant player in the chatbot economy, but we’re in the agent economy now. This economy will be huge, arguably thousands of times bigger1 but it’s an area OpenAI is currently not winning: Anthropic is. Claude Code reached a $1 billion run rate within six months, likely even higher after its Christmas social media storm.

OpenAI is still looking for other revenue pathways. In February, ChatGPT will start showing ads to its 900 million users – betting more on network effects than pure token volume. This could backfire, though. At Davos, Demis Hassabis said he was “surprised” by the decision and that Google had “no plans” to run ads in Gemini. In his view, AI assistants act on behalf of the user; but when your agent has third-party interests, it’s not your agent anymore.

OpenAI reported that its revenue scales in line with compute; they need a lot of energy and funding.

Sarah Friar, OpenAI’s CFO, wants maximum optionality and one of the bets will be taking profit-sharing stakes in discoveries made using their technology. In drug discovery, for example, OpenAI could take a “license to the drug that is discovered,” essentially claiming royalties on customer breakthroughs. Both Anthropic and Google2 are already there and have arguably shown more for it. Google’s Isomorphic Labs, built on Nobel Prize-winning AlphaFold technology, already has ~$3 billion in pharma partnerships with Eli Lilly and Novartis, and is entering human clinical trials for AI-designed drugs this year3. Then, there are OpenAI’s hardware ambitions.

OpenAI needs a new alpha. Their main advantage was being the first mover. But the alpha has shifted from models to agents and there, Anthropic moved first properly with Claude Code. It’s hard to see how OpenAI can sustain its projection of $110 billion in free cash outflow through 2028 in a market it isn’t clearly winning. Anthropic, meanwhile, projects burning only a tenth of what OpenAI will before turning cashflow positive in 2027 (although their cloud costs for running models ended up 23% higher in 2025 than forecast).

Perhaps this is why Dario Amodei, CEO of Anthropic, told me at Davos that research-led AI companies like Anthropic and Google will succeed going forward. Researchers generate the alpha, and research requires time, patience and not a lot of pressure from your product team. OpenAI has built its timeline and product pressure. This has an impact on the culture and talent. Jerry Tworek, the reasoning architect behind o1, departed recently to do research he felt like he couldn’t do at OpenAI (more in this great conversation with and ).

None of this means that OpenAI is out for the count. They still have 900 million users, $20 billion in revenue, and Stargate. But they’re currently in a more perilous position than the competition.

See also:

  • Apple v OpenAI… The iPhone maker is developing a wearable AI pin-sized like an AirTag, release expected in 2027. They also plan to replace Siri later this year with a genAI chatbot, code-named Campos, in partnership with Google.

  • and highlight how AI agents could transform “matching markets” – hiring, dating, specialized services – by helping people articulate what they actually want.


A MESSAGE FROM OUR SPONSOR

Startups move faster on Framer

First impressions matter. With Framer, early-stage founders can launch a beautiful, production-ready site in hours — no dev team, no hassle.

Pre-seed and seed-stage startups new to Framer will get:

  • One year free: Save $360 with a full year of Framer Pro, free for early-stage startups.

  • No code, no delays: Launch a polished site in hours, not weeks, without technical hiring.

  • Built to grow: Scale your site from MVP to full product with CMS, analytics, and AI localization.

  • Join YC-backed founders: Hundreds of top startups are already building on Framer.

Claim your free year


Ethics is economics

The conventional story treats alignment as a tax on capability. Labs face a prisoner’s dilemma: race fast or slow down for safety while someone else beats you to market. At Davos, Dario Amodei said if it were only Demis and him, they could agree to move slowly. But there are other players. Demis told me the same after dinner.

This framing might suggest to some that we’re in a race toward misaligned superintelligence. But I’ve noticed something in recent dynamics that makes me more hopeful. A coordination mechanism exists and paradoxically, it runs through the market.

When users deploy an agent with file system access and code execution, they cede control. An agent with full permissions can corrupt your computer and exfiltrate secrets. But to use agents to their full potential, you have to grant such permissions. You have to let them rip4.

, a Senior Fellow at the Foundation for American Innovation, noticed that the only lab that lets AI agents take over your entire computer is the “safety-focused” lab, Anthropic. OpenAI’s Codex and Gemini CLI seek permission more often. Why would the safety-focused lab allow models to do the most dangerous thing they’re currently capable of? Because their investment in alignment produced a model that can be trusted with autonomy5.

Meanwhile, the one company whose models have become more misaligned over time, xAI, has encountered deepfake scandals, regulatory attention, and enterprise users unwilling to deploy for consequential work.

Alignment generates trust, trust enables autonomy, and autonomy unlocks market value. The most aligned model becomes the most productive model because of the safety investment.

See also:

  • Anthropic researchers have discovered the “Assistant Axis,” the area of an LLM that represents the default helpful persona, and introduced a method to prevent the AI from drifting into harmful personas.

  • Signal Foundation President Meredith Whittaker warns that root-level access required by autonomous AI agents compromises the security integrity of encrypted applications. The deep system integration creates a single point of failure.

Subscribe now


The robotics flywheel

Robotics has two of Exponential View’s favourite forces working for it: scaling laws and Wright’s law. In this beautiful essay worth your time, software engineer Jacob Rintamaki shows how those dynamics push robotics toward becoming general-purpose – and doing so much faster than most people expect.

Robotics needs a lot of data. Vision-language-action models are expected to benefit from scaling laws similar to LLMs.6 The problem is data scarcity: language has a lot of data, but vision-language-action data is scarce. Robotics is roughly at the GPT-2 stage of development. But each robot that starts working in the real world becomes a data generator for the specific actions it performs – this creates a flywheel. More deployed robots generate more varied action data. The next generation of models absorbs this variety and becomes more capable of unlocking larger markets worth serving. That’s scaling laws. And Wright’s law compounds the effect: each doubling of cumulative production will drive down costs. Already, the cheapest humanoid robots today cost only $5,000 per unit. Rintamaki argues they’ll eventually cost “closer to an iPhone than a car”; they require fewer raw materials than vehicles and need no safety certifications for 100mph travel.

AI datacenter construction will kick off the flywheels. Post-shell work (installing HVAC systems and running cables) is 30-40% of construction costs and is repetitive enough for current robotics capabilities. The buyers are sophisticated, the environments standardised, and the labour genuinely scarce: electricians and construction crews are in short supply.

See also, World Labs launched the World API for generating explorable 3D worlds from text and images programmatically. A potential training environment for robots.


Elsewhere:

1

For instance, based on Simon P. Couch’s analysis, his median Claude Code session consumes 41 Wh, 138x more than a “typical query” of 0.3 Wh. On a median day, he estimates consuming 1,300 Wh through Claude Code, equivalent to 4,400 typical queries. Even if you do 100 queries a day, that is over 400 times more usage. And this is probably still not the furthest you can push out of agents in a day.

2

Google can afford to play the game more patiently. They have the money and the data-crawling advantage from their dominant position in online advertising – publishers want Google’s bots to crawl their sites to send search traffic. This advantage has concerned competition authorities around the world, most recently the UK CMA.

3

Although ambitions were for this to happen in 2025.

4

This is by no means a recommendation – current systems should not be fully trusted yet. There are ways to give agents more permissive environments while limiting damage (e.g. sandboxing).

5

You can read Claude’s constitution to see the ethics framework it operates under

6

Although extracting clear scaling laws is harder than for LLMs, robotics has to deal with different embodiments, environments, and tasks that make a single “log-linear curve of destiny” elusive.

🔮 The end of the fictions

2026-01-24 20:04:42

I just got back to London after a week at the Annual Meeting at Davos. For the past few years, the World Economic Forum had become a kind of parody of itself, a place where billionaires flew in on private jets to discuss climate change and “stakeholder capitalism” while nothing much seemed to happen. But this year was different.

The AI discussion at the Forum alone was proof of change. It was practical, CEOs asking each other: what’s actually happening with your workforce? Which skills matter now? Why is that company pulling ahead while everyone else flounders?

And on politics, things moved to the heart of the matter: the fragmentation, the end of the old world order. But neither Davos woman nor Davos man felt a deracination in the face of the crumbling rules-based order. Rather, they used the simple fact that many of those who hold the world’s power were gathered in that one place: to speak and, in many cases, to listen.

The gathering met, nay exceeded, its purpose. Davos showed why it matters, why it is necessary in a world that is fraying rather than cohering.

Canada’s prime minister, Mark Carney, gave a speech that will echo for a long time. He spoke of “the end of a pleasant fiction and the beginning of a harsh reality.” He was referring to the unraveling of the post-war geopolitical settlement, the fading authority of the rules-based order, the growing irrelevance of multilateral institutions designed for a slower, more stable world. If you haven’t seen it yet, I really recommend watching.

Carney was talking about treaties, trade, and power. But these aren’t the only norms that are unravelling.

Today’s reflection originates from the research I’m doing for my second book. There’s a long way to go before it lands on your shelf, but that work is already tracing a similar unraveling – in domains much closer to Exponential View’s home. So let’s get to it.

Subscribe now

The three crossings

We have crossed the threshold from extraction to learning. Image generated using Midjourney

Between 2010 and 2017, three fundamental inputs to human progress – energy, intelligence and biology – crossed a threshold. Each moved from extraction to engineering and learning, from “find it and control it” to “build it and improve it.”

Energy became a technology. For most of human history, energy meant finding something in the ground and burning it. Coal seams, oil fields, natural gas deposits. The logic was geological; as reserves deplete, access is contested, and the nation that controls the supply controls the game. Wars were fought over this. Empires rose and fell on it. Then, solar costs fell below the threshold where photovoltaics could compete with new fossil generation in sunny regions. Wright’s Law in action, as every doubling of cumulative production drops costs by roughly 20%. The learning curve, once it takes hold, is relentless, as Exponential View readers know.

Source: Exponential View, Casey Handmer

Intelligence became engineerable. For decades, artificial intelligence was a research curiosity plagued by “winters,” periods of hype followed by disappointment. Neural networks worked, sort of, but scaling them was brutal. Progress was uncertain. Capability gains were unpredictable. Then, in 2017, a team at Google published Attention Is All You Need that introduced the transformer architecture. The insight was technical – a new way to process sequences in parallel using self-attention – but the consequence was civilizational. For the first time, there was a reliable scaling law for intelligence: more compute and more data yielded more capability, predictably. AI became a predictable engineering problem. We crossed from uncertainty to a learning curve.

Biology became readable. The human genome was first sequenced in 2003 after roughly $3 billion and thirteen years of work. By the mid-2010s, sequencing costs had fallen to a few thousand dollars, and dropping faster than Moore’s Law.

Source: NIH/NHGRI

The technologies involved (next-generation sequencing, computational genomics) followed their own improvement curves, and they were steeper than anyone predicted. For the first time in the history of life on Earth, a species could read and begin to edit its own source code. Biology moved from evolutionary timescales to engineering timescales, and as a result we got mRNA vaccines and CRISPR.

Civilizational OS is upgrading

In each of the three crossings, a fundamental input to human flourishing moved from a regime of extraction, where the resource is fixed, contested, and depleting, to a regime of learning curves, where the resource improves with investment and scales with production.

This is not a small shift. It is an upgrade to the operating system of civilization.

For most of history, humanity ran on what I call the Scarcity OS. Resources are limited in this system so the game is about finding them, controlling them, and defending your share. This logic shaped everything – our institutions, our economics, our social structures, our sense of what’s possible.

Under Scarcity OS, certain “fictions” emerged. And I use this word carefully. These fictions weren’t lies; they were social technologies, coordination mechanisms that worked brilliantly in a world of genuine constraint.

Take jobs… Jobs were a fiction. Not in the sense that work wasn’t real, but in the sense that bundling tasks, identity, healthcare, social status, and income into a single institution called “employment” was a specific solution to a specific problem: how do you distribute resources and organize production when information is expensive and coordination is hard? The job was an answer to that question. It was a brilliant answer. But it was an answer to a question that is now changing.

Likewise, credentials were a fiction. When evaluating someone’s capability was expensive, we outsourced the judgment to institutions. A degree from a prestigious university wasn’t proof that you could do anything in particular – it was proof that you had survived a sorting mechanism. The credential was a proxy, a compression algorithm for trust. It worked when the cost of direct evaluation was prohibitive. That cost is collapsing.

Expertise was a fiction. Not the knowledge itself, but the social construct of the “expert” – the person whose authority derived from scarcity of information and difficulty of access. When knowledge was locked in libraries, accumulated through years of study, and distributed through gatekept institutions, expertise was a genuine bottleneck. The expert was a bridge between the uninformed and the truth. That bridge is being bypassed.

These fictions were functional adaptations to real constraints. The job, the credential, the expert, each solved a genuine problem in a high-friction world. But the constraints changed. And now the adaptations are decaying.

Subscribe now

Dealing with the decaying of the old

At Davos, I saw three different responses to this decay playing out in real time.

The Hoarder sees the old fictions crumbling and concludes that the game is zero-sum. If the pie is fixed, the only strategy is to take more of it. Build walls. Impose tariffs. Retreat to the nation-state. Punish the outgroup. This is Trump’s instinct, and it resonates precisely because it matches the Scarcity OS that most people still run internally. The hoarder isn’t stupid; he’s applying legacy software to a changed environment.

The Manager sees the same decay and tries to patch the system. Redistribute more fairly. Strengthen institutions. Negotiate better deals within the existing framework. This is Mark Carney’s instinct. It’s more sophisticated than hoarding but it shares an assumption that the pie is still fixed, just poorly divided. The manager wants to optimize the Scarcity OS, not replace it.

The Builder would see something different. If the fundamental inputs are now on learning curves – if energy, biology, and intelligence are becoming cheaper and more abundant – then the pie is not fixed, it’s growing. The game is then about accelerating abundance. The builder’s question will not be “how do I get my share?” but “how do I help make more?”

The tragedy of this moment is that the loudest voices are the hoarders, the most respectable voices are the managers, and the builders are too busy building to fight the political battle.

The terror and the invitation

If you’ve built your identity on the old fictions, this transition is terrifying.

Read more