MoreRSS

site iconExponential ViewModify

By Azeem Azhar, an expert on artificial intelligence and exponential technologies.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Exponential View

🔮 Exponential View #553: The story of 100 trillion tokens; China’s chip win; superhuman persuasion, Waymo ethics, Polymarket & hydrology++

2025-12-07 10:59:49

I haven’t read anyone as thorough as you on the AI bubble and related topics. Miguel O., a paying member

Subscribe now


Hi all,

In this edition:

  • Superhuman persuasion. New data shows that AI is more effective than TV ads at changing minds, even when it lies.

  • The chip sanctions failed. How “stacking” old tech is neutralizing the US blockade (and validating my 2020 thesis).

  • Solving the unsolved. Terence Tao calls AI “routine” and autonomous agents are finally cracking the backlog of neglected science.

  • The end of Hollywood. Financialization, not AI, may be the root cause.

In my latest video, I break down how I think about the overlapping technology S-curves that are driving the market upheaval:

There are more exponentials in this AI wave doing their work than just ChatGPT and large language models. These new technologies make new things possible: we start to do things we didn’t do before, either because they weren’t possible or because they were too expensive.

Listen on Apple Podcasts or Spotify


RIP chatbot, hello glass slipper

OpenRouter, an aggregator routing traffic across 300+ AI models for five million developers, released an analysis of 100 trillion tokens of usage. Their data offers a unique, cross-platform look at the market. It’s worth a deep read, but for now I’d highlight two directions:

First, the “glass slipper” effect – retention is driven by “first-to-solve,” not “first-to-market.” When a model is the first to unlock a specific, high-friction workload (like a complex reasoning chain), the cohort that adopts it shows effectively zero churn, even when cheaper or faster competitors emerge. This confirms my long-held view: customers don’t buy benchmarks; they buy solutions. Once a model fits the problem, like a glass slipper, switching costs become irrelevant.

Second, the shift to agentic inference is undeniable. In less than 12 months, reasoning-optimised models have surged from negligible to over 50% of all token volume. Consequently, average prompt lengths have quadrupled to over 6,000 tokens, while completion lengths have tripled. The insight here is that users aren’t becoming more verbose; the systems are. We are seeing the mechanical footprint of agentic loops iterating in the background.


China’s chip tipping point

China’s drive for semiconductor independence is accelerating faster than predicted. The recent Shanghai IPO of Moore Threads, a leading AI chipmaker, surged 425% on debut, signaling voracious domestic capital support for “China’s Nvidia” alternatives. This aligns with a bold forecast from Bernstein, that China is on track to produce more AI chips than it consumes by 2026, effectively neutralizing the intended chokehold of US export controls.

Read more

🔮 The real bottlenecks in AI + Q&A

2025-12-06 01:22:57

In today’s session I reflected on why AI’s bottleneck is no longer the models but the systems expected to absorb them.

I followed with a Q&A that touched on

  • OpenAI’s competitive position

  • Where value will accrue in the stack

  • The role of energy and grid limits

  • The impact of cybersecurity risks

Enjoy!

Azeem

Leave a comment

🔮 The S-curves rewriting our economy

2025-12-05 21:37:25

Yesterday, I unpacked the commercial reality behind Gemini’s release and OpenAI’s “Code Red.” Recent moves look defensive and could narrow OpenAI’s route to a $100 billion in revenues.

But past the immediate competition, we are seeing overlapping S-curves of technology rewriting the rules of our economy. In this video, I step back from the leaderboard to look at the transition from scale to reasoning, the “TAM fallacy” that is blinding investors, and the emergence of entirely new behaviours.

Skip to the best part:

(00:09) How ChatGPT became synonymous with AI

(11:46) The iPhone calculation that breaks everything

(16:38) The challenge of evaluating new markets

🔔 Subscribe on YouTube for every video – including the ones we don’t publish here.

👀 Did OpenAI’s $100 billion path just get narrower?

2025-12-04 23:10:10

In February 2023, when ChatGPT had just hit 100 million users and launched its $20 premium tier, I ran the numbers on revenue potential. My napkin math pointed to monthly revenues of at least $60 million by year-end. Readers thought I was optimistic. OpenAI closed 2023 at $1.6 billion in annualized revenue and then tripled it the following year.

A month ago, I asked whether they could reach $100 billion in full year revenues by 2027. The math showed a plausible route: layered subscription momentum, enterprise API growth, international expansion, advertising and agentic systems. I concluded:

Is it possible? The mathematics says yes – barely, but yes.

Sam Altman’s declaration of a “code red” makes that ‘barely possible’ even more unlikely.

OpenAI remains the fastest-growing technology company in history, with revenues approaching $13 billion this year and 800 million weekly active users. But now Google, Anthropic and DeepSeek are pressuring OpenAI to choose between defending its core product and pursuing new revenue streams.

Altman seems to have chosen defense. That decision makes sense for product integrity. It also suggests the firm will delay several revenue streams underpinning our $100 billion scenario.

In our previous work, we showed you a scenario. The latest news changes our assumptions – here is a live update.

Exponential View readers get evidence‑led analysis and straight updates when assumptions change. Don’t miss the beat:

Retreating to defend

The Information reported that Altman declared a ‘code red,’ telling staff: “We are at a critical time for ChatGPT.” Google’s AI resurgence, he warned, could bring ‘temporary economic headwinds.’

Altman redirected engineering toward five priorities: personalization, image generation, model behavior, speed, and reliability. Advertising, AI agents and Pulse – automatic-feed experiment – now take a lower priority. The concern is that running ads while users doubt ChatGPT’s edge would push them toward good enough rivals (see my analysis of how Google came to own “good enough”). And “good enough” may understate the threat – on several benchmarks, Google and Anthropic’s models already outperform.

Gemini 3 Pro and Claude 4.5 have started to eat more and more into my AI conversations. ChatGPT is still dominant, but less than it was just three weeks ago. ChatGPT does retrain those queries where years of conversational history have made its context indispensable, a narrowing niche built on accumulated exchanges I can’t yet abandon.

I genuinely admired ChatGPT Pulse; its premise was exactly what I needed: telling ChatGPT my tracking priorities, receiving nightly intelligence briefs tailored to those domains, with offers to drill deeper. Yet execution stumbled on fundamentals and it didn’t hold me. After several weeks, the experience became monotonous and small friction points were a dealbreaker. But its fatal flaw was that Pulse has no sense of time. It recycled stale findings day after day, like a news feed for amnesiacs. I no longer open it regularly.

Meanwhile, our API and programmatic workflows rarely relied on OpenAI’s offering anyway. Gemini 2.5 Flash handles bulk processing. Claude’s API powers judgment calls: editing passes, analysis and code, although Gemini’s 3 Pro model is coming increasingly handy.

I doubt I’m an outlier. Last quarter’s numbers show my shift mirrors a broader migration across three competitive fronts:

  1. Google has woken up. ChatGPT’s web traffic declined 6% in late November, correlating directly with the release of Gemini 3 Pro and competing models. Google’s Gemini grew from 350 million to 650 million monthly active users between March and October, with daily requests tripling quarter-over-quarter. Gemini added 100 million users in four months, then 200 million in three. A large part of this may be due to the memetic temptations of images generated by Nano Banana. But Nano Banana’s virality hasn’t exploded to the same scale as OpenAI’s “ghiblification” craze, yet Google’s spike feels more sustainable. Those user numbers will keep growing. But it’s not just Google…

  1. Open-source alternatives are eating into OpenAI’s cost advantage. DeepSeek’s V3.2 matches GPT-5 performance on some benchmarks at a tenth of the cost, and parity at ~6x lower cost is no longer unusual, as we wrote in EV#551.

  2. Focused competitors, best seen as Anthropic. Claude Opus 4.5 now leads in coding, agentic workflows and computer use. For enterprise buyers, those are the capabilities that write contracts.

Facing stiffer competition on both enterprise and consumer sides, OpenAI has made a trade-off to delay expansion into ads and agentic systems to protect the core products under siege.

Subscribe now

Does $100 billion still hold?

Last month, I projected five revenue streams that could combine to reach $100 billion by 2027. This month’s code red announcement forces us to reassess each one:

Remove or halve the ad and agent revenue, and the math shifts. Our earlier scenario totaled $100 billion across five streams. Without ads and agents firing on schedule, what does that ceiling fall to? Perhaps $55-60 billion by 2027.

That remains extraordinary growth by any historical standard. It is not, however, $100 billion.

Of course, there is an astonishing paucity of information in this market right now. Is the “code red” just an internal kick in the pants to an already overworked AI team? Is the decline in ChatGPT’s usage actually linked to the launch of Gemini?

Or are we just seeing a typical seasonality? ChatGPT’s traffic has traditionally dipped around Black Friday, according to Coatue, an investor in both Anthropic and OpenAI.

If history is any guide, usage could pick up again in two to three weeks, and we’ll update our models once more.

Subscribe now

OpenAI has rallied before

OpenAI has rallied before – after Claude 3, after Gemini 1.5’s million-token context window, after DeepSeek’s efficiency leap. Each time, it answered. But now all three forces are surging at once.

OpenAI is already mounting its response. Chief research officer Mark Chen hinted at a model codenamed Garlic:

We have models internally that perform at the level of Gemini 3, and we’re pretty confident that we will release them soon and we can release successor models that are even better.

The upshot, he explained, was that OpenAI can now pack the same level of knowledge into a smaller model that previously required a much larger one. Smaller means faster to train, cheaper to serve and quicker to iterate, exactly the advantages you need when the incumbent is cutting prices and moving the goalposts.

But it would take a lot to undercut Google. The incumbent has woken up and is pulling everything into its gravity well. Its vertical integration allows it to better control inference and training costs; its deep balance sheet is fed by the $300 billion cash spigot that is its ad business. The search giant could cut prices longer than most rivals can bear.

When escaping an object as massive as Google, you need to find an angle, one that really distinguishes you from the competition, that is perhaps orthogonal to their gravitational field. Again, I go into detail on this in my analysis of Google.

I’m wondering whether OpenAI’s broad approach still makes sense – or if it needs a sharper differentiation from Google.

There are some signals emerging: the firm recently signed a deal with LSEG, a financial information group that includes the London Stock Exchange and Refinitiv’s market data business. This puts professional-level financial data directly into the ChatGPT workflow, particularly for LSEG’s existing customers. Deals with airlines like Emirates and Virgin Australia might presage deeper integration of ChatGPT into both inward-facing operations and, ultimately, the consumer-passenger experience. These tactics might yield the right kind of differentiation that would deepen user engagement and, with it, greater opportunities to monetize.

Will this be enough to turn off the “code red” and put the company in a shouting chance of the $100 billion scenario for 2027?

We’ll have a clear answer soon enough. I await Garlic with bad (bated) breath.

🔮 The next 24 months in AI

2025-12-03 22:48:39

Over the past week, I published a four-part series examining the forces shaping AI’s trajectory over the next two years. I’ve now brought those pieces together into a single, unified document – partly for convenience, but mainly because these threads belong together.

This is my map of the AI landscape as we head into 2026.

I’ve spent the past decade tracking exponential technologies, and I’ve had the privilege of spending time with the people building these systems, deploying them at scale, and grappling with their consequences — as a peer, investor and advisor.

That vantage point shapes this synthesis of where I believe the critical pressure points are, from physical constraints on compute to the widening gap between AI’s utility and public trust.

A map, of course, is not the territory. The landscape will shift – perhaps dramatically – as new data emerges, as companies report earnings, as grids strain or expand, as the productivity numbers finally come in. I offer this not as prediction but as a framework for paying attention.

Use it to orient yourself as the news unfolds. And when you spot something I’ve missed or gotten wrong, I want to hear about it.

How GPT-5.1 sees this series, a seascape in transition: “dawn breaking, storms gathering, and human endeavour straining to find a navigable course”.

For our members: drop your questions on this piece in the comments or in Slack, and I’ll answer a selection in Friday’s live session at 4.30pm UK time / 11.30am ET, hosted on Substack.

Leave a comment

Subscribe now


Here’s what I cover:

The Firm

  1. Enterprise adoption – hard but accelerating

  2. The revenue rocket

Physical limitations

  1. Energy constraints limiting scaling

  2. The inference-training trade-off

The economic engine

  1. Capital markets struggle with exponentials

  2. Perhaps GPUs do last six years

  3. Compute capacity is existential for companies

  4. The “productivity clock” rings around 2026

The macro view

  1. Sovereign AI fragments stack

  2. Utility-trust gap dangerously widens


1. Enterprise adoption is hard and may be accelerating

There is a clear disconnect between the accelerating spend on AI infrastructure and the relatively few enterprises reporting transformative results.

By historical standards, the impact is arriving faster than in previous technology waves, like cloud computing, SaaS or electricity. Close to 90% of surveyed organizations now say they use AI in at least one business function.

But organizational integration is hard because it requires more than just API access. AI is a general-purpose technology which ultimately transforms every knowledge-intensive activity, but only after companies pair the technology with the institutional rewiring that’s needed to metabolise them. This requires significant organizational change, process re‑engineering and data governance.

McKinsey shared that some 20% of organizations already report a tangible impact on value creation from genAI. Those companies have done the hard yards fixing processes, tightening data, building skills and should find it easier to scale next year. One such company is BNY Mellon. The bank’s current efficiency gains follow a multi-year restructuring around a “platforms operating model”. Before a single model could be deployed at scale, they had to create an “AI Hub” to standardize data access and digitize core custody workflows. The ROI appeared only after this architectural heavy-lifting was completed. The bank now operates over 100 “digital employees” and has 117 AI solutions in production. They’ve cut unit costs per custody trade by about 5% and per net asset value by 15%. The next 1,000 “digital employees” should be less of a headache.

The best example, though, is JP Morgan, whose boss Jamie Dimon said: “We have shown that for $2 billion of expense, we have about $2 billion of benefit.” This is exactly what we would expect from a productivity J‑curve. With any general‑purpose technology, a small set of early adopters captures gains first, while everyone else is reorienting their processes around the technology. Electricity and information technology followed that pattern; AI is no exception. The difference now is the speed at which the leading edge is moving.

I don’t think this will be a multi‑decadal affair for AI. The rate of successful implementation is higher, and organizations are moving up the learning curve. As we go from “hard” to “less hard” over the next 12-18 months, we should expect an inflection point where value creation rapidly broadens. The plus is that the technology itself will only get better.

Crucially, companies are already spending as if that future value is real. A 2025 survey by Deloitte shows that 85% of organizations increased their AI investment in the past 12 months, and 91% plan to increase it again in the coming year.

One complicating factor in assessing the impact of genAI on firms is the humble mobile phone. Even if their bosses are slow to implement new workflows, employees have already turned to AI – often informally, on personal devices and outside official workflows – which introduces a latent layer of traction inside organisations. This is a confounding factor, and it’s not clear whether this speeds up or slows down enterprise adoption.

On balance, I’d expect this to be the case. In diffusion models inspired by Everett Rogers and popularised by Geoffrey Moore, analysts often treat roughly 15-20% adoption as the point at which a technology begins to cross from early adopters into the early majority1. Once a technology reaches that threshold, adoption typically accelerates as the mainstream follows. We could reasonably expect this share to rise towards 50% over the coming years.

However, 2026 will be a critical check-in. If the industry is still relying on the case studies of JP Morgan, ServiceNow and BNY Mellon rather than a slew of positive productivity celebrations from large American companies, diffusion is taking longer than expected. AI would be well off the pace.

2. Revenue is already growing like software

We estimate that the generative AI sector experienced roughly 230% annual revenue growth in 2025, reaching around $60 billion2.

That puts this wave on par with commercialization of cloud, which took only two years to reach $60 billion in revenue3. The PC took nine years; the internet 13 years4.

More strikingly, the growth rate is not yet slowing. In our estimates, the last quarter’s annualized revenue growth was about 214%, close to the overall rate for the year. The sources are familiar – cloud, enterprise/API usage and consumer apps – but the fastest‑growing segment by far is API, which we expect to have grown nearly 300% in 2025 (compared to ~140% for apps and ~50% for cloud). Coding tools are already a $3 billion business against $157 billion in developer salaries, a massive efficiency gap. Cursor reportedly hit $1 billion ARR by late 2025, the fastest SaaS scale-up ever, while GitHub Copilot generates hundreds of millions in recurring revenue (see my conversation with GitHub CEO Thomas Dohmke). These tools are converting labor costs into high-margin software revenue as they evolve from autocomplete to autonomous agents. The current market size is just the beginning.

Consumer revenues, meanwhile, are expanding as the user base compounds. Monthly active users of frontier chatbots are driving a classic “ARPU ratchet”: modest price increases, higher attach rates for add-ons, and a growing share of users paying for premium tiers. There are structural reasons to expect this to continue, even before AI feels ubiquitous inside firms.

First, the base of adoption is widening. If 2026 brings a wave of verified productivity wins, this trajectory will steepen. More firms should enjoy meaningful results and the surveys should show unambiguously that 25-30% of firms that started pilots are scaling them. As the remaining majority shift from pilots to production, they will push a far greater workload onto a small number of model providers. Revenue can rise even while most firms are still doing the unglamorous integration work; pilots “chew” tokens, but scaling up chews more.

Second, the workloads themselves are getting heavier. A basic chatbot turn might involve a few hundred tokens, but agentic workflows that plan, load tools and spawn sub‑agents can consume tens of thousands. To the user it still feels like “one question,” but under the surface, the token bill – and therefore the revenues – is often 10-40x higher.

Of course, this growth in usage will see token bills rise. And companies may increasingly use model-routers to flip workloads to cheaper models (or cheaper hosts) to manage their bills.

But ultimately, what matters here is the amount consumers and firms are willing to spend on genAI products.

3. The real scaling wall is energy

Energy is the most significant physical constraint on the AI build-out in the US, as I argued in the New York Times back in December. The lead time for new power generation and grid upgrades, often measured in decades, far exceeds the 18-24 months needed to build a data center. The US interconnection queue has a median wait of four to five years for renewable and storage projects to connect to the grid. Some markets report average waiting times as long as 9.2 years.

This is also a problem in Europe. Grid connections face a backlog of seven to ten years in data center hotspots.

For the Chinese, the calculus is different.

points out that “current forecasts through 2030 suggest that China will only need AI-related power equal to 1-5% of the power it added over the past five years, while for the US that figure is 50-70%.”

Because the American grid can’t keep up, data center builders are increasingly opting out and looking for behind-the-meter solutions, such as gas turbines or off-grid solar. Solar is particularly attractive – some Virginia projects can move from land-use approval to commercial operation in only 18 to 24 months. Compute will increasingly be dictated by the availability of stranded energy and the resilience of local grids rather than by proximity to the end user.

These grid limitations cast doubt on the industry’s most ambitious timelines. Last year, some forecasts anticipated 10 GW clusters by 2027. This now appears improbable. In fact, suggests that we would be on track for a one-trillion-dollar cluster based on historic trends by 2030. Assuming such a cluster had a decent business case and there was cash available to fund it, energy and other physical constraints would mean 2035 is a more realistic timeline. As long as it remains dependent exclusively on the scaling paradigm and this timeline slips from 2030 to 2035, capability progress will slow. It’s worth noting, the “slow” is relative here. Even if such a delay materialized, progress might feel fast.

Read more

📈 Data to start your week

2025-12-01 22:59:00

Hi all,

Here’s your Monday round-up of data driving conversations this week in less than 250 words.

Let’s go!

Subscribe now


  1. AI S&P boost ↑ AI-related stocks have accounted for 75% of S&P 500 returns and 79% of earnings growth since the launch of ChatGPT three years ago.

  2. Coding tool revenues ↑ Revenues from AI coding tools such as Cursor and Claude Code have collectively passed $3.1 billion, approximately 2% of all software engineer pay globally.

  3. SWE crown returns ↑ Claude regained first place on the SWE-bench verified benchmark1 with Opus 4.5, becoming the first model to score higher than 80%.

  4. Renewables > fossils. Between January and September this year, solar and wind generation have outpaced global electricity demand growth.

Read more