MoreRSS

site iconTomasz TunguzModify

I’m a venture capitalist since 2008. I was a PM on the Ads team at Google and worked at Appian before.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Tomasz Tunguz

The Golden Age of AI Applications

2026-06-15 08:00:00

We’re entering the golden age of AI applications. Three recent developments confirm it.

The Fable retraction shows regulatory risk. Nadella’s thesis shows strategic consensus. Salesforce’s acquisition shows market validation.

First, the US government shut down Fable access1 & the software ecosystem roared with many responses : Bring it back! Open-source & local models have become essential! Don’t rely on a single model!

Satya Nadella published an AI ecosystem thesis.2 He argued that for a healthy ecosystem, the moat can’t be the model. Instead, human expertise & the system around the model (the harness3) must be the moat.

And Salesforce announced the acquisition of Fin, formerly Intercom, for $3.6b.4 The founders & management team repositioned the company through the AI upheaval. Fin used open-source models to maximize price/performance.

Building AI applications is hard for different reasons than SaaS. It’s not a lack of engineers, or the challenges of uptime, or the demands of faster releases.

AI applications present three new disciplines to master : picking the right models, developing the hill-climbing loop, & evaluating the performance of the system for each company, all of which answer the question how much intelligence can I squeeze out of my token budget?

Models are tricky. Budgets prevent defaulting everyone to state-of-the-art. The legion of other models each have a personality. Kimi K2.6 is fast & a great creative writer but less precise. Qwen 3.6 27b is a small model with legendary performance, but it’s a bit of a donkey. It stops suddenly in the middle of a toolchain call & requires a good prodding to push on. GLM 5.1 is an excellent coding model, but a plodder.

Loops, the critical problem-definition exercise of this era, are hard to design. Systems design is an entire discipline (see Donella Meadows’ excellent work on it5). What is the best way to define a loop so an agentic system improves? This field is novel & challenging because the models & infrastructure move quickly.

Evaluating the performance of model + loop is ongoing labor. Most companies won’t want to staff a team for each workflow software in a company. AI systems are complex, finicky engines.

The nuances of tuning the carburetors & the timing belts of these complex beasts are tasks better assigned to a few vendors to deliver maximum intelligence per dollar6 & amortize the costs across a broader population.

The companies that master these three disciplines will own the golden age.



  1. Anthropic Pulls Fable 5 After U.S. Government Directive — Fortune, June 13, 2026. ↩︎

  2. A Frontier Without an Ecosystem Is Not Stable — Satya Nadella, June 14, 2026. ↩︎

  3. Harnessing AI — tomtunguz.com. ↩︎

  4. Salesforce Signs Definitive Agreement to Acquire Fin — Salesforce, June 15, 2026. ↩︎

  5. 10 Best Books of 2025 — Donella Meadows’ Thinking in Systems. ↩︎

  6. Tokens Per Result — tomtunguz.com. ↩︎

A CEO's Cost of Capital Advantage

2026-06-12 08:00:00

SpaceX IPOs today. One hallmark of the largest IPO in history : Elon Musk’s astoundingly low cost of capital. Despite raising 25x more than the typical founder, Musk retained ownership in the top decile.

Musk has raised 25x more than most & kept top decile ownership

Some founders raise $2m for an idea. Others raise $15m. Yet others raise hundreds of millions.

Inverting1, we can say some people have a high cost of capital & others a low cost of capital.

At inception, cost of capital is purely personal. Founders & an idea. No business exists yet to evaluate. Over time, the combination of the team & the business’s performance dictates cost of capital.

Early wins lower the cost of the next raise. Cheaper capital funds bigger bets. Bigger bets produce bigger wins. Musk’s trajectory from Zip2 to PayPal to Tesla to SpaceX is the flywheel in motion.

The flywheel attracts capital from everywhere. Tesla’s retail ownership is 7x higher than the S&P 500 average2. SpaceX allocated a large portion of the offering to retail as well.

Musk raised more capital than nearly any founder in history & retained more ownership than most. His personal cost of capital made that possible.


  1. “Invert, always invert” is a mental model popularized by Charlie Munger, originating from 19th-century mathematician Carl Gustav Jacob Jacobi. ↩︎

  2. https://www.webull.com/news/13249038316839936 ↩︎

The AI Glass Ceiling

2026-06-10 08:00:00

We’ve reached the upper bound of AI.

Not in the sense that performance won’t improve. On the contrary, AI will improve AI.

But Anthropic’s Fable release has imposed a glass ceiling. How do you release the most powerful model in the world to everyone without destroying kingdoms?

Strong guardrails. It’s easy to trigger a gentle reminder of verboten topics : ask for a description of a plant cell or a detailed description of a modern large language model or question about software security.

But if we remain within the playground, Fable is the most powerful AI yet. Stripe compressed months of engineering into days : a 50-million-line Ruby codebase migrated in a single day, a refactor across tens of thousands of lines completed in 45 minutes.1

In my testing, Fable doubled inference performance on local models, besting the efforts of other state-of-the-art systems. Adding 10-15 percentage points on key benchmarks compared to typical improvements of 2 percentage points, Fable represents a genuine leap.2

We’re still understanding the best ways of using AI : techniques change every day. RAG, Plan/Act, Ralph Wiggum loops, /goals, structured prompting, MCP. How many fashions have we seen when the seasons of AI trends are measured in days?

Systems this powerful need to be phased in to allow the backbones of technology, banking, & energy to harden themselves in anticipation of increasingly powerful attacks.

The glass ceiling exists. It was inevitable for stability. It will rise over time, but for now there’s vast area underneath its curve.

The Substitution Wave in AI

2026-06-07 08:00:00

Three forces are reshaping the AI cost structure :

  1. Foundation labs are moving up the stack into applications,12
  2. Frontier model prices keep rising for the smartest models,3
  3. Open-source models have crossed the good enough threshold for most use cases.45

The natural response from AI buyers is substitution.

Coinbase6 :

At Coinbase we’re working hot on routing prompts to cheaper models where appropriate, & in some cases have been able to keep costs roughly flat, while token usage continues to grow exponentially.

Lindy7 :

Pulled the trigger today & switched 100% of Lindy traffic to DeepSeek v4, churning from Anthropic models. Saves us millions of $ & we’re actually seeing an increase in performance on many core use cases. Transformative for the business.

Harvey8 :

On a 100-task slice of our Legal Agent Benchmark (LAB), SFT moved Kimi 2.6’s all-pass rate from 11% to 15%, beating Opus’ 14%. But the cost gap was even more striking : $84 vs $954 across the same 100 tasks, or ~11x cheaper.

Cursor went further. They post-trained Kimi K2.5 into their own production model, Composer.9

Composer 2.5 is exceptionally intelligent & up to 10x more efficient than similarly capable models.

Coinbase’s quote shows where the savings go : costs flat, tokens exponential. Buyers don’t pocket the discount — they spend it on more intelligence.

Closed models are getting more expensive at the frontier; open models are getting cheaper at parity. The choice is which slope you want under your unit economics.

Ramp cost curve framing for AI buyers and app purveyors

The Minimill of AI

2026-06-05 08:00:00

A laptop on my desk now handles 78% of my AI work, with the rest sent to the cloud. The shift came out of my skill distillation work.

Here’s how it works.

I create tasks in Asana. An agent sees the task : scheduling, email triage, research, a CRM update ; & classifies it as easy or hard. If it’s straightforward, a local model on my Mac handles it in seconds. If it’s complex, the same model routes it to a cloud model.

Local router replacing a single queue with a two-lane scheduler

Across the last seven days, daily peaks reached 88%.

Daily share of model route decisions handled locally, May 29 to June 4

As the workload grew, the two-lane design paid off. Throughput jumped about 25%, average task duration fell from 47 seconds to 19, & queue age dropped from 73 seconds to four. Nothing about the work changed. Small, fast tasks simply stopped waiting behind big, slow ones.

The task factory that uses distilled skills is now humming along with 25% more throughput, queue age down 94%, & a much more responsive system. For now, the cloud handles the hard fifth. The Mac handles the rest.

It’s the minimill of agentic work. Nucor’s minimills1 started small, capital-light, & close to demand; within a generation they outflanked the integrated steel giants.

Every laptop, phone, & edge device with enough memory to host a distilled model becomes its own minimill : routing locally, paying cloud rates only for the hard fifth. Tens of millions of these will proliferate inside companies in the next few years, each one quietly absorbing much of the work that today shows up on a hyperscaler invoice.


  1. Nucor began in the 1960s by melting scrap steel in electric-arc furnaces rather than smelting iron ore in giant integrated blast-furnace mills. Each minimill was a fraction of the size & cost of an integrated plant, sited near regional demand, & ran on flexible, lower-cost labor. The integrated mills dismissed minimills as fit only for low-grade products like rebar. Over the next thirty years Nucor moved up-market into sheet steel & structural beams, & by 2014 had become the largest steel producer in the United States, while most of the integrated giants (Bethlehem, LTV, National) had gone bankrupt. Clayton Christensen used the story as the canonical example of disruptive innovation in The Innovator’s Dilemma↩︎

Intelligence Per Dollar

2026-06-03 08:00:00

Screenshot 2026-06-02 at 9.22.43 PM

Yesterday Microsoft added a new metric to a model release card, one that will likely become a standard.1

Average token usage.

In the first row, the Microsoft model hits 71.6 on SWE-Bench Verified using about a third of the tokens Claude Haiku 4.5 burns.

Benchmarks are now measured on two different dimensions, the overall performance & the cost to achieve that intelligence.

This is yet another sign that the era of subsidies2, tokenmaxxing3, & all-out performance for many use cases is over.

Even the most valuable companies in the world cannot afford state-of-the-art intelligence for every conceivable use case.4 Uber capped employee AI spending after blowing through its budget in four months.5 Salesforce is spending $300M on Anthropic tokens & has frozen engineering hires.6

This new dual benchmark answers the buyer’s only question : what is my intelligence per dollar?

Screenshot 2026-06-03 at 5.49.00 AM

Artificial Analysis already benchmarks this.7 GPT 5.5 & Claude Opus 4.8 land within a point of each other on the Intelligence Index, around 60. Running the index costs $3,357 on GPT 5.5 & $4,685 on Opus 4.8. Same answer, 40% more expensive.

Model companies must now compete on both dimensions. The application layer will compete one level up, on dollars per outcome, what a closed ticket, a shipped PR, or a resolved support case actually costs.

Every layer in the stack now has to price the same way the customer thinks : per result, not per token.



  1. Introducing MAI-Code-1-Flash — Microsoft announces a new coding model with average token usage on the release card. ↩︎

  2. The Unsustainable Subsidy — The era of AI subsidies is ending. ↩︎

  3. Tokenmaxxing — Models that game benchmarks with extra tokens are losing their edge. ↩︎

  4. Microsoft cancels Claude Code licenses, shifting developers to GitHub Copilot CLI — Microsoft cancelled Claude Code licenses across its Experiences and Devices division (Windows, Microsoft 365, Outlook, Teams, Surface) after engineering usage outran budgets. ↩︎

  5. Uber caps employee AI spending after blowing through budget in 4 months — Uber caps employee AI spending after blowing through budget in four months. ↩︎

  6. Salesforce Spends $300M on AI, Freezes Engineering Hires — Salesforce Spends $300M on AI, Freezes Engineering Hires. ↩︎

  7. AI Model & API Providers Analysis — Independent analysis of AI model costs. ↩︎