MoreRSS

site iconTomasz TunguzModify

I’m a venture capitalist since 2008. I was a PM on the Ads team at Google and worked at Appian before.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Tomasz Tunguz

The Org Chart Math Behind AI-Native Speed

2026-03-10 08:00:00

“Since last November, 100% of my code has been written by Claude Code. I have not manually edited a single line, shipping 10 to 30 PRs per day.”

Boris Cherny, creator of Claude Code, ships 20-30 pull requests per day. Major code changes, not typo fixes. He runs five parallel AI instances, each on a separate branch.1

Compare that to a traditional engineer : 3 PRs per week.2 Cherny isn’t 10% more productive. He’s 30x more productive.

That productivity gap compounds at the company level. Anthropic generates ~$5 million per employee.3 Cursor, $3.3 million. Midjourney, $2 million.4 Traditional SaaS considers $200-300k strong. A 10-20x difference.

One explanation : communication overhead. The math follows Metcalfe’s Law.5 Each new team member adds n-1 new connections. Coordination drag doesn’t grow linearly. It explodes.

Now consider what AI does to this equation.

A traditional 150-person organization runs four layers deep. The org chart creates 11,175 potential communication channels. Meetings multiply. Alignment decays.

An AI-enabled team producing equivalent output might need 30 people. Communication channels drop to 435. A 96% reduction.

Communication channels in 150-person org vs 30-person AI-enabled org

This is one reason AI-native startups are pulling ahead, and why building AI companies feels fun. The advantage comes from organizational structure. Fewer humans, fewer channels, faster iteration, compounding speed.6

R&D adopts this fastest. AI writes the code. Human communication becomes the bottleneck. The span of control debate shifts from “how many people can one manager oversee?” to “how many AI agents can one human orchestrate?

Small teams have always paid less coordination tax. AI cuts it further.


  1. Cherny, Boris. Claude Code creator landed 259 PRs in 30 days, Hacker News, 2025. ↩︎

  2. Seporaitis, Julius. What Can 75,000 Pull Requests Tell?, 2021. Median developer opens 3 PRs per week; consistent with Google’s internal data. ↩︎

  3. Estimated from Anthropic’s ~$20B revenue run rate (Bloomberg, March 2026) divided by ~4,300 employees (LinkedIn). ↩︎

  4. Dealroom estimates. AI startups revenue per employee : Cursor $3.3M, Midjourney $2M, OpenAI $1.5M per employee. ↩︎

  5. Metcalfe’s Law, Wikipedia. ↩︎

  6. How to start a Lean, AI-Native Startup in 2025, Henry the 9th, 2025. ↩︎

Is This Tomasz's Agent?

2026-03-08 08:00:00

“Hi, Tomasz or Tomasz’s agent.”

I’ve started receiving emails that begin this way. A byproduct, I suppose, of having written so much about AI. People now assume my inbox is monitored by robots.

Which raises an odd question : what does it mean to write to someone when you expect a machine to answer?

Gmail suggests my reply before I’ve thought it. “Sounds good!” “Thanks for sending!” “Let’s circle back next week.” The machine knows what I’d say. Sometimes I click it. Sometimes I wonder if the person on the other end can tell.

Every customer support call is now with an AI agent. The voice sounds real. They are infinitely knowledgeable. The responses are fast. Does it matter that it’s not a person?

A friend sends voice memos instead of texts now. “So you know it’s actually me,” he said. But how do you know? ElevenLabs can clone a voice from thirty seconds of audio. The ums, the pauses, the little laugh—all reproducible.

Does it matter?

But maybe the people writing “Hi, Tomasz or Tomasz’s agent” have it right. They’re not being rude. They’re being realistic. They’ve adapted to a world where the answer might come from either side of the curtain, & they’ve decided not to care which.

The polite thing now is to assume the robot. The intimate thing is to be surprised when it’s not.

The Sword of Damocles in Software

2026-03-07 08:00:00

GitHub Copilot pioneered AI coding assistance. First to market. 20 million users. Then Claude Code & OpenAI Codex launched in mid-2025. Within six months, Copilot’s daily installs peaked & declined while competitors surged past 100,000 combined.1

Daily install counts of AI coding assistants in VS Code showing GitHub Copilot decline as Claude Code and OpenAI Codex rise

The sword didn’t fall on a laggard. It cut the early leader. If Microsoft can lose share in six months, no one is safe.

I analyzed 374 quarterly NDR observations from 25 public software companies. For years, the decline looked gradual. Net dollar retention fell from 125% in 2022 to 112% in 2025. Quarter by quarter. No cliff when ChatGPT launched. No acceleration when enterprises adopted Copilot.

Then came 2026.

Software industry NDR trends showing 25th, 50th, and 75th percentiles from 2022-2026 with sharp 2026 decline

The 25th percentile fell from 106% to 101% in a single quarter, now touching the breakeven line.2 The weakest companies are bleeding first. Zoom sits at 98%. Asana at 96%. The bottom quartile is now contracting.

The companies in the bottom quartile face different threats, but they share one trait : products simple enough to replace. Bill.com (94% NDR) serves SMBs depressed by macro conditions.3 Zoom (98%) faces near-free alternatives in Teams & Google Meet. Asana (96%) offers task workflows that competitors & AI agents can replicate. With 96% NDR, they lose 4% of existing revenue annually. Growing 9% requires 13%+ new customer acquisition just to tread water.4

Macro pressure. Commoditization. Competition. AI. Each blade cuts differently. The bottom quartile will see accelerating losses. Some may tip into outright contraction. The sword of Damocles hangs by a single horsehair. For simpler products in competitive categories, that horsehair is fraying.


  1. Source : Bloomberry.com, VS Code extension install data tracked by AznHisoka. ↩︎

  2. The decline is statistically significant (p < 0.0001, R² = 0.74). ↩︎

  3. SMB bankruptcies hit a 15-year high in 2025, driven by tariffs and high interest rates. Sources : PYMNTS, Bloomberg↩︎

  4. Asana Q4 FY2026 Results : 9% revenue growth, 96% NDR. ↩︎

Data Center Intelligence at the Price of a Laptop

2026-03-05 08:00:00

I burned 84 million tokens on February 28th. Researching companies, drafting memos, running agents.

Token usage dashboard showing 84.42M tokens consumed on Feb 28 2026

That’s running Kimi K2.5, a serverless model via API. At Claude1 or OpenAI2 rates — roughly $9 per million tokens blended — equivalent usage would cost $756 for a single day’s work. My peak days hit 80 million tokens. My average days run 20 million. Cloud inference at frontier-model pricing adds up fast.

This week, Alibaba released Qwen3.5-9B3, an open-source model that matches Claude Opus 4.1 from December 2025. It runs locally on 12GB of RAM. Three months ago, this capability required a data center. Now it requires a power outlet.

GPQA Diamond high water mark chart showing frontier models vs Qwen3.5-9B

A $5,000 laptop — a MacBook Pro with enough memory to run Qwen locally — pays for itself after 556 million tokens. At my usage rate, that’s about a month. At 20 million tokens per day, it’s four weeks.

After payback, the marginal cost drops to electricity.

It isn’t an intelligence compromise. Reasoning, coding, agentic workflows, document processing, instruction following : the 9B model matches December’s frontier across the board.

Aggregate benchmark comparison showing Qwen3.5-9B vs GPT-5 and Claude Opus 4.1 across enterprise benchmarks

What changes when frontier intelligence runs locally? Everything I send to cloud APIs today — drafting emails, researching companies, writing code, analyzing documents — stays on my machine. No API logs. No third-party retention. No outages. No rate limits.

The tradeoff is parallelization. Cloud APIs handle thousands of concurrent requests. A laptop runs one inference at a time. For simple tasks — summarization, drafting, Q&A — that’s fine.

Queue them up. Let them run overnight. For complex agentic workflows that spawn dozens of parallel threads, local inference may not be worth the wait. The economics favor depth over breadth : fewer tasks, run longer, run cheaper.

Three months from data center to laptop. The buy-vs-rent math just changed.

Not Prompts, Blueprints

2026-03-04 08:00:00

I hate to micromanage & I’ve been micromanaging AI.

A few months ago, I’d use Claude for a familiar workflow : capturing notes from a meeting, drafting a follow-up email, updating the CRM, writing the investment memo. Micromanagement at 10x speed. The agent would finish a step, then wait. I’d scan the output, type the next instruction, wait again. Prompt, response, prompt, response. I was the bottleneck in my own system.

A year ago, this was necessary. The models couldn’t hold a complex task in their heads. Now they can.

But this leverage requires planning. Now I sketch the workflow before I touch the machine. I anticipate the decision branches : what if the company isn’t in the CRM? What if the website is down or the call transcript isn’t available? I flag the gaps before the agent encounters them.

This morning’s notebook page :

Handwritten workflow blueprint on graph paper showing parallel agent tasks with decision branches

I took a photo & shared it with Claude & walked away. Workflows as images work beautifully.

The agents run in the background. The memo sat in my inbox, formatted, sourced, ready to send.

Not prompts. Blueprints.

Would You Buy Generic AI?

2026-03-02 08:00:00

Kirkland ibuprofen is the same molecule as Advil. Same dosage, same FDA requirements, same therapeutic effect.1 It costs 80% less.

AI has its generic drug moment. DeepSeek V3 matches GPT-5.2 on most benchmarks.2 It costs 90% less. OpenAI & Anthropic generated $22 billion in 2025.3 Chinese AI labs generated $1.8 billion.4 The ratio : 12:1.

AI Lab Revenue 2025 - US vs China

Pricing explains the gap. Chinese AI API prices collapsed 90% in 2024.5 US frontier models average $3.38 per million input tokens. Chinese models average $0.48.

Company Model Input ($/1M tokens) Output ($/1M tokens)
Anthropic Claude Opus 4.6 5.00 25.00
OpenAI GPT-5.2 1.75 14.00
Zhipu GLM-5 1.00 3.20
Minimax M2.5 0.30 1.20
DeepSeek V3 0.14 0.28

OpenAI processes roughly 8.6 trillion tokens per day.6 Chinese labs likely match or exceed this volume. The 12:1 revenue gap isn’t usage. It’s price.

Three forces drive Chinese prices down.

First, distillation commoditizes capability. Anthropic accused DeepSeek, Minimax & Moonshot AI of conducting “industrial-scale campaigns” to extract knowledge from Claude.7 OpenAI made similar accusations to Congress.8

Second, hyperscalers subsidize AI to win cloud customers. Alibaba Cloud cut LLM pricing by up to 97%.9 Baidu, ByteDance & Tencent spent $1.1B on AI subsidies during Chinese New Year 2026 alone.10

Third, DeepSeek set the floor. They trained V3 for $6 million versus OpenAI’s $100 million+ for GPT-4,11 price at $0.14 per million input tokens & hit $220 million ARR with 122 employees.

In the US, Chinese models also price at a discount. Together AI charges $1.25 per million input tokens for DeepSeek V3.12 DeepInfra offers $0.21 per million.13 DeepSeek’s own API charges $0.14 - 12x less than GPT-5.2.14

DeepSeek V3 Pricing by Provider vs OpenAI GPT-5.2

Pharma companies spend billions developing a molecule, then enjoy 20 years of patent protection to recoup R&D costs before generics flood the market. AI follows the same pattern - massive R&D costs upfront, then commoditization. But the timeline is compressed.

In pharma, the generic window opens after two decades. In AI, it opens in weeks. DeepSeek V3 costs $0.14 per million tokens. GPT-5.2 costs $1.75. Same capability. Different label. The 90% discount isn’t coming. It’s here.

The question : how to protect an asset that takes hundreds of millions to develop when it can be copied in a month?


Sources