2026-03-30 08:00:00
Jevon & Veblen walk into a data center.
The dominant motif around AI has been Jevon’s Paradox1 : the cheaper a product becomes, the more it is consumed.
Token prices dropped 10-20x over the past 18 months & demand exploded in response.
Anthropic surged past $19 billion in run-rate last month, up from $9 billion at the end of 2025.2 OpenAI topped $25 billion in annualized revenue in February, a 17% increase in two months.3
We know GPUs, CPUs, & memory are already in short supply.4 Rumors of next-generation models, including Claude Mythos, suggest pricing that moves in the opposite direction.
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Claude Opus 4.6 | $5 | $25 |
| GPT-4.5 | $2 | $8 |
| Claude Mythos (rumored) | $15-25 | $75-150 |
This weekend, an accidental data leak revealed Anthropic’s secretive Mythos model.5 A leaked blog post described it as :
“A step change” in capability, “dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity.”6
Anthropic stated the model is “very expensive to serve & will be very expensive for customers.”7 Some have speculated on inference pricing 5-6x more than existing models.
If these rumors hold, the most powerful intelligence would trade at a stiff premium. Jevon’s Paradox would give way to Veblen goods.8
Veblen goods are those whose demand increases with price : front-row concert tickets that cost 10x more despite worse acoustics. Nike Jordans that retail for $110 and resell for $500+. Ivy League tuition where selectivity is the value proposition.
Could AI follow this dynamic for competitive advantage? The company with capital to access the most powerful model wins. How much is that worth?
Consider a Series A founder building an AI coding assistant. Today, she pays $25 per million output tokens for Opus 4.6. Her burn rate assumes that price. If Mythos launches at $150 per million tokens, 6x more, she faces a choice : raise prices, raise capital, or watch her AI-native competitor ship features she can’t match.
The token-maxxing era ends. Companies will stop optimizing for cheap inference. They’ll deploy capital aggressively, both GPUs & dollars, to maximize capability rather than minimize cost.
Balance sheets become a moat. The most profitable companies or those who can raise capital cheaply will have the biggest advantage in their industries.
For companies that cannot respond quickly enough or afford the most sophisticated AI, the gap widens. If AI-native companies can build 10x faster with Mythos-class models while competitors are stuck on Opus 4.6, valuations will diverge further.
Jevon & Veblen walked into a data center. We don’t yet know who walks out.
“Jevons paradox”, Wikipedia. ↩︎
“Anthropic Nears $20 Billion Revenue Run Rate”, Bloomberg, March 2026. ↩︎
“OpenAI Tops $25 Billion in Annualized Revenue”, The Information, February 2026. ↩︎
“What If We Run Out of Capacity?”, tomtunguz.com. ↩︎
“Anthropic data leak reveals powerful, secret Mythos AI model”, Fortune, March 2026. ↩︎
“Anthropic leak reveals new model Claude Mythos”, The Decoder, March 2026. ↩︎
“Claude Mythos (Opus 5) Leaked : What We Know So Far”, WaveSpeed AI, March 2026. ↩︎
“Veblen good”, Wikipedia. ↩︎
2026-03-26 08:00:00
Would you choose one software over another because it has a proprietary model with better performance?
Two companies shipped custom AI models today (three in a week counting Cursor!1), raising that question. Intercom launched Apex 1.0, a model for answering customer support tickets.2 Chroma released Context-1, a model for multi-hop agent search.3
Apex 1.0 beats GPT-5.4 & Claude Opus 4.5 on customer service tasks.2 Context-1 scores 97% on agent search benchmarks.3 One Intercom gaming customer saw resolution rates jump from 68% to 75%.2
History suggests4 these gains may be temporary. As general-purpose models improve, today’s specialized advantage erodes. But with GPU shortages, inference costs will spike, perhaps this will be the moment for built-for-purpose more efficient models.
Intercom built Apex to differentiate in a competitive market. Chroma’s bet is different. Context-1 is open-source under Apache 2.0.3 Anyone can use it. The model isn’t the product. It’s marketing rather than sales. Distribution & brand building for their vector database infrastructure.
Two philosophies. Proprietary model as differentiation versus open-source model as adoption mechanism.
“As features become ~free to build, the technology factors that will differentiate the players will be the AI under the hood. If you’re using the same general-purpose off-the-shelf model as everyone else, you have no durable differentiation.” - Eoghan McCabe2
AI models offered by software vendors have become a new axis upon which to compete. In the marketing arena, models drive attention & distribution. At the bottom of the sales funnel, they serve as competitive differentiators in performance.
“Cursor, Kimi, & the Open-Source AI Imperative”, tomtunguz.com, March 2026. ↩︎
Eoghan McCabe, “The age of vertical models is here”, X, March 26, 2026. ↩︎ ↩︎ ↩︎ ↩︎
“Chroma Launches Context-1, Efficient Open-Source AI for Agent Search”, PR Newswire, March 26, 2026. ↩︎ ↩︎ ↩︎
Rich Sutton, “The Bitter Lesson”, March 2019. ↩︎
2026-03-24 08:00:00
The SaaS era was defined by unbundling : find a workflow, optimize it, own it. Salesforce chose sales automation. Slack chose chat. Dropbox chose file sharing. Point solutions won by perfecting single workflows. The playbook : own one pain point, expand from there.
AI is moving faster than anyone predicted. When models change every 42 days, buyers can’t assemble a best-of-breed stack. They want a platform they can trust for three to five years.
Legal → Professional Services. Harvey now positions itself as AI for legal and professional services, not just law firms. It serves corporate legal departments, court systems, and co-built a Tax AI model with PwC covering 25+ jurisdictions.12
Enterprise Search → Work AI. Glean started as enterprise search. Now it sells vertical solutions for healthcare, financial services, and government, with dedicated agents for sales, HR, and engineering3.
Audio Models → Voice Agents. ElevenLabs started with text-to-speech. Now it offers voice agents for customer service, music generation, and AI audiobooks.4
Foundation model companies are doing the same. OpenAI launched a dedicated Healthcare & Life Sciences vertical, complete with industry-specific sales teams and solutions engineers. Anthropic built an Industries organization with account executives for healthcare, insurance, and federal markets. They’re not selling APIs. They’re becoming platforms.56
Each of these companies recognized the cognitive burden of unbundling7. They’re not selling features. They’re selling trust.
There’s a deeper logic at work. Once integrated, AI systems see how teams operate, capture workflows, and build more systems on top of them. As the cost of software development falls, trusted partners with broad adoption can expand faster than anyone else.
The SaaS playbook rewarded specialization. The AI playbook rewards breadth.
2026-03-23 08:00:00
Last week, Cursor launched Composer 2 to over one million daily active users.1 Within hours, a developer discovered Cursor had built its flagship model on top of Moonshot AI’s Kimi K2.5, a Chinese open-source model.2
Moonshot AI’s response? “This is the open model ecosystem we love to support.”3
Cursor’s model is at near parity with state-of-the-art at one-eighth the price.4 It’s also no coincidence the editor powering Cursor is open-source, VS Code.
$50B in market cap on open-source foundations. Open source empowers startups to compete with incumbents.
It’s not easy to replicate Cursor’s innovation on US models. American open-source frontier models average 8 months old. Chinese open-source models average 7 weeks. That’s a 5x age gap. Cursor chose Kimi K2.5 (8 weeks old) over GPT-OSS (8 months old) for good reason : in AI, eight months is three generations of models.
Meta, formerly America’s open-source champion with Llama, pivoted to closed-source development in 2025.5 Chinese open-source models grew from 1.2% of global AI usage in late 2024 to nearly 30% by the end of 2025.6 Qwen overtook Llama in cumulative downloads by October 2025, reaching 700 million downloads on Hugging Face.7
But commercializing Chinese models in the US carries risks : NIST found Chinese models 12x more susceptible to agent hijacking attacks,8 & companies like Microsoft & News Corp have banned their use entirely.9 Many government agencies have followed suit.
Meanwhile, the American open-source response is taking shape. NVIDIA announced a $26 billion commitment over five years to open-source AI through its Nemotron Coalition.10 Google, OpenAI, & the Allen Institute are building alternatives. OLMo 3 matches Qwen 3 on math benchmarks with 6x less training data.111213
Cursor’s choice wasn’t ideological. It was practical. When the best open-source option is Chinese, that’s what a $50 billion company will use.
Open source is how startups compete with giants. The next Cursor will be built on the best open-source foundation available. The question is whether that foundation will be American.
TechCrunch : Cursor admits its new coding model was built on top of Moonshot AI’s Kimi ↩︎
The Decoder : Cursor quietly built its new coding model on top of Chinese open-source Kimi K2.5 ↩︎
Composio : Kimi K2.5 vs. Opus 4.5 pricing comparison - Kimi K2.5 costs $0.60/M input vs Claude Opus $5.00/M input (8.3x cheaper) ↩︎
Bloomberg : Inside Meta’s Pivot From Open Source to Money-Making AI Model ↩︎
OpenRouter : State of AI 2025 - 100T Token LLM Usage Study ↩︎
Xinhua : Alibaba’s Qwen leads global open-source AI community with 700 million downloads ↩︎
NIST : CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks ↩︎
TechCrunch : DeepSeek - The countries and agencies that have banned the AI company’s tech ↩︎
Allen Institute for AI : OLMo 3 - Charting a path through the model flow ↩︎
2026-03-20 08:00:00
In 2025, we predicted that 2026 would be the year agents would earn as much as a person.
It’s already happening.
In markets where there’s a labor shortage and an urgent need to hire people, we are seeing agents command 75%, 85%, even 100% of a human equivalent salary. This is faster than we were anticipating.
The first-order benefit is completing the work.
But there are second-order benefits that are now starting to appear. Training agents is significantly faster since all materials can be presented at once & in parallel to the AI.
Agents typically require less management burden. They can work 24 hours faster or slower as the team needs. Capacity scales as a function of willingness to spend on inference.
Then, a third-order benefit : significantly lower tax burden. Robotic workers are not taxed to the same extent as humans. No FICA. No state unemployment insurance. No benefits. At least a 25-30% cost reduction for the same salary.1 Plus agent software cost is tax-deductible up to $2.56m.2
In other categories where AI is augmenting existing workers, the sale is different. Here, the sale captures the marginal hire rather than a big swath of the team.3
In both conversations, usage tends to surge because of the effectiveness of the systems, much faster than both the vendor and the buyer anticipate.
At that point, the business often pauses because a strategic review of organizational design needs to take place.
The market rewards this shift. Goldman Sachs found that low-labor-cost stocks outperformed high-labor-cost stocks by 8 percentage points in 2025.4 Labor’s share of GDP hit a record low of 53.8% in Q3 2025.5 The implication : every dollar shifted from labor to software improves margins & stock performance.
Across the S&P 500, labor costs represent about 12% of revenues on average.6 Software costs sit around 1-3%. As agents absorb labor, that ratio inverts. Labor shrinks. Software expands. The total addressable market for software grows at labor’s expense, while profitability grows.
In the short term, this means no pricing competition on a per-agent basis. Vendors aren’t racing to the bottom ; they can price at par to a person.
Sources
California employer costs : 7.65% FICA + 3.4% SUTA + 0.1% ETT + ~25% benefits = ~36% on top of base salary. CA EDD Payroll Tax Rates, BLS Employer Costs ↩︎
Goldman Sachs / Futunn : S&P 500 Surges 32% - Cooling Labor Costs as Hidden Driver ↩︎
Fortune : U.S. Workers Took Home Smallest Share of Capital Since 1947, FRED Labor Share Data ↩︎
Goldman Sachs : How Labor Costs Have Affected Corporate Margins ↩︎
2026-03-18 08:00:00
I set up a race today between two robots.
My Mac on the left vs Claude Code on the right. Both tasked with building a payment app on Stripe’s new Tempo blockchain. Same prompts, same task, side by side.
Opus 4.5 is about 20% smarter than Qwen 35B on benchmarks. And it’s likely 50x larger. The hare should have won. It didn’t.
The local model finished in 2 minutes. Claude took over 6. I asked Claude to score both outputs : local model 6.5, Claude 4.5.1
Video plays at 2x speed.
With 3x faster responses, I could add an extra cycle : “critique the plan and address the critiques.” In the time the hare was still thinking, the tortoise ran another lap.
| Prompt | Local (Qwen 35B) | Claude (Opus 4.5) |
|---|---|---|
| Research Tempo & create plan | 20.9s | 55s |
| Critique the plan | 16.5s | 1m 35s |
| Which language is best? | 16.5s | 1m 35s |
| Research feedback online | 48.9s | 2m 35s |
| Save implementation plan | 15.4s | 44s |
| Total | ~2 min | ~6 min 24s |
Faster responses mean more rounds of revision before a meeting ends or attention drifts. It’s different for agentic coding workflows & complex codebases, where slower work may lead to better outcomes. But for everyday tasks, faster models can enable tighter feedback loops. Tighter loops can produce better outcomes.
We don’t always need the smartest AI to get the job done.