2026-03-26 08:00:00
Would you choose one software over another because it has a proprietary model with better performance?
Two companies shipped custom AI models today (three in a week counting Cursor!1), raising that question. Intercom launched Apex 1.0, a model for answering customer support tickets.2 Chroma released Context-1, a model for multi-hop agent search.3
Apex 1.0 beats GPT-5.4 & Claude Opus 4.5 on customer service tasks.2 Context-1 scores 97% on agent search benchmarks.3 One Intercom gaming customer saw resolution rates jump from 68% to 75%.2
History suggests4 these gains may be temporary. As general-purpose models improve, today’s specialized advantage erodes. But with GPU shortages, inference costs will spike, perhaps this will be the moment for built-for-purpose more efficient models.
Intercom built Apex to differentiate in a competitive market. Chroma’s bet is different. Context-1 is open-source under Apache 2.0.3 Anyone can use it. The model isn’t the product. It’s marketing rather than sales. Distribution & brand building for their vector database infrastructure.
Two philosophies. Proprietary model as differentiation versus open-source model as adoption mechanism.
“As features become ~free to build, the technology factors that will differentiate the players will be the AI under the hood. If you’re using the same general-purpose off-the-shelf model as everyone else, you have no durable differentiation.” - Eoghan McCabe2
AI models offered by software vendors have become a new axis upon which to compete. In the marketing arena, models drive attention & distribution. At the bottom of the sales funnel, they serve as competitive differentiators in performance.
“Cursor, Kimi, & the Open-Source AI Imperative”, tomtunguz.com, March 2026. ↩︎
Eoghan McCabe, “The age of vertical models is here”, X, March 26, 2026. ↩︎ ↩︎ ↩︎ ↩︎
“Chroma Launches Context-1, Efficient Open-Source AI for Agent Search”, PR Newswire, March 26, 2026. ↩︎ ↩︎ ↩︎
Rich Sutton, “The Bitter Lesson”, March 2019. ↩︎
2026-03-24 08:00:00
The SaaS era was defined by unbundling : find a workflow, optimize it, own it. Salesforce chose sales automation. Slack chose chat. Dropbox chose file sharing. Point solutions won by perfecting single workflows. The playbook : own one pain point, expand from there.
AI is moving faster than anyone predicted. When models change every 42 days, buyers can’t assemble a best-of-breed stack. They want a platform they can trust for three to five years.
Legal → Professional Services. Harvey now positions itself as AI for legal and professional services, not just law firms. It serves corporate legal departments, court systems, and co-built a Tax AI model with PwC covering 25+ jurisdictions.12
Enterprise Search → Work AI. Glean started as enterprise search. Now it sells vertical solutions for healthcare, financial services, and government, with dedicated agents for sales, HR, and engineering3.
Audio Models → Voice Agents. ElevenLabs started with text-to-speech. Now it offers voice agents for customer service, music generation, and AI audiobooks.4
Foundation model companies are doing the same. OpenAI launched a dedicated Healthcare & Life Sciences vertical, complete with industry-specific sales teams and solutions engineers. Anthropic built an Industries organization with account executives for healthcare, insurance, and federal markets. They’re not selling APIs. They’re becoming platforms.56
Each of these companies recognized the cognitive burden of unbundling7. They’re not selling features. They’re selling trust.
There’s a deeper logic at work. Once integrated, AI systems see how teams operate, capture workflows, and build more systems on top of them. As the cost of software development falls, trusted partners with broad adoption can expand faster than anyone else.
The SaaS playbook rewarded specialization. The AI playbook rewards breadth.
2026-03-23 08:00:00
Last week, Cursor launched Composer 2 to over one million daily active users.1 Within hours, a developer discovered Cursor had built its flagship model on top of Moonshot AI’s Kimi K2.5, a Chinese open-source model.2
Moonshot AI’s response? “This is the open model ecosystem we love to support.”3
Cursor’s model is at near parity with state-of-the-art at one-eighth the price.4 It’s also no coincidence the editor powering Cursor is open-source, VS Code.
$50B in market cap on open-source foundations. Open source empowers startups to compete with incumbents.
It’s not easy to replicate Cursor’s innovation on US models. American open-source frontier models average 8 months old. Chinese open-source models average 7 weeks. That’s a 5x age gap. Cursor chose Kimi K2.5 (8 weeks old) over GPT-OSS (8 months old) for good reason : in AI, eight months is three generations of models.
Meta, formerly America’s open-source champion with Llama, pivoted to closed-source development in 2025.5 Chinese open-source models grew from 1.2% of global AI usage in late 2024 to nearly 30% by the end of 2025.6 Qwen overtook Llama in cumulative downloads by October 2025, reaching 700 million downloads on Hugging Face.7
But commercializing Chinese models in the US carries risks : NIST found Chinese models 12x more susceptible to agent hijacking attacks,8 & companies like Microsoft & News Corp have banned their use entirely.9 Many government agencies have followed suit.
Meanwhile, the American open-source response is taking shape. NVIDIA announced a $26 billion commitment over five years to open-source AI through its Nemotron Coalition.10 Google, OpenAI, & the Allen Institute are building alternatives. OLMo 3 matches Qwen 3 on math benchmarks with 6x less training data.111213
Cursor’s choice wasn’t ideological. It was practical. When the best open-source option is Chinese, that’s what a $50 billion company will use.
Open source is how startups compete with giants. The next Cursor will be built on the best open-source foundation available. The question is whether that foundation will be American.
TechCrunch : Cursor admits its new coding model was built on top of Moonshot AI’s Kimi ↩︎
The Decoder : Cursor quietly built its new coding model on top of Chinese open-source Kimi K2.5 ↩︎
Composio : Kimi K2.5 vs. Opus 4.5 pricing comparison - Kimi K2.5 costs $0.60/M input vs Claude Opus $5.00/M input (8.3x cheaper) ↩︎
Bloomberg : Inside Meta’s Pivot From Open Source to Money-Making AI Model ↩︎
OpenRouter : State of AI 2025 - 100T Token LLM Usage Study ↩︎
Xinhua : Alibaba’s Qwen leads global open-source AI community with 700 million downloads ↩︎
NIST : CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks ↩︎
TechCrunch : DeepSeek - The countries and agencies that have banned the AI company’s tech ↩︎
Allen Institute for AI : OLMo 3 - Charting a path through the model flow ↩︎
2026-03-20 08:00:00
In 2025, we predicted that 2026 would be the year agents would earn as much as a person.
It’s already happening.
In markets where there’s a labor shortage and an urgent need to hire people, we are seeing agents command 75%, 85%, even 100% of a human equivalent salary. This is faster than we were anticipating.
The first-order benefit is completing the work.
But there are second-order benefits that are now starting to appear. Training agents is significantly faster since all materials can be presented at once & in parallel to the AI.
Agents typically require less management burden. They can work 24 hours faster or slower as the team needs. Capacity scales as a function of willingness to spend on inference.
Then, a third-order benefit : significantly lower tax burden. Robotic workers are not taxed to the same extent as humans. No FICA. No state unemployment insurance. No benefits. At least a 25-30% cost reduction for the same salary.1 Plus agent software cost is tax-deductible up to $2.56m.2
In other categories where AI is augmenting existing workers, the sale is different. Here, the sale captures the marginal hire rather than a big swath of the team.3
In both conversations, usage tends to surge because of the effectiveness of the systems, much faster than both the vendor and the buyer anticipate.
At that point, the business often pauses because a strategic review of organizational design needs to take place.
The market rewards this shift. Goldman Sachs found that low-labor-cost stocks outperformed high-labor-cost stocks by 8 percentage points in 2025.4 Labor’s share of GDP hit a record low of 53.8% in Q3 2025.5 The implication : every dollar shifted from labor to software improves margins & stock performance.
Across the S&P 500, labor costs represent about 12% of revenues on average.6 Software costs sit around 1-3%. As agents absorb labor, that ratio inverts. Labor shrinks. Software expands. The total addressable market for software grows at labor’s expense, while profitability grows.
In the short term, this means no pricing competition on a per-agent basis. Vendors aren’t racing to the bottom ; they can price at par to a person.
Sources
California employer costs : 7.65% FICA + 3.4% SUTA + 0.1% ETT + ~25% benefits = ~36% on top of base salary. CA EDD Payroll Tax Rates, BLS Employer Costs ↩︎
Goldman Sachs / Futunn : S&P 500 Surges 32% - Cooling Labor Costs as Hidden Driver ↩︎
Fortune : U.S. Workers Took Home Smallest Share of Capital Since 1947, FRED Labor Share Data ↩︎
Goldman Sachs : How Labor Costs Have Affected Corporate Margins ↩︎
2026-03-18 08:00:00
I set up a race today between two robots.
My Mac on the left vs Claude Code on the right. Both tasked with building a payment app on Stripe’s new Tempo blockchain. Same prompts, same task, side by side.
Opus 4.5 is about 20% smarter than Qwen 35B on benchmarks. And it’s likely 50x larger. The hare should have won. It didn’t.
The local model finished in 2 minutes. Claude took over 6. I asked Claude to score both outputs : local model 6.5, Claude 4.5.1
Video plays at 2x speed.
With 3x faster responses, I could add an extra cycle : “critique the plan and address the critiques.” In the time the hare was still thinking, the tortoise ran another lap.
| Prompt | Local (Qwen 35B) | Claude (Opus 4.5) |
|---|---|---|
| Research Tempo & create plan | 20.9s | 55s |
| Critique the plan | 16.5s | 1m 35s |
| Which language is best? | 16.5s | 1m 35s |
| Research feedback online | 48.9s | 2m 35s |
| Save implementation plan | 15.4s | 44s |
| Total | ~2 min | ~6 min 24s |
Faster responses mean more rounds of revision before a meeting ends or attention drifts. It’s different for agentic coding workflows & complex codebases, where slower work may lead to better outcomes. But for everyday tasks, faster models can enable tighter feedback loops. Tighter loops can produce better outcomes.
We don’t always need the smartest AI to get the job done.
2026-03-17 08:00:00
For every dollar hyperscalers earn from AI today, they’re spending twelve dollars to build more capacity.1 That’s the bet embedded in $575 billion of capital expenditure this year.2
How fast does AI revenue need to grow to pay back this data center mortgage?
From 2020 to 2024, hyperscalers issued an average of $20 billion in bonds annually.3 In 2025, that jumped to $96 billion. In 2026, it will reach $159 billion.3 Morgan Stanley projects $1.5 trillion over the next few years.4
Amazon, Microsoft, Alphabet, Meta, & Oracle will spend 90% of their operating cash flow on AI data centers in 2026, up from a historical average of 40%.2
Alphabet issued a century bond, the first by a tech company since Motorola in 1997.5 The debt matures in 2126. Who knows what AI will look like then, or whether Alphabet will exist to repay it.
What assumptions justify this borrowing?
The depreciation schedules encode the bet. Most hyperscalers depreciate AI infrastructure over five years.6 At 60% gross margins & 5% borrowing costs, a 5-year payback on $431B in AI capex requires $180B in annual revenue.7 Current AI revenue is $35 billion.1 They’re underwriting 5x growth in five years.
Nvidia’s stated goal is to release new GPU architectures every twelve months, which will compress depreciation cycles. If chips become obsolete in three years rather than five, the required annual revenue jumps to $276B, 7.9x current levels.8
As Michael Mauboussin writes, there’s information in prices. The depreciation schedules tell us what hyperscalers believe : AI revenue will grow 5x within five years. The debt markets are betting alongside them.
Asymco : The Most Brilliant Move in Corporate History? ↩︎ ↩︎
Fortune : Google, Meta, & Oracle’s $1 Trillion Borrowing Spree ↩︎
Bloomberg : Alphabet Plans Tech’s First 100-Year Bond Since Dot-Com Era ↩︎
Calculation : $431B capex ÷ 5 years = $86B depreciation + $22B interest (5% on $431B) = $108B annual cost. At 60% margin, requires $180B revenue ($108B ÷ 0.60). ↩︎
This analysis focuses on direct AI revenue & does not account for internal AI consumption (Copilot, Search, recommendations, internal engineering) that generates value through existing revenue streams. Older chips may retain residual value for inference even after becoming obsolete for frontier training. ↩︎