2026-01-22 08:00:00
Talented people get promoted to management. So do talented models. Claude manages code execution. Gemini routes requests across CRM & chat. GPT-5 can coordinate public stock research.
Why now? Tool calling accuracy crossed a threshold. Two years ago, GPT-4 succeeded on fewer than 50% of function-calling tasks. Models hallucinated parameters, called wrong endpoints, forgot context mid-conversation. Today, SOTA models exceed 90% accuracy on function-calling benchmarks1. Performance of the most recent models, like Gemini 3, is materially better in practice than the benchmarks suggest.
Did we need trillion-parameter models just to make function calls? Surprisingly, yes.
Experiments with small action models, lightweight networks trained only for tool selection, fail in production2. They lack world knowledge. Management, it turns out, requires context.
Today, the orchestrator often spawns itself as a subagent (Claude Code spins up another Claude Code). This symmetry won’t last.
The bitter lesson3 insists ever-larger models should handle everything. But economics push back : distillation & reinforcement fine-tuning produce models 40% smaller & 60% faster while retaining 97% of performance4.
Specialized agents from different vendors are emerging. The frontier model becomes the executive, routing requests across specialists. These specialists can be third-party vendors, all vying to be best in their domain.
Constellations of specialists require reliable tool calling. When tool calling works 50% of the time, teams build monoliths, keeping everything inside one model to minimize failure points. When it works 90% of the time, teams route to specialists & compound their capabilities.
The frontier labs will own the orchestration layer. But they can’t own every specialist. Startups that build the best browser-use agent, the best retrieval system, the best BI agent can plug into these constellations & own their niche.
New startup opportunities emerge not from training the largest models, but from training the specialists the executives call first.
Berkeley Function Calling Leaderboard (BFCL) tests API invocation accuracy. TAU-bench measures tool-augmented reasoning in real-world scenarios (paper). ↩︎
Salesforce’s xLAM is a large action model designed specifically for tool selection. While fast & accurate for simple tool calls, small action models struggle with complex reasoning about when to use tools. ↩︎
Rich Sutton’s influential essay arguing that general methods leveraging computation beat hand-engineered domain knowledge. The Bitter Lesson. ↩︎
See DistilBERT, which is 40% smaller & 60% faster while retaining 97% of BERT’s performance. OpenAI’s model distillation enables similar efficiency gains. ↩︎
2026-01-21 08:00:00
Will designers design first in a world where AI can code software immediately, or just describe the design? Will large enterprises pay for premium observability when AI can migrate & monitor open source competitors?
As Michael Mauboussin writes, there’s information in price. These questions are priced in. It’s too early to see revenue erosion, but the market is pricing in the risk.
The median SaaS stock is down 14-17% year to date. 64% of software companies are down. Adobe has fallen 32%, HubSpot 57%, Atlassian 54%.
Revenue growth predicts returns better than margins, profitability, or market cap. Companies growing above 20% are up. Companies growing below 20% are down. Palantir grows 47%. MongoDB grows 21%. Adobe & Salesforce grow less than 10%.
This isn’t a broad market correction. Large caps are holding; small caps are collapsing.
Ten years ago, software crashed too. On February 5, 2016, LinkedIn fell 43% and Tableau dropped 49% in a single trading session. Salesforce lost 13%. The Nasdaq tumbled 3.25%. Investors dumped software in a single afternoon.
The crime was weak forward guidance. LinkedIn projected 20-22% growth when analysts expected 30%. Tableau’s license revenue growth decelerated from 57% to 31% quarter over quarter.
But the selloff reversed within weeks. Nasdaq finished 2016 up 7.5%. SaaS stocks climbed for five more years. No one doubted that enterprises would continue buying CRM software & analytics tools. The products remained essential. Only the price changed.
In 2016, investors questioned valuations. In 2026, they question relevance.
2026-01-20 08:00:00
After a decade of success, the modern data stack has entered consolidation. What comes next?
The postmodern data stack is AI.
The modern data stack created more than $100 billion in market cap with a simple promise : move the data via ETL, transform it inside a cloud data warehouse, put a semantic layer on top of it to unify metric definitions, & analyze it through BI.
This worked brilliantly for structured data, numerical data found within tables. But AI & businesses thrive on unstructured data, such as call transcripts, status reports, web searches, & PDFs.
The semantic layer, especially within large organizations with huge data sets, must now combine structured & unstructured information at scale, critical because all of it feeds AI & powers agents.
Strategic acquisitions prove this shift is already underway. The Datadog acquisition of Metaplane1, the Snowflake acquisition of Observe2, & the ClickHouse acquisition of Langfuse3 are the most concrete strategic moves reinforcing this fusion.
| Acquirer | Target | Category | Value | Date |
|---|---|---|---|---|
| Datadog | Metaplane | Data Observability | Undisclosed | Apr 2025 |
| Snowflake | Observe | Observability | $1B | Jan 2026 |
| ClickHouse | Langfuse | LLM Observability | Undisclosed | Jan 2026 |
All of these initial acquisitions focus on observability & understanding AI systems. First, the volumes of data are enormous & essential to sustaining the breakneck pace of AI innovation.
Second, observability helps inform roadmaps : how are customers using AI? Where are they not successful? How can a business help them grow & expand?
We will see many more acquisitions that lead to a combined postmodern & AI stack, accelerating the consolidation that has already started. In addition, there are many pieces of the AI stack not represented here, such as evaluations & agent orchestration.
Those are next.
2026-01-16 08:00:00
This week I chatted with an acquaintance who mentioned a board game. I caught half the title & looked for the full title & Amazon link using my AI in Asana.
The system tried with Gemini & failed. The failover to Claude also failed. Rather than continuously iterating with the AI until it worked, I created a Ralph Wiggum loop.
Geoffrey Huntley coined this pattern. Named after the persistently clueless Simpsons character, the idea is simple : keep pushing the model against its failures until it dreams a correct solution just to escape the loop. The system is deterministically bad in an undeterministic world. Iteration beats perfection.
Now an AI loop runs each night. It finds all tasks with “failed” in them. It creates a plan to debug & iterates until the prompt solves the task.
So far this naive system is working pretty well. There is a risk it might begin to oscillate between two optimal states, but I haven’t observed that in the few days it’s been running. It’s something I’m watching.
AI creates software cheaply; excellence requires iteration. Implicit feedback loops are how you get there.
This self improving loop ensures the system wakes up smarter than it went to sleep. So do I.
2026-01-15 08:00:00
AI is NVIDIA’s third climb up a steep slope.
First came gaming in the late 2010s.
Then cryptocurrency.
Now artificial intelligence.
Each wave pushed revenue growth above 50% & with it, P/E1 ratios surged. P/E ratios rose before the revenue growth materialized.
There’s a four-quarter offset between P/E ratio & TTM2 revenue growth. When you shift revenue growth forward by a year, the correlation with P/E jumps to 0.80.
Investors buy stocks based on where they think the company is headed, not where it’s been. A high P/E today reflects expectations of strong revenue growth tomorrow. During NVIDIA’s climb up each wave, P/E & future growth move in lockstep.
But look at what happens at the peaks.
At the end of each boom, the correlation collapses, falling to zero or even turning negative. P/E stays elevated while revenue growth plummets. The market is slow to reprice.
This is the pattern : during the ascent, P/E & growth are tightly linked. At the peak, they decouple. High P/E persists as growth dissipates.
Today’s AI wave is riding the same mesa. The correlation remains above historical levels. The rapid pace of deal-making & explosive rates of inference growth suggest this roller coaster ride will continue.
2026-01-14 08:00:00
“[Silicon Valley] is the biggest, most volatile petri dish of raw capitalism on the planet.” So what lessons does a successful short-seller living here have? As founder of Crown Capital Management, Scott Fearon has shorted the stocks of over 200 companies.
In his book Dead Companies Walking, Fearon distills three decades of meetings with executives into six failure modes :
Despite their differences, they all failed because their leaders made one or more of six common mistakes that I look for :
- They learned from only the recent past.
- They relied too heavily on a formula for success.
- They misread or alienated their customers.
- They fell victim to a mania.
- They failed to adapt to tectonic shifts in their industries.
- They were physically or emotionally removed from their companies’ operations.
The hairpin turn from SaaS to AI amplifies each.
Learning only from the recent past. Software moves in 20-year waves. Mainframe, then client/server, then SaaS. When these waves come, they create bull markets. As they wane, growth flattens & multiples collapse. SaaS multiples have been flat for three years.
Relying too heavily on a formula for success. Sales motions have changed because of dramatic growth rates in product-led growth & multi-million dollar lands. Many of the rules around efficiencies & quotas no longer apply.
Misreading or alienating customers. Customers want to be AI native. They are willing to pay huge sums for education & solutions. Budgets have exploded with 41% of AI spend net new. Selling the same software application that no longer solves the customer’s pain point is a path to churn.
Falling victim to a mania. The AI hype cycle creates pressure to ship half-baked features. Announcing an AI roadmap isn’t the same as delivering value. This distinction will become increasingly stark as long-running agents enter the workforce in 2026.
Failing to adapt to tectonic shifts. If ever this were true, it’s true today.
Being physically or emotionally removed. Teams who don’t use AI daily miss the pace of change. The technology moves too fast for quarterly strategy reviews. If leadership isn’t prompting Claude or GPT every day, they’re already behind.
Cognitive biases are always hard to see in ourselves. A short seller’s mirror is a useful one at this moment in AI.