2025-10-20 08:00:00
When I worked at Google, I was lucky to collaborate with some of the brightest machine-learning (ML) engineers. They worked on feature engineering. By picking the factors to guide the ML model, their advances could generate tens to hundreds of millions of additional revenue.
Imagine an Excel spreadsheet with hundreds of columns of data. Add two columns, multiply two, divide by another, and subtract a fourth. Each of these is a feature. ML models used features to predict the best ad to show.
It started as a craft, reflecting the vibes of the era. Over time, we’ve mechanized this art into a machine called AutoML that massively accelerates the discovery of the right features.
Today, reinforcement learning (RL) is in the same place as feature engineering 15 years ago.
What is RL? It’s a technique of teaching AI to accomplish goals.
Consider a brave Roomba. It presses into a dirty room.
Then it must make a cleaning plan and execute it. Creating the plan is step 1. To complete the plan, like any good worker, it will reward itself, not with a foosball break, but with some points.
Its reward function might be: +0.1 for each new square foot cleaned, -5 for bumping into a wall, and +100 for returning to its dock with a full dustbin. The tireless vacuum’s behavior is shaped by this simple arithmetic. (NB : I’m simplifying quite a bit here.)
Today, AI can create the plan, but isn’t yet able to develop the reward functions. People do this, much as we developed features 15 years ago.
Will we see an AutoRL? Not for a while. The techniques for RL are still up for debate. Andrej Karpathy highlighted the debate in a recent podcast.
This current wave of AI improvement could hinge on RL success. Today, it’s very much a craft. The potential to automate it—to a degree or fully—will transform the way we build agentic systems.
2025-10-17 08:00:00
It’s 9:30 AM. Do you know where your agent is?
As we enter the era of agentic AI, this is an increasingly important question. ChatGPT launched the consumer use of AI, compressing all of human knowledge into a single model. Now we are asking AI to act on our behalf at work.
Analyze this mortgage statement. Figure out if this person has medical insurance. Answer this inbound call about a 2026 Honda Odyssey minivan. AI agents are doing all of this today. Plus, AI is coding millions of lines every day.
My coding agent, Claude Code, helps me write software. It runs on my computer in my terminal, orange cursor blinking. When I use Notion AI to summarize a document, it’s running in Notion’s cloud. A large enterprise handling customer support queries runs their AI in their cloud, whether it’s Azure or Google or Amazon or other.
Which kind of agents should run in which location : on-device, on my cloud, or on a software-vendor’s cloud?
Client-Side | Own Cloud | Hosted Cloud | |
---|---|---|---|
Vendor | Privacy-sensitive tasks | Enterprise control needs | Scalability & convenience |
Enterprise | Privacy-sensitive work | Control & compliance | Scalability & convenience |
Consumer | Privacy & speed | Rarely needed | Most common |
For a vendor : A vendor must offer all three, but for different reasons. Client-side agents handle privacy-sensitive tasks, including accessing data only available locally, like browser history, local files & clipboard content. They may also take advantage of local resources. Own cloud deployments meet enterprise control & compliance requirements. Hosted cloud provides scalability & convenience for most use cases.
For an enterprise : An enterprise will need to manage all three. Coding agents will very likely remain running on employee devices. Certain high-velocity agents will likely be on hosted clouds. Core software & infrastructure will run on their own cloud. This presents the most complex topology, but it’s a business reality.
For an individual user : Individual users will use predominantly client-side & hosted clouds. Aside from software developers, not many people run their own infrastructure.
These hybrid deployments are then multiplexed by all the communication protocols across agents, whether it’s MCP or A2A or ones coming in the future. The net of it is we are creating new agentic networks within enterprises.
So next time it’s 9:30 AM & your agent goes running, you’ll know whether to check your laptop, your cloud, or your vendor’s data center.
2025-10-16 08:00:00
I asked Claude this morning about the most important news in tech. A few follow-up questions about Salesforce’s Dreamforce, Samsung’s XR headset, & TSMC earnings later, I popped over to ChatGPT Pulse. It suggested some updates about the Fed’s rates policy. Then, I listened to podcasts through Gemini-powered AI transcriptions.
All of my news was filtered through AI.
Fifty years ago, most Americans received their news through one of three major television stations : CBS, ABC & NBC. Edward R. Murrow ended his CBS broadcast with “Good night, & good luck.”
Then cable exploded the channel count from 10 to 100 to 1000, serving every possible interest. Social media fragmented conversations further. But AI is a countervailing force.
What if AI is the new CBS, ABC, & NBC, the new mass-media?
The Economist argues AI is killing the web. Who will visit web pages in a few years? Instead of visiting many links, we ask AIs to summarize & prioritize. The traffic drop is real & immediate.
I see it in my behavior. I’d hazard a guess I visit 10% of the websites I did a year ago.
Why sift through pages and pages, when I can have an answer to my question, with all the relevant context across hundreds of sites? Web browsing feels like the card catalog and Google.com like the Dewey Decimal system - anachronisms of legacy information retrieval taught to schoolchildren.
The networks are back. This time, they are probabilistic.
Good morning. Your briefing is ready.
2025-10-14 08:00:00
Why was the Fivetran-dbt merger all but inevitable?
Fivetran & dbt Labs announced their merger yesterday. The all-stock deal combines two companies into an entity approaching $600 million in ARR.
The beauty of the modern data stack was the explosion in choice. As the cloud exploded onto the scene, the legacy data warehouse was replaced by a collection of fast-moving platforms. In that era, specialization won. The pendulum is now swinging back towards consolidation.
Why? The answer lies in compute economics & revenue scale asymmetry. The table below shows why.
Category | Snowflake | Databricks | Fivetran + dbt | Est. Category Revenue1 |
---|---|---|---|---|
Ingestion (ETL) | Openflow (Apache NiFi via Datavolo) | LakeFlow Connect (Arcion CDC) | Fivetran | ~$2.5B |
Transformation | dbt Projects on Snowflake (native dbt) | Delta Live Tables + dbt via Workflows | dbt Core/Cloud | ~$500M |
Compute | Virtual Warehouses (owns margin pool) | Clusters (owns margin pool) | Runs on platform compute (no margin capture) | ~$7.6B |
There are three different categories of software within this subset of the ecosystem:
Ingestion takes data from software & moves it into a cloud data warehouse. Snowflake acquired Datavolo, which commercializes the open source product Apache NiFi, calling it Openflow. Databricks acquired Arcion for ingestion through change data capture, calling it LakeFlow Connect. Fivetran focuses exclusively on this layer.
Transformation means reformatting the data within the cloud data warehouse. Snowflake launched native dbt Projects on Snowflake. Databricks offers Delta Live Tables, native SQL, & Python, plus supports hosted dbt through Databricks Workflows. dbt Core/Cloud is the leading independent transformation tool.
Compute revenue is generated when we ask questions of our data. Snowflake remains one of the leaders in structured data analysis with their cloud data warehouse. Databricks’ compute is their own as well.
Here’s the asymmetry in one number. Compute represents 72% of the overall market ($7.6B of $10.6B). As a result of their massive operations, Snowflake & Databricks exert significant gravity within the ecosystem. They have expanded beyond the compute market to impose their presence & capture marginal revenue within customers, pressuring the competitive ecosystem.
That’s not to say these components are independent. George Fraser analyzed Snowflake workloads in September 2024, finding transformation represents 40-45% of total Snowflake compute, which means even at smaller scales, startups can have significant impact on these behemoth businesses.
The Fivetran-dbt merger is an inevitable evolution of a maturing market. Two unicorns must partner to compete against two decacorns. They solve two of the three customer problems. But not yet compute.
One could surmise this consolidation signals the end of the modern data stack. I view it differently. The MDS has succeeded beyond our expectations. The stakes are higher now. Broad platforms, fast growth, & AI-native architectures define the next phase. Expect more consolidation.
Category revenue estimates based on public disclosures & company filings. Ingestion: Informatica ($1.64B FY2024), Fivetran ($300M est.), Talend ($350M), others ($200M est.). Transformation: dbt Labs ($300M est.), others ($200M est.). Compute: Snowflake ($3.6B FY2025), Databricks ($4.0B ARR Aug 2025). ↩︎
2025-10-10 08:00:00
Philip Schmid dropped an astounding figure1 yesterday about Google’s AI scale : 1,300 trillion tokens per month (1.3 quadrillion - first time I’ve ever used that unit!).
Now that we have three data points on Google’s token processing, we can chart the progress.
In May, Google announced at I/O2 they were processing 480 trillion monthly tokens across their surfaces. Two months later in July, they announced3 that number had doubled to 980 trillion. Now, it’s up to 1300 trillion.
The absolute numbers are staggering. But could growth be decelerating?
Between May & July, Google added 250T tokens per month. In the more recent period, that number fell to 107T tokens per month.
This raises more questions than it answers. What could be driving the decreased growth? Some hypotheses :
Google may be rate-limiting AI for free users because of unit economics.
Google may be limited by data center availability. There may not be enough GPUs to continue to grow at these rates. The company has said it would be capacity constrained through Q4 2025 in earnings calls this year.
Google combines internal & external AI token processing. The ratio might have changed.
Google may be driving significant efficiencies with algorithmic improvements, better caching, or other advances that reduce the total amount of tokens.
I wasn’t able to find any other comparable time series from neoclouds or hyperscalers to draw broader conclusions. These data points from Google are among the few we can track.
Data center investment is scaling towards $400 billion this year.4 Meanwhile, incumbents are striking strategic deals in the tens of billions, raising questions about circular financing & demand sustainability.
This is one of the metrics to track!
2025-10-09 08:00:00
Great products don’t happen by accident. They emerge from disciplined product management practices that balance customer needs, business objectives, & technical constraints. This guide distills lessons from hundreds of successful product organizations.
Marc Andreessen’s definition : “Being in a good market with a product that can satisfy that market.”
Signals you have PMF:
Signals you don’t have PMF:
Product-market fit isn’t binary,it exists on a spectrum & varies by customer segment
Stage 1 : Problem Validation
Stage 2 : Solution Validation
Stage 3 : Market Validation
Quantitative indicators :
Qualitative indicators :
A compelling product vision :
Example (Figma): “Make design accessible to everyone” Example (Slack): “Make work life simpler, more pleasant, & more productive”
Strategy connects vision to execution :
Pivot if:
Persevere if:
Framework | Best For | Key Factors | Output | Strengths | Weaknesses |
---|---|---|---|---|---|
RICE | Data-driven teams | Reach, Impact, Confidence, Effort | Numerical score | Objective, quantifiable | Requires estimation accuracy |
Value vs Complexity | Visual prioritization | Business value, implementation effort | 2x2 matrix | Simple, intuitive | Lacks nuance, binary thinking |
Kano Model | User satisfaction | Basic needs, performance, delight | Feature categorization | Customer-centric | Subjective, requires research |
Weighted Scoring | Multi-stakeholder | Custom criteria with weights | Weighted score | Flexible, transparent | Can be gamed |
MoSCoW | Time-boxed releases | Must/Should/Could/Won’t | Priority buckets | Clear communication | Tendency to over-prioritize |
ICE | Rapid evaluation | Impact, Confidence, Ease | Quick score | Fast, lightweight | Less rigorous than RICE |
RICE Scoring
Value vs Complexity Matrix
Kano Model
The hardest part of product management :
The best product managers aren’t measured by features shipped,they’re measured by features they successfully avoided building
Internal roadmap (detailed):
External roadmap (directional):
Product decisions informed by ongoing customer contact :
Generative research (exploring):
Evaluative research (testing):
Understand why customers “hire” your product :
Example: Customers don’t buy a CRM to “manage contacts”,they hire it to “close more deals & hit quota so they earn commission & feel successful.”
Common mistakes:
Best practices:
The product itself drives acquisition, activation, & retention :
Core principles:
Companies that nailed PLG: Slack, Zoom, Dropbox, Notion, Figma
Onboarding:
Freemium model:
Viral mechanics:
Factor | Product-Led Growth (PLG) | Sales-Led Growth (SLG) |
---|---|---|
Ideal ACV | <$5K annually | >$50K annually |
Time to Value | Minutes to hours | Weeks to months |
Buying Process | Individual or small team decision | Committee, procurement involved |
Onboarding | Self-service, intuitive UI | High-touch, training required |
Pricing | Transparent, low initial cost | Custom quotes, negotiation |
Sales Cycle | Days to weeks | 3-6+ months |
Customer Acquisition | Viral, word-of-mouth | Outbound, field sales |
Best For | SMB, individual users | Enterprise, complex workflows |
Examples | Slack, Zoom, Notion, Figma | Salesforce, Workday, SAP |
CAC | Low ($100-$1K) | High ($10K-$100K+) |
Expansion Model | Usage-based, seat expansion | Upsell, cross-sell |
PLG thrives when:
PLG struggles when:
Indicators you need dedicated product management :
Typical timing: 15-30 employees, post product-market fit, scaling phase
Early stage (1-2 PMs):
Growth stage (3-10 PMs):
Scale stage (10+ PMs):
Core competencies:
Specialized skills:
For comprehensive AI implementation strategies & team building, see our AI Implementation Guide.
Healthy dynamics:
Red flags:
As product teams scale, operations become critical :
Product ops responsibilities:
When to hire: 10+ PMs, 100+ employees, or significant process pain
Running effective A/B tests requires :
Weekly product reviews:
Quarterly planning:
Drivers:
Challenges:
Must-haves for enterprise buyers:
Advanced capabilities:
Common tension:
Resolution strategies:
Start by validating the problem (do customers have this pain?), then validate your solution (does it actually solve the problem?), & finally validate the market (is it large enough?). Key signals of PMF include users getting upset when the product is down, sales cycles shortening, >40% word-of-mouth acquisition, & flat retention cohorts. PMF exists on a spectrum & varies by customer segment.
RICE (Reach × Impact × Confidence / Effort) works well for data-driven teams requiring quantifiable decisions. Value vs Complexity matrix is best for visual prioritization & quick communication. Kano Model excels when focusing on user satisfaction & feature categorization. Choose based on your team’s needs : RICE for rigor, Value/Complexity for speed, Kano for customer-centricity.
Early stage (1-2 PMs): Generalists owning entire product surface area. Growth stage (3-10 PMs): Assign PMs to product areas with specialized roles emerging (Growth PM, Platform PM). Scale stage (10+ PMs): Product groups aligned to business objectives with clear levels (APM, PM, Senior PM, Group PM). Hire your first PM at 15-30 employees post-PMF when founders spend >20 hours/week on product decisions.
PLG is when the product itself drives acquisition, activation, & retention without sales reps. Users experience value before buying through free trials or freemium tiers. It works best when time to value is fast (<30 minutes), individual users can adopt without approval, & pricing is low (<$100/month). Companies like Slack, Zoom, & Notion exemplify PLG.
Hire when founders spend >20 hours/week on product decisions, engineering is unclear on priorities, features ship but aren’t used, customer requests overwhelm the team, or roadmap decisions happen ad-hoc. Typical timing is 15-30 employees post product-market fit during scaling phase. First PM should be a strong generalist who can handle research, roadmap, & GTM.
Use frameworks like RICE scoring for objective quantification, Value vs Complexity for visual prioritization, or Kano Model for customer satisfaction focus. Always frame decisions with opportunity costs (“yes to X means no to Y”), use data on usage & retention impact, ensure strategic alignment with vision, & consider customer segmentation (right for which ICP?). The best PMs are measured by features they successfully avoided building.
Good strategy connects vision to execution with five elements : target customers (who you’re building for), value proposition (what problem you’re solving & why choose you), key capabilities (what the product must do exceptionally well), competitive positioning (where you play & how you win), & success metrics (how you measure progress). It should be aspirational yet believable, customer-centric, & differentiated.
Moving up-market requires product changes (SSO, RBAC, audit logs, SLAs, compliance certifications), organizational shifts (field sales team, customer success, professional services), & balancing multiple customer segments. Common tension : SMB wants simplicity while enterprise wants customization. Resolve through separate tiers, platform approaches (core + enterprise add-ons), or dedicated teams for each segment.
Build data-driven product organizations. Product analytics, experimentation frameworks, metrics selection, & using data to accelerate product-market fit.
Integrate AI into your product strategy. AI product features, ML implementation, AI team structure, & building AI-powered products users love.
Master SaaS product strategy. Product-led growth, pricing strategies, freemium models, & building viral SaaS products.
Bridge product & GTM. Product-led growth strategies, positioning, messaging, & collaborating with sales & marketing teams.