2026-06-15 23:01:32
Listen now on YouTube • Spotify • Apple Podcasts
Claire puts Claude Fable 5, Anthropic’s first generally available Mythos-class model, through a series of real-world tests: product specs, agent workflows, design tasks, vision tasks, and multi-agent orchestration. She breaks down what Anthropic is claiming, where the model genuinely feels like a leap forward, and where it surprisingly falls short.
Fable 5 is Anthropic’s first “Mythos-class” model to reach general availability, and it’s crushing benchmarks across the board. It hit 80% on SWBench Pro, significantly outperforming Opus 4.8, GPT-4.5, and Gemini 3.1 Pro. Claire found the model excels in specific areas while falling short in others that matter for everyday product work.
The model is expensive by design: $10 per million input tokens and $50 per million output tokens. That’s a new tier above Opus, and it consumes tokens at roughly twice the rate of other models. You need to be strategic about when to deploy this level of intelligence versus using cheaper models like Sonnet or Opus for simpler tasks.
Fable 5 works like a “seasoned engineer”—which is both its superpower and its Achilles’ heel. It’s thorough, autonomous, and will investigate every corner of a problem to be 120% sure it’s shipping the right thing. Sometimes you need a model that’s a little less thorough, a little “dumber,” to actually ship something useful quickly.
The model is exceptionally good at vision tasks, particularly document formatting and PDF parsing. Claire tested it on creating handwriting worksheets for her 7-year-old and found it dramatically outperformed Opus 4.8—better spacing, clearer layout, appropriate white space. This extends to other vision tasks where you want something to look good or need to parse complex documents.
The writing is nearly unreadable for specs and PRDs. Claire found that Fable 5 produces extremely detailed, technically complete documents that are almost impossible to parse. It gets wrapped around the axle on details, creates big blocks of dense paragraphs with internal references, and makes it hard to see the forest for the trees.
Design output is shockingly bad, at least for one-shot design tasks. When Claire asked Fable to design a skills registry, it produced fundamentally terrible design: gray, black, red, simple outlines. This was a real surprise given the model’s benchmark performance.
The model is conservative on execution and takes “minimal” very literally. When Claire asked it to ship an MVP that would deliver customer value, Fable produced something extremely narrow and not actually that useful. This conservatism may stem from the safety guardrails built into the model.
Fable 5 includes specific safeguards for cybersecurity, biology, chemistry, and distillation tasks. Instead of blocking you entirely, it uses a new “fallback” concept—if you get classified into one of these categories, it gracefully falls back to Opus 4.8. Anthropic reports that 95% of sessions don’t hit a fallback, and they maintain a 30-day retention policy solely to catch misuse.
Multi-agent orchestration is technically possible but not yet reliable. Claire tested the dynamic workflows and subagent capabilities extensively and had some successful multi-agent runs, but also encountered frequent stalls and errors. She walked away from her laptop and came back to find subagents had stalled after about three hours.
The key insight: match model intelligence to task complexity. Claire recommends using it for hard technical problems where extreme detail matters, long-horizon work, and vision tasks. But for front-end work, strategy, specs, and design, other models in the ecosystem will serve you better and cost less.
This is “baby Mythos,” not the full Mythos model. Fable 5 has guardrails that the unrestricted Mythos model (available only to Project Glasswing partners) doesn’t have. The underlying model is the same, but Fable is tuned for safety and general availability.
How I AI: My Honest Review of Claude Fable 5: https://www.chatprd.ai/how-i-ai/claude-fable-5-review
Listen now on YouTube • Spotify • Apple Podcasts
Brought to you by:
Claire sits down with Ankur Goyal, the founder and CEO of Braintrust, to unpack how top engineering teams are using AI agents, evals, and CI to ship better software faster. They get into why agents are now capable of tackling hard infrastructure problems, how to decide what work sits “below the agent line,” and why evals are quickly becoming the modern version of a PRD. Ankur’s core message: the best teams won’t just use AI to write more code; they’ll build the feedback loops, benchmarks, and systems that let AI improve the quality of the product itself.
There’s no staff engineer running as many rigorous benchmarks as someone using an agent. Ankur viscerally disagrees with engineers who say AI can’t handle complicated problems. While models might not be perfect at writing highly concurrent code, they excel at running exhaustive experiments—testing every column store format, every execution engine, every optimization strategy. The baseline of rigor you get from agents is incredible, and there’s simply no excuse anymore to skip benchmarks because they’re tedious.
The agent line keeps going up—and you need to identify what’s below it. Many interactions, decisions, and directions that feel like they need human judgment actually fit “below the agent line.” If you took the information from a meeting and gave it to an agent, would it solve the same problem? Increasingly, the answer is yes. The best teams push this line higher by building smart skills and integrations that expand what agents can handle autonomously.
Practical quality beats theoretical quality every time. In theory, a human engineer with infinite time and focus might produce better code than an AI agent. In practice, humans lose context over days, have decaying attention spans on hard-but-tedious problems, and skip benchmarks they know they should run. AI agents maintain consistent focus, run every test, and can work on problems continuously for days or weeks. The practical quality of AI-assisted engineering is higher because of sustained rigor, not because the code is theoretically better.
You can now bite off much harder technical problems than before. Companies historically avoid major infrastructure changes because the cost of testing alternatives is prohibitively high and the unknown unknowns are risky. With AI agents, you can exhaustively test six different database solutions, run thousands of benchmarks on production-scale data, and make informed decisions about platform shifts that would have been impossible before. The business case for deep technical work becomes much easier when agents do the heavy lifting.
Run four to six foreground agents simultaneously—that’s the human concurrency limit. Ankur runs different agents working on different problems. This matches the personal concurrency limit most people can manage; you can’t effectively context switch between more than that. Some agents run locally, and others run remotely on cloud infrastructure with production-scale data. The key is isolation: each agent has its own environment, ports, and services.
Evals are the modern PRD—they define what success looks like, not how to achieve it. Machine learning shifts programming from defining implementation details to defining success criteria. Just like the best PRDs include user stories and examples, the best evals include concrete test cases and scoring functions. The difference is that evals quantify success in ways that can be automatically measured and improved. This lets you focus on outcomes while AI figures out the implementation.
Build a feedback loop that automatically turns real-world data into evals. For AI product teams, the #1 engineering priority isn’t prompt engineering or picking an agent framework—it’s building a pipeline that summons real-world data and converts it into evals. This is the same principle as investing in CI for traditional software: you’re building the platform that lets agents do the work engineers used to do manually. Without this feedback loop, you’re stuck in whack-a-mole mode, fixing individual cases without systematic improvement.
Quantify your designer’s taste so it scales across your product. Ankur runs hundreds of evals to improve things quantitatively, then asks David (their tastemaker designer) for a vibe check every few days. When David destroys his work, Ankur captures the feedback (“David thinks it’s OK to show both languages as long as . . .”) and improves the scoring functions to encode David’s palette. This doesn’t replace David; it amplifies him. They’re able to apply David’s quality bar to more things than he could ever review manually.
Product building is now carving, not constructing. It’s extremely fast to create something with too many features, too many buttons, and too much code. The hard part is removing stuff. When customers complain, Braintrust removes the thing causing confusion 90% of the time, making the system work better by eliminating complexity. This is the opposite of traditional product development, where you carefully add features one by one.
Invest in CI to earn the ability to move faster—it’s the platform for AI-powered engineering. Every engineer is now building a platform upon which agents do the work engineers used to do manually. For traditional software, that platform is CI. If you feel constrained by velocity, don’t ship crappy stuff faster. Instead, pause and improve CI so you earn the ability to move faster safely. The same principle applies to AI products: build the eval pipeline first, then let agents optimize within that system.
When agents fail, close the session and improve the evals—don’t yell or bribe. Ankur’s back-pocket strategy is remarkably disciplined: he doesn’t try to prompt his way out of problems. He closes the session, improves the evaluation criteria or success metrics, and starts fresh. Sometimes this means hand-writing code to better understand the problem (like when he spent a weekend hand-writing a 3,000-line eval that had become trash through vibe coding). The solution is always better evals, not better prompting.
Blog: Ankur Goyal’s Playbook for Agent-Driven Benchmarking and AI Evals https://www.chatprd.ai/how-i-ai/ankur-goyals-playbook-for-agent-driven-benchmarking-and-ai-evals
Workflows:
↳ How to Scale Expert Judgment in AI Systems with a Human Feedback Loop: https://www.chatprd.ai/how-i-ai/workflows/how-to-scale-expert-judgment-in-ai-systems-with-a-human-feedback-loop
↳ How to Use AI Coding Agents for Exhaustive Infrastructure Benchmarking: https://www.chatprd.ai/how-i-ai/workflows/how-to-use-ai-coding-agents-for-exhaustive-infrastructure-benchmarking
If you’re enjoying these episodes, reply and let me know what you’d love to learn more about: AI workflows, hiring, growth, product strategy—anything.
Catch you next week,
Lenny
P.S. Want every new episode delivered the moment it drops? Hit “Follow” on your favorite podcast app.
2026-06-15 20:04:03
In this episode, I sit down with Ankur Goyal, founder and CEO of Braintrust, the AI evals and observability platform used by teams like Notion, Stripe, Vercel, and Zapier. This one is for the senior engineers, staff engineers, VPs of engineering, and CTOs in my audience. We get into how coding agents can take on deeply technical architecture and infrastructure work that no single human engineer could tackle before, and then we demystify evals so you can use them to make your AI products better without touching the implementation.
Listen or watch on YouTube, Spotify, or Apple Podcasts
How Ankur uses Codex to run week-long benchmark experiments across database indexes, column store formats, and execution engines to speed up slow queries
Why he argues there’s no excuse to skip rigorous benchmarking now that agents can run them tirelessly
The “agent line” framework: how to decide which decisions, directions, and interactions you can hand off to an agent
How I think about the practical vs. theoretical quality of AI on hard technical problems, and why human attention decays on tedious work
Why evals are the modern version of a PRD, and how to encode “what good looks like” so a model can figure out the “how”
How to build a scoring function live and let an agent improve your prompt inside a safe playground
How Ankur turned his designer David’s taste into a repeatable eval so quality scales beyond one person
Why fixing your CI is the highest-leverage way to speed up engineering velocity
Guru—The AI layer of truth
Persona—Trusted identity verification for any use case
(00:00) Introduction to Ankur Goyal
(03:00) Using AI agents for database optimization
(06:10) Running exhaustive benchmarks with coding agents
(09:03) Why staff engineers are wrong about AI limitations
(11:30) The “agent line” framework for delegation
(14:00) Ankur’s workflow: running 4 to 6 concurrent agents
(17:16) Technical setup: foreground agents, background agents, and cloud environments
(20:32) Spending time with AI tools
(23:06) Demystifying evals
(26:02) Live demo: Building an eval for documentation answers
(30:20) The alternative to evals: vibe checks and whack-a-mole
(32:09) Capturing designer taste in scoring functions
(33:13) Quick recap
(33:44) Managing velocity and throughput
(35:40) Why CI/CD investment is critical for AI-accelerated teams
(37:30) Ankur’s prompting strategy when agents fail
(39:10) Closing thoughts and how to connect
• Braintrust: https://www.braintrust.dev/
• Codex: https://openai.com/codex/
• GPT 5.4: https://developers.openai.com/api/docs/models/gpt-5.4
• Claude: https://claude.ai/
• GPT 5.5 just did what no other model could: https://www.lennysnewsletter.com/p/gpt-55-just-did-what-no-other-model
• Paul Graham’s Maker vs. Manager Schedule: http://www.paulgraham.com/makersschedule.html
• tmux: https://github.com/tmux/tmux
• Chris Tate at Vercel: https://www.linkedin.com/in/ctatedev/
LinkedIn: https://www.linkedin.com/in/ankrgyl/
ChatPRD: https://www.chatprd.ai/
Website: https://clairevo.com/
LinkedIn: https://www.linkedin.com/in/clairevo/
Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
2026-06-14 20:31:44
Mark Pincus founded Zynga—the company behind Words With Friends, FarmVille, and Zynga Poker—and has arguably created more hit consumer products than anyone in history. At Zynga, eight of 10 major game launches became massive hits, reaching over a billion players. Over the past five years, Mark has been synthesizing everything he’s learned about building successful consumer products and turning it into a book, Life at the Speed of Play, which comes out on June 23. This is the first interview he’s done about the book.
Listen on YouTube, Spotify, and Apple Podcasts
His “Proven, Better, New” framework: copy what’s proven, make it better so that 10 out of 10 people say “f*ck yes, I’ll use this”—then add something new
Why being less ambitious is the path to the most ambitious ideas
His rule of thumb that your instincts are right 95% of the time, but your ideas are wrong 75% of the time
“Kill hope before hope kills you”
How to raise kids in the age of AI
WorkOS—Make your app enterprise-ready, with SSO, SCIM, RBAC, and more
Vanta—Automate compliance, manage risk, and accelerate trust with AI
• LinkedIn: https://www.linkedin.com/in/markpincus
• Website: https://www.lifeatthespeedofplay.com
• Tribe.net: https://en.wikipedia.org/wiki/Tribe.net
• Zynga: https://www.zynga.com
• Sid Meier: https://en.wikipedia.org/wiki/Sid_Meier
• Electronic Arts: https://www.ea.com
• CityVille: https://en.wikipedia.org/wiki/CityVille
• Words With Friends: https://wordswithfriends.com/
• Scrabble: https://playscrabble.com
• Reddit: https://www.reddit.com
• TED Radio Hour, MIT Media Lab founder, 1984 TED talk.: https://www.ted.com/talks/nicholas_negroponte_5_predictions_from_1984
• Peter Thiel on LinkedIn: https://www.linkedin.com/in/peterthiel
• FarmVille: https://en.wikipedia.org/wiki/FarmVille
• Craig Newmark: https://en.wikipedia.org/wiki/Craig_Newmark
• How to consistently go viral: Nikita Bier’s playbook for winning at consumer apps (co-founder of TBH, Gas, advisor, investor): https://www.lennysnewsletter.com/p/how-to-consistently-go-viral-nikita-bier
• Angry Birds: https://www.angrybirds.com/
• OMGPop: https://en.wikipedia.org/wiki/OMGPop
• Draw Something: https://en.wikipedia.org/wiki/Draw_Something
• Slack founder: Mental models for building products people love ft. Stewart Butterfield: https://www.lennysnewsletter.com/p/slack-founder-stewart-butterfield
• Brian Chesky’s new playbook: https://www.lennysnewsletter.com/p/brian-cheskys-contrarian-approach
• Garry Tan on LinkedIn: https://www.linkedin.com/in/garrytan
• Brian Armstrong on LinkedIn: https://www.linkedin.com/in/barmstrong
• Jason Citron on X: https://x.com/jasoncitron
• Stanislav Vishnevskiy on LinkedIn: https://www.linkedin.com/in/svishnevskiy
• Jeff Bezos on X: https://x.com/JeffBezos
• Andy Jassy on X: https://x.com/ajassy
• Niantic: https://nianticlabs.com
• Pokémon Go: https://pokemongo.com
• Bing Gordon on LinkedIn: https://www.linkedin.com/in/binggordon
• Life at the Speed of Play: Launch Products People Love!: https://www.amazon.com/Life-Speed-Play-Launch-Products/dp/0063352575/ref=tmm_hrd_swatch_0
Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Lenny may be an investor in the companies discussed.
2026-06-14 04:54:15
👋 Hello and welcome to this week’s edition of ✨ Community Wisdom ✨ a subscriber-only email delivered every Saturday, highlighting the most helpful conversations and events happening in our subscriber-only Slack community.
A big thank-you to this month’s community sponsor, Strella. Strella is an AI-powered qualitative research platform that allows teams to run, analyze, and share customer interviews at scale without sacrificing depth. Usability testing, concept validation, discovery research, and more, Strella delivers insights fast and makes them accessible right where you work, including Claude, ChatGPT, and Figma, so your research keeps working long after the initial readout.
The Lenny and Friends Summit returns on September 10 in San Francisco. This will be the greatest assembling of product leaders in history.
We’re keeping it intentionally small (about 1,000 people)—every attendee will be handpicked—and along with in-depth talks from the world’s top operators, we’ll have interactive workshops, tons of opportunities to get to know other attendees, and a few fun surprises.
Here’s the initial lineup of speakers (with more to be announced soon):
Given that we’re anticipating lots of interest and the venue has limited capacity, we’re asking people to apply to attend. Paid newsletter subscribers will get priority access.
P.S. We’ve expanded capacity at the venue, so if you’ve received an email saying there wasn’t a spot for you, reply to this email and we’ll take another look at your application.
Click the city name to RSVP:
Amsterdam. June 17th. Thanks to @Adriana Mosnoi, @Ruslan Doronichev & @Luke Rynne Cullen!
Asheville. June 17th. Thanks to @Nathan Phillips!
Atlanta. June 25th. Thanks to @Ravish C.!
Austin. June 25th. Thanks to @Mark Vandegrift & @Andy Keil - Austin!
Bellevue. June 23rd. Thanks to @Aman Goyal!
Boston. June 17th. Thanks to @David Jorjani!
Chicago. June 17th. Thanks to @Jason Siegel!
Boulder. June 25th. Thanks to @Dave Carlyle!
Hong Kong. June 17th. Thanks to @Manny Reimi!
Lisbon. June 22nd. Thanks to @Gabriela Naumnik & @Nina Un!
Munich. June 16th. Thanks to @Lukas Gerhardt!
Philadelphia. June 17th. Thanks to @Keriann Sabatini & @Doug Clark!
San Francisco. June 25th. Thanks to @Tarek Sadi!
Santa Barbara. June 15th. Thanks to @Joni Hoadley, @Cody Landstrom, & @Oliver Barton!
Shenzhen Afternoon Meetup. June 18th. Thanks to @Ivan Xu!
Toronto. June 24th. Thanks to @Jessie Wang!
Valencia. June 26th. Thanks to @Paul Boudet!
Vienna. June 16th. Thanks to @Serge Versille!
Warsaw. June 16th. Thanks to @Matt Swulinski!
We have a special event for the community’s #book-club this month: June’s book club selection is The Mom Test by Rob Fitzpatrick, and the man himself, Rob, will be joining us on June 24 for a Q&A!
Join the #book-club channel to participate, and be sure to RSVP by June 23. Huge shoutout to Akil Bhagat for running the current iteration of our book club.
We’d love to get your perspective on what it’s like to work in tech right now. The survey will take less than 5 minutes to complete, and we’ll send you the anonymized raw data so you can perform your own analysis, along with an early look at the results before we share them publicly.
Father of the iPod and iPhone on building taste, judgment, and creativity in the AI era | Tony Fadell: YouTube, Spotify, Apple Podcasts
Claude Fable 5 review: what the new Mythos model gets right (and very wrong): YouTube, Spotify, Apple Podcasts
5 Career Questions Your Old Playbook Can’t Answer: YouTube, Spotify, Apple
Thought of posting this on #talk-lol, but this is such an interesting use case and it seems real. Someone “hooked my Whoop to my work calendar to find which coworker gives me the most stress.” One of the replies: “HR reviews could never, this has actual data. Somewhere an Anthropic PM is adding ‘coworker stress forensics’ to the use case list. Legend behavior, monetize it.” ↗
—Guy Peled
Jeremy: I did a similar thing a few months ago, but used Apple Watch heart rate data. I only tried it once, but noticed that my heart rate was lower in larger meetings, but certain 1:1s definitely caused a spike. I’m fairly confident it was related to the level of engagement, but it was a fun vibe-coded experiment.
Guy Peled: Love it. So many factors go into this. Definitely beyond the people in the room:
Amount of people
Your host/participant role
Content (e.g. department restructuring announcement meeting)
etc.
Abdussamad Bello: Has to be a PM!
Miroslav Pavelek: I love the idea! But in comments it is also correctly pointed out that, well, Whoop (or any other device on the market) is not really good at measuring stress.
Jeremy: @Miroslav Pavelek It’s best to view them as directional instead of completely accurate. Exploring the concept with this level of data feels reasonable to me.
For managers of managers, how do you have your leaders provide weekly status reports for non-sprint/dev-related work? For example, initiatives that involve other divisions/teams, GTM work, research, etc.—how do you stay up to date with progress and what their teams are focused on? ↗
Ian McAllister: For anything weekly sent via email, do your teams a favor and enforce a very short/crisp format. Green/Yellow/Red status for key projects with a path to Green for anything Yellow or Red. Anything longer simply won’t get read, so all the effort writing it is wasted.
Save the thoughtful descriptions and details for monthly reviews when there is time to process and discuss.
Aaron Nichols: I think this depends greatly on why you want those updates. I’ve operated in teams where those updates needed to go to someone else—in which case I can just observe what gets reported through those normal means. I’ve also operated in orgs where my teams operated fine without my oversight on regular execution and MBR/QBR updates were sufficient to keep tabs. I tend to be more worried about whether my teams are getting good context and signal from me and their stakeholders than about status updates to me—but this all depends on your level, the size of your org, maturity, etc.
There’s a reality here that you simply do not stay up to date on what everyone is doing. You’ll spend all your energy keeping tabs instead of doing the work you actually should be doing.
Cindy Cohen: I’ve found the most effective updates focus on outcomes, risks, and decisions needed.
For cross-functional initiatives, I typically ask leaders to provide:
Objective / desired outcome
Current status (green/yellow/red)
Key accomplishments since last update
Next 1–2 priorities
Risks, dependencies, or blockers
Decisions or support needed from leadership
Then I use 1:1s to go deeper where needed rather than reviewing everything in detail.
The biggest mistake I see is reporting activity (“met with X, researched Y”) instead of progress toward an outcome. A short update tied to objectives usually provides enough visibility without creating reporting overhead.
Adam Thackeray: Put @Ian McAllister recommendation in bold. Simple is better. Get to the point and what help is needed (if at all).
Question for solo founders and 2–5-person teams where everyone is an engineer or PM: how are you handling marketing/GTM day-to-day?
Roughly three paths I keep seeing, curious which one you took and how it went:
Hire an agency/freelancer—pay someone external to run SEO, content, ads, etc.
Hire a GTM/marketing person—first marketing hire, in-house
DIY—founder learns the tools and systems (SEO, email, social, ads) and just does it
React with (1), (2), (3) for what you actually did, and if you have 2 minutes I’d love to hear:
Roughly what it costs you per month (money and/or founder hours)
What you’d do differently if starting today
(Context: solo founder with a product background here—figuring out how much of marketing I should learn vs. buy. Not selling anything in this thread.) ↗
—Jake Luo
Shawn Jones: Hermes agent + company context brain (LLM-wiki) + growth skills + management tool (Notion/Linear/etc.).
Jake Luo: Cool solution. Would you mind sharing what growth skills you use? Do you use the Hermes agent to run your SoMe and ads campaigns as well?
Shawn Jones: I recommend creating your own custom skills for your industry/business, but have your agent reference these for inspiration: https://github.com/coreyhaines31/marketingskillsi
Primarily have my agent focused on the content development and management side. I’m planning to set up ad management next.
Christos Apartoglou: I think it is hard to assess what the right direction for your team is with the information you provided. My recommendation is to do the following exercise:
Spend some time understanding what your team needs at this life stage. Is it validating PMF? Is it demand generation? Something else?
For the needs you have identified, it will be important to then understand if they are one-offs or evergreen and the relative degree of prioritization. This can guide you on identifying the skills and competencies you may need to add to the team and for how long.
This information can help guide your staffing approach between the three options you are contemplating. If you already have done this work I am happy to jump on a quick 30-minute call and help make sense of it. I was an in-house marketing lead for many companies in the past and now have a fractional consulting practice (though I am not attempting to sell you anything—I don’t have capacity for more clients atm).
Matthew Stublefield: Founder and solopreneur here. I recommend the books Obviously Awesome and Crossing the Chasm and learning about positioning and copywriting. I have built and bought some AI tools that help me with copywriting and SEO, and that combined with some reading and learning has helped. I also paid for some inexpensive digital courses that helped (Csaba Borzasi and Katelyn Bourgain).
Bal Sieber: Path 3, with an asterisk. Solo consultancy here, product design background, so writing was never the scary part. The system was. What I actually run: I do the thinking and the voice, and a set of scheduled AI jobs does the reps. Sourcing who to talk to, drafting engagement, keeping the daily floor. My time lands around an hour a day of reviewing and steering rather than producing. Tooling is a few hundred a month all in. No agency, no hire. What I’d do differently: skip the months of treating marketing as the thing I’d get to after the real work. It only started compounding once it got the same seriousness as client work, a daily floor and a weekly review. One caveat: This shape works because my buyer hangs out where text lives. If your customers need ads or SEO at scale, my setup says nothing about that.
Jake Luo: Seems like many folks focus on content writing and SEO, which is great for building organic pipeline. But sometimes I feel like it’s slow. I would consider doing performance marketing to accelerate user acquisitions. Then how to run paid ads efficiently and effectively is another challenge to tackle.
Mih Fodor: Buy the Demand Curve program and go through it. You’ll have everything planned after that, and you can use what @Shawn Jones suggested.
Bal Sieber: Paid is the one lane I can’t speak to with real numbers, so season this hard. The one thing I’d check before buying acceleration, since it’s the part I do see every week: what the first session after the click does. Ads multiply whatever your signup-to-first-value path already does, so if that path leaks, paid mostly accelerates the leak, at CPC prices. The slow organic phase is annoying, but it’s also cheap rehearsal. You find the funnel holes while the traffic is free. I’d want evidence the path holds before paying to fill it.
Dinuka Wijesinghe: As a starting point before you think about execution, I’d encourage you to spend high-quality time developing the following:
Target market/ICP
Market dynamics incl. competitors and alternatives
Your positioning in relation to market dynamics (value prop, problem & solution statements, pricing etc.)
Likely acquisition channels based on where you think your ICP is and where you’re likely to recognize their trigger point
One primary GTM metric (e.g. # of users, amount of usage, $s etc.) and a goal for that metric—start with only one metric
Interrogate these with your team/investors/trusted advisors—once you feel like you know this back to front and have confidence in it, then think about how you could systematically execute outreach. For example, pick 2-3 channels to test, what would be an ideal cadence of outbound activity you’d need to execute to get meetings/users. At first this will be an educated guess, but you can refine once you get more data. Only then should you consider the mechanics of execution (i.e. the options you and others have listed here)—and the answer will become clearer. More importantly, you’ll have the rationale for the decision you made, which will make it easier for you to adjust if the data you get is different. Take your GTM as seriously as your product. Throwing AI at it or hiring an agency from the start won’t deliver good results if you as the founder don’t have the clarity to ask the right questions. If you want to chat specifics in relation to your market, I can do a short call.
Harshil Bhimani: I have worked with founders before, going 0 to 1. Usually they have a very good understanding of ICP. So the main goal is to find message-market fit. I think all options have tradeoffs. Imo you can do it yourself if you have bandwidth, otherwise outsource it to a good agency/freelancer. Hiring FT is very costly if you don’t have a playbook imo. Wish you luck!
I’m curious to know, how has AI actually changed the way your product team operates?
For context, I’m working on re-organising the product operating model in my org and want to understand what’s working for others.
Two things I’m keen to learn from people who’ve gotten their hands dirty:
What’s one thing that’s genuinely changed in how your team works—not just speed, but how you’re structured, how you hand off to engineering, or how decisions get made?
I’m hearing PMs or designers pushing changes directly to GitHub a lot. We are not there yet, so for those who do, what does that look like in practice, and what did you need to be in place for it to work?
If you’ve worked with teams experimenting with BMAD or similar approaches, I’d love to hear what the PM side of that looks like.
If this has been answered before, happy to be pointed to the relevant thread. ↗
—Nishanth D’Souza
2026-06-10 02:32:03
Claude Fable 5 is the first Mythos-class intelligence model to be generally available, and I got early access to test it before launch. I walk through what Anthropic is promising, what actually stood out when I used it on real work, and where I think it fits in your AI stack.
Listen or watch on YouTube, Spotify, or Apple Podcasts
(00:00) Introduction: Fable 5 is finally here
(00:31) What Anthropic says about the model
(05:14) Token-intensive by design
(06:28) Safety classifiers and the new fallback concept
(07:46) Is this or is this not Mythos?
(08:30) New product launches: Managed Agents and more
(09:20) Crushing benchmarks
(09:55) What it’s actually like to use (the good and the bad)
(11:40) Test 1: product graph spec
(12:56) Test 2: designing a skills registry
(14:04) Conservative on execution
(14:43) Test 3: multi-agent orchestration
(15:39) My takeaways
• Claude Fable 5: https://www.anthropic.com/news/claude-fable-5-mythos-5
• Claude Managed Agents: https://platform.claude.com/docs/en/managed-agents/overview
• SWBench Pro benchmark: https://www.swebench.com/
ChatPRD: https://www.chatprd.ai/
Website: https://clairevo.com/
LinkedIn: https://www.linkedin.com/in/clairevo/
Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
2026-06-09 21:11:29
👋 Hey there, I’m Lenny. Each week, I answer reader questions about building product, driving growth, and accelerating your career. For more: Lenny’s Podcast | Lennybot | How I AI | My favorite AI/PM courses, public speaking course, and interview prep copilot
P.S. Get a full free year of Google AI, Cursor, Lovable, Notion, Manus, Replit, Gamma, n8n, Canva, ElevenLabs, Factory, Wispr Flow, Fin, Supabase, Bolt, Linear, PostHog, Framer, Railway, Granola, Warp, Gumloop, Magic Patterns, Mobbin, Stripe Atlas, and ChatPRD, by becoming an Insider subscriber. Yes, this is for real.
On the heels of part 1 of my essential books for product builders series, I’m excited to share part 2.
As with last time, the books are organized by their jobs-to-be-done in your work and life. I’m again limiting myself to only three per category that I’ve personally read (and completed), and only books that have stood the test of time (i.e., no new books). As a bonus, at the end of the post I’ve also included a dozen fan favorites that either fell just below the ones I picked or I haven’t had a chance to read yet.
I know it’s hard to find time to read a whole book, and when you do, it’s hard to retain anything. As I mentioned in part 1, what’s worked for me is to read 10 minutes before bed as part of my wind-down routine. This has the added benefit of helping me sleep better! And as I read, I try to find one nugget or tactic that I can bring into my work that week. I take a photo of it and email it to myself (using this sweet app) for the next morning. My philosophy is that if I retain just one golden nugget per book over the years, I’m happy. That’s how it usually ends up working out anyway.
So, as I share the book recommendations, I’ll also highlight a nugget of wisdom that’s still with me over the many years since I first read these books.
Let’s get into it.
“Books are the closest thing you’ll ever come to finding cheat codes for real life. You can access the entire learnings of someone else’s career in a few hours.” —Tobi Lütke
Before reading these books, I thought design was a squishy, subjective thing. It’s not. Don’t Make Me Think taught me how to objectively make a product UI work (and feel) better. The Design of Everyday Things showed me that when I struggle with a product, it’s not my fault—it’s the design’s. Refactoring UI gave me a ton of specific design tactics.
Don’t Make Me Think by Steve Krug
The Design of Everyday Things by Don Norman
Refactoring UI by Adam Wathan and Steve Schoger
I’ve always looked up to people who have great taste, and lucky for me, guests on my podcast consistently remind me that it isn’t something you’re born with—taste is something you can learn. The War of Art helped me learn how to recognize and overcome the internal “resistance” that comes with creating something new and different. The Work of Art showed me the creative process of dozens of high-taste creators. Creativity, Inc. taught me to protect my (and my team’s) “ugly babies”—early half-baked ideas that otherwise get squashed.
The War of Art by Steven Pressfield
The Work of Art by Adam Moss
Creativity, Inc. by Ed Catmull
Influence—maybe you’ve heard it’s important? It’s something I was very bad at early in my career, and I’ve had to learn how to do it well. How to Win Friends and Influence People showed me the power in being interested vs. interesting. Influence taught me the fundamentals of how people change their mind: social proof, authority, scarcity, and simply being liked. Never Split the Difference taught me how to shift a negotiation from “you vs. me” to “us” working together to solve the problem.
How to Win Friends and Influence People by Dale Carnegie
Influence by Robert Cialdini
Never Split the Difference by Chris Voss
I’ll be honest, I didn’t read a lot of books when I was starting my company. But I should have. When I finally read The Lean Startup, it showed me how to be smart about where to start, and how to iterate efficiently. Crossing the Chasm taught me what a good early user looks like. Fall in Love with the Problem, Not the Solution finally gave me a very practical guide for every step of the founder journey.
The Lean Startup by Eric Ries
Crossing the Chasm by Geoffrey Moore
Fall in Love with the Problem, Not the Solution by Uri Levine
There are certain books you read and you’re just like, “Wow, this really explains what’s going on around here.” These three did that for me. Great at Work showed me that people who rise fastest focus on fewer things but do them extremely well. 7 Rules of Power taught me that, often, it isn’t the most talented or nice people who end up winning. The Effective Executive helped me understand that efficiency is doing things well, while effectiveness is doing the right (highest-leverage) things.
Great at Work by Morten T. Hansen
7 Rules of Power by Jeffrey Pfeffer
The Effective Executive by Peter Drucker
A few readers suggested that this category should have come first last time—what else matters if you can’t be happy? So I’m including another three books that made me a happier person. The Subtle Art of Not Giving a F*ck taught me that freedom from work isn’t the real goal. Instead, it’s your opportunity to figure out what problems you want to spend time solving, because the most lasting fulfillment comes from solving problems we care about. A Guide to the Good Life gave me the skill of “negative visualization,” which I use to this day. Stumbling on Happiness showed me that we’re often (very) wrong about what will make us happy.
The Subtle Art of Not Giving a F*ck by Mark Manson
A Guide to the Good Life by William B. Irvine
Stumbling on Happiness by Daniel Gilbert