2026-02-18 07:58:58
Sonnet 4.6 is out today, and Anthropic claim it offers similar performance to November's Opus 4.5 while maintaining the Sonnet pricing of $3/million input and $15/million output tokens (the Opus models are $5/$25). Here's the system card PDF.
Sonnet 4.6 has a "reliable knowledge cutoff" of August 2025, compared to Opus 4.6's May 2025 and Haiku 4.5's February 2025. Both Opus and Sonnet default to 200,000 max input tokens but can stretch to 1 million in beta and at a higher cost.
I just released llm-anthropic 0.24 with support for both Sonnet 4.6 and Opus 4.6. Claude Code did most of the work - the new models had a fiddly amount of extra details around adaptive thinking and no longer supporting prefixes, as described in Anthropic's migration guide.
Here's what I got from:
uvx --with llm-anthropic llm 'Generate an SVG of a pelican riding a bicycle' -m claude-sonnet-4.6

The SVG comments include:
<!-- Hat (fun accessory) -->
I tried a second time and also got a top hat. Sonnet 4.6 apparently loves top hats!
For comparison, here's the pelican Opus 4.5 drew me in November:

And here's Anthropic's current best pelican, drawn by Opus 4.6 on February 5th:

Opus 4.6 produces the best pelican beak/pouch. I do think the top hat from Sonnet 4.6 is a nice touch though.
Via Hacker News
Tags: ai, generative-ai, llms, llm, anthropic, claude, llm-pricing, pelican-riding-a-bicycle, llm-release, claude-code
2026-02-18 07:02:33
My Rodney CLI tool for browser automation attracted quite the flurry of PRs since I announced it last week. Here are the release notes for the just-released v0.4.0:
- Errors now use exit code 2, which means exit code 1 is just for for check failures. #15
- New
rodney assertcommand for running JavaScript tests, exit code 1 if they fail. #19- New directory-scoped sessions with
--local/--globalflags. #14- New
reload --hardandclear-cachecommands. #17- New
rodney start --showoption to make the browser window visible. Thanks, Antonio Cuni. #13- New
rodney connect PORTcommand to debug an already-running Chrome instance. Thanks, Peter Fraenkel. #12- New
RODNEY_HOMEenvironment variable to support custom state directories. Thanks, Senko Rašić. #11- New
--insecureflag to ignore certificate errors. Thanks, Jakub Zgoliński. #10- Windows support: avoid
Setsidon Windows via build-tag helpers. Thanks, adm1neca. #18- Tests now run on
windows-latestandmacos-latestin addition to Linux.
I've been using Showboat to create demos of new features - here those are for rodney assert, rodney reload --hard, rodney exit codes, and rodney start --local.
The rodney assert command is pretty neat: you can now Rodney to test a web app through multiple steps in a shell script that looks something like this (adapted from the README):
#!/bin/bash
set -euo pipefail
FAIL=0
check() {
if ! "$@"; then
echo "FAIL: $*"
FAIL=1
fi
}
rodney start
rodney open "https://example.com"
rodney waitstable
# Assert elements exist
check rodney exists "h1"
# Assert key elements are visible
check rodney visible "h1"
check rodney visible "#main-content"
# Assert JS expressions
check rodney assert 'document.title' 'Example Domain'
check rodney assert 'document.querySelectorAll("p").length' '2'
# Assert accessibility requirements
check rodney ax-find --role navigation
rodney stop
if [ "$FAIL" -ne 0 ]; then
echo "Some checks failed"
exit 1
fi
echo "All checks passed"Tags: browsers, projects, testing, annotated-release-notes, rodney
2026-02-17 22:49:04
This is the story of the United Space Ship Enterprise. Assigned a five year patrol of our galaxy, the giant starship visits Earth colonies, regulates commerce, and explores strange new worlds and civilizations. These are its voyages... and its adventures.
— ROUGH DRAFT 8/2/66, before the Star Trek opening narration reached its final form
Tags: screen-writing, science-fiction
2026-02-17 22:09:43
First kākāpō chick in four years hatches on Valentine's Day
First chick of the 2026 breeding season!Kākāpō Yasmine hatched an egg fostered from kākāpō Tīwhiri on Valentine's Day, bringing the total number of kākāpō to 237 – though it won’t be officially added to the population until it fledges.
Here's why the egg was fostered:
"Kākāpō mums typically have the best outcomes when raising a maximum of two chicks. Biological mum Tīwhiri has four fertile eggs this season already, while Yasmine, an experienced foster mum, had no fertile eggs."
And an update from conservation biologist Andrew Digby - a second chick hatched this morning!
The second #kakapo chick of the #kakapo2026 breeding season hatched this morning: Hine Taumai-A1-2026 on Ako's nest on Te Kākahu. We transferred the egg from Anchor two nights ago. This is Ako's first-ever chick, which is just a few hours old in this video.
That post has a video of mother and chick.

Via MetaFilter
Tags: kakapo
2026-02-17 22:04:44
But the intellectually interesting part for me is something else. I now have something close to a magic box where I throw in a question and a first answer comes back basically for free, in terms of human effort. Before this, the way I'd explore a new idea is to either clumsily put something together myself or ask a student to run something short for signal, and if it's there, we’d go deeper. That quick signal step, i.e., finding out if a question has any meat to it, is what I can now do without taking up anyone else's time. It’s now between just me, Claude Code, and a few days of GPU time.
I don’t know what this means for how we do research long term. I don’t think anyone does yet. But the distance between a question and a first answer just got very small.
— Dimitris Papailiopoulos, on running research questions though Claude Code
Tags: research, coding-agents, claude-code, generative-ai, ai, llms
2026-02-17 12:51:58
Given the threat of cognitive debt brought on by AI-accelerated software development leading to more projects and less deep understanding of how they work and what they actually do, it's interesting to consider artifacts that might be able to help.
Nathan Baschez on Twitter:
my current favorite trick for reducing "cognitive debt" (h/t @simonw ) is to ask the LLM to write two versions of the plan:
- The version for it (highly technical and detailed)
- The version for me (an entertaining essay designed to build my intuition)
Works great
This inspired me to try something new. I generated the diff between v0.5.0 and v0.6.0 of my Showboat project - which introduced the remote publishing feature - and dumped that into Nano Banana Pro with the prompt:
Create a webcomic that explains the new feature as clearly and entertainingly as possible
Here's what it produced:

Good enough to publish with the release notes? I don't think so. I'm sharing it here purely to demonstrate the idea. Creating assets like this as a personal tool for thinking about novel ways to explain a feature feels worth exploring further.
Tags: nano-banana, gemini, llms, cognitive-debt, generative-ai, ai, text-to-image, showboat, ai-assisted-programming