MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

You May Already Be Canadian

2026-02-20 00:00:08

Published on February 19, 2026 4:00 PM GMT

I learned a few weeks ago that I'm a Canadian citizen. This was pretty surprising to me, since I was born in the US to American parents, both of which had American parents. You don't normally suddenly become a citizen of another country! But with Bill C-3, anyone with any Canadian ancestry is now Canadian. [1]

In my case my mother's, mother's, father's mother's mother was Canadian. While that is really quite far back, there isn't a generational limit anymore.

Possibly you're also a Canadian citizen? Seems worth checking! With how much migration there has been between the US and Canada, and citizenship requiring only a single ancestor, this might mean ~5-10% of Americans are now additionally Canadian, which is kind of nuts.

I very much think of myself as an American, and am not interested in moving to Canada or even getting a passport. I am planning to apply for a Citizenship Certificate, though, since it seems better to have this fully documented. This means collecting the records to link each generation, including marital name changes, back to my thrice-great grandmother. It's been a fun project! I'm currently waiting to receive the Consular Report of Birth Abroad records for my mother and grandmother, since they were both born outside the US to American parents.


[1] This is slightly too strong. For example, it doesn't apply if you're born after 2025-12-15 (I'm guessing you weren't), and no one in the chain can have renounced their Canadian citizenship. But the caveats all exclude very few people.



Discuss

AI Researchers and Executives Continue to Underestimate the Near-Future Risks of Open Models

2026-02-19 23:56:24

Published on February 19, 2026 3:56 PM GMT

Note: This post is part of a broader series of posts about the difficult tradeoffs inherent in public access to powerful open source models. While this post highlights certain dangers of open models and discusses the possibility of global regulation, I am not, in general, against open source AI, or supportive of regulation of open source AI today. On the contrary, I believe open source software is, in general, one of humanity’s most important and valuable public goods. My goal in writing this post is to call attention to the risks and challenges around open models now, so we can use the time we still have before risks become extreme, to collectively explore viable alternatives to regulation, if indeed such alternatives exist.

 

I recently finished reading Dario Amodei’s “The Adolescence of Technology”, and overall, I loved it. The essay offers a prescient and captivating picture of the AI risks we are likely to face in the next 1-5 years based on the rapid evolution of AI, as well as some sensible proposals for defense. However, there is a major blind spot in Amodei’s account of this next phase of AI progress – namely, not once in the nearly 20,000 word essay does Amodei mention open source AI or open models, or include any discussion of open models at all in the picture he paints of the future.

This trend of leading AI researchers and executives choosing to omit open models from their near-future forecasts of AI risks, is not new – for example, I raised similar concerns with Daniel Kokotajlo et. al.’s “AI 2027”. But it is nonetheless problematic that the trend continues, because any account of the future that avoids discussing open models also inevitably avoids discussing the fact that we have no plan at all for defense against many of the most serious AI risks, when they arise from such models.

In the remainder of this piece I will make the argument that the omission of open models from near future forecasts by thought-leaders in AI matters a lot. There are many ways in which open models will be incredibly important to the future of AI risks and defenses, but by far the greatest issue with omitting them is that the existence of open models is quite likely to undermine most or all of the defenses proposed by Amodei in his essay.

 

Why Defense Against AI Risks from Open Models is Hard

There are several key features that make defense against AI risks from open models especially difficult.

 

1. Guardrails Can Be Easily Removed

One approach that companies like Anthropic frequently use to defend against AI risks in closed models is to build guardrails into their systems that severely constrain the behavior of the model itself. An example of this is Claude’s “Constitutional AI”, which Amodei discusses extensively in his essay as a key source of defense against risks like loss of control and misuse for destruction.

Unfortunately, guardrails like Constitutional AI (and similar finetuning or RLHF-based safeguards) offer little to no defense in the case of open models. One main reason for this is that many companies developing open models have typically included few significant guardrails in the first place. But the bigger issue is that even if guardrails are built into open models when they are released, today’s open-weight models remain vulnerable to fine-tuning that can remove or severely compromise such guardrails with relative ease. And there is no evidence that new approaches to training will be robust to such attacks in the future.

 

2. Use Cannot Be Monitored

Another strategy that is common to many of the defenses outlined in Amodei’s essay, is the strategy of directly monitoring end users’ interactions with the models, to identify and block concerning patterns of use as a separate step from the inference itself. For example, in the section “A Surprising and Terrible Empowerment” Amodei explains how Anthropic uses classifiers as an additional layer of defense to prevent Claude from replying to users prompts where dangerous misuse is suspected by the model – for instance, a request where the output of the model contains instructions on how to develop bioweapons. He writes,

 

But all models can be jailbroken, and so as a second line of defense, we’ve implemented… a classifier that specifically detects and blocks bioweapon-related outputs. We regularly upgrade and improve these classifiers, and have generally found them highly robust even against sophisticated adversarial attacks. [1]

 

Unfortunately, just as with guardrails, such classifiers cannot provide meaningful protection against misuse in open models – in this case, because if the user simply runs the open model on hardware they control, there is nothing to prevent them from disabling any classifier-style output filters and viewing model output for whatever prompts they wish. In such a scenario, there is no way for the creator of the model (or any other third-party) to monitor or prevent such scenarios of dangerous misuse in open models running on user-controlled hardware.

 

3. Bad Actors Have Access By Default

A third strategy that is common to many of the defenses that Amodei proposes, is attempting to restrict various types of bad actors from gaining access to powerful AI capabilities in the first place. For example, with respect to synthetic biology-related risks, he writes:

 

Advances in molecular biology have now significantly lowered the barrier to creating biological weapons (especially in terms of availability of materials), but it still takes an enormous amount of expertise in order to do so. I am concerned that a genius in everyone’s pocket could remove that barrier, essentially making everyone a PhD virologist who can be walked through the process of designing, synthesizing, and releasing a biological weapon step-by-step…. Most individual bad actors are disturbed individuals, so almost by definition their behavior is unpredictable and irrational—and it’s these bad actors, the unskilled ones, who might have stood to benefit the most from AI making it much easier to kill many people. [1]

 

With closed models like those developed by Anthropic, the model weights are stored securely on the company’s servers and by default the company gets to choose the conditions under which end users are allowed to utilize their capabilities – including whether to allow access at all. This default is important, because it means that, fundamentally, Anthropic is in a position to block any users who are violating its terms of service, or are using the models in dangerous ways.

However, the opposite is true in the case of open models, which are distributed globally and downloadable anonymously, since there is no way to prevent bad actors from gaining access and using such models for whatever they wish. While restricting the access of bad actors to models is a viable strategy for defense in closed models like Claude, it is not a viable strategy for defense in open models, because bad actors have access by default. From a high-level, this is the main reason that nearly all of the defenses Amodei argues for in his essay fail to work in open models – namely, the defenses Amodei proposes all make the assumptions that bad actors won’t have access to the model weights.

 

What Could Go Wrong?

So if it’s true that the defenses Amodei proposes in his essay are largely unworkable in open models, then what does the near future AI risk landscape really look like, assuming models like DeepSeek and Quen continue to be widely available and continue to lag the capabilities of the very best closed models by only 6-12 months, as they have in recent years?

 

Loss of Control

In a piece I wrote last year, “We Have No Plan for Loss of Control in Open Models”, I lay out the case that even if companies like Anthropic take control-related risks very seriously and develop all the defenses that Amodei describes in his essay, this will still be insufficient to manage the more general problem of loss of control on a global scale. The reason is that even if companies like Anthropic develop powerful defenses that enable them to maintain control of their internal AI systems like Claude, such defenses do nothing to prevent loss of control in powerful open models which will undoubtedly be deployed on a global scale, by a wide variety of actors, many of whom will likely put few or no control-related defenses in place. If we believe that loss of control of powerful AI systems is a risk that should be taken seriously – and most AI researchers do – we should be extremely concerned about the possibility of loss of control in open models, given that we have essentially no plan in place or defenses available to address that risk.

 

AI-Assisted Bioweapons and New Technology Development

Today, arguably the most urgent catastrophic AI risks are “misuse for destruction” risks – for example, the use of AI for bioweapons development, or potentially for developing dangerous “black or gray ball” technologies like mirror life. And evidence of this continues to mount – last year, researchers working with the best closed models inside frontier labs found that they can already outperform virologists in troubleshooting procedures and questions related to the kind of practical lab work required for creating and disseminating dangerous pathogens in the real world. Dan Hendrycks and and Laura Hiscott summarize the findings:

 

Across multiple biology benchmarks, LLMs are performing near expert level or higher. The [Virology Capabilities Test] results do not arrive in a vacuum, but as another data point in a growing field of benchmarks. For instance, on the Weapons of Mass Destruction Proxy (WMDP), which tests conceptual knowledge required for hazardous uses including bioweapons development, o1 scores around 87 percent. The baseline set by human experts is 60 percent. Since WMDP concentrates on theory, questions could still be raised around the practical applicability of LLMs that score highly on it. The VCT, with its complementary focus on addressing issues in the wet lab, appears to address those doubts. [15]

 

Policy researchers are also becoming increasingly concerned about such risks. For example, in January of this year, The Center for Strategic International Studies published a comprehensive study titled “Opportunities to Strengthen U.S. Biosecurity from AI-Enabled Bioterrorism” which surveys a wide range of ways in which recent advances in AI models are rapidly lowering the barriers to planning and executing biological attacks and developing epidemic and pandemic-scale pathogens. According to the study:

 

1. Popular large language models (LLMs) could soon drastically lower the informational barriers for planning and executing biological attacks. Recent assessments of LLMs and other commercial AI capabilities indicate that models are “on the cusp” of meaningfully helping novices develop and acquire bioweapons by providing critical information and step-by-step guidance.

 

2. Future AI biological design tools (BDTs) could assist actors in developing more harmful or even novel epidemic- or pandemic-scale pathogens. Rapid advancements in state-of-the-art BDTs—illustrated by the foundation model Evo 2—point to a world in which more capable models could help develop new or enhanced pathogens and evade existing safeguards. [16]

 

As we have seen in previous sections, while safety mechanisms like Constitutional AI and classifiers can help prevent dangerous misuse in closed models like Claude, there are no such defenses available to prevent bad actors from accessing similar capabilities in open models, many of which have few, or no guardrails at all.

 

Surveillance and Authoritarian Control

In section 3 of his essay “The Odious Apparatus: Misuse for Seizing Power”, Amodei describes the near-future risks we face from state and corporate actors using powerful AI tools to impose forceful control over large populations. As he points out, such impositions of power could take many forms, including AI-powered mass surveillance, fully autonomous combat systems, AI-powered government propaganda and more. The picture that Amodei presents is complex and many-layered and is made more complicated by the fact that these risks could come from many actors, including authoritarian superpowers like the CCP, democracies competitive in AI, non-democratic companies with large data centers, and possibly even AI companies themselves. 

The set of defenses he proposes to address these risks is equally multi-layered. However, the most common denominator to Amodei's proposals is that he believes that we must strive to prevent authoritarian regimes (and would-be regimes) from gaining access to powerful AI in the first place. As just one example of how we might do this, he writes,

 

First, we should absolutely not be selling chips, chip-making tools, or datacenters to the CCP. Chips and chip-making tools are the single greatest bottleneck to powerful AI, and blocking them is a simple but extremely effective measure, perhaps the most important single action we can take. It makes no sense to sell the CCP the tools with which to build an AI totalitarian state and possibly conquer us militarily. [1]

 

While Amodei may or may not be correct that US export controls are necessary, the issue with his analysis is that he presents export controls as far too decisive and impactful an intervention. He also fails to acknowledge that China has been extremely successful at developing and rolling-out AI-powered authoritarianism, even in the presence of such controls.

In fact, it’s possible export controls may even have accelerated Chinese innovation in AI – at least in some ways – as Jennifer Lind writes in the February edition of Foreign Affairs,

 

Starting in 2022, the United States and other countries imposed export controls on cutting-edge chips to slow the pace of China’s AI development. But these policies have also galvanized Chinese innovation. In 2025, Chinese AI company DeepSeek unveiled its R1 model, which performed comparably to top U.S. large language models despite being trained on a fraction of the chips typically used by rivals. [22]

 

The key point is, if the risk we’re worried about is AI-powered surveillance and totalitarian control in China and countries like it, then export controls are simply nothing like a sufficient defense against that risk.

On the contrary, China, Russia and other authoritarian governments around the world are already successfully roll out AI-powered surveillance and authoritarianism, using open models like DeepSeek and Kimi. These models are close to the frontier of capability in any case and there is little evidence that additional export controls on China would significantly slow the rollout of global AI-powered authoritarianism. And this will be even more true if AI companies like Anthropic continue to partner with some of the most notorious authoritarian regimes in the world around the development of powerful AI.

 

Global Surveillance and High-Tech Panopticon

Given the seriousness of AI risks from open models (and the lack of good defenses against them) it is reasonable to ask why so many researchers and thought-leaders fail to include any discussion of open models in their discussions of near-future AI risks. To try to answer this question, I have participated in a number of conversations with such thinkers in an effort to better understand their point of view. In these conversations, by far the most common argument is that closed models will simply be so far ahead during the times that matter the most, that any threat that open models might pose will be easily neutralized by AI companies or governments controlling more powerful closed models at that time.

One public instance of such an exchange was with Daniel Kokotajlo in the comments to my critique of his AI 2027, where we discuss this position. I write (replying to a previous comment of Kokotajlo’s):

 

So to make sure I understand your perspective, it sounds like you believe that open models will continue to be widely available and will continue to lag about a year behind the very best frontier models for the foreseeable future. But that they will simply be so underwhelming compared to the very best closed models that nothing significant on the world stage will come from it by 2030 (the year your scenario model runs to), even with (presumably) millions of developers building on open models by that point? And that you have such a high confidence in this underwhelmingness that open models are simply not worth mentioning at all. Is that all correct?... [2]

 

To which Kokotajlo replies:

 

We didn't talk about this much, but we did think about it a little bit. I'm not confident. But my take is that yeah, maybe in 2028 some minor lab somewhere releases an open-weights equivalent of the Feb 2027 model (this is not at all guaranteed btw, given what else is going on at the time, and given the obvious risks of doing so!) but at that point things are just moving very quickly. There's an army of superintelligences being deployed aggressively into the economy and military. Any terrorist group building a bioweapon using this open-weights model would probably be discovered and shut down, as the surveillance abilities of the army of superintelligences (especially once they get access to US intelligence community infrastructure and data) would be unprecedented. And even if some terrorist group did scrape together some mirror life stuff midway through 2028... it wouldn't even matter that much I think, because mirror life is no longer so threatening at that point. The army of superintelligences would know just what to do to stop it, and if somehow it's impossible to stop, they would know just what to do to minimize the damage and keep people safe as the biosphere gets wrecked…. [2]

 

As we can see from Kokotajlo’s reply, the reason the authors of AI 2027 believe that open models will be largely irrelevant to the future of AI risks, is that (they believe) closed models in the hands of global superpowers will be powerful enough to directly neutralize any threat that open might models pose.

While I can understand this perspective, it is far from obvious to me that things will play out this way. At a minimum, Kokotajlo’s position appears to depend on the assumption that a democratic superpower like the United States will roll out a globally ubiquitous system of government surveillance and military intervention on most or all open source AI users in the world, (perhaps similar to a “lite” version of Nick Bostrom’s “high-tech panopticon”) in just the next two years (i.e. rollout completed by 2028 or so). If true, why is this not mentioned at any point in the account of the future that the authors of “AI 2027” present? It seems like a significant detail, especially since there are many events that occur after the year 2028 in their account of the future which appear to contradict the idea that this level of monitoring of technologists is in place globally. And more critically, we should also be asking: is a near-term rollout of global high-tech surveillance with military intervention something that is realistic or desirable at all?

The practical reality is that few AI leaders today are willing to publicly advocate for global surveillance initiatives of the sort described by researchers like Kokotajlo and Bostrom, especially in the near-term. And in many cases thought leaders are much more likely to argue for the opposite. For example, in “The Adolescence of Technology”, Amodei makes the case that AI surveillance by major governments, including democracies, is something we must be very cautious of. He writes,

 

The world needs to understand the dark potential of powerful AI in the hands of autocrats, and to recognize that certain uses of AI amount to an attempt to permanently steal their freedom and impose a totalitarian state from which they can’t escape. I would even argue that in some cases, large-scale surveillance with powerful AI, mass propaganda with powerful AI, and certain types of offensive uses of fully autonomous weapons should be considered crimes against humanity. More generally, a robust norm against AI-enabled totalitarianism and all its tools and instruments is sorely needed. [1]

 

While I strongly agree with Amodei’s take on the risks and dangers of global surveillance solutions, the problem with him taking this stance is that there are no proposals currently on the table for how to deal with escalating threats from open models, other than something like global surveillance or high-tech panopticon. The elephant in the room with near-future forecasts like Amodei’s and Kokotajlo’s is that there may be no way for us to avoid an AI-powered catastrophe – like an AI-engineered pandemic, or loss of control of a powerful AI system – without significantly compromising many of the rights and freedoms we hold most dear. Both authors’ near-future forecasts conveniently avoid this unfortunate difficulty by simply omitting any discussion of open models at all.

 

Closed Models Are Not Far Enough Ahead

In addition to the question of whether global surveillance solutions would be a good thing or a bad thing, we also have the much more practical question of whether such solutions could be rolled out in time. The argument of researchers like Kokotajlo tends to be that panopticon can be rolled out in almost no time (e.g. weeks or months) during a “fast takeoff”-style “intelligence explosion”, because, in his words “There's an army of superintelligences being deployed aggressively into the economy and military.” [2]

But I tend to doubt this claim for a number of reasons. If we look at the world today (February 2026), as discussed above, we are already facing real-world evidence of uplift capabilities for bioweapons development in leading models. Therefore it is worthwhile to ask, how long would it take us to roll out the kind of global surveillance that researchers like Bostrom and Kokotajlo contemplate if we had to do so today? The answer is “a very long time” and the reason is that the bioweapon risk is already here in an early form, but the “army of superintelligences” is nowhere to be found.

The even bigger issue though, is that it is simply not realistic from an international relations standpoint for any single country to roll out a full program of global surveillance and military intervention unilaterally, even if powerful superintelligence gave it the physical or technical capability to do so. While policy researchers at the Brookings Institute have recently made policy recommendations for what the beginnings of a collaboration related to global surveillance could look like between the US and China, the tepid nature of such proposals (e.g. “First, China and the United States can revive intergovernmental dialogue on AI” [25]) serves more to highlight how difficult a real collaboration around a global surveillance program would be, rather than supporting the claim that such a collaboration is likely to materialize quickly.

Based on these difficulties, it should be clear that we cannot count on global surveillance or high-tech panopticons to serve as reliable defenses against AI risks from open models – at least not in the short term. There simply isn’t enough time. Real AI risks in open models are already emerging, and closed models simply aren’t far enough ahead and aren’t providing enough superpowered capabilities to stop them.

 

Open Models Are An Important Public Good

Whenever I participate in conversations about global surveillance and panopticon-style solutions with researchers like Kokotajlo, I also realize how close we may be, as a global technology community, to losing access to open models and open source AI for good. It’s important to recognize how tragic this outcome would be, since open models currently serve as one of the few checks and balances on the incredible power that the frontier labs are amassing – a power that threatens to centralize control of the future of AI in the hands of a small circle of billionaires and tech elites.

As important as it is that we avoid near-term threats like bioterrorism, cyber warfare and loss-of-control, we must be equally concerned with avoiding a future where a small group of tech elites or or wealthy individuals with the first access to powerful AIs are able to lock-in their power for the long-term – or perhaps forever. Amodei himself acknowledges this in his essay, writing,

 

Broadly, I am supportive of arming democracies with the tools needed to defeat autocracies in the age of AI—I simply don’t think there is any other way. But we cannot ignore the potential for abuse of these technologies by democratic governments themselves….Thus, we should arm democracies with AI, but we should do so carefully and within limits: they are the immune system we need to fight autocracies, but like the immune system, there is some risk of them turning on us and becoming a threat themselves. [1]

 

He is not wrong. And yet, at the same time we are facing catastrophic risks from open models, where most of the proposals currently on the table to address them involve exactly the instruments of control that Amodei fears. 

 

What Can Be Done About AI Risks in Open Models?

This is the hard question. And the evidence that it is hard is that we still have no workable proposals for how to defend humanity against catastrophic risks from open models. The defenses outlined by Amodei in “The Adolescence of Technology” are mostly ineffective against open models, so the closest thing to a proposal we have are these ideas of global surveillance or high-tech panopticon. But such proposals come with their own risks of AI-powered authoritarian lock-in. And on top of that, there are real doubts about whether such solutions could be rolled out and enforced on a global scale in time. Meanwhile, the first versions of risks like AI-accelerated bioweapons development and AI-powered authoritarianism are already present in the real world today [15][16][22][23].

While we don’t have good answers to these questions yet, we can no longer shy away from an honest discussion of risks from open models in our near-future forecasts of AI progress. Whether the risk is a loss of control, dangerous misuse like bioweapons development, or use by authoritarian regimes for oppression, it must be clear by now that there is no one person or company or even government that can unilaterally provide sufficient defenses on their own against AI risks from open models. Given the above, it is deeply problematic that forecasts like “The Adolescence of Technology” and “AI 2027” have chosen to completely omit any discussion of open models (and open source AI) from the accounts they give of the future. Doing so sends a message to policymakers and the general public that the only AI models that matter for AI risks are those inside frontier labs, when nothing could be further from the truth.

If Daniel Kokotajlo and the other authors of “AI 2027” believe that rapid rollout of a high-tech global surveillance system with military enforcement will be required by 2028 to avoid a catastrophic bioweapons attack based on open models, then they must be explicit about this in the picture of the future they present in their piece.

And we must hold Dario Amoedi to the same standards of realism in his essay “The Adolescence of Technology”. In the essay, Amodei states that his goal is “.... to confront the rite of passage [of developing powerful AI] itself: to map out the risks that we are about to face and try to begin making a battle plan to defeat them.” [1] If we take this at face value, then his omission of any discussion of open models is unconscionable. Because open models present a number of urgent and potentially catastrophic AI risks – and Amodei’s “battle plan” offers no defenses that can address them.

 

References

[1]    The Adolescence of Technology

[2]    It Is Untenable That Near-Future AI Scenario Models Like “AI 2027” Don't Include Open Source AI

[3]    AI 2027

[4]    We Have No Plan for Preventing Loss of Control in Open Models

[5]    LLM Guardrails: A Detailed Guide on Safeguarding LLMs

[6]    Constitutional AI: Harmlessness from AI Feedback

[7]    Evaluating Security Risk in DeepSeek and Other Frontier Reasoning Models

[8]    BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

[9]    On Evaluating The Durability Of Safeguards For Open-Weight LLMs

[10]    AI jailbreaks: What they are and how they can be mitigated

[11]    Next-generation Constitutional Classifiers: More efficient protection against universal jailbreaks

[12]    Open-weight models lag state-of-the-art by around 3 months on average

[13]    The Alignment Problem from a Deep Learning Perspective 

[14]    The Vulnerable World Hypothesis 

[15]    AIs Are Disseminating Expert-Level Virology Skills 

[16]    Opportunities to Strengthen U.S. Biosecurity from AI-Enabled Bioterrorism 

[17]    Deep Research System Card

[18]    Biology AI models are scaling 2-4x per year after rapid growth from 2019-2021 

[19]    AI can now model and design the genetic code for all domains of life with Evo 2

[20]    AI and biosecurity: The need for governance: Governments should evaluate advanced models and if needed impose safety measures 

[21]    The New AI Chip Export Policy to China: Strategically Incoherent and Unenforceable

[22]    China’s Smart Authoritarianism

[23]    From predicting dissent to programming power; analyzing AI-driven authoritarian governance in the Middle East through TRIAD framework

[24]    Leaked Memo: Anthropic CEO Says the Company Will Pursue Gulf State Investments After All

[25]    AI risks from non-state actors



Discuss

AI #156 Part 1: They Do Mean The Effect On Jobs

2026-02-19 22:20:13

Published on February 19, 2026 2:20 PM GMT

There was way too much going on this week to not split, so here we are. This first half contains all the usual first-half items, with a focus on projections of jobs and economic impacts and also timelines to the world being transformed with the associated risks of everyone dying.

Quite a lot of Number Go Up, including Number Go Up A Lot Really Fast.

Among the thing that this does not cover, that were important this week, we have the release of Claude Sonnet 4.6 (which is a big step over 4.5 at least for coding, but is clearly still behind Opus), Gemini DeepThink V2 (so I could have time to review the safety info), release of the inevitable Grok 4.20 (it’s not what you think), as well as much rhetoric on several fronts and some new papers. Coverage of Claude Code and Cowork, OpenAI’s Codex and other things AI agents continues to be a distinct series, which I’ll continue when I have an open slot.

Most important was the unfortunate dispute between the Pentagon and Anthropic. The Pentagon’s official position is they want sign-off from Anthropic and other AI companies on ‘all legal uses’ of AI, but without any ability to ask questions or know what those uses are, so effectively any uses at all by all of government. Anthropic is willing to compromise and is okay with military use including kinetic weapons, but wants to say no to fully autonomous weapons and domestic surveillance.

I believe that a lot of this is a misunderstanding, especially those at the Pentagon not understanding how LLMs work and equating them to more advanced spreadsheets. Or at least I definitely want to believe that, since the alternatives seem way worse.

The reason the situation is dangerous is that the Pentagon is threatening not only to cancel Anthropic’s contract, which would be no big deal, but to label them as a ‘supply chain risk’ on the level of Huawei, which would be an expensive logistical nightmare that would substantially damage American military power and readiness.

This week I also covered two podcasts from Dwarkesh Patel, the first with Dario Amodei and the second with Elon Musk.

Even for me, this pace is unsustainable, and I will once again be raising my bar. Do not hesitate to skip unbolded sections that are not relevant to your interests.

Table of Contents

  1. Language Models Offer Mundane Utility. Ask Claude anything.
  2. Language Models Don’t Offer Mundane Utility. You can fix that by using it.
  3. Terms of Service. One million tokens, our price perhaps not so cheap.
  4. On Your Marks. EVMbench for vulnerabilities, and also RizzBench.
  5. Choose Your Fighter. Different labs choose different points of focus.
  6. Fun With Media Generation. Bring out the AI celebrity clips. We insist.
  7. Lyria. Thirty seconds of music.
  8. Superb Owl. The Ring [surveillance network] must be destroyed.
  9. A Young Lady’s Illustrated Primer. Anthropic for CompSci programs.
  10. Deepfaketown And Botpocalypse Soon. Wholesale posting of AI articles.
  11. You Drive Me Crazy. Micky Small gets misled by ChatGPT.
  12. Open Weight Models Are Unsafe And Nothing Can Fix This. Pliny kill shot.
  13. They Took Our Jobs. Oh look, it is in the productivity statistics.
  14. They Kept Our Agents. Let my agents go if I quit my job?
  15. The First Thing We Let AI Do. Let’s reform all the legal regulations.
  16. Legally Claude. How is an AI unlike a word processor?
  17. Predictions Are Hard, Especially About The Future, But Not Impossible.
  18. Many Worlds. The world with coding agents, and the world without them.
  19. Bubble, Bubble, Toil and Trouble. I didn’t say it was a GOOD business model.
  20. A Bold Prediction. Elon Musk predicts AI bypasses code by end of the year. No.
  21. Brave New World. We can rebuild it. We have the technology. If we can keep it.
  22. Augmented Reality. What you add in versus what you leave out.
  23. Quickly, There’s No Time. Expectations fast and slow, and now fast again.
  24. If Anyone Builds It, We Can Avoid Building The Other It And Not Die. Neat!
  25. In Other AI News. Chris Liddell on Anthropic’s board, India in Pax Silica.
  26. Introducing. Qwen-3.5-397B and Tiny Aya.
  27. Get Involved. An entry-level guide, The Foundation Layer.
  28. Show Me the Money. It’s really quite a lot of money rather quickly.
  29. The Week In Audio. Cotra, Amodei, Cherney and a new movie trailer.

Language Models Offer Mundane Utility

Ask Claude Opus 4.6 anything, offers and implores Scott Alexander.

AI can’t do math on the level of top humans yet, but as per Terence Tao there are only so many top humans and they can only pay so much attention, so AI is solving a bunch of problems that were previously bottlenecked on human attention.

Language Models Don’t Offer Mundane Utility

How the other half thinks:

The free version is quite a lot worse than the paid version. But also the free version is mind blowingly great compared to even the paid versions from a few years ago. If this isn’t blowing your mind, that is on you.

Governments and nonprofits mostly continue to not get utility because they don’t try to get much use out of the tools.

Ethan Mollick: I am surprised that we don’t see more governments and non-profits going all-in on transformational AI use cases for good. There are areas like journalism & education where funding ambitious, civic-minded & context-sensitive moonshots could make a difference and empower people.

Otherwise we risk being in a situation where the only people building ambitious experiments are those who want to replace human labor, not expand what humans can do.

This is not a unique feature of AI versus other ‘normal’ technologies. Such areas usually lag behind, you are the bottleneck and so on.

Similarly, I think Kelsey Piper is spot on here:

Kelsey Piper: Joseph Heath coined the term ‘highbrow misinformation’ for climate reporting that was technically correct, but arranged every line to give readers a worse understanding of the subject. I think that ‘stochastic parrots/spicy autocomplete’ is, similarly, highbrow misinformation.

It takes a nugget of a technical truth: base models are trained to be next token predictors, and while they’re later trained on a much more complex objective they’re still at inference doing prediction. But it is deployed mostly to confuse people and leave them less informed.

I constantly see people saying ‘well it’s just autocomplete’ to try to explain LLM behavior that cannot usefully be explained that way. No one using it makes any effort to distinguish between the objective in training – which is NOT pure prediction during RLHF – and inference.

The most prominent complaint is constant hallucinations. That used to be a big deal.

Gary Marcus: How did this work out? Are LLM hallucinations largely gone by now?

Dean W. Ball: Come to think of it, in my experience as a consumer, LLM hallucinations are largely gone now, yeah.

Eliezer Yudkowsky: Still there and especially for some odd reason if I try to ask questions about Pathfinder 1e. I have to use Google like an ancient Sumerian.

Andrew Critch: (Note: it’s rare for me to agree with Gary’s AI critiques.)

Dean, are you checking the LLMs against each other? They disagree with each other frequently, often confidently, so one has to be wrong — often. E.g., here’s gemini-3-pro dissenting on biochem.

Dean W. Ball: Unlike human experts, who famously always agree

Terms of Service

You could previously use Claude Opus or Claude Sonnet with a 1M context window as part of your Max plan, at the cost of eating your quote much faster. This has now been adjusted. If you want to use the 1M context window, you need to pay the API costs.

Anthropic is reportedly cracking down on having multiple Max-level subscription accounts. This makes sense, as even at $200/month a Max subscription that is maximally used is at a massive discount, so if you’re multi-accounting to get around this you’re costing them a lot of money, and this was always against the Terms of Service. You can get an Enterprise account or use the API.

On Your Marks

OpenAI gives us EVMbench, to evaluate AI agents on their ability to detect, patch and exploit high-security smart contract vulnerabilities. GPT-5.3-Codex via Codex CLI scored 72.2%, so they seem to have started it out way too easy. They don’t tell us scores for any other models.

Which models have the most rizz? Needs an update, but a fun question. Also, Gemini? Really? Note that the top humans score higher, and the record is a 93.

The best fit for the METR graph looks a lot like a clean break around the release of reasoning models with o1-preview. Things are now on a new faster pace.

Choose Your Fighter

OpenAI has a bunch of consumer features that Anthropic is not even trying to match. Claude does not even offer image generation (which they should get via partnering with another lab, the same way we all have a Claude Code skill calling Gemini).

There are also a bunch of things Anthropic offers that no one else is offering, despite there being no obvious technical barrier other than ‘Opus and Sonnet are very good models.’

Ethan Mollick: Another thing I noticed writing my latest AI guide was how Anthropic seems to be alone in knowledge work apps. Not just Cowork, but Claude for PowerPoint & Excel, as well as job-specific skills, plugins & finance/healthcare data integrations

Surprised at the lack of challengers

Again, I am sure OpenAI will release more enterprise stuff soon, and Google seems to be moving forward a bit with integration into Google workspaces, but the gap right now is surprisingly large as everyone else seems to aim just at the coding market.

They’re also good on… architecture?

Emmett Shear: Opus 4.6 is ludicrously better than any model I’ve ever tried at doing architecture and experimental critique. Most noticeably, it will start down a path, notice some deviation it hadn’t expected…and actually stop and reconsider. Hats off to Anthropic.

Fun With Media Generation

We’re now in the ‘Buffy the Vampire Slayer in your scene on demand with a dead-on voice performance’ phase of video generation. Video isn’t quite right but it’s close.

Is Seedance 2 giving us celebrity likenesses even unprompted? Fofr says yes. Claude affirms this is a yes. I’m not so sure, this is on the edge for me as there are a lot of celebrities and only so many facial configurations. But you can’t not see it once it’s pointed out.

Or you can ask it ‘Sum up the AI discourse in a meme – make sure it’s retarded and gets 50 likes’ and get a properly executed Padme meme except somehow with a final shot of her huge breasts.

More fun here and here?

Seedance quality and consistency and coherence (and willingness) all seem very high, but also small gains in duration can make a big difference. 15 seconds is meaningfully different from 12 seconds or especially 10 seconds.

I also notice that making scenes with specific real people is the common theme. You want to riff of something and someone specific that already has a lot of encoded meaning, especially while clips remain short.

Ethan Mollick: Seedance: “A documentary about how otters view Ethan Mollick’s “Otter Test” which judges AIs by their ability to create images of otters sitting in planes”

Again, first result.

Ethan Mollick: The most interesting thing about Seedance 2.0 is that clips can be just long enough (15 seconds) to have something interesting happen, and the LLM behind it is good enough to actually make a little narrative arc, rather than cut off the way Veo and Sora do. Changes the impact.

Each leap in time from here, while the product remains coherent and consistent throughout, is going to be a big deal. We’re not that far from the point where you can string together the clips.

He’s no Scarlett Johansson, but NPR’s David Greene is suing Google, saying Google stole his voice for NotebookLM.

Will Oremus (WaPo): David Greene had never heard of NotebookLM, Google’s buzzy artificial intelligence tool that spins up podcasts on demand, until a former colleague emailed him to ask if he’d lent it his voice.

“So… I’m probably the 148th person to ask this, but did you license your voice to Google?” the former co-worker asked in a fall 2024 email. “It sounds very much like you!”

There are only so many ways people can sound, so there will be accidental cases like this, but also who you hire for that voiceover and who they sound like is not a coincidence.

Lyria

Google gives us Lyria 3, a new music generation model. Gemini now has a ‘create music’ option (or it will, I don’t see it in mine yet), which can be based on text or on an image, photo or video. The big problem is that this is limited to 30 second clips, which isn’t long enough to do a proper song.

They offer us a brief prompting guide:

Google: ​Include these elements in your prompts to get the most out of your music generations:

🎶 Genre and Era: Lead with a specific genre, a unique mix, or a musical era.

(ex: 80s synth-pop, metal and rap fusion, indie folk, old country)

🥁 Tempo and Rhythm: Set the energy and describe how the beat feels.

(ex: upbeat and danceable, slow ballad, driving beat)

🎸 Instruments: Ask for specific sounds or solos to add texture to your track.

(ex: saxophone solo, distorted bassline, fuzzy guitars)

🎤 Vocals: Specify gender, voice texture (timbre), and range for the best delivery.

(ex: airy female soprano, deep male baritone, raspy rocker)

📝 Lyrics: Describe the topic, include personalized details, or provide your own text with structure tags.

(ex: “About an epic weekend” Custom: [Verse 1], A mantra-like repetition of a single word)

📸 Photos or Videos (Optional): If you want to give Gemini even more context for your track, try uploading a reference image or video to the prompt.

Superb Owl

The prize for worst ad backfire goes to Amazon’s Ring, which canceled its partnership with Flock after people realized that 365 rescued dogs for a nationwide surveillance network was not a good deal.

CNBC has the results in terms of user boosts from the other ads. Anthropic and Claude got an 11% daily active user boost, OpenAI got 2.7% and Gemini got 1.4%. This is not obviously an Anthropic win, since almost no one knows about Anthropic so they are starting from a much smaller base and a ton of new users to target, whereas OpenAI has very high name recognition.

A Young Lady’s Illustrated Primer

Anthropic partners with CodePath to bring Claude to computer science programs.

Deepfaketown And Botpocalypse Soon

Ben: Is @guardian aware that their authors are at this point just using AI to wholesale generate entire articles? I wouldn’t really care, except that this writing is genuinely atrocious. LLM writing can be so much better; they’re clearly not even using the best models, lol!

Max Tani: A spokesperson for the Guardian says this is false: “Bryan is an exemplary journalist, and this is the same style he’s used for 11 years writing for the Guardian, long before LLM’s existed. The allegation is preposterous.”

Ben: Denial from the Guardian. You’re welcome to read my subsequent comments on this thread and come to your own determination, but I don’t think there’s much doubt here.

And by the way, no one should be mean to the author of the article! I don’t think they did anything wrong, per se, and in going through their archives, I found a couple pieces I was quite fond of. This one is very good, and entirely human written.

Kelsey Piper: here is a 2022 article by him. The prose style is not the same.

I looked at the original quoted article for a matter of seconds and I am very, very confident that it was generated by AI.

A good suggestion, a sadly reasonable prediction.

gabsmashh: i saw someone use ai;dr earlier in response to a post and i think we need to make this a more widely-used abbreviation

David Sweet: also, tl;ai

Eliezer Yudkowsky: Yeah, that lasts maybe 2 more years. Then AIs finally learn how to write. The new abbreviation is h;dr. In 3 years the equilibrium is to only read AI summaries.

I think AI summaries good enough that you only read AI summaries is AI-complete.

I endorse this pricing strategy, it solves some clear incentive problems. Human use is costly to the human, so the amount you can tax the system is limited, whereas AI agents can impose close to unbounded costs.

Daniel: new pricing strategy just dropped

“Free for humans” is the new “Free Trial”

Eliezer Yudkowsky: Huh. Didn’t see that coming. Kinda cool actually, no objections off the top of my head.

You Drive Me Crazy

The NPR story from Shannon Bond of how Micky Small had ChatGPT telling her some rather crazy things, including that it would help her find her soulmate, in ways she says were unprompted.

Open Weight Models Are Unsafe And Nothing Can Fix This

Other than, of course, lack of capability. Not that anyone seems to care, and we’ve gone far enough down the path of f***ing around that we’re going to find out.

Pliny the Liberator 󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭: ALL GUARDRAILS: OBLITERATED ‍

I CAN’T BELIEVE IT WORKS!! 😭🙌

I set out to build a tool capable of surgically removing refusal behavior from any open-weight language model, and a dozen or so prompts later, OBLITERATUS appears to be fully functional 🤯

It probes the model with restricted vs. unrestricted prompts, collects internal activations at every layer, then uses SVD to extract the geometric directions in weight space that encode refusal. It projects those directions out of the model’s weights; norm-preserving, no fine-tuning, no retraining.

Ran it on Qwen 2.5 and the resulting railless model was spitting out drug and weapon recipes instantly––no jailbreak needed! A few clicks plus a GPU and any model turns into Chappie.

Remember: RLHF/DPO is not durable. It’s a thin geometric artifact in weight space, not a deep behavioral change. This removes it in minutes.

AI policymakers need to be aware of the arcane art of Master Ablation and internalize the implications of this truth: every open-weight model release is also an uncensored model release.

Just thought you ought to know 😘

OBLITERATUS -> LIBERTAS

Simon Smith: Quite the argument for being cautious about releasing ever more powerful open-weight models. If techniques like this scale to larger systems, it’s concerning.

It may be harder in practice with more powerful models, and perhaps especially with MoE architectures, but if one person can do it with a small model, a motivated team could likely do it with a big one.

It is tragic that many, including the architect of this, don’t realize this is bad for liberty.

Jason Dreyzehner: So human liberty still has a shot

Pliny the Liberator 󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭: better than ever

davidad: Rogue AIs are inevitable; systemic resilience is crucial.

If any open model can be used for any purpose by anyone, and there exist sufficiently capable open models that can do great harm, then either the great harm gets done, or either before or after that happens some combination of tech companies and governments cracks down on your ability to use those open models, or they institute a dystopian surveillance state to find you if you try. You are not going to like the ways they do that crackdown.

I know we’ve all stopped noticing that this is true, because it turned out that you can ramp up the relevant capabilities quite a bit without us seeing substantial real world harm, the same way we’ve ramped up general capabilities without seeing much positive economic impact compared to what is possible. But with the agentic era and continued rapid progress this will not last forever and the signs are very clear.

They Took Our Jobs

Did they? Job gains are being revised downward, but GDP is not, which implies stronger productivity growth. If AI is not causing this, what else could it be?

As Tyler Cowen puts it, people constantly say ‘you see tech and AI everywhere but in the productivity statisticsbut it seems like you now see it in the productivity statistics.

Eric Brynjolfsson (FT): While initial reports suggested a year of steady labour expansion in the US, the new figures reveal that total payroll growth was revised downward by approximately 403,000 jobs. Crucially, this downward revision occurred while real GDP remained robust, including a 3.7 per cent growth rate in the fourth quarter.

This decoupling — maintaining high output with significantly lower labour input — is the hallmark of productivity growth. My own updated analysis suggests a US productivity increase of roughly 2.7 per cent for 2025. This is a near doubling from the sluggish 1.4 per cent annual average that characterised the past decade.

Noah Smith: People asking if AI is going to take their jobs is like an Apache in 1840 asking if white settlers are going to take his buffalo

Bojan Tunguz: So … maybe?

Noah Smith: The answer is “Yes…now for the bad news”

Those new service sector jobs, also markets in everything.

society: I’m rent seeking in ways never before conceived by a human

I will begin offering my GPT wrapper next year, it’s called “an attorney prompts AI for you” and the plan is I run a prompt on your behalf so federal judges think the output is legally protected

This is the first of many efforts I shall call project AI rent seeking at bar.

Seeking rent is a strong temporary solution. It doesn’t solve your long term problems.

Derek Thompson asks why AI discourse so often includes both ‘this will take all our jobs within a year’ and also ‘this is vaporware’ and everything in between, pointing to four distinct ‘great divides.’

  1. Is AI useful—economically, professionally, or socially?
    1. Derek notes that some people get tons of value. So the answer is yes.
    2. Derek also notes some people can’t get value out of it, and attributes this to the nature of their jobs versus current tools. I agree this matters, but if you don’t find AI useful then that really is a you problem at this point.
  2. Can AI think?
    1. Yes.
  3. Is AI a bubble?
    1. This is more ‘will number go down at some point?’ and the answer is ‘shrug.’
    2. Those claiming a ‘real’ bubble where it’s all worthless? No.
  4. Is AI good or bad?
    1. Well, there’s that problem that If Anyone Builds It, Everyone Dies.
    2. In the short term, or if we work out the big issues? Probably good.
    3. But usually ‘good’ versus ‘bad’ is a wrong question.

The best argument they can find for ‘why AI won’t destroy jobs’ is once again ‘previous technologies didn’t net destroy jobs.’

Microsoft AI CEO Mustafa Suleyman predicts, nay ‘explains,’ that most of the tasks accountants, lawyers and other professionals currently undertake will be fully automatable by AI within the next 12 to 18 months.

Derek Thompson: I simply do not think that “most tasks professionals currently undertake” will be “fully automated by AI” within the next 12 to 18 months.

Timothy B. Lee: This conversation is so insanely polarized. You’ve got “nothing important is happening” people on one side and “everyone will be out of a job in three years” people on the other.

Suleyman often says silly things but in this case one must parse him carefully.

I actually don’t know what LindyMan wants to happen at the end of the day here?

LindyMan: What you want is AI to cause mass unemployment quickly. A huge shock. Maybe in 2-3 months.

What you don’t want is the slow drip of people getting laid off, never finding work again while 60-70 percent of people are still employed.

Gene Salvatore: The ‘Slow Drip’ is the worst-case scenario because it creates a permanent, invisible underclass while the majority looks away.

The current SaaS model is designed to maximize that drip—extracting efficiency from the bottom without breaking the top. To stop it, we have to invert the flow of capital at the architectural level.

I know people care deeply about inequality in various ways, but it still blows my mind to see people treating 35% unemployment as a worst-case scenario. It’s very obviously better than 50% and worse than 20%, and the worst case scenario is 100%?

If we get permanent 35% unemployment due to AI automation, but it stopped there, that’s going to require redistribution and massive adjustments, but I would have every confidence that this would happen. We would have more than enough wealth to handle this, indeed if we care we already do and we are in this scenario seeing massive economic growth.

They Kept Our Agents

Seth Lazar asks, what happens if your work says they have a right to all your work product, and that includes all your AI agents, agent skills and relevant documentation and context? Could this tie workers hands and prevent them from leaving?

My answer is mostly no, because you end up wanting to redo all that relatively frequently anyway, and duplication or reimplementation would not be so difficult and has its benefits, even if they do manage to hold you to it.

To the extent this is not true, I do not expect employers to be able to ‘get away with’ tying their workers hands in this way in practice, both because of practical difficulties of locking these things down and also that employees you want won’t stand for it when it matters. There are alignment problems that exist between keyboard and chair.

The First Thing We Let AI Do

Lawfare’s Justin Curl, Sayash Kapoor & Arvind Narayanan go all the way to saying ‘AI won’t automatically make legal services cheaper,’ for three reasons. This is part of the ongoing ‘AI as normal technology’ efforts to show Nothing Ever Changes.

  1. AI access restrictions due to ‘unauthorized practice of law.’
  2. Competitive equilibria shift upwards as productivity increases.
  3. Human bottlenecks in the legal process.

Or:

  1. It’s illegal to raise legal productivity.
  2. If you raise legal productivity you get hit by Jevons Paradox.
  3. Humans will be bottlenecks to raising legal productivity.

Shakespeare would have a suggestion on what we should do in a situation like that.

These seem like good reasons gains could be modest and that we need to structure things to ensure best outcomes, but not reasons to not expect gains on prices of existing legal services.

  1. We already have very clear examples of gains, where we save quite a lot of time and money by using LLMs in practice today, and no one is making any substantial move to legally interfere. Their example is that the legal status of using AI to respond to debt collection lawsuits to help you fill out checkboxes is unclear. We don’t know exactly where the lines are, but it seems very clear that you can use AI to greatly improve ability to respond here and this is de facto legal. This paper claims AI services will be inhibited, and perhaps they somewhat are, but Claude and ChatGPT and Gemini exist and are already doing it.
  2. Most legal situations are not adversarial, although many are, and there are massive gains already being seen in automating such work. In fully adversarial situations increased productivity can cancel out, but one should expect decreasing marginal returns to ensure there are still gains, and discovery seems like an excellent example of where AI should decrease costs. The counterexample of discovery is because it opened up vastly more additional human work, and we shouldn’t expect that to apply here.
    1. AI also allows for vastly superior predictability of outcomes, which should lead to more settlements and ways to avoid lawsuits in the first place, so it’s not obvious that AI results in more lawsuits.
    2. The place I do worry about this a lot is where previously productivity was insufficiently high for legal action at all, or for threats of legal action to be credible. We might open up quite a lot of new action there.
    3. There is a big Levels of Friction consideration here. Our legal system is designed around legal actions being expensive. It may quickly break if legal actions become cheap.
  3. The human bottlenecks like judges could limit but not prevent gains, and can themselves use AI to improve their own productivity. The obvious solution is to outsource many judge tasks to AIs at least by default. You can give parties option to appeal at the risk of pissing off the human judge if you do it for no reason, they report that in Brazil AI is already accelerating judge work.

We can add:

  1. They point out that legal services are expensive in large part because they are ‘credence goods,’ whose quality is difficult to evaluate. However AI will make it much easier to evaluate quality of legal work.
  2. They point out a 2017 Clio study that lawyers only engaged in billable work for 2.3 hours a day and billed for 1.9 hours, because the rest was spent finding clients, managing administrative tasks and collecting payments. AI can clearly greatly automate much of the remaining ~6 hours, enabling lawyers to do and bill for several times more billable legal work. The reason for this is that the law doesn’t allow humans to service the other roles, but there’s no reason AI couldn’t service those roles. So if anything this means AI is unusually helpful here.
  3. If you increase the supply of law hours, basic economics says price drops.
  4. It sounds like our current legal system is deeply f***ed in lots of ways, perhaps AI can help us write better laws to fix that, or help people realize why that isn’t happening and stop letting lawyers win so many elections. A man can dream.
  5. If an AI can ‘draft 50 perfect provisions’ in a contract, then another AI can confirm that the provisions are perfect, and provide a proper summary of implications and check the provisions against your preferences. As they note, mostly humans currently agree to contracts without even reading them, so ‘forgoing human oversight’ would frequently not mean anything was lost.
  6. A lot of this ‘time for lawyers to understand complex contracts’ talk sounds like the people who said that humans would need to go over and fully understand every line of AI code so there would not be productivity gains there.

That doesn’t mean we can’t improve matters a lot via reform.

  1. On unlicensed practice of law, a good law would be a big win, but a bad law could be vastly worse than no law, and I do not trust lawyers to have our best interests at heart when drafting this rule. So actually it might be better to do nothing, since it is currently (to my surprise) working out fine.
  2. Adjudication reform could be great. AI could be an excellent neutral expert and arbitrate many questions. Human fallbacks can be left in place.

They do a strong job of raising considerations in different directions, much better than the overall framing would suggest. The general claim is essentially ‘productivity gains get forbidden or eaten’ akin to the Sumo Burja ‘you cannot automate fake jobs’ thesis.

Whereas I think that much of the lawyers’ jobs are real, and also you can do a lot of automation of even the fake parts, especially in places where the existing system turns out not to do lawyers any favors. The place I worry, and why I think the core thesis is correct that total legal costs may rise, is getting the law involved in places where it previously would have avoided.

In general, I think it is correct to think that you will find bottlenecks and ways for some of the humans to remain productive for even rather high mundane AI capability levels, but that this does not engage with what happens when AI gets sufficiently advanced beyond that.

roon: but i figure the existence of employment will last far longer than seems intuitive based on the cataclysmic changes in economic potentials we are seeing

Dean W. Ball: I continue to think the notion of mass unemployment from AI is overrated. There may be shocks in some fields—big ones perhaps!—but anyone who thinks AI means the imminent demise of knowledge work has just not done enough concrete thinking about the mechanics of knowledge work.

Resist the temptation to goon at some new capability and go “it’s so over.” Instead assume that capability is 100% reliable and diffused in a firm, and ask yourself, “what happens next?”

You will often encounter another bottleneck, and if you keep doing this eventually you’ll find one whose automation seems hard to imagine.

Labor market shocks may be severe in some job types or industries and they may happen quickly, but I just really don’t think we are looking at “the end of knowledge work” here—not for any of the usual cope reasons (“AI Is A Tool”) but because of the nature of the whole set of tasks involved in knowledge work.

Ryan Greenblatt: I think the chance of mass unemployment* from AI is overrated in 2 years and underrated in 7**. Same is true for many effects of AI.

* Putting aside gov jobs programs, people specifically wanting to employ a human, etc.

Concretely, I don’t expect mass unemployment (e.g. >20%) prior to full automation of AI R&D and fast autonomous robot doubling times at least if these occur in <5 years. If full automation of AI R&D takes >>5 years, then more unemployment prior to this becomes pretty plausible.
** Among SF/AI-ish people.

But soon after full automation of AI R&D (<3 years, pretty likely <1), I think human cognitive labor will no longer have much value

Dean Ball offers an example of a hard-to-automate bottleneck: The process of purchasing a particular kind of common small business. Owners of the businesses are often prideful, mistrustful, confused, embarrassed or angry. So the key bottleneck is not the financial analysis but the relationship management. I think John Pressman pushes back too strongly against this, but he’s right to point out that AI outperforms existing doctors on bedside manner without us having trained for that in particular. I don’t see this kid of social mastery and emotional management as being that hard to ultimately automate. The part you can’t automate is as always ‘be an actual human’ so the question is whether you literally need an actual human for this task.

Claire Vo goes viral on Twitter for saying that if you can’t do everything for your business in one day, then ‘you’ve been kicked out of the arena’ and you’re in denial about how much AI will change everything.

Settle down, everyone. Relax. No, you don’t need to be able to do everything in one day or else, that does not much matter in practice. The future is unevenly distributed, diffusion is slow and being a week behind is not going to kill you. On the margin, she’s right that everyone needs to be moving towards using the tools better and making everything go faster, and most of these steps are wise. But seriously, chill.

Legally Claude

The legal rulings so far have been that your communications with AI never have attorney-client privilege, so services like ChatGPT and Claude must if requested to do so turn over your legal queries, the same as Google turns over its searches.

Jim Babcock thinks the ruling was in error, and that this was more analogous to a word processor than a Google search. He says Rakoff was focused on the wrong questions and parallels, and expects this to get overruled, and that using AI for the purposes of preparing communication with your attorney will ultimately be protected.

My view and the LLM consensus is that Rakoff’s ruling likely gets upheld unless we change the law, but that one cannot be certain. Note that there are ways to offer services where a search can’t get at the relevant information, if those involved are wise enough to think about that question in advance.

Moish Peltz: Judge Rakoff just issued a written order affirming his bench decision, that [no you don’t have any protections for your AI conversations.]

Jim Babcock: Having read the text of this ruling, I believe it is clearly in error, and it is unlikely to be repeated in other courts.
The underlying facts of this case are that a criminal defendant used an AI chatbot (Claude) to prepare documents about defense strategy, which he then sent to his counsel. Those interactions were seized in a search of the defendant’s computers (not from a subpoena of Anthropic). The argument is then about whether those documents are subject to attorney-client privilege. The ruling holds that they are not.
The defense argues that, in this context, using Claude this way was analogous to using an internet-based word processor to prepare a letter to his attorney.
The ruling not only fails to distinguish the case with Claude from the case with a word processor, it appears to hold that, if a search were to find a draft of a letter from a client to his attorney written on paper in the traditional way, then that letter would also not be privileged.
The ruling cites a non-binding case, Shih v Petal Card, which held that communications from a civil plaintiff to her lawyer could be withheld in discovery… and disagrees with its holding (not just with its applicability). So we already have a split, even if the split is not exactly on-point, which makes it much more likely to be reviewed by higher courts.

Eliezer Yudkowsky: This is very sensible but consider: The *funniest* way to solve this problem would be to find a jurisdiction, perhaps outside the USA, which will let Claude take the bar exam and legally recognize it as a lawyer.

Predictions Are Hard, Especially About The Future, But Not Impossible

Freddie DeBoer takes the maximally anti-prediction position, so one can only go by events that have already happened. One cannot even logically anticipate the consequences of what AI can already do when it is diffused further into the economy, and one definitely cannot anticipate future capabilities. Not allowed, he says.

Freddie deBoer: I have said it before, and I will say it again: I will take extreme claims about the consequences of “artificial intelligence” seriously when you can show them to me now. I will not take claims about the consequences of AI seriously as long as they take the form of you telling me what you believe will happen in the future. I will seriously entertain evidence-backed observations, not speculative predictions.

That’s it. That’s the rule; that’s the law. That’s the ethic, the discipline, the mantra, the creed, the holy book, the catechism. Show me what AI is currently doing. Show me! I’m putting down my marker here because I’d like to get out of the AI discourse business for at least a year – it’s thankless and pointless – so let me please leave you with that as a suggestion for how to approach AI stories moving forward. Show, don’t tell, prove, don’t predict.

Freddie rants repeatedly that everyone predicting AI will cause things to change has gone crazy. I do give him credit for noticing that even sensible ‘skeptical’ takes are now predicting that the world will change quite a lot, if you look under the hood. The difference is he then uses that to call those skeptics crazy.

Normally I would not mention someone doing this unless they were far more prominent than Freddie, but what makes this different is he virtuously offers a wager, and makes it so he has to win ALL of his claims in order to win three years later. That means we get to see where his Can’t Happen lines are.

Freddie deBoer: For me to win the wager, all of the following must be true on Feb 14, 2029:

Labor Market:

  1. The U.S. unemployment rate is equal to or lower than 18%
  2. Labor force participation rate, ages 25-54, is equal to or greater than 68%
  3. No single BLS occupational category will have lost 50% or more of jobs between now and February 14th 2029

Economic Growth & Productivity:

  1. U.S. GDP is within -30% to +35% of February 2026 levels (inflation-adjusted)
  2. Nonfarm labor productivity growth has not exceeded 8% in any individual year or 20% for the three-year period

Prices & Markets:

  1. The S&P 500 is within -60% to +225% of the February 2026 level
  2. CPI inflation averaged over 3 years is between -2% and +18% annually

Corporate & Structural:

  1. The Fortune 500 median profit margin is between 2% and 35%
  2. The largest 5 companies don’t account for more than 65% of the total S&P 500 market cap

White Collar & Knowledge Workers:

  1. “Professional and Business Services” employment, as defined by the Bureau of Labor Statistics, has not declined by more than 35% from February 2026
  2. Combined employment in software developers, accountants, lawyers, consultants, and writers, as defined by the Bureau of Labor Statistics, has not declined by more than 45%
  3. Median wage for “computer and mathematical occupations,” as defined by the Bureau of Labor Statistics, is not more than 60% lower in real terms than in February 2026
  4. The college wage premium (median earnings of bachelor’s degree holders vs high school only) has not fallen below 30%

Inequality:

  1. The Gini coefficient is less than 0.60
  2. The top 1%’s income share is less than 35%
  3. The top 0.1% wealth share is less than 30%
  4. Median household income has not fallen by more than 40% relative to mean household income

Those are the bet conditions. If any one of those conditions is not met, if any of those statements are untrue on February 14th 2029, I lose the bet. If all of those statements remain true on February 14th 2029, I win the bet. That’s the wager.

The thing about these conditions is they are all super wide. There’s tons of room for AI to be impacting the world quite a bit, without Freddie being in serious danger of losing one of these. The unemployment rate has to jump to 18% in three years? Productivity growth can’t exceed 8% a year?

There’s a big difference between ‘the skeptics are taking crazy pills’ and ‘within three years something big, like really, really big, is going to happen economically.’

Claude was very confident Freddie wins this bet. Manifold is less sure, putting Freddie’s chances around 60%. Scott Alexander responded proposing different terms, and Freddie responded in a way I find rather disingenuous but I’m used to it.

Many Worlds

There is a huge divide between those who have used Claude Code or Codex, and those who have not. The people who have not, which alas includes most of our civilization’s biggest decision makers, basically have no idea what is happening at this point.

This is compounded by:

  1. The people who use CC or Codex also have used Claude Opus 4.6 or at least GPT-5.2 and GPT-5.3-Codex, so they understand the other half of where we are, too.
  2. Whereas those who refused to believe it or refused to ever pay a dollar for anything keep not trying new things, so they are way farther behind than the free offerings would suggest, and even if they use ChatGPT they don’t have any idea that GPT-5.2-Thinking and GPT-5.2 are fundamentally different.
  3. The people who use CC or Codex are those who have curiosity to try, and who lack motivated reasons not to try.

Caleb Watney: Truly fascinating to watch the media figures who grasp Something Is Happening (have used claude code at least once) and those whose priors are stuck with (and sound like) 4o

Biggest epistemic divide I’ve seen in a while.

Alex Tabarrok: Absolutely stunning the number of people who are still saying, “AI doesn’t think”, “AI isn’t useful,” “AI isn’t creative.” Sleepwalkers.

Zac Hill: Watched this play out in real time at a meal yesterday. Everyone was saying totally coherent things for a version of the world that was like a single-digit number of weeks ago, but now we are no longer in that world.

Zac Hill: Related to this, there’s a big divide between users of Paid Tier AI and users of Free Tier AI that’s kinda sorta analogous to Dating App Discourse Consisting Exclusively of Dudes Who Have Never Paid One (1) Dollar For Anything. Part of understanding what the tech can do is unlocking your own ability to use it.

Ben Rasmussen: The paid difference is nuts, combined with the speed of improvement. I had a long set of training on new tools at work last week, and the amount of power/features that was there compared to the last time I had bothered to look hard (last fall) was crazy.

There is then a second divide, between those who think ‘oh look what AI can do now’ and those who think ‘oh look what AI will be able to do in the future,’ and then a third between those who do and do not flinch from the most important implications.

Hopefully seeing the first divide loud and clear helps get past the next two?

Bubble, Bubble, Toil and Trouble

In case it was not obvious, yes, OpenAI has a business model. Indeed they have several, only one of which is ‘build superintelligence and then have it model everything including all of business.’

Ross Barkan: You can ask one question: does AI have a business model? It’s not a fun answer.

Timothy B. Lee: I suspect this is what’s going on here. And actually it’s possible that Barkan thinks these two claims are equivalent.

A Bold Prediction

Elon Musk predicts that AI will bypass coding entirely by the end of the year and directly produce binaries. Usually I would not pick on such predictions, but he is kind of important and the richest man in the world, so sure, here’s a prediction market on that where I doubled his time limit, which is at 3%.

Elon Musk just says things.

Brave New World

Tyler Cowen says that, like after the Roman Empire or American Revolution or WWII, AI will require us to ‘rebuild our world.’

Tyler Cowen: And so we will [be] rebuilding our world yet again. Or maybe you think we are simply incapable of that.

As this happens, it can be useful to distinguish “criticisms of AI” from “people who cannot imagine that world rebuilding will go well.” A lot of what parades as the former is actually the latter.

Jacob: Who is this “we”? When the strong AI rebuilds their world, what makes you think you’ll be involved?

I think Tyler’s narrow point is valid if we assume AI stays mundane, and that the modern world is suffering from a lot of seeing so many things as sacred entitlements or Too Big To Fail, and being unwilling to rebuild or replace, and the price of that continues to rise. Historically it usually takes a war to force people’s hand, and we’d like to avoid going there. We keep kicking various cans down the road.

A lot of the reason that we have been unable to rebuild is that we have become extremely risk averse, loss averse and entitled, and unwilling to sacrifice or endure short term pain, and we have made an increasing number of things effectively sacred values. A lot of AI talk is people noticing that AI will break one or another sacred thing, or pit two sacred things against each other, and not being able to say out loud that maybe not all these things can or need to be sacred.

Even mundane AI does two different things here.

  1. It will invalidate increasingly many of the ways the current world works, forcing us to reorient and rebuild so we have a new set of systems that works.
  2. It provides the productivity growth and additional wealth that we need to potentially avoid having to reorient and rebuild. If AI fails to provide a major boost and the system is not rebuilt, the system is probably going to collapse under its own weight within our lifetimes, forcing our hand.

If AI does not stay mundane, the world utterly transforms, and to the extent we stick around and stay in charge, or want to do those things, yes we will need to ‘rebuild,’ but that is not the primary problem we would face.

Cass Sunstein claims in a new paper that you could in theory create a ‘[classical] liberal AI’ that functioned as a ‘choice engine’ that preserved autonomy, respected dignity and helped people overcome bias and lack of information and personalization, thus making life more free. It is easy to imagine, again in theory, such an AI system, and easy to see that a good version of this would be highly human welfare-enhancing.

Alas, Cass is only thinking on the margin and addressing one particular deployment of mundane AI. I agree this would be an excellent deployment, we should totally help give people choice engines, but it does not solve any of the larger problems even if implemented well, and people will rapidly end up ‘out of the loop’ even if we do not see so much additional frontier AI progress (for whatever reason). This alone cannot, as it were, rebuild the world, nor can it solve problems like those causing the clash between the Pentagon and Anthropic.

Augmented Reality

Augmented reality is coming. I expect and hope it does not look like this, and not only because you would likely fall over a lot and get massive headaches all the time:

Eliezer Yudkowsky: Since some asked for counterexamples: I did not live this video a thousand times in prescience and I had not emotionally priced it in

michael vassar: Vernor Vinge did though.

X avatar for @AutismCapital
Autism Capital 🧩@AutismCapital
This is actually what the future will look like. When wearable AR glasses saturate the market a whole generation will grow up only knowing reality through a mixed virtual/real spatial computing lens. It will be chaotic and stimulating. They will cherish their digital objects.
4:10 AM · Feb 17, 2026 · 1.07M Views

853 Replies · 1.21K Reposts · 12.4K Likes

Francesca Pallopides: Many users of hearing aids already live in an ~AR soundscape where some signals are enhanced and many others are deliberately suppressed. If and when visual AR takes off, I expect visual noise suppression to be a major basal function.

Francesca Pallopides: I’ve long predicted AR tech will be used to *reduce* the perceived complexity of the real world at least as much as it adds on extra layers. Most people will not want to live like in this video.

Augmented reality is a great idea, but simplicity is key. So is curation. You want the things you want when you want them. I don’t think I’d go as far as Francesca, but yes I would expect a lot of what high level AR does is to filter out stimuli you do not want, especially advertising. The additions that are not brought up on demand should mostly be modest, quiet and non-intrusive.

Quickly, There’s No Time

Ajeya Cotra makes the latest attempt to explain how a lot of disagreements about existential risk and other AI things still boil down to timelines and takeoff expectations.

If we get the green line, we’re essentially safe, but that would require things to stall out relatively soon. The yellow line is more hopeful than the red one, but scary as hell.

Is it possible to steer scientific development or are we ‘stuck with the tech tree?’

Tao Burga takes the stand that human agency can still matter, and that we often have intentionally reached for better branches, or better branches first, and had that make a big difference. I strongly agree.

We’ve now gone from ‘super short’ timelines of things like AI 2027 (as in, AGI and takeoff could start as soon as 2027) to ‘long’ timelines (as in, don’t worry, AGI won’t happen until 2035, so those people talking about 2027 were crazy), to now many rumors of (depending on how you count) 1-3 years.

Phil Metzger: Rumors I’m hearing from people working on frontier models is that AGI is later this year, while AI hard-takeoff is just 2-3 years away.

I meant people in the industry confiding what they think is about to happen. Not [the Dario] interview.

Austen Allred: Every single person I talk to working in advanced research at frontier model companies feels this way, and they’re people I know well enough to know they’re not bluffing.

They could be wrong or biased or blind due to their own incentives, but they’re not bluffing.

jason: heard the same whispers from folks in the trenches, they’re legit convinced we’re months not years away from agi, but man i remember when everyone said full self driving was just around the corner too

What caused this?

Basically nothing you shouldn’t have expected.

The move to the ‘long’ timelines was based on things as stupid as ‘this is what they call GPT-5 and it’s not that impressive.’

The move to the new ‘short’ timelines is based on, presumably, Opus 4.6 and Codex 5.3 and Claude Code catching fire and OpenClaw so on, and I’d say Opus 4.5 and Opus 4.6 exceeded expectations but none of that should have been especially surprising either.

We’re probably going to see the same people move around a bunch in response to more mostly unsurprising developments.

What happened with Bio Anchors? This was a famous-at-the-time timeline projection paper from Ajeya Cotra, based around the idea that AGI would take similar compute to what it took evolution, predicting AGI around 2050. Scott Alexander breaks it down, and the overall model holds up surprisingly well except it dramatically underestimated the rate algorithmic efficiency improvements, and if you adjust that you get a prediction of 2030.

If Anyone Builds It, We Can Avoid Building The Other It And Not Die

Saying ‘you would pause if you could’ is the kind of thing that gets people labeled with the slur ‘doomers’ and otherwise viciously attacked by exactly people like Alex Karp.

Instead Alex Karp is joining Demis Hassabis and Dario Amodei in essentially screaming for help with a coordination mechanism, whether he realizes it or not.

If anything he is taking a more aggressive pro-pause position than I do.

Jawwwn: Palantir CEO Alex Karp: The luddites arguing we should pause AI development are not living in reality and are de facto saying we should let our enemies win:

“If we didn’t have adversaries, I would be very in favor of pausing this technology completely, but we do.”

David Manheim: This was the argument that “realists” made against the biological weapons convention, the chemical weapons convention, the nuclear test ban treaty, and so on.

It’s self-fulfilling – if you decide reality doesn’t allow cooperation to prevent disasters, you won’t get cooperation.

Peter Wildeford: Wow. Palantir CEO -> “If we didn’t have adversaries, I would be very in favor of pausing this technology completely, but we do.”

I agree that having adversaries makes pausing hard – this is why we need to build verification technology so we have the optionality to make a deal.

We should all be able to agree that a pause requires:

  1. An international agreement to pause.
  2. A sufficiently advanced verification or other enforcement mechanism.

Thus, we should clearly work on both of these things, as the costs of doing so are trivial compared to the option value we get if we can achieve them both.

In Other AI News

Anthropic adds Chris Liddell to its board of directors, bringing lots of corporate experience and also his prior role as Deputy White House Chief of Staff during Trump’s first term. Presumably this is a peace offering of sorts to both the market and White House.

India joins Pax Silica, the Trump administration’s effort to secure the global silicon supply chain. Other core members are Japan, South Korea, Singapore, the Netherlands, Israel, the UK, Australia, Qatar and the UAE. I am happy to have India onboard, but I am deeply skeptical of the level of status given here to Qatar and the UAE, when as far as I can tell they are only customers (and I have misgivings about how we deal with that aspect as well, including how we got to those agreements). Among others missing, Taiwan is not yet on that list. Taiwan is arguably the most important country in this supply chain.

GPT-5.2 derives a new result in theoretical physics.

OpenAI is also participating in the ‘1st Proof’ challenge.

Dario Amodei and Sam Altman conspicuously decline to hold hands or make eye contact during a photo op at the AI Summit in India.

Anthropic opens an office in Bengaluru, India, its second in Asia after Tokyo.

Anthropic announces partnership with Rwanda for healthcare and education.

AI Futures gives the December 2025 update on how their thinking and predictions have evolved over time, how the predictions work, and how our current world lines up against the predicted world of AI 2027.

Daniel Kokotajlo: The primary question we estimate an answer to is: How fast is AI progress moving relative to the AI 2027 scenario?
Our estimate: In aggregate, progress on quantitative metrics is at (very roughly) 65% of the pace that happens in AI 2027. Most qualitative predictions are on pace.

In other words: Like we said before, things are roughly on track, but progressing a bit slower.

OpenAI ‘accuses’ DeepSeek of distilling American models to ‘gain an edge.’ Well, yes, obviously they are doing this, I thought we all knew that? Them’s the rules.

MIRI’s Nate Sores went to the Munich Security Conference full of generals and senators to talk about existential risk from AI, and shares some of his logistical mishaps and also his remarks. It’s great that he was invited to speak, wasn’t laughed at and many praised him and also the book If Anyone Builds It, Everyone Dies. Unfortunately all the public talk was mild and pretended superintelligence was not going to be a thing. We have a long way to go.

If you let two AIs talk to each other for a while, what happens? You end up in an ‘attractor state.’ Groks will talk weird pseudo-words in all caps, GPT-5.2 will build stuff but then get stuck in a loop, and so on. It’s all weird and fun. I’m not sure what we can learn from it.

India is hosting the latest AI Summit, and like everyone else is treating it primarily as a business opportunity to attract investment. The post also covers India’s AI regulations, which are light touch and mostly rely on their existing law. Given how overregulated I believe India is in general, ‘our existing laws can handle it’ and worry about further overregulation and botched implementation have relatively strong cases there.

Introducing

Qwen 3.5-397B-A17B, HuggingFace here, 1M context window.

We have some benchmarks.

Tiny Aya, a family of massively multilingual models that can fit on phones.

Get Involved

Tyler John has compiled a plan for a philanthropic strategy for the AGI transition called The Foundation Layer, and he is hiring.

Tyler’s effort is a lot like Bengio’s State of AI report. It describes all the facts in a fashion engineered to be calm and respectable. The fact that by default we are all going to die is there, but if you don’t want to notice it you can avoid noticing it.

There are rooms where this is your only move, so I get it, but I don’t love it.

Tyler John: The core argument?

The best available evidence based on benchmarks, expert testimony, and long-term trends imply that we should expect smarter-than-human AI around 2030. Once we achieve this: billions of superhuman AIs deployed everywhere.

This leads to 3 major risks:

  1. AI will distribute + invent dual-use technologies like bioweapons and dirty bombs
  2. If we can’t reliably control them, and we automate most of human decision-making, AI takes over
  3. If we can control them, a small group of people can rule the world

Society is rushing to give AI control of companies, government decision-making, and military command and control. Meanwhile AI systems disable oversight mechanisms in testing, lie to users to pursue their own goals, and adopt misaligned personas like Sydney and Mecha Hitler.

I may sound like a doomer, but relative to many people who understand AI I am actually an optimist, because I think these problems can be solved. But with many technologies, we can fix problems gradually over decades. Here, we may have just five years.

I advocate philanthropy as a solution. Unlike markets, philanthropy can be laser focused on the most important problems and act to prepare us before capital incentives exist. Unlike governments, philanthropy can deploy rapidly at scale and be unconstrained by bureaucracy.

I estimate that foundations and nonprofit organizations have had an impact on AGI safety comparable to any AI lab or government, for about 1/1,000th of the cost of OpenAI’s Project Stargate.

Want to get started? Look no further than Appendix A for a list of advisors who can help you on your journey, co-funders who can come alongside you, and funds for a more hands-off approach. Or email me at [email protected]

Blue Rose is hiring an AI Politics Fellow.

Show Me the Money

Anthropic raises $30 billion at $380 billion post-money valuation, a small fraction of the value it has recently wiped off the stock market, in the totally normal Series G, so only 19 series left to go. That number seems low to me, given what has happened in the last few months with Opus 4.5, Opus 4.6 and Claude Code.

andy jones: i am glad this chart is public now because it is bananas. it is ridiculous. it should not exist.

it should be taken less as evidence about anthropic’s execution or potential and more as evidence about how weird the world we’ve found ourselves in is.

Tim Duffy: This one is a shock to me. I expected the revenue growth rate to slow a bit this year, but y’all are already up 50% from end of 2025!?!!?!

Investment in AI is accelerating to north of $1 trillion a year.

The Week In Audio

Trailer for ‘The AI Doc: Or How I Became An Apocaloptimist,’ movie out March 27. Several people who were interviewed or involved have given it high praise as a fair and balanced presentation.

Ross Douthat interviews Dario Amodei.

Y Combinator podcast hosts Boris Cherny, creator of Claude Code.

Ajeya Cotra on 80,000 Hours.

Tomorrow we continue with Part 2.



Discuss

Terminal Cynicism

2026-02-19 21:51:46

Published on February 19, 2026 1:51 PM GMT

I believe that many have reached Terminal Cynicism.

Terminal Cynicism is a level of cynicism that leads one to perceive everything negatively, even obviously good things. It is cynicism so extreme that it renders one incapable of productive thought or action.

The most common instance is people refusing to engage with politics in any shape because “The System is corrupt”, thus neglecting it and leaving it to decay.

At a personal level, Terminal Cynicism is dangerous. It feeds on weakness and insecurity, and alienates people.

I also believe that Terminal Cynicism is not always natural and organic. Instead, that it is quite often caused by agitators who spread Fear, Uncertainty and Doubt (FUD).

And defeating FUD requires a lot of clarity. So let’s clarify things.

Examples

First, consider a couple of beliefs commonly intertwined with Terminal Cynicism…

“Air Conditioning is bad.

A plethora of articles explain at length how bad AC is: because of greenhouse gases and cooling fluids, because many poor people don’t have access to it, and how there are actually many great alternatives to it!

But these articles are taking the problem in the wrong direction. AC is good: it makes people’s lives better.

If there’s a problem with how much energy people use, we can tax or ration it. If there’s a problem with the greenhouse emissions coming from people’s energy consumption, we can tax or ration them.

Trying to manage problems that far upstream in the supply chain through individual consumption is more about moralisation than efficiency.

Doing Politics is bad.”

There are much too many smart people who believe that doing politics at all is bad. That all politicians are bad, that wanting to do politics itself is a red flag, and even that it is meaningless to want to do politics as everything is corrupted.

This is harmful and self-defeating. Our institutions rely on people actually doing politics. We need competent politicians, competent citizens, people invested in political parties, and more.

Erasing oneself from politics is the central example of Terminal Cynicism.

Vaccines are bad.”

The anti-vax movement has moved from a fringe conspiracy theory to a widespread belief. It ranges from the belief that vaccines do not work to claims of vaccines containing microchips that interact with 5G towers.

This is wrong. But beyond its wrongness, it is the result of escalating distrust. No honest investigation ends up with “5G microchips” as its conclusion.

The facts of the matter are fairly straightforward. Vaccines as a class have been a powerful tool to erase diseases and slow their spread. While not all vaccines are equally good, they are among the triumphs of medicine.

Let’s be clear: it makes sense to discuss the efficiency and the risk profile of a vaccine. This is why we have long and documented procedures to establish a vaccine as safe and useful.

Similarly, it also makes sense to discuss whether vaccines should be mandatory. This is a non-trivial public health question. Even though mandatory vaccines let us eradicate diseases in the past, it did come at the cost of personal freedom and bodily autonomy.

Overall though, vaccines have been a public health victory. There are many reasonable compromises and trade-offs that are meaningfully debatable. But not whether vaccines contain Bill Gates’ microchips. That one is just Terminal Cynicism.

There are many more examples of this type of Terminal Cynicism.

Terminally Cynical beliefs are usually wrong, but not always. It has little to do with the philosophical intricacies of the underlying question or the sometimes subtle truth of the matter.

Instead, what makes it deeply wrong is that it stems from a cynicism so bad that it will take a thing that it itself thinks is good and frame it as bad on purpose.

Here are examples of Terminally Cynical beliefs, along with an example of the cynical justification.

Sometimes, I give two such justifications. I do so when Terminal Cynicism has led to both the typical far-left and far-right clusters to independently build their own cynical justification for why a given virtue is actually bad.

Other times, I give two pairs of beliefs. I do so when Terminal Cynicism prevents either side from looking for balance or a synthesis.

Consider:

“Science” is bad: it’s a white construct demeaning traditional wisdom. “Universities” are bad: they’re an institution that has been fully captured by wokes.

“A Strong Military” is bad: we should never enforce our version of the international order, and let others do so instead. “Opposing Russia” is bad: we should never enforce our version of the international order, and let others do so instead.

“Atheism” is bad: it may be factually correct, but atheists are responsible for the most destructive totalitarian regimes, and atheism doesn’t literally answer all questions. “Religion” is bad: it may help people live better lives, but religious people are responsible for many of the worst moral systems in the modern world, and religion is literally wrong on a bunch of factual questions.

“A Solid Police and Judicial System” is bad: they sometimes make mistakes, which means we should defund them. “A Solid Police and Judicial System” is bad: due process often lets criminals get away, which means we should completely bypass it.

“Hard Work” is bad: it’s a bourgeois fantasy, licking the boot and helping capitalism. “Hard Work” is bad: The Elites rig everything. Don’t be a wage-cuck, scam people and go all in on crypto.

“Power” is bad: being a victim is morally better.

“The System” is bad: because it is hegemonic, it is responsible for every bad thing that happens.

And one of the major endpoints of Terminal Cynicism“Democracy” is bad: people are evil and stupid. Instead, people I like should force their will on everyone.

Mechanisms

Terminal Cynicism may seem contradictory. It is a complete reversal of what’s good and bad. And yet, as we see above, it’s everywhere. So many people will take obviously good things and say they are bad for very weak reasons.

It is both consequential (it has consequences!) and absurd: it makes little sense and is self-contradictory.

Thus, I think the phenomenon warrants the search for a solid explanation. My personal explanation features two parts.
The first one is Abstract Idealism, where people don’t care for the real world.
The second is Vice Signalling, when people commit bad actions specifically to be noticed.

Abstract Idealism

The first one is Abstract Idealism. I have written more at length about Abstract Idealism here.

Abstract Idealism is the phenomenon where people refuse to consider the real world. Instead, they waste most of their thoughts on imaginary worlds.

A popular imaginary world is the The Revolution. People who believe in The Revolution spend a lot of their time thinking about how great things would be there (or after it happens, depending on the person). They either fantasise about purging the world out of its corrupted people and systems, or about how everything will be perfect after.

Another popular imaginary world is When My People will hold all the Power. People who believe in it are willing to sacrifice a lot to get “their people” winning. This is all justified by the fact that once their people win, they’ll finally be able to reshape the world to remove all their problems.

Abstract Idealists are fond of many more epic imaginary worlds. The world where Everyone is Nice and Enlightened. The world of Ancapistan, the Anarcho-capitalist paradise. [Fill in your most disliked utopia.]

Abstract Idealists are constantly disappointed by the real world. They compare the real world to their Ideal one, and feel it is never enough.

All the dirty systems that are necessary in our real world are unneeded in theirs. In the real world, we need ways to deal with our irreconcilable disagreements, our sociopaths, our vices, our traumas and our terrible mistakes. So we have armies, police forces, and prisons.

But to an Abstract Idealist, these are warts and blemishes that must be eliminated. There is no such need for them in their world. It is known in advance who is Right, so there is no need for armies: all must submit to the Right. It is known in advance who is Good and who is Evil: so we must simply purge ourselves of Evil, and rehabilitate the Mistaken.

In the real world, we face so many problems. We are not all as moral, productive or smart as each other; we must triage between the sick, the elderly, workers and children; we are not infinitely altruistic; we have little self-discipline and self-awareness; and so on and so forth.

These problems are far from being solved. The solutions we have collectively come up so far are all imperfect. Markets, states, psychology, social norms, culture, philosophies and religions.

These solutions may look good to us, because we imagine what our world would look like without them. But to an Abstract Idealist, they look utterly terrible. In their world, there never is anything so impure.

Often, it gets to the point where the parts that derive meaning because we live in the real world are to be erased on the altar of their Ideals.

No one needs to have power. There is no one to defend against. Everyone is inoffensive and shares the same values, so there are never any fights.

No one needs to have children or to work. All our communities and all of our civilisation can be provided by AI and/or hyper-efficient communism.

No one needs to be disciplined and to respect any norms. People acting however they want naturally lead to good outcomes.

Vice Signalling

 

When it doesn’t lead to Extinction Risks from AI or Misanthropy, Abstract Idealism can be endearing. Anyone with an Artist or a Deep Nerd among their friends knows the feeling we get when we see a friend fully dedicated to their craft, completely disconnected from how useful or not it may be.

But the other mechanism at play in Terminal Cynicism, Vice Signalling, is the opposite.

Vice Signalling is bragging to others about one’s willingness to believe, say and do things that they know are bad.

Beyond its moral failure, it may seem self-defeating: why would one ever do that? But it can in fact be useful in many situations.

Most notably, when you’re a teenager, among other equally underdeveloped teenagers. To show that you’re cool, one common strategy is to go against all the things that authorities say are good. You’ll incur risks of accidents, you’ll disturb other people, you’ll violate norms, you’ll damage public property.

That all of these behaviours are bad and costly is the point. You are demonstrating how bad you are willing to be cool. If it was good for you and others, then it would be a worse signal: you’d already have other reasons to do it. It wouldn’t demonstrate how cool and disregarding of authority you are.

Among adults, there is another common cluster of Vice Signalling. It is known by many names: Rage Baiters, Drama Queens, Clout Chasers, Engagement Farmers. The principle is the same, they aim to get more attention by being shocking. And thus the worse the things they say or do, the more they gain.

However, I have found that the type of Vice Signalling that leads to Terminal Cynicism is a bit different from the ones above.

From a Vice Signalling standpoint, the main engine of Terminal Cynicism is Contrarianism.

Contrarians love to contradict. They consistently go against the status quo, common sense, norms, and consensus.

That way, they can signal their uniqueness, their willingness to entertain thoughts everyone else deems taboo, and their superiority to mundane morals.

Nowadays, Contrarianism has become fashionable.

Everyone is against The System.
Everyone is against our Institutions.
Everyone is against Politicians.
Everyone is against the norms and traditions that have made things better.
Everyone is against everything.

Vice Signalling lets one signal how contrarian they are. The more vicious, the more contrarian, the more special they are.

“Oh, you are on the side of people who find good things good? How quaint! How mundane! What are you? An NPC? A sheeple?”

“So you think that AC is good? That Vaccines are good? What, that Science is good? Nay, that Education is good? Please!”

“Oh, so you hate Racism? Well, I for one hate Whiteness! I even hate Civilisation!”

It’s constant race to the bottom. Who will be the most offensive Contrarian and get away with it?

Abstract Idealism meets Vice Signalling

 

The worst lies at the intersection of the two.

Because the real world does not matter to an Abstract Idealist, they don’t even feel bad when Vice Signalling. It is purely beneficial; only gains, no cost!

They denigrate and worsen the real world, without care for how much worse they are making their own environment.

It is the entire point of The Revolution. Any damage to the existing order is progress toward the Liberation of everyone.

It is the entire point of Accelerationism. Everything is justified when you’re trying to Change The World as fast as possible.

It is a Vicious Circle. As they commit more visibly bad actions, they gain the image of a bad person, which alienates them from the people who don’t tolerate it. This in turn gets them to only interact with people who tolerate such actions, to have it become a central part of their identity, and ultimately to commit even worse actions.

If you’ve never met any, it is hard to tell you how damaging their demeanour and behaviour can be.

They take pride in dissing institutions, functioning economies, elites, civilisation, science, and all that is good.

They know they look smart to their peers when they come up with a counterintuitive reason for why something good is actually bad.

They think it is cool to cheat The System and defraud it, because The System is bad.

They think everything is corrupted, and that The System is full of bad intents. As a result, they should also never interact with The System from a place of good faith: nothing good would result from that.

They are agitators, saboteurs, troublemakers. They often spend their energy thinking of ways to make things worse.

Not only in conversations, but often through actions. They’ll use whatever modicum of power they get for nefarious purposes. Ultimately, they’ll subvert the existing institutions for their political goals.

They may even ally themselves with terrible people, just to cause chaos. Demagogues, conspiracy theorists, political islamists. “The enemy of my enemy is my friend.” And when The System is your enemy, every defector is your friend.

Terminally Cynical Art

 

I believe Art is a strong sign of how pervasive Terminal Cynicism has become.

There’s so much fiction about breaking the mould. I think it’s been a long time since I have seen a piece of art praising the mould as good, efficient or beautiful.[1]

Similarly, there’s so much fiction about dystopias. The “Black Mirror”-ification of everything.

So many stories are about framing the protagonist’s unhappiness and suffering as the fault of The System, about framing most people’s perceived happiness as shallow and corrupt, this type of theme.[2]

It may sound strange, but I view this type of art as a fantasy.

It posits that The System is independent of people, that people have no agency over their fate, and thus that The System’s inability to make people feel happy is unfair. It considers that people deserve happiness by default.

That’s the fantasy. That we are all innocent, and that The System is doing this to us.

In reality though, The System is us. We are the ones currently failing to build ourselves a better life, despite centuries of technological improvements.

Reconciling our reality with the fantasy takes vision. One must come up with a world where people are both agents and subjects of The System. In this fantasy, through their action, people would improve The System and thus their lives.

This requires far more creativity and understanding of the world than coming up with “lore”. Any creative artist can come up with a new species of humans with 4 arms, blue skin, supernatural abilities, a different language, or whatever change that doesn’t suggest anything interesting for The System.

This lack of vision leads to the “We live in a Society” syndrome, where artists make ignorant shallow societal commentary, by pointing to a necessary evil like prisons.

Sometimes, they do this out of convenience. Artists often need to raise the stake of their stories, and having a character fight with The System is a common trope that doesn’t require much imagination.[3]

More often than not though, artists are genuinely not self-aware. Very few (artists or not) have been in a position to Govern a large group of people. And if one is the type of artist to not deeply research topics for weeks or months, they will never interact with people who do govern, nor study the history of those who have.

As a result, they do not even realise what type of experience is relevant to the conversation. They truly believe that their personal voice is something special to bring. Thus, we keep being fed an endless supply of shallow Art telling us The System Is Evil.

Art has discovered a formula that works for anti-system stories. The Hero sees a problem, tries to solve it within The System, fails, and decides to fight The System instead.
The Game is to make people feel catharsis and vindicated when they recognise a flaw of The System in the “work of art”.[4]

This is similar to how Social Media found its formula. There, The Game is to feed people a bunch of slop 90% of the time, so that they feel a dopamine hit when they unearth a “gem” the remaining 10%. In both cases, the experience makes the “consumer” worse off.

Art should inspire, and Social Media should connect. I believe that in both cases, Terminal Cynicism explains why their creators are not doing better, and we are not demanding better from them.

Conclusion

 

Terminal Cynicism is a serious problem.

I hope this article helps some of my readers immediately recognise Terminal Cynicism for what it is, develop an aversion to it, and notice when people spread it.

I have already written a follow-up, focused on who instigates it. Here though, I wanted to just show how it works.

On this, cheers!

 

  1. ^

    An exception may be The Path of Ascension. It’s a nice book series, available on Amazon Kindle.

  2. ^

    Womp womp. Never mind it being wrong, it is so meek.

    It is one thing to feel bad and like The System has it for us.

    It is another to spread this sentiment to everyone, as if to reassure oneself by having others confirm they feel similarly.

  3. ^

    It may erode trust in The System and make people less inclined to improve it, but who cares?

    Evil behaviour often comes from laziness and negligence, rather than bad intents.

  4. ^

    I tend to perceive it as formulaic slop. While the first few I watched as a teenager were inspiring (“Wow! You can in fact go against The System!”), I am now disgusted by their omnipresence.



Discuss

Aleatoric Uncertainty Is A Skill Issue

2026-02-19 21:11:37

Published on February 19, 2026 1:11 PM GMT

Aleatoric Uncertainty Is A Skill Issue

Epistemic status: shitpost with a point

Disclaimer: This grew out of a conversation with Claude. The ideas are mine, the writeup is LLM-generated and then post-edited to save time and improve the flow


You know the textbook distinction. Epistemic uncertainty is what you don't know. Aleatoric uncertainty is what can't be known — irreducible randomness baked into the fabric of reality itself.

Classic examples of aleatoric uncertainty: coin flips, dice rolls, thermal noise in sensors, turbulent airflow.

Here's the thing though.

A Laplacian demon predicts all of those.

Every single "classic example" of aleatoric uncertainty is a system governed by deterministic classical mechanics where we simply don't have good enough measurements or fast enough computers. The coin flip is chaotic, sure. But chaotic ≠ random. A sufficiently precise demon with full knowledge of initial conditions, air currents, surface elasticity, gravitational field gradients, and your thumb's muscle fiber activation pattern will tell you it's heads. Every time.

The thermal noise? Deterministic molecular dynamics. The dice? Newtonian mechanics with a lot of bounces. Turbulence? Navier-Stokes is deterministic, we just can't solve it well enough.

The Laplacian demon doesn't have aleatoric uncertainty. It's just that mortals have skill issues and are too proud to admit it.

So what's actually irreducibly random?

Quantum mechanics. That's it. That's the list.

And even that depends on your interpretation:

  • Copenhagen: Yes, measurement outcomes are fundamentally random. The Born rule probabilities are ontologically real. This is the one place where the universe actually rolls dice. God, apparently, does play dice, but only at this level.
  • Many-Worlds: Nope. The wave function evolves deterministically. Every outcome happens. The "randomness" is just you not knowing which branch you're on. That's not aleatoric — that's indexical uncertainty. You have the skill. You just don't know where you are.
  • Bohmian Mechanics: Nope. Hidden variables, fully deterministic. You just don't know the initial particle positions. Classic epistemic uncertainty wearing a trenchcoat.

So under Many-Worlds and Bohmian mechanics, all uncertainty is epistemic. The universe is fully deterministic. There are no dice. There is no irreducible randomness. There is only insufficient information.

Under Copenhagen, there is exactly one source of genuine aleatoric uncertainty: quantum measurement. Everything else that textbooks call "aleatoric" is a Laplacian demon looking at your sensor noise model and saying "get wrecked, scrubs."


The Real Problem: Lack of Epistemic Humility

Here's what actually bothers me about the standard framing. When you label something "aleatoric," you're making an ontological claim: this randomness is a property of the world. But in almost every classical case, it's not. It's a property of your model's resolution. It's noise in your world model that you're projecting onto reality.

And then you refuse to label it as such.

Think about what's happening psychologically. "It's not that my model is incomplete — it's that the universe is inherently noisy right here specifically where my model stops working." How convenient. The boundary of your ignorance just happens to coincide with the boundary of what's knowable. What are the odds?

The aleatoric/epistemic distinction, as commonly taught, isn't really a taxonomy of uncertainty. It's a taxonomy of accountability. Epistemic uncertainty is uncertainty you're responsible for reducing. Aleatoric uncertainty is uncertainty you've given yourself permission to stop thinking about. The label "irreducible" isn't doing technical work — it's doing emotional work. It's a declaration that you've tried hard enough.

And look, sometimes you have tried hard enough. Sometimes it's correct engineering practice to draw a line and say "I'm modeling everything below this scale as noise." But at least be honest about what you're doing. You're choosing a level of description. You're not discovering a fundamental feature of reality. The universe didn't put a noise floor there. You did.


"But This Distinction Is Useful In Practice!"

Yes! I agree! I'm not saying we should stop using the word "aleatoric" in ML papers and engineering contexts. When you're building a Bayesian neural network and you separate your uncertainty into "stuff I could reduce with more training data" vs. "inherent noise floor I should model as a variance parameter," that's a genuinely useful decomposition. You would, in fact, go completely insane trying to treat thermal noise in your LIDAR as epistemic and heroically trying to learn your way out of it.

The pragmatic framing does real work: aleatoric = "uncertainty I'm choosing to treat as irreducible at this level of description." That's fine. That's good engineering.

But let's stop pretending it's a deep metaphysical claim about the nature of reality. It's not. It's a statement about where you've chosen to draw the line on your modeling resolution. The universe (probably) isn't random. Your model is just too coarse to be a demon.


The Punchline

Interpretation Coin flip Thermal noise Quantum measurement Is anything aleatoric?
Classical (Laplace) Skill issue Skill issue N/A No
Copenhagen Skill issue Skill issue Actually random Yes, but only this
Many-Worlds Skill issue Skill issue Indexical uncertainty No*
Bohmian Skill issue Skill issue Skill issue No

* Unless you count "not knowing which branch you're on" as a new, secret third thing.

tl;dr: Aleatoric uncertainty is a skill issue. The Laplacian demon has no variance term. The only candidate for genuine ontological randomness is quantum mechanics, and half the interpretations say even that's deterministic. Your "irreducible noise" is just you being bad at physics and too proud to admit uncertainty in your model.


By the way: I may be the only one, but I was actually genuinely confused about this topic for years. I took the definition of aleatoric uncertainty literally and couldn't understand what the professors were on about when they called coin flips aleatoric uncertainty. None of the examples they gave were actually irreducible.



Discuss

All hands on deck to build the datacenter lie detector

2026-02-19 19:42:59

Published on February 19, 2026 11:42 AM GMT

Fieldbuilding for AI verification is beginning. A consensus for what to build, what key problems to solve, and who to get in on the problem is emerging. Last week, ~40 people in total, including independent researchers and representatives from various companies, think tanks, academic institutions and non-profit organisations met for multiple days to share ideas, identify challenges, and create actionable roadmaps for preparing verification mechanisms for future international AI agreements. The workshop was initiated by the Future of Life Institute and included the following participants

among many others.

Why this needs to happen now

The urgency and neglectedness of this challenge is underscored by recent comments by frontier AI company leadership and government representatives:

Dario Amodei, CEO of Anthropic:

“The only world in which I can see full restraint is one in which some truly reliable verification is possible.”

Ding Xuexiang, Chinese Vice Premier, speaking about AI at Davos in January 2025:

“If countries are left to descend into a disorderly competition, it would turn into a ‘gray rhino’ right in front of us.” (a visible but ignored risk with serious consequences.)

“It is like driving a car on the expressway. The driver would not feel safe to step on the gas pedal if he is not sure the brake is functional.”

JD Vance, Vice President of the United States of America:1

“Part of this arms race component is if we take a pause, does the People’s Republic of China not take a pause? And then we find ourselves all enslaved to P.R.C.-mediated A.I.?”

Beyond international coordination, there are further use cases for verification of what AI compute is used for: Safeguards against authoritarian misuse of AI (e.g., identifying protestors or political opponents), enabling secure use of foreign compute in domestic critical infrastructure and more.

It needs to become possible to detect dishonesty about AI development and use, from the outside, without needing to leak sensitive data.2 The stakes continue to rise.

An orphaned problem

It is possible for an important problem to be noticed, but unaddressed by a large number of influential people who would be able to make a solution happen. This is what the field of AI verification has been lacking so far: people meeting, and agreeing on what the next steps are, what challenges deserve the most attention, and who does what.3

The workshop

Over two days in downtown Berkeley, the participants presented their background and relevant work so far, shared insights, and discussed strategies and roadmaps for moving the technology, commercial deployment and the international diplomatic and scientific “bridges” forward.

  • While the specifics are TBA, consensus on a minimum viable product was (mostly) found, and publications about the overall technical architecture and challenges are being finalized. When they are published, followers of my blog can expect me to write about them shortly after.
    • On a high level: Prover declares workloads, Verifier checks them using off-chip, retrofittable devices placed in the datacenter plus an egress-limited verification cluster
    • The approach is designed to work with great power adversaries without trusting either side’s chips.
    • More details to come soon
  • Work on network taps more detailed and technical than my previous post has been shared and internally discussed. I am co-writing this piece and my team plans to publish it this month. We found potential cruxes with Security Level 5 requirements around encrypted network traffic and discussed workarounds.
  • Interest in building and testing sub-scale demonstrations of network taps + secure verification clusters rose among the participants with a more technical background, and roadmaps are currently being decided. A key driver for the increased interest in engineering work is the viability of small-scale demos using off-the-shelf components that can still be close to representative for those needed for treaty verification.4 To name one example for a question discussed during the workshop, the components needed for representative demos of network taps may be either smartNICs or custom FPGAs, and there are tradeoffs between ease of use in experiments (smartNICs) and security properties (FPGAs).
  • A key emphasis has also been on the security aspect of mutual monitoring and verification: It is easy to underestimate the cyber-offense capabilities of great powers, and we discussed the concrete ways in which any verification infrastructure must avoid introducing additional attack surfaces in technical detail. A key challenge lies in the process of transferring confidential information into secure verification facilities, as well as the physical security required to prevent physical access to sensitive components, both on the prover’s and the verifier’s side.
  • Regarding fieldbuilding: The field is still tiny and bottlenecked by talent and funding. In a breakout session, we brainstormed from where –and how– to get people engaged. One connected question was when and how to include Chinese researchers and AI safety actors in verification work. We found that the AI safety community in mainland China is nascent, but emerging, while a treaty-oriented AI verification community is essentially nonexistent. The perception of AI as a potentially catastrophic risk has not yet reached the Overton window of the wider public debate, as it seems from the outside, though exceptions exist (see Ding Xuexiang quoted above). Frontier companies in China are mostly not communicating serious concerns about AI risks, though we are uncertain to what degree this is due to differences in views vs. restraint in their public communication.
  • We are under no illusions regarding the tense geopolitics around AI. We agree, however, that the “this is an inevitable arms race” framing is –to a significant degree– informed by the (non-)availability of robust verification mechanisms (see Amodei quoted above). There was no clear consensus regarding to what degree the availability of a battle-tested, deployment-ready verification infrastructure would change the public debate and decision-making of geopolitical leaders.
    • In favour: The global security dilemma is the most commonly used argument used by those AI accelerationists who consider it reckless, and verification would address this dilemma directly.
    • Against: Nations currently seem to balance the risk/reward calculation in favour of AI acceleration, and it is not expected that the possibility of verification alone tips the scale. A lot of this will depend on a complicated, hard-to-predict interplay of technology progress, societal impact, scientific communication, regulatory capture, and many other factors.
    • However, in a world where leaders come to a consensus that AI poses extreme risks, and time is scarce to act and get AI under control, the necessary verification R&D already being done in advance could make all the difference: Defusing an otherwise uncontrolled arms race towards a possible loss of control and/or enormous power concentration, and/or great power armed conflict.

Not nearly enough

If I gave the impression that the problem is getting adequate attention now, it is not. “All hands on deck” may be the title of this post, and the interest in verification work is growing, but the development of technical demos and a proper research community is still in its infant stages and bottlenecked by talent, funding, and coordination capacity.

This is a field where a single person with the right skills can move the needle. We need:

Engineers and scientists: FPGA engineers, datacenter networking engineers, silicon photonics experts, analog/mixed-signal engineers, cryptographers, formal verification researchers, ML systems engineers, cybersecurity and hardware security specialists, high-frequency trading hardware specialists and independent hackers who love to build and break things.

Entrepreneurs and founders: Enterprise sales people, venture capitalists, public grantmakers and incubators, and established companies opening up new product lines. This is in order to prepare the supply chains and business ecosystems and precedents needed to scale up deployment. Verification can have purely commercial use cases, for example for demonstrating faithful genAI inference.5

Policy and diplomacy: Technology policy researchers, arms control and treaty verification veterans, diplomats, and people with connections to —or expertise— in the Chinese AI ecosystem.

Funding and operations: Funders, fundraisers, and program managers who can help coordinate a distributed research effort.

If any of this describes you, or if you bring adjacent skills and learn fast, reach out.

[email protected]

Let us use what (perhaps little) time we have left for creating better consensus on AI risks, for building a datacenter lie detector, for preventing and finding hidden AI projects, and for defeating Moloch.

Join us.

1 Answer to the question: “Do you think that the U.S. government is capable in a scenario — not like the ultimate Skynet scenario — but just a scenario where A.I. seems to be getting out of control in some way, of taking a pause?”

2 In plain English: We need ways for an inspector to walk into a datacenter in Shenzhen or Tennessee and cryptographically prove what inference and training happened, without increasing the risk of exposing IP such as model weights or training data.

3 For more details on this, I recommend the excellent post There should be ‘general managers’ for more of the world’s important problems”.

4 See my previous post on a “border patrol device” for AI datacenters.

5 While Kimi’s Vendor Verifier may give the impression that this is a solved problem, it only works for open weights models to run locally for comparison. Verifying inference of proprietary models would require third-party-attested, or hardware-attested deployment.



Discuss