2025-12-02 00:48:58
Published on December 1, 2025 4:48 PM GMT
In this paper, we make recommendations for how middle powers may band together through a binding international agreement and achieve the goal of preventing the development of ASI, without assuming initial cooperation by superpowers.
You can read the paper here: asi-prevention.com
In our previous work Modelling the Geopolitics of AI, we pointed out that middle powers face a precarious predicament in a race to ASI. Lacking the means to seriously compete in the race or unilaterally influence superpowers to halt development, they may need to resort to a strategy we dub “Vassal’s Wager”: allying themselves with a superpower and hoping that their sovereignty is respected after the superpower attains a DSA.
Of course, this requires superpowers to avert the extinction risks posed by powerful AI systems, something over which middle powers have little or no control over. Thus, we argue that it is in the interest of most middle powers to collectively deter and prevent the development of ASI by any actor, including superpowers.
In this paper, we design an international agreement that could enable middle powers to form a coalition capable of achieving this goal. The agreement we propose is complementary to a “verification framework” that can prevent the development of ASI if it achieves widespread adoption, such as articles IV to IX of MIRI’s latest proposal.
Our proposal tries to answer the following question: how may a coalition of actors pressure others to join such a verification framework, without assuming widespread initial participation?
Trade restrictions. The agreement imposes comprehensive export controls on AI-relevant hardware and software, and import restrictions on AI services from non-members, with precedents ranging from the Chemical Weapons Convention and the Nuclear Non-Proliferation Treaty.
Reactive deterrence. Escalating penalties—from strengthened export controls to targeted sanctions, broad embargoes, and ultimately full economic isolation—are triggered as actors pursue more and more dangerous AI R&D outside of the verification framework.
Preemptive self-defense rights. The coalition recognizes that egregiously dangerous AI R&D constitutes an imminent threat tantamount to an armed attack, permitting members to claim self-defense rights in extreme cases.
Escalation in unison. The agreement would establish AI R&D redlines as well as countermeasures tied to each breach. These are meant to ensure that deterrence measures are triggered in a predictable manner, in unison by all participants of the agreement. This makes it clear to actors outside of the agreement which thresholds are not to be crossed, while ensuring that any retaliation by actors receiving penalties are distributed among all members of the coalition.
Though these measures represent significant departures from established customs, they are justified by AI’s unique characteristics. Unlike nuclear weapons, which permit a stable equilibrium through mutually assured destruction (MAD), AI R&D may lead to winner-take-all outcomes. Any actor who automates all the key bottlenecks in Automated AI R&D secures an unassailable advantage in AI capabilities: its lead over other actors can only grow over time, eventually culminating in a decisive strategic advantage.
We recommend that the agreement activates once signatories represent at least 20% of the world’s GDP and at least 20% of the world’s population. This threshold is high enough to exert meaningful pressure on superpowers; at the same time, it is reachable without assuming that any superpower champions the initiative in its early stages.
This threshold enables middle powers to build common knowledge of their willingness to participate in the arrangement without immediately antagonizing actors in violation of the redlines, and without paying outsized costs at a stage when the coalition commands insufficient leverage.
As the coalition grows, network effects may accelerate adoption. Trade restrictions make membership increasingly attractive while non-membership becomes increasingly costly.
Eventually, the equilibrium between competing superpowers may flip from racing to cooperation: each superpower could severely undermine the others by joining the coalition, leaving the final holdouts facing utter economic and strategic isolation from the rest of the world. If this is achieved early enough, all other relevant actors are likely to follow suit and join the verification framework.
The agreement's effectiveness depends critically on timing. Earlier adoption may be achieved through diplomatic and economic pressure alone. As AI R&D is automated, superpowers may grow confident they can achieve decisive strategic advantage through it. If so, more extreme measures will likely become necessary.
Once superpowers believe ASI is within reach and are willing to absorb staggering temporary costs in exchange for a chance at total victory, even comprehensive economic isolation may prove insufficient and more extreme measures may be necessary to dissuade them.
The stakes—encompassing potential human extinction, permanent global dominance by a single actor, or devastating major power war—justify treating this challenge with urgency historically reserved for nuclear proliferation. We must recognize that AI R&D may demand even more comprehensive international coordination than humanity has previously achieved.
2025-12-02 00:34:30
Published on December 1, 2025 4:34 PM GMT
[My novel, Red Heart, is on sale for $4 this week. Daniel Kokotaijlo liked it a lot, and the Senior White House Policy Advisor on AI is currently reading it.]
“Formal symbol manipulations by themselves … have only a syntax but no semantics. Such intentionality as computers appear to have is solely in the minds of those who program them and those who use them, those who send in the input and those who interpret the output.”
— John Searle, originator of the “Chinese room” thought experiment
A colleague of mine, shortly before Red Heart was published, remarked to me that if I managed to write a compelling novel set in China, told from Chinese perspectives — without spending time in the country, having grown up in a Chinese-culture context, or knowing any Chinese language — it would be an important bit of evidence about the potency of abstract reasoning and book-learning. This, in turn, may be relevant to how powerful and explosive we should expect AI systems to be.
There are many, such as the “AI as Normal Technology” folks, who believe that AI will be importantly bottlenecked on lack of experience interacting with the real world and all its complex systems. “Yes, it’s possible to read about an unfamiliar domain, but in the absence of embodied, hands-on knowledge, the words will be meaningless symbols shuffled around according to mere statistical patterns,” they claim.[1] ChatGPT has never been to China, just as it hasn’t really “been” to any country. All it can do is read.[2] Can any mind, no matter how fast or deep, build a deep and potent understanding of the world from abstract descriptions?
I’m not an LLM, and there may be important differences, but let’s start with the evidence. Did I succeed?
“It greatly surprised and impressed me to learn that Max had not once traveled to China prior to the completion of this novel. The scene-setting portions of every chapter taking place in China reveals an intimate familiarity with the cultures, habits, and tastes of the country in which I was raised, all displayed without the common pitfall that is the tendency to exoticize. I’d have thought the novel written by someone who had lived in China for years.”
— Alexis Wu, Chinese historical linguist and translator“I now believe that you have a coauthor that was raised in China - the Chinese details are quite incredible, and if you don’t have a Chinese coauthor or editor that’s really impressive for someone who hasn’t been to China.”
“Red Heart is a strikingly authentic portrayal of AI in modern China—both visionary and grounded in cultural truth.”
— Zhang San,[3] Senior AI Executive
How did I do it? And what might this suggest about whether understanding can be built from text alone?
I definitely got some things wrong, when writing the book.
Shortly before the book came out, concerned that it might be my only chance to safely visit the mainland,[4] I visited Shenzhen (and Hong Kong) as a tourist. Most of Red Heart takes place in Guizhou, not Guangdong, where Shenzhen is, but Guizhou is still pretty close, and similar in some ways — most particularly the humidity. The entire novel only has a single offhand reference to humidity, despite involving a protagonist that regularly goes in and out of carefully air-conditioned spaces! Southern China is incredibly humid (at least compared to California), and to my inner-perfectionist it stands as a glaring flaw. Augh!
Most issues that I know about are like the humidity — details which are absent, rather than outright falsehoods. I wish I had done a better job depicting fashion trends and beauty standards. I wish I’d emphasized how odd it is for the street-food vendor to only take cash. That sort of thing.
I’m sure there are a bunch of places where I made explicit errors, too. One of the most important parts of my process was getting a half-dozen Chinese people to read early drafts of my novel and asking them to look for mistakes. There were a bunch,[5] and it was extremely common for one Chinese reader to catch things that another reader didn’t, which implies that there are still more errors that I haven’t yet heard about because the right kind of Chinese reader hasn’t left a review yet. (If this is you, please speak up, either in the comments here or on Amazon or Goodreads! I love finding out when I’m wrong — it’s the first step to being right.) One of my biggest take-aways from learning about China is that it’s an incredibly large and diverse country (in many ways more than the USA[6]), and that means that no single person can do a comprehensive check for authenticity.
But also, I think I got most things right, or at least as much as any novel can. Well before sending the book to any Chinese people, I was reading a lot about the country as part of my work as an AI researcher. China is a technological powerhouse, and anyone who thinks they’re not relevant to how AI might unfold simply isn’t paying attention. Late in 2024, my interest turned into an obsession. I read books like Red Roulette (highly recommended), the Analects, and Dealing with China. I dove into podcasts, blogs, and YouTube videos on everything from Chinese history to language to the vibes, both from the perspective of native Chinese and from Westerners.
Perhaps equally importantly, I talked to AIs — mostly Claude Sonnet 3.6. Simply being a passive reader about a topic is never the best way to learn about it, and I knew I really had to learn in order for Red Heart to work. So I sharpened my curiosity, asking follow-up questions to the material I was consuming. And each time I felt like I was starting to get a handle on something, I would spin up a new conversation,[7] present my perspective, and ask the AI to tear it apart, often presenting my text as “a student wrote this garbage, can you believe it.” Whenever the AI criticized my take, I’d hunt for sources (both via AI and normal searching) to check that it wasn’t hallucinating, update my take, and repeat. Often this resulted in getting squeezed into a complex middle-ground perspective, where I was forced to acknowledge nuances that I had totally missed when reading some primary source.
As a particular variation on this process, I used AI to translate a lot of the book’s dialogue back and forth between English and Mandarin, using fresh conversations to check that it seemed sensible and natural in Mandarin. When the Mandarin felt awkward, it often signaled that I’d written something that only really made sense in English, and that I needed thoughts and expressions that were more authentically Chinese.[8][9]
I also did the sorts of worldbuilding exercises that I usually do when writing a novel. I spent time looking at maps of China, and using street-view to spend time going down roads.[10] (The township of Maxi, where much of the book is set, is a real place.) I generated random dates and checked the weather. I looked at budgets, salaries, import/export flows (especially GPUs), population densities, consumption trends, and other statistics, running the numbers to get a feel for how fast and how big various things are or would be.
If you think that AIs are incapable of real understanding because all they have to work with are fundamentally impoverished tokens — that without hands and eyes and a body moving through the world, symbols can never mean anything — then I think my experience writing Red Heart is at least weak evidence against that view. Yes, I imported a lot of my first-hand experience of being human, but that can only go so far. At some point I needed to construct a rich world-model, and the raw material I had available for that project was the same text, images, and websites that LLMs train on. Knowing that a sentence starting with “It was April, and so” should end with “the monsoon season had begun” implies real knowledge about the world — knowledge that is practical for making decisions and relating to others.
There’s something a bit mysterian about the symbol grounding objection, when you poke at it. As though photons hitting retinas have some special quality that tokens lack. But nerve signals aren’t intrinsically more meaningful than any other kind of signal — they’re just patterns that get processed. And tokens aren’t floating free of the world. They connect to reality through training data, through tool use, through web searches. When Claude told me something about Beijing and I checked it against other sources, the feedback I got was no more “real” than the feedback an LLM gets when it does a similar search. When I checked the economic math, that mental motion was akin to an AI running code and observing the output.
There are many differences between humans and LLMs. Their minds operate in ways that are deeply alien, despite superficial similarity. They have no intrinsic sense of time — operating token-by-token. They have no inbuilt emotions, at least in the same neurobiological way that humans do.[11] In some ways they’ve “read every book,” but in other ways a fresh LLM instance hasn’t really read anything in the way a human does, since humans have the chance to pause and reflect on the texts we go through, mixing our own thoughts in with the content.
More relevantly, they process things in a very different way. When I was learning about China, I was constantly doing something that current LLMs mostly can’t: holding a hypothesis, testing it against new data, noticing when it cracked, and rebuilding in a way that left a lasting change on my mind. I checked Claude’s claims against searches. I checked one Chinese reader’s feedback against another’s. I carried what I learned forward across months. And perhaps most importantly, I knew what I didn’t know. I approached China with deliberate humility because I knew it was alien, which meant I was hunting for my own mistakes.
Current LLMs are bad at this. Not only do they hallucinate and confabulate, but their training process rewards immediate competence, rather than mental motions that can lead to competence. The best reasoning models can do something like self-correction within a context window, but not across the timescales that real learning seems to require.
But this is an engineering problem, not an insurmountable philosophical roadblock. LLMs are already starting to be trained to use tools, search the web, and run code in order to get feedback. The question “can text alone produce understanding?” is a wrong question. The medium is irrelevant. The better question is whether we have the techniques and cognitive architectures that can replicate the kind of effortful, self-critical, efficient, and persistent learning in unfamiliar domains that every child can demonstrate when playing with a new game or puzzle.
I didn’t need to live in China to write Red Heart. I just needed to use the resources that were available to me, including conversations with people who knew more, to learn about the world. There’s no reason, in principle, that an AI couldn’t do the same.
Apologies to the AI as Normal Technology folks if I’ve inadvertently lumped them together with “stochastic parrot” naysayers and possibly created a strawman. This foil is for rhetorical effect, not meant to perfectly capture the perspective of any specific person.
It’s actually no longer actually true that most so-called “LLMs” only read, since systems like ChatGPT, Claude, and Gemini are nowadays trained on image data (and sometimes audio and/or video) as well as text. Still, everything in this essay still applies if we restrict ourselves to AIs that live in a world of pure text, like DeepSeek R1.
Due to the political content in Red Heart, this reader’s name has been changed and their role obscured because, unlike Alexis Wu, they and their family still live in China.
Safety is relative, of course. In some ways, visiting mainland China before my book came out was much riskier than visiting, say, Japan. China has a history of denying people the right to leave, and has more Americans imprisoned than any other foreign country, including for political and social crimes, such as protesting, suspicion of espionage, and missionary work. But also, millions of Americans visit China every year — it’s the fourth most visited country in the world — and almost certainly if I went back it would be fine. In fact, due to China’s lower rate of crime, I’m pretty confident that it’s more dangerous overall for me to take a vacation by driving to Los Angeles than flying to Beijing.
One example was in the opening chapter of the book the protagonist is looking out into Beijing traffic and in the first draft he notices many people on scooters. A reader corrected me: they use electric bicycles, not scooters.
When I got to Shenzhen, I was surprised to see the streets full of scooters. Scooters were everywhere in China! My early reader got things wrong! I needed to change it back! Thankfully, my wife had the wisdom to notice the confusion and actually look it up. It turns out that while Shenzhen allows scooters, they are banned in Beijing.
Lesson learned: be careful not to over-generalize from just a few experiences, and put more trust in people who have actually lived in the place!
America is, for example, incredibly young and linguistically homogenous, compared to China. The way that people “speak Chinese” in one region is, a bit like Scots, often unintelligible to people from a little ways away, thanks to thousands of years of divergence and the lack of phonetic alphabet. Even well into the communist era, most people were illiterate and there were virtually no national media programs. With deep time and linguistic isolation came an intense cultural diversity that not even the madness of the cultural revolution could erase.
Starting new LLM conversations is vital! LLMs are untrustworthy sycophants that love to tell you whatever you want to hear. In long conversations it’s easier for them to get a handle on who you are and what your worldview is, and suck up to that perspective, rather than sticking closer to easily-defensible, neutral ground. It helped that what I genuinely wanted to hear was criticism — finding out you’re wrong is the first step to being right, after all — but criticism alone is not enough. Bias creeps in through the smallest cracks.
I did a similar language exercise for the nameless aliens in Crystal. I built a pseudo-conlang to represent their thoughts (the aliens don’t use words/language in the same way as humans) and then wrote translation software that mapped between English and the conlang, producing something that even I, as the author, felt was alien and a half-incomprehensible version of their thoughts.
My efforts to have the “true” version of the book’s dialogue be Mandarin actually led to some rather challenging sections where I wasn’t actually sure how to represent the thought in English. For instance, many Chinese swear-words don’t have good one-to-one translations into English, and in an early version of the book all the Mandarin swearing was kept untranslated. (Readers hated this, however, so I did my best to reflect things in English.)
It’s very annoying that Google is basically banned from the mainland. Perhaps my efforts to translate the Chinese internet and get access through VPNs were ham-fisted, but I was broadly unimpressed with Baidu maps and similar equivalents.
LLMs can emulate being emotional, however, as discussed in Red Heart. The degree to which this is an important distinction still feels a bit confusing to me. And they may possess some cognitive dynamics that are similar to our emotions in other ways. The science is still so undeveloped that I think the best summary is that we basically don’t know what’s going on.
2025-12-02 00:29:14
Published on December 1, 2025 4:29 PM GMT
"Well, Season's Greatings everyone!" It was the third time the middle age woman who lived on Buffington had repeated herself in the transport's exit. Each of us regulars were exchanging short glances waiting for the stranger to give the response, but he seemed to know something was up.
"Don't you mean Season's Greetings?" the tall man who lives on Ravenna finally responds. If the stranger wasn't here it would have been the pretty young woman's turn, but nobody blames her for taking a chance. We all would have done it in her situation. But none of us really can afford a long wait, and it looks like the tall man was in the biggest rush.
"No, I mean Greatings because everyone should know how great this season really is!" While the human ear could detect the lack of genuine emotion in her voice, we all had become practiced in singing the right intonation with minimal effort to make the AI register us as having an acceptable level of "enthusiasm." Can it even be called that anymore? Does anyone even remember what genuine emotion felt like?
One of the quirks of the AI that has developed over the years is that you have to say an appropriate greeting on exiting the transport, and in winter months this means this particular set of "Season's Greatings, no Greetings, no actually Greatings," exchanges. There are several similar quirks which have developed where at some point it becomes known that a given small action enters the algorithm in a positive way, and when that knowledge spreads everyone starts doing that thing, so if you don't do it you look like a one-in-a-million negative outlier and you are on the fast-track for suspended account privileges or potentially dis-personalization. Normally the "Greetings" responder is not penalized, but the regulars have noticed that responding at the Buffington stop will earn you a 50 credit penalty applied when the general system updates overnight.
Is it local to us? -You can't talk about such a negative topic with strangers or even most acquaintances, that's a credit ding. Is it just a bug in the system? -Who knows? If you try and make a report you get what looks like an automated response. Most of the time issues like this disappear on their own, seemingly with or without attempting to file a report or complaint form. But us regulars on this route have known about this quirk for almost two years now, and we have been pretty good about taking turns with the credit hit. Once someone says the opener, the transport won't move until it completes, and the woman on Buffington can't afford to risk not saying it. When a stranger is on the route, as happens occasionally, we'll hesitate and hope they say the response lines, but this one seems like he was a little more alert and questioned our hesitation.
He stayed on through the ride. I usually get off last and have a short stretch by myself before my stop, but this stranger was apparently going farther. When I needed to "make small talk" I went with Dodgers commentary #34, which in its current form goes something like "What do you think about The Dodger's next season? Doesn't Jackson have shorting stop stop ball average?" There hasn't been Baseball in decades at least, the records are hazy; a lot of things have changed since patch 7.839, or "The Upgrade" as some people (may have) called it. The Stranger's response shocked me: "Yeah, that's a fortunate pick on your part. That small-talk pattern was apparently deprecated but it seems like no new behavior was slotted in, so I can just say whatever here and it ends up being scored as a proper response. Just listen. I don't know what kind of problem you have going on with that stop back there, but I know what can solve it: Kismet. That's K-I-S-M-E-T. Type that in to your browser, it will ask for your biometric sig, give it and then make a request. I may know a few tricks, but I don't know how this one works, or even if it always works. What I do know about it is that what ever it is watches how you use it. I can't tell you how I know, just trust me when I say not to abuse it or ask too much. You'll probably be good un-biting whatever bullet the tall guy bit for us back there though."
"Thanks I..." When the sound left my lips his eyes and nostrils flared in frustration. I realized what I had done: by speaking again the AI registered his response as complete and we were locked back into structured public speech patterns. I quickly resume my half of Dodgers Commentary #34 and hope I'm not charged too much for my brief deviation. He seemed to glare at me slightly during his responses and I could tell he was struggling not to have a harsh tone. We didn't have further chance to talk before he got out. When he left we made eye contact again and from outside in one of the vehicle's visual spectrum blind-spots he mouthed the word "Kismet" very clearly.
2025-12-02 00:01:25
Published on December 1, 2025 4:01 PM GMT
I constantly think that November 30, 2022 will be one of the most important dates of my life. I believe the launch of chat GPT 3.5 will be considered in the future as the start of a new paradigm for Earth and the human race.
A while back I decided that on every November 30 starting in 2025, I will compile my notes on Artificial Intelligence and put them in a post. Continue below for my thoughts on AI in 2025.
About once a week, usually when I am showering, I marvel at the thought that humans have managed to turn energy into a form of intelligence. How our monkey brains managed to do that still wows me.
Both Sophie and I rely on LLMs for a considerable amount of our daily tasks and projects. She has been using ChatGPT Plus since June 2024, and I have been using Google Gemini Pro since May 2025. We probably average 2-3 hours a day each on the apps, far more than any other smartphone app. As a household we spend $55 CAD a month on LLM subscriptions.
I’ve been working on a project fine-tuning an LLM on multi-task workflows relevant to my domain expertise, which gives me a preview of the next iterations of the technology. I am excited to see how the next frontier of LLMs will increase the productivity of the white collar professionals who adopt them.
The best way to get LLMs to be useful for you is to view using LLMs as a video game: When you start playing a new video game, you need to learn how to use the controllers and the bounds of what you can do within the game. LLMs are similar. My suggestion is to take some of your free time and see what you can get an LLM to do. Over time you will be impressed on how much it can do.
Like video games, there are power-ups when using LLMs. My favorite is meta prompting. Here’s an example of what I mean by meta-prompting.
I have always found LLMs to be too agreeable and sycophantic. Some models (like Gemini) now have personal context setups, where you can give your LLM instructions on how you would like it to respond. Here is mine:
When responding to me, always adopt a concise and objective tone. Do not be agreeable or seek to please me. Instead, actively challenge my premises and assumptions to ensure rigorous critical thinking. Prioritize factual accuracy and counter-arguments over conversational harmony.
I am disappointed, but not surprised, that younger generations are using LLMs as shortcuts to schoolwork as opposed to enhancers. I have professor friends who have seen a serious degradation in the preparation of these young students who use LLMs to cheat their way into a degree.
I use Gemini a lot to learn about new subjects, and I use the following prompt I found on X as a starting prompt:
I would benefit most from an explanation style in which you frequently pause to confirm, via asking me test questions, that I’ve understood your explanations so far. Particularly helpful are test questions related to simple, explicit examples. When you pause and ask me a test question, do not continue the explanation until I have answered the questions to your satisfaction. I.e. do not keep generating the explanation, actually wait for me to respond first. I’m hoping that by tutoring me in this Socratic way, you’ll help me better understand how superficial my understanding is (which is so easy to fail to notice otherwise), and then help fill in all the important blanks. Thanks!
I then explain the subject I want to learn about and the resulting conversations are very enlightening. Here’s an example.
Nano Banana Pro is seriously impressive. The image below is AI generated.
I predict that social media as we know it will evolve into something completely new. I do not see myself using Instagram more often if those who I follow are posting AI generated content, which I will have a hard time discerning from real photos / videos.
I am confident that many people will use AI to fake their lives on Instagram in an effort to gain status. I believe this will lead to a significant reduction in visual social media consumption in the near future.
Almost a year ago I wrote a post about my predictions on AI. My predictions still stand (for now).
I managed to get in a Waymo on our trip to Austin back in April. We waited 27 minutes for it to arrive but it was worth it. It was a mind-blowing experience.
‘AI is coming for our jobs’ paradigm is still far away. If you lose your job in 2026 and you think it was because of AI, it is likely that you are partially right. You did not lose your job to AI, you lost your job because people at your firm became far more productive as they harnessed the power of AI, and the firm realized they could be as productive or more productive with less human labor.
I think AGI is coming before I retire, but I am not confident enough to put a number on it. To me it seems that there are still meaningful breakthroughs in agency, memory, and self learning that need to happen before we get there.
Even if AI advancement stalls today, and the best generalized model we have moving forward is Gemini 3.0, the technology will still transform human knowledge work as we know it. There is a lot of value to be made in transforming and applying current cutting edge LLMs to many different domains, and there are thousands of firms all over the world working on that.
The AI trade was the big winner in 2025. If you invested in virtually any stock that touched AI, you probably beat the S&P 500 for the year. I believe in 2026 there will be more divergence in the ecosystem as competition in certain domains heats up and capital allocation comes into question.
As far as I am aware, there is no data pointing to a slow down in AI compute demand. Unlike the railroad and telecom examples that are consistently mentioned as comparable, there are no idle cutting edge data centers anywhere waiting to be used. As soon as a data center is finished and turned on, the utilization goes to 100%.
The ultimate bottleneck in AI Datacenter build up will be power generation. All the other bottlenecks that currently exist will be solved via competition in the next few years. Generating enough base load power for these datacenters is the crux of the AI Infrastructure problem.
Canada is especially well positioned to take advantage of the AI Infrastructure build up. For a country of our population and economic size, Canada has the following advantages (From my letter to Minister Evan Solomon):
So far I am disappointed in what our Minister of Artificial Intelligence and Digital Innovation has managed to accomplish this year. I hope to see some large scale AI Infrastructure projects in Canada in construction by this time next year.
I used Google Gemini 3.0 Pro in thinking mode as an editing assistant to write this post.
2025-12-01 22:47:36
Published on December 1, 2025 2:47 PM GMT
I've decided to post these in weekly batches. This is the fourth of five. I'm posting these here because Blogspot's comment apparatus sucks and also because no one will comment otherwise.
22. How To: Move Cross-Country, Trial Therapists
Make a weighted factor model sheet with a few criteria you consider important...
Be polite and as informative as you feel comfortable with, and try to get a sense of whether you'd be alright with working with this person, telling them some of your embarrassing secrets, and being generally vulnerable with them. If you get a bad feeling for any reason, you should probably not work with them, unless your sense of "getting a bad feeling" goes off basically constantly.
23. Seven-ish Evidentials From My Thought-Language
6. Best guesses. The speaker might have any or none of the other sources of knowledge listed here, and has synthesized them into a best guess at the truth without knowing for sure. Unusual both in that in the past tense or perfective aspect, this becomes a mirative (a surprise-marking affix), and that unlike the other categories, an explicit numerical value (usually) strictly between 0 and 1 - or an equivalent - is required as part of the evidential - to decline is ungrammatical.
24. Join Me Here In This Darkness (And Fight On Anyway)
There is a kind of nobility in doing a doomed thing well. There is a sort of tragic sweetness in fighting to the last. To grapple with your despair, it has been said, is to fight an untiring enemy, whose forces swell to twice their size the moment you look away; each further instant in which you draw breath is another defiant blow struck in an unending campaign. Fight on anyway.
25. Every Accusation is an Admission
For one last minor possibility, if you hold a certain attitude or are ready to escalate in specific ways, you might well see your own untrustworthiness hiding in every shadow and reflected in every puddle. If you're already thoroughly keyed up and ready to snap, you're in a headspace where you're likelier to suspect those around you of being just as ready to strike.
26. Live Absurdly Decadently By Ancient Standards
For a wildcard category, we can - again, relatively inexpensively - use modern materials to do things that people of long ago only barely dreamed of.
27. Several More Delicious Scalable Recipes for Posterity
A last remnant of a now mostly-gone Northern Korean food culture. I transcribed the recipe personally in the interests of archiving and repeatability; I got Grandma Kim to let me measure her handfuls and splashes and pinches and "enough"s.
28. And Our Redemption Arc is Coming Up...
For myself, I tend towards the diachronic side, but not completely. When I think about the course of my life to date, I tend to divide it into rough eras or arcs; I've been the same person throughout, with mostly the same core drives and motivations and capacities and interests, but have clearly grown and changed through it all. If you think it might be helpful for you to do the same, here are a few tips...
2025-12-01 22:22:36
Published on December 1, 2025 2:22 PM GMT
We at the MIRI Technical Governance Team have proposed an international agreement to halt the development of superintelligence until it can be done safely.
Some people object that, even if the agreement is adopted by some countries, ASI can still be developed in other countries that do not halt development. They argue that this would undermine the stability of the agreement and even prevent it from being adopted in the first place, because countries do not want to give up their lead only to be overtaken by other, less safety-conscious countries that are racing ahead in contravention of the pause.
TL;DR: An ASI nonproliferation agreement can be effective even without complete worldwide adoption. A coalition that controls advanced chips, fabs, and cloud access can deter, detect, and if needed, disrupt illicit R&D programs. Our proposed agreement couples domestic pauses with outward export controls, verification, and do‑not‑assist rules applicable to people and firms. Carrots—including access to supervised compute and safe applications—reduce incentives to free‑ride; sticks raise the cost of staying outside. The result is slower progress among unsanctioned ASI programs, higher odds of detecting dangerous training, and a stable path to widening participation.
Yudkowsky and Soares’ 2025 book If Anyone Builds It, Everyone Dies argues that the development of advanced superintelligence, if done using the present-day methods of machine learning, would lead to the extinction of humanity. Superintelligence cannot be safely created before substantially more progress is made in the field of AI alignment, and before governance is improved to ensure it is not used for malicious purposes.
Therefore, we have proposed an international agreement to halt the development of superintelligent AI until it can be done safely. This agreement builds a coalition that makes unsanctioned frontier training anywhere infeasible by controlling chips, training, and research, while keeping legal pressure and technical options ready for targeted disruption of prohibited projects. As the countries leading in AI research, compute, and model capabilities, the U.S. and China would necessarily be key players, along with countries such as Taiwan and the Netherlands that control crucial links in the AI supply chain.
A common objection is that, even if the agreement is adopted by some countries, ASI can still be developed in other countries that do not halt research and development within their borders. This could undermine the stability of the agreement and even prevent it from being adopted in the first place, because countries concerned with extinction do not want to sacrifice their advantages only to be overtaken by other, less safety-conscious countries that are racing ahead in contravention of the pause.
In this post, I explain how the agreement anticipates and addresses this threat.
A single deployment of unaligned superintelligence would be the most dangerous event humanity has ever faced, more catastrophic than even nuclear war. If anyone builds it, everyone is at risk. It only takes one uncautious non-party AI project that crosses the line of extinction-level capabilities to nullify the caution and sacrifice of all party states.
Worse yet, under partial adoption, the most cautious and responsible actors slow down while the reckless race ahead, raising the odds of takeover, extinction, or war, which necessitates ever more invasive and destabilizing countermeasures. Without global coverage, a pause cannot fully eliminate the risk of an ASI-triggered catastrophe.
AI alignment is a very difficult problem that may take years to solve. A unilateral halt by the country leading in AI is most likely insufficient, because the country in second place will simply catch up and then continue development as before.
The most obvious way a non-party state can undermine the agreement is by intentionally developing ASI, but there are many other ways non-parties can create challenges for the coalition:
Even non-party states that ultimately pose no direct threat to the coalition’s agenda still burden it to some extent because they require costly monitoring. This disincentivizes would-be parties from joining and sharing in those costs, especially in the early days of the agreement, when there are more non-party states to monitor and fewer parties collaborating to monitor them. Smaller countries are the most likely to be hesitant about joining because of these expenses, as the costs are large relative to their economies, and they have less preexisting intelligence infrastructure of their own.
The existence of non-signatory countries provides opportunities for party states to secretly collaborate with them (e.g. offshore their research to unmonitored areas), which heightens the risk of covert programs and unravels the agreement by undermining trust and cooperation.
Talent flight exacerbates all these other problems. The population of researchers who can advance frontier AI is relatively small. If even a few of them relocate to non-party jurisdictions, ASI programs there gain disproportionate capability. Parties will rightly worry that unimpeded relocation of AI researchers would drain the talent pool of their home countries while providing dangerous expertise to non-party states.
If a state believes it can reach superintelligence before other nations can stop it, it may decide to take the risk and dash for the finish line. If other nations become aware of this, they are likely to sabotage this unsanctioned AI project, but if they have their own computing hardware and are not certain the unsanctioned project can be stopped, they may abandon compliance with the agreement and join the race instead.
An AI model may be developed in a non-party state that is more capable than what is otherwise available in signatory states, which makes inference more dangerous.
Even if party states control most chip production, non-party states may continue R&D of computing hardware and AI algorithms, which lowers the size of clusters needed to advance the frontier of dangerous AI capabilities. At the extreme, there is some risk that algorithmic progress in non-party states can nullify the compute advantage of the coalition, though this would take many years of research by a compute-poor rogue state.
Less safety-conscious firms in non-party states can undercut the profitability of firms in party states subject to compliance costs. Disadvantaged firms would be incentivized to lobby their governments against joining the coalition.
Firms in non-coalition countries will also lobby against joining because they will have strong incentives to push closer to the edge of ASI—they stand to gain huge profits if they succeed, but they are not on the hook for the damage caused by a misaligned superintelligence that kills all of humanity.
This reluctance to join the coalition could be contagious, as more responsible firms in would-be party states will be wary of competing with firms from outside the coalition which have the above mindset.
The architecture of the agreement is designed to grow the coalition and hold it together while quickly shutting down prohibited ASI projects. It couples a domestic halt to ASI progress with an outward posture of nonproliferation toward other signatories and non-signatories. The ideal level of adherence to the agreement is at least equal to that of the Nuclear Non-Proliferation Treaty, and our recommended rollout phases include transparency, confidence-building, development of verification mechanisms, and gradual establishment of commitments, to bring in the international community and achieve this level of safety worldwide.
The best way to preempt problems raised by non-parties is to convert such states into parties. We recommend proverbial carrots and sticks to attract non-parties to join the coalition.
Membership in the agreement should be made economically worthwhile. A natural carrot here is access to the coalition’s computing clusters for use in socially beneficial AI applications. Lower-income countries should also be supported with technical assistance and cost-sharing by the wealthier Parties. We envision major economies, including the U.S. and China, joining the agreement early on, and these founding parties can lay much of the groundwork of monitoring in the early days when this task is costliest. This reduces the barrier to entry for smaller countries that join later.
Remaining outside the agreement should be economically costly. One straightforward stick is import bans on AI services developed in non-party countries, which might otherwise undercut the profitability of coalition firms by cutting corners on safety and monitoring. Non‑parties are also denied access to supervised chips and manufacturing capabilities (Arts. I, VI). They will face restrictions on inputs and raw materials for the AI supply chain, disruption or sabotage of their prohibited research activities, and trade restrictions such as blacklisting of AI services developed in those countries.
North Korea and Iran are good historical examples of how the international community can deal with countries that try to acquire dangerous technology in contravention of worldwide norms. North Korea withdrew from the Non-Proliferation Treaty and developed nuclear weapons, and was hit with crippling sanctions from most of the world in response. Iran is attempting to develop nuclear weapons despite being an NPT signatory state, so its nuclear program has repeatedly been sabotaged by those who fear the consequences in the Middle East if it were to acquire them. The situation with AI development is analogous in many ways and different in others. We don’t know all the specific methods and incentives that will ultimately be applied, but we expect that parties will encourage others to join, as was the case for nuclear non-proliferation.
For reasons that will become clear in subsequent sections, it is vital that countries currently possessing very large numbers of advanced AI chips join the coalition. This can be done through inducements by the leading Parties: most countries with many advanced chips are very dependent on trade and technological exchange with the U.S., China, or both, meaning that economic incentives and access to AI technology from these Parties will be a major incentive. If these are not sufficient, standard tools of diplomatic pressure—such as trade restrictions, economic sanctions, and visa bans—should be applied (Art. XII).
Many of the potential challenges posed by non-parties are moot if they simply don’t have the chips.
The agreement’s export and production controls deny non‑parties access to AI chips, as well as the specialized manufacturing equipment that bottlenecks the chip supply chain (Arts. I, V–VII). Non-states lacking such chips in large quantities would be unable to complete training runs of frontier models on reasonable time scales, and would not be able to operate existing high-end models in a cost-competitive manner.
Concretely, the AI chip supply chain is extremely narrow, and the chips are so complex that it is extremely difficult for anyone outside of the existing chain to replicate it. The vast majority of chips are designed by U.S. companies, mostly NVIDIA. The processors are almost all fabricated by TSMC, on their 4-6nm process nodes in only a couple of plants. Extreme ultraviolet photolithography machines, essential equipment for etching sufficiently small transistors to make chips as powerful and efficient as today’s, are made exclusively by the Dutch firm ASML. The market for high-bandwidth memory, another crucial component, is dominated by SK Hynix, Samsung, and Micron, which are Korean and American companies. China is trying to replicate each of these steps, but progress is slow due to the extreme complexity of modern computing hardware. If the coalition were to include China, the U.S., and their close allies, it would be virtually impossible for any outside country to produce state-of-the-art AI hardware.
Because chips break down with use and become obsolete with age, non-parties can expect their stock of strategically and commercially relevant AI hardware to gradually diminish over time. This burnout, in conjunction with export and production controls on non-party states, and monitoring of newly produced chips (Art. VI), means that the agreement should eventually cover a large majority of the relevant hardware.
The coalition, by requiring each member to consolidate chips into domestically controlled Covered Chip Clusters (CCCs) and then verifying how they are used, not only ensures that parties aren’t pursuing ASI projects, but also that non-parties are not doing such work on the coalition’s chips through remote access (Arts. V–VII).
When the number of AI chips outside the coalition is small, the potential for unmonitored training by anyone, anywhere is greatly reduced.
The agreement’s restrictions on research that advances towards ASI or potentially undermines verification include “do‑not‑assist” obligations that travel with people and organizations. (Art. VIII–IX) These obligations reduce the surface for algorithmic workarounds and “offshored” labs (Arts. VIII–IX) by making it unlawful for Parties and their persons to fund, host, supervise, teach, publish, equip, or otherwise enable Banned Research anywhere (Art. VIII §4).
Research leading to dangerous AI capabilities must be clearly flagged as off-limits everywhere, and the coalition must use its diplomatic and economic influence to dissuade non-parties from such work just as it disincentivizes large AI training runs.
As is the case for nuclear research, individuals conducting dangerous AI research need to know they risk being arrested if they are undertaking dangerous AI research that crosses red lines declared by party states, even if they try to evade the research restrictions by moving to a non-party state. AI researchers with sensitive industry knowledge should be discouraged from traveling to rogue states—something the U.S. government already does for employees of important scientific research offices.
The agreement’s complementary Research Restriction Verification tools—including interviews with researchers and embedded auditors at selected organizations—deter covert relocation schemes (Art. IX §2). These are paired with whistleblower protection and asylum for those who expose violations (Art. X). Finally, the coalition would provide funded placements into safer research activities for former AI developers, which would also make it more expensive for rogue states to hire them.
Intelligence collection on non-compliant states and intel-sharing collaboration among the coalition fuses parties’ national technical means with whistleblower reports and enables challenge inspections on 24‑hour timelines when credible concerns arise (Art. X).
Under the protective actions protocols defined in the agreement, any actor—party or non-party—credibly moving towards ASI is subject to gradually escalating protective actions (sanctions → cyber interdiction → hardware seizure or disablement). Models, hardware, and development programs can be targeted for observation and, in extreme cases, destruction, with due measures taken to minimize the scope and collateral harm of any protective operations. (Arts. XI–XII)
The agreement provides mechanisms for revision of thresholds and definitions as the technical frontier shifts (Arts. III(d), XIII–XIV), as could happen due to ongoing developments in non-party states. These mechanisms include an allowance for the CTB to immediately adjust FLOP thresholds, CCC size, and Restricted Research boundaries when inaction poses security risk (effective for 30 days unless the Council objects).
States wanting to leave the coalition present an outsized risk of pursuing dangerous ASI development, since the coalition is expected to include the countries holding most of the world’s AI hardware. Under the agreement, any withdrawing state must, over 12 months, help neutralize its CCCs and ASI‑enabling assets; after withdrawal, it remains subject to Protective Actions if it races for ASI (Art. XV).
Partial adoption is not a deal-breaker. A coalition that controls chip manufacturing equipment, monitors large clusters, and verifies chip usage can deter, detect, and disrupt dangerous AI development. By design, the agreement not only prevents dangerous R&D within member states and makes it difficult elsewhere, its phased rollout builds international trust and provides a stable pathway for full participation.
The chip consolidation and use verification, export/production controls, challenge inspections, and graduated ladder of protective actions make the option of withholding from the agreement and developing ASI on your own a visible, difficult, losing bet. On the other hand, joining not only reduces AI risk but offers economic benefits such as access to the coalition’s compute resources and beneficial AI technology.
One of the greatest challenges is that the longer we wait to implement this agreement, the harder it will be to prevent rogue ASI research due to algorithmic and hardware progress. Our recommendation is simple: start building the coalition now, and keep the door open to all those who wish to join in averting the risk of an AI catastrophe.