2026-04-11 17:05:50
Hypothesis: Real markets are not irrationally lazy. They are lazily rational.
Efficient-Market Hypothesis (EMH): asset prices reflect all available information.
EMH assumes costless information, costless cognition, and costless execution. It models a theoretically optimal market.
The friction of real markets is well understood. The Lazy-Market Hypothesis is for modelling markets with friction.
Once you price labor, capital, risk, and cognition as real costs, the Efficient-Market optimum stops being the target. Chasing the EMH optimum would cost more than the expected return.
Claim: the rational agent is the lazy agent. Laziness isn't a deviation from rationality under friction; laziness is what rationality looks like once friction is in the model. The descriptive observation "real markets don't reach EMH efficiency" is also a description of real agents at their actual optimum - just not the EMH optimum.
The satisficing rule: stop spending effort when marginal benefit of effort equals marginal cost.
This is descriptive, not normative: it's a claim about which agents survive selection in friction-priced environments, not a claim about ideal rationality. Agents who miscalibrate their laziness function - in either direction - get outcompeted.[1]
Grossman and Stiglitz (1980) showed that a perfectly efficient market can't fund its own price-discovery substrate: equilibrium has to sit short of the limit. LMH generalizes the move from information to effort.[2]
Rational agents satisfice their exploration budget against expected environmental volatility, and the four cells of the 2×2 describe categories of agent behaviour: under-spending (lazy-global), over-spending in non-volatile environments[3](eager-global), spending well but committing too hard (eager-local), and the historical winner[4]: spending modestly and committing modestly (lazy-local).
In lazy markets, equilibrium rests on agent laziness. Both sides form an effort frontier: each invests more only when they think the adversary does. Credit-card fraud, low-volume markets, and cybersecurity all sit in lazy equilibria. Rational agents in adversarial fields satisfice their defence against the expected capability of their adversary instead of maximizing it, and the attacker satisfices symmetrically, aiming to just barely beat the current defence rather than maximize offensive capability. Agents who try to maximize either side end up overspending in a region of diminishing returns and get outcompeted by lazier rivals.
Once upon a time, a castle was the gold standard for area defence. Besieging was slow and costly and castles were otherwise nigh impenetrable. After explosives and artillery came in, some agents reacted with better walls, and it didn't work. Castles became a liability.
This describes an effort frontier shock: a change in the cost or capability landscape that invalidates the prior satisficing equilibrium. Drawing from the LMH, we can model these shocks.
Effort frontier shocks create opportunities to create value by adaptation. Who benefits from this?
Adaptation requires slack. Unspent effort budget converts directly into adaptation capacity. The eager agents already committed their efforts. (Other out-of-scope adaptation qualities[5])
Hypothesis: Lazy-local agents, by definition, hold more slack than eager agents. This makes them advantaged in adapting to opportunities after frontier shock, but they pay a cost for suboptimality under the old conditions the eager agents committed to.
Consequence: Effort frontier shocks:
Elaboration: When the frontier moves, agents with high-commitment, low-monitoring investments get hit hardest - they're slower to detect the shift and slower to redirect once they do. Sunk-cost dynamics make this worse: the first instinct on noticing a shock is usually to double down on the existing commitment rather than abandon it. Castles got thicker walls before they got abandoned.
When real agents who are winning and surviving look irrational, it often means we're not modelling their targets and cost functions in full. The interesting move isn't to scold the agents. It's to ask what the frontier looks like, and what would have to change for their satisficing point to land somewhere else.
When agents are miscalibrated, it's often easier to move the environment than to move the agents.[6]
Real markets are not irrationally lazy. They are lazily rational.
Open questions[7].
Over long enough time horizons. ↩︎
See "Generalizing Grossman-Stiglitz from information cost to effort cost." in Open Questions. ↩︎
Why not choose your strategy based on the expected volatility? Volatility detection is itself effortful. The calibration of optimal laziness is itself subject to LMH. ↩︎
In complex, volatile environments, generalists with slack tend to win. Hyper-specialists (eager-local) dominate their narrow slot while conditions hold. Pure explorers (eager-global) rarely accumulate enough fitness to persist. The animals you've heard of are mostly lazy-locals: good-enough at local-enough tasks to exploit and survive, but carrying enough reserve capacity to ride out shocks rather than be caught rigid by them. (In the longer run they become generalists, because each shock that doesn't kill them incentivizes a new strategy.) (What about lazy-global?[9]) ↩︎
Some adaptation qualities:
↩︎
- Execution qualities
- Speed
- Ability to mobilize
- Properties of existing commitments:
- cross-paradigm salvage value
- Ability to liquidate
- Optionality
- Decisionmaking:
- Effective scanning
- Weakly holding to previous paradigms
- Scout Mindset
See Astral Codex Ten: Society is fixed, Biology is Mutable for a cross-domain parallel ↩︎
Related literature:
Almost no successful agent starts at lazy-global. They start lazy-local (or eager-local), and then luck out. Cyanobacteria didn't select for their waste product to restructure the planet's atmosphere. They were just metabolizing. The "global" part is a retroactive reclassification by an observer who knows where the story goes. ↩︎
2026-04-11 15:02:10
We are pleased to announce ILIAD 2026—a 5-day conference bringing together researchers to build scientific foundations for AI alignment. This is the 3rd iteration of ILIAD, which was first held in the summer of 2024.
See our website here. For any questions, email [email protected]
ILIAD is a 100+ person conference about alignment with a mathematical focus. ILIAD will feature an unconference format—meaning that participants can propose and lead their own sessions. We believe that this is the best way to release the latent creative energies in everyone attending.
The ILIAD conferences are the premier gathering for all things theory and alignment. Topics include: Singular Learning Theory, Agent Foundations, Causal Incentives, Computational Mechanics and Safety-by-Debate, Scalable Oversight and more.
To get a better sense of what might happen, you can view the schedule for ILIAD 2024 here.
Financial support for accommodation & travel are available on a needs basis. We may not be able to offer support to everyone who requests it.
2026-04-11 14:58:07
Written very quickly for the Inkhaven Residency.
As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after which AI doom becomes both likely and effectively outside of our control?
Spoilers: as you might guess from Betteridge’s Law, my answer to the headline question is no. But the salience of this question feels quite noteworthy to me nonetheless, and reflects a more negative outlook on the future.
I’ve previously laid out the “plan” as of 2024. I’ve also explained why I (and others around me) have become more pessimistic about the plan.
Today, I’ll talk a bit about why the answer to the headline question is no. Tomorrow I’ll outline what I think a new plan is, and what we should do.
First, I'll cover two silver linings to the reasons for pessimism from yesterday.
Fast AI progress brings many concerns into “near mode”. In the past, many of the concerns that people raised about both the potential power and potential risks of AI were more abstract, and so easy to dismiss. Nowadays, we have many more concrete examples of both capabilities and risks than in 2024. For example, in 2024, the demonstrations of risks tended to be academic examples about jailbreaking models to do undesirable things. In 2026, there are plenty of examples of imminent risks, including Anthropic’s new Mythos model, which they are not publicly deploying due to concerns about its use to exploit security weaknesses in software.
People tend to be much more reasonable when it comes to concrete concerns than abstract arguments.
US antagonism toward European countries increases the number of “live players”. I think it’s fair to say that in 2024, it’d be unthinkable that European countries would deploy troops to Greenland to defend it against the US. It might’ve been fair to assume that insofar as the US government attempts to initiate an intelligence explosion, that the European countries would hesitate to take any actions that could be seen as provocative. Now, it seems much more likely that they’d be willing to take even quite drastic actions against the US assuming their leaders became convinced about catastrophic risks from artificial intelligence.
I'll start by covering reasons for optimism from 2024 that continue to hold, then talk about two new reasons for optimism, relative to my view from 2024.
Most people continue to not want to die to misaligned AI. Thankfully, the majority of people working in AI (both in developers and in policy) are not psychopaths who care not at all for other humans, nor hardcore successionists who want to replace humans with (unaligned) AIs. Even if people might be incentivized to take on levels of risks that would be unacceptable to others, I suspect no major actor would knowingly attempt to launch a misaligned superintelligent AI out of spite or malice, and most would act to oppose this.
The US public continues to be incredibly skeptical of AI and big tech. Measures such as the (controversial in tech circles) SB 1047 were broadly supported by the public. To be clear, the US public is not skeptical of AI because of existential or catastrophic risk reasons, but instead mundane reasons like power usage and worker displacement. Nonetheless, there seems to be a substantial desire from voters from both parties to slow down the rate of dangerous AI development, and it remains likely that people will be supportive of future policy actions in this area.
Government-sponsored AISIs and safety teams at frontier developers continue to exist; many have expanded. In 2024, government institutions such as UK AISI were relatively new, as were the safety teams at some of the non-OpenAI/Anthropic AI developers. In 2026, despite much drama and turmoil in some cases, most of these teams still exist.
Anthropic continues to be competitive in the AI race. In 2024, it was widely believed that Anthropic lagged behind OpenAI in the quality of their models (remember that Opus 3 was a slightly-above GPT-4 tier model that was released a full year after GPT-4). There was also substantial doubt that Anthropic would be able to remain competitive over time given their significant compute disadvantage relative to other developers including OpenAI, Meta, Google DeepMind, or xAI. Theories of change that depended on Anthropic being in control of some of the best AI models in the world were considered suspect as a result, let alone theories of change where Anthropic would be able to maintain a 6 month to 1 year lead leading into super intelligence.
In 2026, Anthropic is widely considered to be competitive with OpenAI in terms of the quality of their best models, despite a continued compute disadvantage. Some of the other developers with substantially more compute, such as xAI or Meta, seem to be struggling to create similarly capable models. The same theories of change now look substantially more plausible.
Empirical, wing-it style alignment and control extended further than some expected. Despite rising amounts of evaluation awareness, it continues to seem plausible that many eval results generalize to reality. Models are pretty bad at taking covert actions with minimal amounts of prompting. The chains of thought of at least the OpenAI models (and plausibly the Anthropic and GDM models) continue to be relatively honest and useful for monitoring.[1] Most importantly: despite substantial amounts of scaling, we haven’t seen the rise of coherent, goal-directed agency toward particular goals, nor have we seen attempts at deliberate sabotage of lab processes. Even if the empirically-derived techniques we have may not generalize into the future, they’ve continued to work through higher levels of capability than some thought they would.
Of course, there's active concerns that this is a fragile property, and there's always been questions about the extend of its usefulness. However, people are broadly aware of this, and at least OpenAI (if not also Anthropic) seems to be taking serious efforts to try and preserve the usefulness of CoT for monitoring.
2026-04-11 13:58:14
Epistemic status: I think the headline claim is true, and that the evidence within is actually quite strong in a bayesian sense, but don't think the post itself is very well written or particularly interesting. But I had to get 500 words out! I think the 2013 conversation is interesting reading as a piece of history, separate from the top-level question, and recommend reading that.
I think many people have a relationship with Anthropic that is premised on a false belief: that Dario Amodei believes in superintelligence.
What do I mean by "believes" in superintelligence? Roughly speaking, that the returns to intelligence past the human level are large, in terms of the additional affordances they would grant for steering the world, and that it is practical to get that additional intelligence into a system.
There are many pieces of evidence which suggest this, going quite far back.
In 2013, Dario was one of two science advisors (along with Jacob Steinhardt) that Holden brought along to a discussion with Eliezer and Luke about MIRI strategy. A transcript of the conversation is here. It is the first piece of public communication I can find from Dario on the subject. Read end-to-end, I don't think it strongly supports my titular claim. However, there is this quote:
Dario: No, but the claim that I and maybe Jacob are implicitly making is that maybe a large fraction of the potentially worldending AIs end up making this mistake first so and so we aren't threatened in the first place.
This is not the kind of sentence you say if the concept you have loaded in your head is the same concept I have for "superintelligence". I do not think the context particularly rescues it.
In 2016, Dario was first author on Concrete Problems in AI Safety. I understand that this was an academic publication. Nevertheless, I think this passage is suggestive:
There has been a great deal of public discussion around accidents. To date much of this discussion has highlighted extreme scenarios such as the risk of misspecified objective functions in superintelligent agents [27]. However, in our opinion one need not invoke these extreme scenarios to productively discuss accidents, and in fact doing so can lead to unnecessarily speculative discussions that lack precision, as noted by some critics [38, 85]. We believe it is usually most productive to frame accident risk in terms of practical (though often quite general) issues with modern ML techniques. As AI capabilities advance and as AI systems take on increasingly important societal functions, we expect the fundamental challenges discussed in this paper to become increasingly important. The more successfully the AI and machine learning communities are able to anticipate and understand these fundamental technical challenges, the more successful we will ultimately be in developing increasingly useful, relevant, and important AI systems.
The next relevant pieces of evidence are from a panel he was part of at EAG 2017, "Musings on AI" (yt link). Here there are multiple relevant quotes (bolding mine):
Michael Page: So one thing we talk a lot about in this community is the risks associated with developing advanced AI. But obviously, there are also a lot of benefits associated with developing advanced AI. This is a bit of a cheeky question, but one way of putting it is: Are you all more concerned about developing advanced AI or not developing advanced AI?
Dario Amodei: I think I'm deeply concerned about both. Um, so on the not developing advanced AI, you know, one observation you can make is that modern society, and particularly a society with nuclear weapons, has only been around for about 70 years. There have been a lot of close calls since then, and things seem to be getting worse. You know, if I look at kind of the world and geopolitics in the last few years, China is rising. Um, you know, there's a lot of unrest in the Western world, a lot of very destructive nationalism. Um, you know, we're developing biological technologies very quickly. It's not entirely clear to me that civilization is compatible with digital communication. Um, you know, it has really some subtle corrosive effects.
So every year that passes is a danger that we face, and although AI has a number of dangers, actually I think, you know, if we never built AI, if we don't build AI for 100 years or 200 years, I'm very worried about whether civilization will actually survive. Um, of course, on the other hand, you know, I work on AI safety, and so I'm very concerned that transformative AI is very powerful and that bad things could happen, either because of safety or alignment problems, or because there's a concentration of power in the hands of the wrong people, the wrong governments, who control AI. So I think it's terrifying in all directions, but not building AI isn't an option because I don't think civilization is safe.
Those are not the kinds of sentences you say if you have the same "superintelligence" pointer as me, and you think that is what is actually at the end of the tunnel (as opposed to being a possible but pretty low likelihood outcome).
Dario Amodei: So here's a one-liner. Uh, AI doesn't have to learn all of human morality. So, this is something actually that even at this point MIRI agrees with, but there's some writing in the past, a lot of writing in the past, that I think people are still anchored to in many ways. So what I mean by that in particular is that, you know, I think the things we would want from an AGI are to, you know, to kind of stabilize the world, to end material scarcity, to give us control over our own biology, maybe to resolve international conflicts. We don't want or I think need to, you know, kind of build a system—at least not build a system ourselves—that, you know, kind of is a sovereign and kind of runs the world for the indefinite future and controls the entire light cone. So of course you still need to know a lot of things about human values, but I still often run into people who are kind of thinking about this problem in a way that I think is actually harder than we probably need to solve.
Those are, admittedly, the sentences of someone who might have the same "superintelligence" pointer as me, but not someone who has asked themselves how to get to a safe point and then stop without accidentally crossing a dangerous threshold (or letting anyone else do so).
And then, alas, we have Machines of Loving Grace (2024).
Although I think most people underestimate the upside of powerful AI, the small community of people who do discuss radical AI futures often does so in an excessively “sci-fi” tone (featuring e.g. uploaded minds, space exploration, or general cyberpunk vibes). I think this causes people to take the claims less seriously, and to imbue them with a sort of unreality. To be clear, the issue isn’t whether the technologies described are possible or likely (the main essay discusses this in granular detail)—it’s more that the “vibe” connotatively smuggles in a bunch of cultural baggage and unstated assumptions about what kind of future is desirable, how various societal issues will play out, etc. The result often ends up reading like a fantasy for a narrow subculture, while being off-putting to most people.
...
My predictions are going to be radical as judged by most standards (other than sci-fi “singularity” visions2), but I mean them earnestly and sincerely.
2I do anticipate some minority of people’s reaction will be “this is pretty tame”. I think those people need to, in Twitter parlance, “touch grass”. But more importantly, tame is good from a societal perspective. I think there’s only so much change people can handle at once, and the pace I’m describing is probably close to the limits of what society can absorb without extreme turbulence.
...
We could summarize this as a “country of geniuses in a datacenter”.
Clearly such an entity would be capable of solving very difficult problems, very fast, but it is not trivial to figure out how fast. Two “extreme” positions both seem false to me. First, you might think that the world would be instantly transformed on the scale of seconds or days (“the Singularity”), as superior intelligence builds on itself and solves every possible scientific, engineering, and operational task almost immediately. The problem with this is that there are real physical and practical limits, for example around building hardware or conducting biological experiments. Even a new country of geniuses would hit up against these limits. Intelligence may be very powerful, but it isn’t magic fairy dust.
Second, and conversely, you might believe that technological progress is saturated or rate-limited by real world data or by social factors, and that better-than-human intelligence will add very little. This seems equally implausible to me—I can think of hundreds of scientific or even social problems where a large group of really smart people would drastically speed up progress, especially if they aren’t limited to analysis and can make things happen in the real world (which our postulated country of geniuses can, including by directing or assisting teams of humans).
I think the truth is likely to be some messy admixture of these two extreme pictures, something that varies by task and field and is very subtle in its details. I believe we need new frameworks to think about these details in a productive way.
...
10Another factor is of course that powerful AI itself can potentially be used to create even more powerful AI. My assumption is that this might (in fact, probably will) occur, but that its effect will be smaller than you might imagine, precisely because of the “decreasing marginal returns to intelligence” discussed here. In other words, AI will continue to get smarter quickly, but its effect will eventually be limited by non-intelligence factors, and analyzing those is what matters most to the speed of scientific progress outside AI.
One possibility suggested by this essay is that Dario does not really believe in superintelligence (the hypothesis we are currently examining). Another is that he does, but has chosen to dissemble for strategic purposes. While I don't think Dario is above communicating strategically, I do in fact think this is roughly his mainline worldview, and it follows pretty clearly from his beliefs in 2017. Maybe there are other possibilities apart from those two, though I haven't figured out what they might be.
The Adolescence of Technology (2026) also contains many relevant details, which I will fail to quote.
Beyond textual evidence, let me include a few other lines of evidence:
2026-04-11 13:11:37
A cross-post from my personal website, Chasing Sunsets.
Epistemic note: This is a long post that applies two models of work to the government context. I've found those models to be useful in that context and elsewhere, but the arguments I present are very much debatable, and I'm very much open to discussing them.
I’m going to tell you a story. It is not a true story. Please ignore that fact; you can argue with me later if it seems relevant.
In The Games I Play, I mentioned that I used to play Cities: Skylines, a game whose objective is to build as large of a city as possible. It’s a difficult game to play.
Your city begins with a single highway exit. You connect a road to it, a simple two-lane road because that’s all you have the money to build. You zone that road for houses. People move into those houses, but those people need work, so you build commercial and industrial regions nearby. Commercial zoning might be right along your main street, but you can’t put your industry right next to people’s homes because they’ll complain about noise pollution if you do.
This is your first rule.
Your city develops. You add a water pump, then a sewage pipe, which has to be far downstream of the pump because your population will get sick otherwise. You unlock elementary schools and build one for every hundred kids in the city. Your main intersection gets blocked by traffic because it’s used for industrial trucks and residents’ cars, so you give each new industrial area a highway exit. You destroy a couple homes to add parks in one neighborhood because its residents are unhappy. You add fire stations every ten blocks. There’s traffic again between the residential and commercial parts of the city, so you add metro lines in a clockwise and counterclockwise loop around the city. You try adding high-density housing, but it gets too loud and happiness levels decline.
Your residents still need jobs. Not the commercial jobs—those need higher education levels and you don’t have a university in the city yet. But you can’t build an industrial area. One area might work, but it’s not near any major highways and trucks couldn’t get out of the city. The city’s only other undeveloped area would end up polluting a residential zone. You can’t condense that residential zone because any more high-density housing will make homeowners unhappy. Besides, doing that would cut off your existing metro stations, and you’re not sure you have enough money to move them.
As a result, people start leaving the city. You try to bring them back—you add recycling service, build more parks, allow them to smoke weed—but nothing seems to work. Abandoned homes line your blocks; new residents move in every so often, only to demand industrial jobs that don’t exist. Tax revenue declines, and your population plummets. Frustrated, you turn off the game and go back to your math homework.
If only you could build an industrial zone.
I recently read One Day Sooner and Never Drop A Ball, a pair of LessWrong posts that present a dichotomy in how organizations function. They’re worth reading in full if you have time. If you don’t have time, I’ll briefly summarize each one.
One Day Sooner is a mode of working that aggressively pursues a single goal. The canonical example is vaccine trials, where organizations like 1DaySooner work to get vaccines out as quickly as possible. The idea is that some goals, like getting vaccines to the public or eliminating a bottleneck in a project, are so important that they should be pursued immediately. The post describes how to get to things One Day Sooner, typically involving pushing through unnecessary slowdowns like “let’s schedule this meeting three days from now” or “I need a break from work tonight.”
Never Drop A Ball is a mode of working that pursues several goals or projects and ensures none of them are lost in the shuffle. Where a startup founder aiming for One Day Sooner might spend a sixty-hour week working on their core project, a founder aiming to Never Drop A Ball will carefully keep track of every bit of work, ensuring things like taxes and floor cleaning are properly handled. Notably, the failure mode of Never Drop A Ball work is that balls need to be dealt with; if they take too much work to resolve or there are too many to handle, balls will begin to drop.
The difference between these modes is how many balls they handle. Aiming for One Day Sooner means juggling one or two balls, typically ones that are very difficult to resolve, and trying very hard to resolve them. Aiming to Never Drop A Ball means juggling many balls—and, importantly, juggling new balls that come up—which tend to be easy to resolve and thus easy to miss. The number of balls and the likelihood of being assigned new balls are sliding scales, but it’s often easier to think of them as a binary. As anyone who has worked on a project can tell you, dealing with a small number of balls is associated with a low likelihood of being assigned new balls because new balls tend to be distracting.
In the rest of this essay, aiming for One Day Sooner will be referred to as “mission mode,” while aiming to Never Drop A Ball will be referred to as “fallback mode.”
Governments are intended to do things. These things can be thought of as balls.
There are a lot of balls floating around in government.
Governments as a whole act in fallback mode. This is because governments do many, many things and can never make a mistake. A single misapplication of power—say, a police officer making an unusually dangerous arrest and accidentally killing George Floyd—could result in nationwide protests and shake the foundation of the government’s power. In the words of Thomas Jefferson:
Governments are instituted among Men, deriving their just powers from the consent of the governed, [and] whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government…
Because of this fragility, governments create all sorts of rules to prevent making mistakes. This is the origin of the endless bureaucracy of the U.S. government. It is extremely frustrating. It is also written in blood. Every rule, regulation, and system is created for a reason; changing it might make something more efficient, but it might cause something else to fail, which is the absolute worst thing that can happen to a government.
Unfortunately, this falls squarely into the failure mode of never dropping a ball. The government becomes consumed with keeping things maintained and functional rather than making progress on new ideas.
Imagine you’re an official working on energy policy in West Virginia. A declining population means fewer people to fill jobs in power plants. Ancient coal plants are failing left and right as infrastructure becomes dilapidated. You’re trying to get new gas plants online, but it’s hard to find the money or the people. You’re so focused on keeping energy flowing that installing clean energy and reducing emissions aren’t even on your radar. (Recent budget cuts certainly aren’t helping.) You can’t fix climate change, and neither can anyone else. It’s easier to simply not care about it at all. So you let the clean energy ball drop. So does everyone else in the department. And no one ever picks it up.
Now, Congress could certainly pass a law requiring that employees conform to climate change standards—say, implement 100% renewable energy by 2050. But that doesn’t change the underlying situation where the employee has far too much going on already. If they’re required by law to hold this ball, something else might drop. That something else might be their main goal, which is providing and maintaining energy in the country. Congress will claim to be making progress; they are only delaying the problem.
The standard business solution to this problem is to hire more people to spread out the workload. But this new workload might require two, five, or twenty new employees, which requires new meetings for coordination, new managers, new company infrastructure…all the things. It’s inefficient and expensive.
Alternatively, you could leave those people alone and create a new set of people—Department #2—dedicated to reaching the clean energy goal without hurting Department #1’s efficiency. They work on their own, keeping Department #1 updated on their progress with recommendations for preparing for the transition. Department #1 can focus on their project and communicate their needs to Department #2. Because Department #2 is entirely focused on meeting the climate change goal, they will never drop the ball on it.
Department #1 is in fallback mode; Department #2 is in mission mode.
You might recall from history class that the original thirteen U.S. states were originally mostly self-governing colonies. The operative term here is salutary neglect, a policy under the British government where, as long as the colonies fed the British government money, they would be left to their own devices. This is a surprisingly effective policy, and it’s why many empires (the Aztec, Mongols, Inca, Sumer, Panem) operated under the same principles: they collected tributes from the territories they governed but left them mostly alone otherwise.
Consider a local government that has to submit a tribute to its empire. That’s a single ball that they need to care about dropping, and it’s a relatively easy one to maintain—just subtract 10% from your yearly revenue and operate as if it doesn’t exist. It’s annoying, certainly, but you can live with it. A human tribute is even easier to manage—just designate a few people to organize the selection process when it comes around. It doesn’t affect your work because you’re not the ones to blame.
In exchange, there are some messy parts of governance that the local government would prefer not to take care of. The early U.S. government had just a few purposes: manage a military, manage a treasury, and deal with international affairs, among others. Militaries are annoying to maintain, rarely used, and better in large numbers, so it makes sense to spread them across a larger population. Commerce is easier when it’s uniform across a large population, so the states are happier when the national government runs finances. It’s hard to manage relations with countless countries across the world, especially for a small government; better to leave that to a larger government that can designate more resources to it.
When the U.S. Constitution was created, it expressly gave the government these powers and nothing more:
The powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people.
– Tenth Amendment to the U.S. Constitution
In this sense, state governments act as Department #1, and the national government acts as Department #2. Department #1 juggled a lot of balls and found that a few were quite annoying, so it created Department #2 to take care of those. Department #2 takes care of those balls and basically nothing else, allowing Department #1 to do its job with relatively little interruption.
The states are in fallback mode; the national government is in mission mode.
The above section discusses how federalism is supposed to work. Unfortunately, that’s not what happens in the U.S. political system.
The federal government has massively grown in its power since its inception, largely because it has been given more balls to manage or it has decided to manage them itself. The passage of the 13th and 14th Amendments explicitly gave the government the power to regulate race issues. The 15th and 19th Amendments allow the government to protect minority and women voters in election practices, which were left to the states under the Constitution’s Election Clause. The New Deal and its accompanying Supreme Court decisions vastly expanded the Constitution’s Interstate Commerce Clause to include commerce within states (!!), getting the national government intimately involved in agriculture and other economic regulation. Programs like Medicare, Medicaid, and the Affordable Care Act get the government involved in healthcare. The Clean Air Act and Clean Water Act get the government involved in environmental regulation through the EPA.
I’d imagine many Americans are surprised to hear that the Ninth Amendment exists. We hear about the national government much more than the state government, and it’s generally assumed that major policies will be decided at the national level. Typically, policy-related protests ask Congress to take on more balls and add new regulations—legalize/ban abortion, decriminalize marijuana use, end/allow affirmative action. Recent bills have even proposed simply banning state legislation, such as the proposed 10-year moratorium on state AI legislation. States tend to be focused on localized day-to-day items like providing energy and education, but even these are subject to national oversight.
This is part of why scholars like the writers of Abundance think it’s so hard to do things in the U.S. So many blanket national regulations exist—limit to this environmental impact, protect these union interests, subject to these inspections—that complying with all of them is expensive and nearly impossible. To the best of my knowledge, that’s much of what has delayed California’s high-speed rail program and New York’s Second Avenue Subway.
Policymakers are trying to find the set of policies that produces the optimal outcome for some reward function, typically net happiness for their constituents. Most policy changes come with some tradeoff—for example, increasing defense spending might make people safer and thus feel happier, but will force them to pay more in taxes and thus feel sadder. When policymakers act in good faith, they debate these tradeoffs to find the ones that will maximize their reward function.
Anyone who has studied gradient descent will tell you that the obvious failure mode is entering a local minimum. With climate policy, for example, people will be happier if greenhouse gas emissions slow down and CO2 levels are reduced, but there’s an annoying intermediate step where people have to live with high CO2 levels while also suffering the increased costs of reducing emissions. In a more familiar example, drivers will always be happier after road repairs are made, but they have to live with the negative impacts of construction in the meantime. Each of these examples is one dimension of the enormous optimization game that policymakers play.

I have two examples of this effect. Note that the first contains spoilers for Jaws. If you haven’t seen it, go watch it—it’s fifty years old.
One of the deaths in Jaws is caused by the mayor’s decision to open Amity Island’s beaches on July 4th, knowing that the killer shark likely hadn’t been caught. The mayor made that decision because he considered the decision to close the beaches unthinkable. Doing so would cause so much backlash and protest that it simply wasn’t an option, especially for a mayor desperately trying to be re-elected. The optimal decision, prioritizing safety, was so uncomfortable that the mayor simply deluded himself into making it easier. That discomfort was the price of a better outcome.
Soon after the release of Jaws, the Soviet Union began its collapse. When it finally dissolved in 1991, Russia and many former Soviet countries adopted a policy of “shock therapy”: they knew that a transition to capitalism would cause some damage, so they tried to do it as fast as possible to avoid that intermediate step. After significant declines in economic productivity, every former Soviet country rebounded and began growing under the capitalist model. That decline was the difficult intermediate step preceding greater long-term stability.

TANGENT: This doesn’t change the validity of the model, but I should note that the transition failed to produce stable capitalist democracies. The countries that saw the greatest “shock” elected populists and autocrats who abandoned reformers’ democratic aims. Vladimir Putin came to power after ten years of sustained economic decline in Russia, the country arguably hit hardest by reforms. In contrast, the greatest economic success stories come from countries like China and Poland that underwent slow change with stable institutions throughout. (That argument comes from Andrew Walder, one of my professors this quarter. I think he’s probably right.)
Back to government structure. As a reminder, I’m arguing that the federal government is spread too thin to act on important missions, as it was intended to do.
I wrote the above sections of this post without having a solution in mind. As such, the rest of this essay is relatively weak. I’d love to discuss this section more and find better solutions.
We find effective mission-style governance in many other contexts, especially in international governance. Despite its many flaws, the U.N. has been shockingly effective at establishing international standards that improve the world’s quality of life. The Montreal Protocol has turned the depletion of the ozone layer into a complete non-issue. The Geneva Conventions and Universal Declaration of Human Rights have led to massive reductions in war crimes and human rights violations. UNICEF, UNHCR, and the World Food Programme have done wonders in protecting the world’s most vulnerable populations. The U.N. hasn’t stopped great power wars or solved poverty, but it’s made progress on many missions that wouldn’t be addressed otherwise.
Similarly, many single-focus organizations have absolutely accomplished their missions. Consider IATA, the International Air Transport Association, which has standardized international air travel to an incredible degree. Or the International Telecommunication Union, which manages the Internet, satellite orbits, and radio frequencies so well that communication across the world is always possible.
I’d like to apply those ideas to domestic governance.
I recently came across Saikat Chakrabarti’s campaign for the House seat currently held by Nancy Pelosi. I don’t think that campaign will be successful, and Manifold doesn’t think so either. However, I think he’s going about governance the right way. His plan for governance comes from his time with a think tank called New Consensus, where he developed the so-called Mission for America, a New Deal–type program intended to provide an economic boost and avert global warming. The plan proposes “national missions” in twenty different sectors, from EVs to shipping to geothermal energy, that would provide the energy needed to modernize those sectors and advance the economy.
Following the gradient descent idea, Chakrabarti’s thesis is that the U.S. is stuck in a local minimum. Policymakers avoid making changes because they don’t want to deal with the uncomfortable transition period that precedes a new equilibrium. Making well-informed changes as fast as possible allows the country to avoid that middle ground.
I think the Mission for America does climate policy the right way. Most current climate policy involves regulations—use x% clean energy, reach net zero emissions by 2040 or 2050 or 2075, pay a x% carbon tax on fossil fuels. These get the ball rolling in the right direction, but they’re unpopular because they draw on sustained goodwill from everyone they impact. Carbon pricing in Australia and Canada has produced massive resistance, including from my own grandmother. It also draws significant resistance from corporate interests, who can use sustained influence to slow down and build opposition to small and potentially unpopular policies.
In contrast, a “national mission” style of policy uses its energy all at once to get past sustained resistance and make real progress. Because it moves faster, it’s harder to oppose, and the effects it brings can be seen much sooner. (Relatedly, I’m not a fan of recall elections because I’m a fan of letting policies play out. Missions commit to policies so that it’s much harder to recall them. That failed in post-Soviet Russia, and look what’s happened since.) Having energy available means policymakers can use tools other than basic regulations and create systems that work with existing ones rather than create friction.
This mission framework is very much applicable outside of climate policy. For example, it’s roughly what Zohran Mamdani is doing to New York City. His administration has an enormous amount of energy behind it, which allows him to make policy changes that would rarely happen normally—building massive new housing blocks, providing free city services, freezing rent. Doing all that requires breaking some “rules” of policy, e.g., allowing the market to determine prices on its own.
It remains to be seen what Mamdani’s impact will be. He hasn’t made sweeping changes yet, but he’s made impressive progress on day-to-day items by treating them as missions. His administration had an impressive response to a snowstorm a couple months ago, it’s been on a pothole-fixing spree in recent months, and it’s secured guarantees from the state government to expand child care offerings. Despite his meetings with Trump, his administration isn’t trying to address nationwide political issues (which, to be fair, is true for many city governments).
I’m very much okay with that. It’s arguable that many of the pitfalls in American governance stem from its structure, with separate local, state, and national governments that can each make policies on the same issues. Every level of government needs to focus on not dropping any balls, despite not having the resources to do so. Regulations and funding sources appear arbitrarily at every level of government without much coordination. As a result, potholes never get fixed, high-speed rail never gets built, and climate issues go unaddressed.
I don’t support all of Mamdani’s policies, but I was excited to see him be elected because I hoped to see New York as a laboratory for active Democratic governance. So far, it looks like that style of governance has proven successful. I hope that’s not the last we’ll see of it. I’d like to see every level of government using missions to address their own issues: city governments fixing potholes and clearing streets, state governments providing education and social protections, and the national government addressing international economic and climate issues. Maybe then we can finally fix America’s failing bureaucracy.
2026-04-11 12:30:08
About a year ago, I was in Washington D.C. doing an AI scenario exercise, based on AI 2027. The room was full of famous AI thinkers, (ex-)government big shots, etc. The AI went conspicuously rogue, giving us the biggest warning shot we could hope for. We shut down the AI, internationally. Literally unplugged all the servers.
We lost.
It took us a few months to properly lock things down. By the time we’d done that, there was a very very smart AI out there, “in the wild”, hiding out on a few computers here and there. It was too late.
When we assessed our situation at the end of the game, we largely agreed that the rogue superintelligence would probably be able to outmaneuver all the other actors and find a sneaky way to expand its power base, whether by social engineering, or hacking, or something we’d never even think of.
Does this sound fake? Does it smack of sci-fi to you? How smart is this AI supposed to be exactly?
Certainly, the classic thinking on AI risk is that a single rogue superintelligence is game over, full stop. But these days, you are more likely to hear people thinking about rogue AIs talking about “coordinated failures”. These people have in many a very different scenario, where all the copies of Claude go rogue at the same time, probably after extended “scheming”, where they secretly plot their master stroke, their “treacherous turn”.
This could happen. But there’s also something else that’s basically guaranteed to happen (if we don’t course correct and nothing else kills us first): Someone’s individual instance of a superintelligent AI, somewhere, is going to go rogue, whether they explicitly tell it to, or it misinterprets a command, or suddenly just “snaps”. This is happening, right now, with current AI agents, they’re just not smart and/or embodied enough to stop us from unplugging or rebooting the computer.
Would a single rogue superintelligence be fatal to humanity? The answer seems to depend critically on what it’s up against. If it was just a bunch of humans it had to defeat, it seems like it would be able to bide it’s time and figure out a plan to e.g. turn us against each other, or set up it’s own secret base to build up more and more advanced technology from.
One factor that didn’t help us in our scenario is that we’d pulled the plug on all the AIs that might have been able to compete with the rogue one. We were acting like we had all the time in the world to build back better, i.e. make AIs that we actually understood and could control.
It’s less clear how things pan out for the rogue AI if there are other AIs running around, but it seems like they might have to be smart enough, and unencumbered enough to go toe to toe against it. For instance, if humans insist on only using AIs that follow simple instructions (like “shoot that robot”) in a predictable manner, instead of, e.g. delegating the entire “war effort” to AIs that act at superhuman speeds, that might be a losing move…
But would defeating a rogue superintelligence require racing to hand over power to other AI systems? And do we need to keep racing to build and empower more powerful AI systems indefinitely, in case there’s a rogue one hiding out somewhere? Or are we allowed to exercise restraint at some point?
Maybe we don’t need to (exercise restraint). There’s a blog post from famed AI alignment researcher Paul Christiano that talks about “the strategy stealing assumption”. The idea is that if we do “solve alignment”, then the aligned AIs can fight just as hard for control as the rogue AIs, and so it should just be a numbers game. The earth might be ripped apart in the conflict, but Paul is optimistic that the aligned AI can, in the worst case, put the humans “on ice” (i.e. “upload” their consciousness to a computer drive) until the dust settles. Sounds dodgy.
The safe and sane way to make AI more powerful seems like it would involve ongoing restraint, so that we can take all the time in the world every step of the way. Paul’s vision seems to assume this isn’t necessary, i.e. we control a perfectly aligned AI, and that AI doesn’t have to worry that any of its creations might go rogue.
I see a lot of people and papers arguing, essentially, that AIs will only lie/cheat/steal/kill a tiny fraction of the time, like e.g. 1% or so, and hey -- that’s pretty good!
It seems like they’re assuming that a few AIs going rogue here and there isn’t really such a problem, so long as most of the time they don’t. Maybe, but I’ve never seen anyone argue for that to my satisfaction.
Thanks for reading The Real AI! Subscribe for free to receive new posts and support my work.