MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

10% ≈ 90%

2026-04-11 18:51:05

My intuition about probability doesn't match the linear distances between percentage points.

With some percentages, 51% and 49% makes all the difference, such as in company ownership and voting.

But with other percentages, 15% and 30% point estimates of timeline prediction make me wish to have a "plus or minus Knightian uncertainty" emoji, screaming in ignorance "that's basically the same number, no?!?".. the feeling is similar to my reaction when I first learned 0.1 and 1/10 are two different numbers in float32. I understand why, but also .. yuck!

If some of the variables in a world-model can change the prediction from "that's basically impossible" to "oh, it already happened", I wish smarter people than me invented better communication tools about that. Something less awkward than decibels or bits though.. something that would feel like cubic-bezier(1,0,0,1).

The least I can hope for is more examples and qualitative splits (if/else, scenarios, ...) before collapsing an estimate to a single weighted number.



Discuss

the Lazy Market Hypothesis

2026-04-11 17:05:50

Hypothesis: Real markets are not irrationally lazy. They are lazily rational.

The Efficient-Market Hypothesis

Efficient-Market Hypothesis (EMH): asset prices reflect all available information.

EMH assumes costless information, costless cognition, and costless execution. It models a theoretically optimal market.

The friction of real markets is well understood. The Lazy-Market Hypothesis is for modelling markets with friction.

The Lazy-Market Hypothesis (LMH)

Once you price labor, capital, risk, and cognition as real costs, the Efficient-Market optimum stops being the target. Chasing the EMH optimum would cost more than the expected return.

Claim: the rational agent is the lazy agent. Laziness isn't a deviation from rationality under friction; laziness is what rationality looks like once friction is in the model. The descriptive observation "real markets don't reach EMH efficiency" is also a description of real agents at their actual optimum - just not the EMH optimum.

The satisficing rule: stop spending effort when marginal benefit of effort equals marginal cost.

This is descriptive, not normative: it's a claim about which agents survive selection in friction-priced environments, not a claim about ideal rationality. Agents who miscalibrate their laziness function - in either direction - get outcompeted.[1]

Why rational agents are lazy

Grossman and Stiglitz (1980) showed that a perfectly efficient market can't fund its own price-discovery substrate: equilibrium has to sit short of the limit. LMH generalizes the move from information to effort.[2]

The laziness 2x2

The lazy-eager local-global 2x2

Rational agents satisfice their exploration budget against expected environmental volatility, and the four cells of the 2×2 describe categories of agent behaviour: under-spending (lazy-global), over-spending in non-volatile environments[3](eager-global), spending well but committing too hard (eager-local), and the historical winner[4]: spending modestly and committing modestly (lazy-local).

Lazy markets

In lazy markets, equilibrium rests on agent laziness. Both sides form an effort frontier: each invests more only when they think the adversary does. Credit-card fraud, low-volume markets, and cybersecurity all sit in lazy equilibria. Rational agents in adversarial fields satisfice their defence against the expected capability of their adversary instead of maximizing it, and the attacker satisfices symmetrically, aiming to just barely beat the current defence rather than maximize offensive capability. Agents who try to maximize either side end up overspending in a region of diminishing returns and get outcompeted by lazier rivals.

When the effort frontier shifts

Once upon a time, a castle was the gold standard for area defence. Besieging was slow and costly and castles were otherwise nigh impenetrable. After explosives and artillery came in, some agents reacted with better walls, and it didn't work. Castles became a liability.

This describes an effort frontier shock: a change in the cost or capability landscape that invalidates the prior satisficing equilibrium. Drawing from the LMH, we can model these shocks.

Effort frontier shocks create opportunities to create value by adaptation. Who benefits from this?

Adaptation requires slack. Unspent effort budget converts directly into adaptation capacity. The eager agents already committed their efforts. (Other out-of-scope adaptation qualities[5])

Hypothesis: Lazy-local agents, by definition, hold more slack than eager agents. This makes them advantaged in adapting to opportunities after frontier shock, but they pay a cost for suboptimality under the old conditions the eager agents committed to.

Consequence: Effort frontier shocks:

  • disadvantage eager-local agents
  • advantage lazy-local agents
  • eager-global agents are insulated but immobile. Their commitments should weather frontier shocks, but they don't have the slack to extract value from the shocks.
  • lazy-global agents rarely live to see a frontier shift. But if they did, they would be at the best position: slack to extract value, commitments that are insular from the frontier shock.

Elaboration: When the frontier moves, agents with high-commitment, low-monitoring investments get hit hardest - they're slower to detect the shift and slower to redirect once they do. Sunk-cost dynamics make this worse: the first instinct on noticing a shock is usually to double down on the existing commitment rather than abandon it. Castles got thicker walls before they got abandoned.

  • Example: Typically larger corporations and nations get hit by effort frontier shocks worse than small actors.
  • Example: Armies evolved from heavy central command to mission-command because the war frontier moves constantly, and high-monitoring central command wasn't technologically feasible.

Closing

When real agents who are winning and surviving look irrational, it often means we're not modelling their targets and cost functions in full. The interesting move isn't to scold the agents. It's to ask what the frontier looks like, and what would have to change for their satisficing point to land somewhere else.

When agents are miscalibrated, it's often easier to move the environment than to move the agents.[6]

Real markets are not irrationally lazy. They are lazily rational.

Appendix

Open questions[7].

Related literature[8].

  1. Over long enough time horizons. ↩︎

  2. See "Generalizing Grossman-Stiglitz from information cost to effort cost." in Open Questions. ↩︎

  3. Why not choose your strategy based on the expected volatility? Volatility detection is itself effortful. The calibration of optimal laziness is itself subject to LMH. ↩︎

  4. In complex, volatile environments, generalists with slack tend to win. Hyper-specialists (eager-local) dominate their narrow slot while conditions hold. Pure explorers (eager-global) rarely accumulate enough fitness to persist. The animals you've heard of are mostly lazy-locals: good-enough at local-enough tasks to exploit and survive, but carrying enough reserve capacity to ride out shocks rather than be caught rigid by them. (In the longer run they become generalists, because each shock that doesn't kill them incentivizes a new strategy.) (What about lazy-global?[9]) ↩︎

  5. Some adaptation qualities:

    • Execution qualities
      • Speed
      • Ability to mobilize
    • Properties of existing commitments:
      • cross-paradigm salvage value
      • Ability to liquidate
      • Optionality
    • Decisionmaking:
      • Effective scanning
      • Weakly holding to previous paradigms
        • Scout Mindset
    ↩︎
  6. See Astral Codex Ten: Society is fixed, Biology is Mutable for a cross-domain parallel ↩︎

  7. Open questions:

    • Current trends in effort frontiers:
      • What the LLM era does to
        • effort-floor institutions? (Zoning complaints, grant applications, text applications in general)
        • attacker-defender dynamics (Cybersecurity, physical security)
      • Other live frontier shifts: (Remote work? Renewable energy? Cryptocurrency?)
    • "Lazy economics":
      • Generalizing Grossman-Stiglitz from information cost to effort cost.
        • What survives, what breaks, and what new equilibria appear when the substrate being priced is the full stack (search, evaluation, execution, monitoring) rather than information acquisition alone? Sims' rational inattention covers part of this for attention specifically; the full generalization seems open.
      • Who pays for exploration?
        • Could generate interesting predictions on ecosystem composition
        • Implications for foundational research funding
        • How legibility of output correlates with local eagerness instead of global value
          • Do you actually get better foundational research if you measure proxies for the output value (citations, status, optimizing funding applications), or should you just give researchers budgets and let them do whatever?
        • Rational risk tolerance
      • Implications of real agent's calibrated laziness functions:
        • Correlated laziness and systemic fragility: when many agents satisfice against the same threat model, the resulting monoculture is locally stable and globally fragile (2008 risk models, monocultures, antibiotic regimes). LMH may have something to say about market-level fragility that EMH-plus-friction doesn't. Future work.
      • Performance of eager-global agents under effort frontier shifts and value frontier shifts
        • Does the hypothesis hold that eager-global strategies are stabler over effort frontier shift but unstable over value frontier shifts?
          • Do some agents deliberately aim at deeper values to outcompete agents targeting shallower ones - trading short-term efficiency for shock-resistance across value-frontier shifts?
        • See "Layers of value globality"
      • Laziness-function updates:
        • How to affect agent laziness?
        • When do agents shift their positions in the 2x2 (possibly within their quadrant, but still relevantly)? Can a lazy-local agent 'ascend' to a global-lazy agent if they happen to get lucky and find a good global gold vein to extract from? Can an eager agent weather a shock and realize they need more slack?
          • Also see "Is lazy-global a selectable strategy?"
    • Lazy philosophy:
      • How do agents time discounting (affects local vs global) and laziness functions (eager vs lazy) relate to each other?
      • Modelling the effects of effort frontier shifts (cost/capability landscape changes) versus value frontier shifts (what counts as valuable changes):
        • Layers of value globality.
          • Shallow-global values: Values that may shift based on megatrends or regulation.
          • Medium-global values: Objectives that have stayed constant over capitalism
          • Deep-global values: existential safety
        • Can agents aim for deeper values to outcompete agents with shallower objectives?
      • Implications for values of agents:
        • LMH applications to value drift (of individuals, of organizations)
        • LMH applications to addressing existential risk being irrational from the perspective of most real economic agents.
          • agent-rationality question: With rationally lazy modelling, LMH-rational agents won't fund x-risk because the individual cost-benefit math fails.
            • Implications for what kinds of coordination or externalization could change this.
          • model-coverage question: x-risk is a deep-global value, and the layers-of-globality extension is what's needed to make the framework able to talk about it at all.
      • Where are the lazy-global agents?
        • Is lazy-global a selectable strategy?
          • Hypothesis: You cannot aim at lazy-global, but we can recognise lazy-global? Can we only recognise lazy-global in retrospect?
        • Does evolution-at-scale effectively produce lazy-global outcomes even though no individual round of selection aims at them? If selection is locally lazy-local but the surviving distribution is globally weighted toward lazy-global, what does that tell us about which other systems (markets? cultures? research ecosystems?) might have the same structure?
    ↩︎
  8. Related literature:

    ↩︎
  9. Almost no successful agent starts at lazy-global. They start lazy-local (or eager-local), and then luck out. Cyanobacteria didn't select for their waste product to restructure the planet's atmosphere. They were just metabolizing. The "global" part is a retroactive reclassification by an observer who knows where the story goes. ↩︎



Discuss

Announcing ILIAD 2026

2026-04-11 15:02:10

We are pleased to announce ILIAD 2026—a 5-day conference bringing together researchers to build scientific foundations for AI alignment. This is the 3rd iteration of ILIAD, which was first held in the summer of 2024

***Apply to attend by July 1!***

  • When: Aug 3-7, 2026
  • Where: @Lighthaven (Berkeley, US)
  • What: Unconference with participant-led programming, 150-200 attendees. 
  • Who: Currently confirmed attendees include Richard Ngo, Abram Demski, and Adam Goldstein
  • Costs: Tickets are free. Financial support for travel and accommodations is available on a needs basis. 

See our website here. For any questions, email [email protected]


About ILIAD

ILIAD is a 100+ person conference about alignment with a mathematical focus. ILIAD will feature an unconference format—meaning that participants can propose and lead their own sessions. We believe that this is the best way to release the latent creative energies in everyone attending.

The ILIAD conferences are the premier gathering for all things theory and alignment. Topics include: Singular Learning Theory, Agent Foundations, Causal Incentives, Computational Mechanics and Safety-by-Debate, Scalable Oversight and more.

To get a better sense of what might happen, you can view the schedule for ILIAD 2024 here.

Financial Support

Financial support for accommodation & travel are available on a needs basis. We may not be able to offer support to everyone who requests it. 




Discuss

Have we already lost? Part 3: Reasons for Optimism

2026-04-11 14:58:07

Written very quickly for the Inkhaven Residency.

As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after which AI doom becomes both likely and effectively outside of our control?

Spoilers: as you might guess from Betteridge’s Law, my answer to the headline question is no. But the salience of this question feels quite noteworthy to me nonetheless, and reflects a more negative outlook on the future. 

I’ve previously laid out the “plan” as of 2024. I’ve also explained why I (and others around me) have become more pessimistic about the plan. 

Today, I’ll talk a bit about why the answer to the headline question is no. Tomorrow I’ll outline what I think a new plan is, and what we should do. 

Silver linings to reasons for pessimism

First, I'll cover two silver linings to the reasons for pessimism from yesterday.

Fast AI progress brings many concerns into “near mode”. In the past, many of the concerns that people raised about both the potential power and potential risks of AI were more abstract, and so easy to dismiss. Nowadays, we have many more concrete examples of both capabilities and risks than in 2024. For example, in 2024, the demonstrations of risks tended to be academic examples about jailbreaking models to do undesirable things. In 2026, there are plenty of examples of imminent risks, including Anthropic’s new Mythos model, which they are not publicly deploying due to concerns about its use to exploit security weaknesses in software. 

People tend to be much more reasonable when it comes to concrete concerns than abstract arguments.

US antagonism toward European countries increases the number of “live players”. I think it’s fair to say that in 2024, it’d be unthinkable that European countries would deploy troops to Greenland to defend it against the US. It might’ve been fair to assume that insofar as the US government attempts to initiate an intelligence explosion, that the European countries would hesitate to take any actions that could be seen as provocative. Now, it seems much more likely that they’d be willing to take even quite drastic actions against the US assuming their leaders became convinced about catastrophic risks from artificial intelligence.

Reasons for optimism

I'll start by covering reasons for optimism from 2024 that continue to hold, then talk about two new reasons for optimism, relative to my view from 2024.

Continued reasons for optimism

Most people continue to not want to die to misaligned AI. Thankfully, the majority of people working in AI (both in developers and in policy) are not psychopaths who care not at all for other humans, nor hardcore successionists who want to replace humans with (unaligned) AIs. Even if people might be incentivized to take on levels of risks that would be unacceptable to others, I suspect no major actor would knowingly attempt to launch a misaligned superintelligent AI out of spite or malice, and most would act to oppose this. 

The US public continues to be incredibly skeptical of AI and big tech. Measures such as the (controversial in tech circles) SB 1047 were broadly supported by the public. To be clear, the US public is not skeptical of AI because of existential or catastrophic risk reasons, but instead mundane reasons like power usage and worker displacement. Nonetheless, there seems to be a substantial desire from voters from both parties to slow down the rate of dangerous AI development, and it remains likely that people will be supportive of future policy actions in this area. 

Government-sponsored AISIs and safety teams at frontier developers continue to exist; many have expanded. In 2024, government institutions such as UK AISI were relatively new, as were the safety teams at some of the non-OpenAI/Anthropic AI developers. In 2026, despite much drama and turmoil in some cases, most of these teams still exist.

New reasons for optimism

Anthropic continues to be competitive in the AI race. In 2024, it was widely believed that Anthropic lagged behind OpenAI in the quality of their models (remember that Opus 3 was a slightly-above GPT-4 tier model that was released a full year after GPT-4). There was also substantial doubt that Anthropic would be able to remain competitive over time given their significant compute disadvantage relative to other developers including OpenAI, Meta, Google DeepMind, or xAI. Theories of change that depended on Anthropic being in control of some of the best AI models in the world were considered suspect as a result, let alone theories of change where Anthropic would be able to maintain a 6 month to 1 year lead leading into super intelligence. 

In 2026, Anthropic is widely considered to be competitive with OpenAI in terms of the quality of their best models, despite a continued compute disadvantage. Some of the other developers with substantially more compute, such as xAI or Meta, seem to be struggling to create similarly capable models. The same theories of change now look substantially more plausible. 

Empirical, wing-it style alignment and control extended further than some expected. Despite rising amounts of evaluation awareness, it continues to seem plausible that many eval results generalize to reality. Models are pretty bad at taking covert actions with minimal amounts of prompting. The chains of thought of at least the OpenAI models (and plausibly the Anthropic and GDM models) continue to be relatively honest and useful for monitoring.[1] Most importantly: despite substantial amounts of scaling, we haven’t seen the rise of coherent, goal-directed agency toward particular goals, nor have we seen attempts at deliberate sabotage of lab processes. Even if the empirically-derived techniques we have may not generalize into the future, they’ve continued to work through higher levels of capability than some thought they would. 

  1. ^

    Of course, there's active concerns that this is a fragile property, and there's always been questions about the extend of its usefulness. However, people are broadly aware of this, and at least OpenAI (if not also Anthropic) seems to be taking serious efforts to try and preserve the usefulness of CoT for monitoring.



Discuss

Dario probably doesn't believe in superintelligence

2026-04-11 13:58:14

Epistemic status: I think the headline claim is true, and that the evidence within is actually quite strong in a bayesian sense, but don't think the post itself is very well written or particularly interesting. But I had to get 500 words out! I think the 2013 conversation is interesting reading as a piece of history, separate from the top-level question, and recommend reading that.

I think many people have a relationship with Anthropic that is premised on a false belief: that Dario Amodei believes in superintelligence.

What do I mean by "believes" in superintelligence? Roughly speaking, that the returns to intelligence past the human level are large, in terms of the additional affordances they would grant for steering the world, and that it is practical to get that additional intelligence into a system.

There are many pieces of evidence which suggest this, going quite far back.

In 2013, Dario was one of two science advisors (along with Jacob Steinhardt) that Holden brought along to a discussion with Eliezer and Luke about MIRI strategy. A transcript of the conversation is here. It is the first piece of public communication I can find from Dario on the subject. Read end-to-end, I don't think it strongly supports my titular claim. However, there is this quote:

Dario: No, but the claim that I and maybe Jacob are implicitly making is that maybe a large fraction of the potentially world­ending AIs end up making this mistake first so and so we aren't threatened in the first place.

This is not the kind of sentence you say if the concept you have loaded in your head is the same concept I have for "superintelligence". I do not think the context particularly rescues it.

In 2016, Dario was first author on Concrete Problems in AI Safety. I understand that this was an academic publication. Nevertheless, I think this passage is suggestive:

There has been a great deal of public discussion around accidents. To date much of this discussion has highlighted extreme scenarios such as the risk of misspecified objective functions in superintelligent agents [27]. However, in our opinion one need not invoke these extreme scenarios to productively discuss accidents, and in fact doing so can lead to unnecessarily speculative discussions that lack precision, as noted by some critics [38, 85]. We believe it is usually most productive to frame accident risk in terms of practical (though often quite general) issues with modern ML techniques. As AI capabilities advance and as AI systems take on increasingly important societal functions, we expect the fundamental challenges discussed in this paper to become increasingly important. The more successfully the AI and machine learning communities are able to anticipate and understand these fundamental technical challenges, the more successful we will ultimately be in developing increasingly useful, relevant, and important AI systems.

The next relevant pieces of evidence are from a panel he was part of at EAG 2017, "Musings on AI" (yt link). Here there are multiple relevant quotes (bolding mine):

Michael Page: So one thing we talk a lot about in this community is the risks associated with developing advanced AI. But obviously, there are also a lot of benefits associated with developing advanced AI. This is a bit of a cheeky question, but one way of putting it is: Are you all more concerned about developing advanced AI or not developing advanced AI?

Dario Amodei: I think I'm deeply concerned about both. Um, so on the not developing advanced AI, you know, one observation you can make is that modern society, and particularly a society with nuclear weapons, has only been around for about 70 years. There have been a lot of close calls since then, and things seem to be getting worse. You know, if I look at kind of the world and geopolitics in the last few years, China is rising. Um, you know, there's a lot of unrest in the Western world, a lot of very destructive nationalism. Um, you know, we're developing biological technologies very quickly. It's not entirely clear to me that civilization is compatible with digital communication. Um, you know, it has really some subtle corrosive effects.

So every year that passes is a danger that we face, and although AI has a number of dangers, actually I think, you know, if we never built AI, if we don't build AI for 100 years or 200 years, I'm very worried about whether civilization will actually survive. Um, of course, on the other hand, you know, I work on AI safety, and so I'm very concerned that transformative AI is very powerful and that bad things could happen, either because of safety or alignment problems, or because there's a concentration of power in the hands of the wrong people, the wrong governments, who control AI. So I think it's terrifying in all directions, but not building AI isn't an option because I don't think civilization is safe.

Those are not the kinds of sentences you say if you have the same "superintelligence" pointer as me, and you think that is what is actually at the end of the tunnel (as opposed to being a possible but pretty low likelihood outcome).

Dario Amodei: So here's a one-liner. Uh, AI doesn't have to learn all of human morality. So, this is something actually that even at this point MIRI agrees with, but there's some writing in the past, a lot of writing in the past, that I think people are still anchored to in many ways. So what I mean by that in particular is that, you know, I think the things we would want from an AGI are to, you know, to kind of stabilize the world, to end material scarcity, to give us control over our own biology, maybe to resolve international conflicts. We don't want or I think need to, you know, kind of build a system—at least not build a system ourselves—that, you know, kind of is a sovereign and kind of runs the world for the indefinite future and controls the entire light cone. So of course you still need to know a lot of things about human values, but I still often run into people who are kind of thinking about this problem in a way that I think is actually harder than we probably need to solve.

Those are, admittedly, the sentences of someone who might have the same "superintelligence" pointer as me, but not someone who has asked themselves how to get to a safe point and then stop without accidentally crossing a dangerous threshold (or letting anyone else do so).

And then, alas, we have Machines of Loving Grace (2024).

Although I think most people underestimate the upside of powerful AI, the small community of people who do discuss radical AI futures often does so in an excessively “sci-fi” tone (featuring e.g. uploaded minds, space exploration, or general cyberpunk vibes). I think this causes people to take the claims less seriously, and to imbue them with a sort of unreality. To be clear, the issue isn’t whether the technologies described are possible or likely (the main essay discusses this in granular detail)—it’s more that the “vibe” connotatively smuggles in a bunch of cultural baggage and unstated assumptions about what kind of future is desirable, how various societal issues will play out, etc. The result often ends up reading like a fantasy for a narrow subculture, while being off-putting to most people.

...

My predictions are going to be radical as judged by most standards (other than sci-fi “singularity” visions2), but I mean them earnestly and sincerely.

2I do anticipate some minority of people’s reaction will be “this is pretty tame”. I think those people need to, in Twitter parlance, “touch grass”. But more importantly, tame is good from a societal perspective. I think there’s only so much change people can handle at once, and the pace I’m describing is probably close to the limits of what society can absorb without extreme turbulence.

...

We could summarize this as a “country of geniuses in a datacenter”.

Clearly such an entity would be capable of solving very difficult problems, very fast, but it is not trivial to figure out how fast. Two “extreme” positions both seem false to me. First, you might think that the world would be instantly transformed on the scale of seconds or days (“the Singularity”), as superior intelligence builds on itself and solves every possible scientific, engineering, and operational task almost immediately. The problem with this is that there are real physical and practical limits, for example around building hardware or conducting biological experiments. Even a new country of geniuses would hit up against these limits. Intelligence may be very powerful, but it isn’t magic fairy dust.

Second, and conversely, you might believe that technological progress is saturated or rate-limited by real world data or by social factors, and that better-than-human intelligence will add very little. This seems equally implausible to me—I can think of hundreds of scientific or even social problems where a large group of really smart people would drastically speed up progress, especially if they aren’t limited to analysis and can make things happen in the real world (which our postulated country of geniuses can, including by directing or assisting teams of humans).

I think the truth is likely to be some messy admixture of these two extreme pictures, something that varies by task and field and is very subtle in its details. I believe we need new frameworks to think about these details in a productive way.

...

10Another factor is of course that powerful AI itself can potentially be used to create even more powerful AI. My assumption is that this might (in fact, probably will) occur, but that its effect will be smaller than you might imagine, precisely because of the “decreasing marginal returns to intelligence” discussed here. In other words, AI will continue to get smarter quickly, but its effect will eventually be limited by non-intelligence factors, and analyzing those is what matters most to the speed of scientific progress outside AI.

One possibility suggested by this essay is that Dario does not really believe in superintelligence (the hypothesis we are currently examining). Another is that he does, but has chosen to dissemble for strategic purposes. While I don't think Dario is above communicating strategically, I do in fact think this is roughly his mainline worldview, and it follows pretty clearly from his beliefs in 2017. Maybe there are other possibilities apart from those two, though I haven't figured out what they might be.

The Adolescence of Technology (2026) also contains many relevant details, which I will fail to quote.

Beyond textual evidence, let me include a few other lines of evidence:

  • Dario has paid substantial political costs by making an enemy of the Trump administration on the subject of chip export controls (and then, later, Acceptable Use Policies, though this one is weaker because failing to hold the line here could plausibly have cost him more in terms of employee morale/turnover than he lost from the DoW's enmity). This is strongly consistent with being quite worried about misuse risks, particularly from authoritarian governments.
  • I have spoken to or been privy to conversations with multiple Anthropic employees where this subject came up. Several employees confirmed (paraphrasing) that Dario was not as ASI-pilled as they were, and I have yet to any employee object that no, Dario does actually expect to live to see strong nanotech and dyson spheres, and that these concerns are fundamental to how he orients to Anthropic's mission, the potential risks and benefits involved, how to communicate these beliefs to the public, etc. (I welcome correction on this point from other Anthropic employees, or from Dario himself, if I am wrong.)
  • The one time that Dario gave an actual estimate of 10-25% of catastrophic outcomes, that included multiple sources of risk. My guess is that misalignment risk is well under half of that. This is technically orthogonal to believing in superintelligence, but in practice I think these beliefs are related.


Discuss

Why Nothing Ever Happens

2026-04-11 13:11:37

A cross-post from my personal website, Chasing Sunsets.

Epistemic note: This is a long post that applies two models of work to the government context. I've found those models to be useful in that context and elsewhere, but the arguments I present are very much debatable, and I'm very much open to discussing them.

I’m going to tell you a story. It is not a true story. Please ignore that fact; you can argue with me later if it seems relevant.

In The Games I Play, I mentioned that I used to play Cities: Skylines, a game whose objective is to build as large of a city as possible. It’s a difficult game to play.

Your city begins with a single highway exit. You connect a road to it, a simple two-lane road because that’s all you have the money to build. You zone that road for houses. People move into those houses, but those people need work, so you build commercial and industrial regions nearby. Commercial zoning might be right along your main street, but you can’t put your industry right next to people’s homes because they’ll complain about noise pollution if you do.

This is your first rule.

Your city develops. You add a water pump, then a sewage pipe, which has to be far downstream of the pump because your population will get sick otherwise. You unlock elementary schools and build one for every hundred kids in the city. Your main intersection gets blocked by traffic because it’s used for industrial trucks and residents’ cars, so you give each new industrial area a highway exit. You destroy a couple homes to add parks in one neighborhood because its residents are unhappy. You add fire stations every ten blocks. There’s traffic again between the residential and commercial parts of the city, so you add metro lines in a clockwise and counterclockwise loop around the city. You try adding high-density housing, but it gets too loud and happiness levels decline.

Your residents still need jobs. Not the commercial jobs—those need higher education levels and you don’t have a university in the city yet. But you can’t build an industrial area. One area might work, but it’s not near any major highways and trucks couldn’t get out of the city. The city’s only other undeveloped area would end up polluting a residential zone. You can’t condense that residential zone because any more high-density housing will make homeowners unhappy. Besides, doing that would cut off your existing metro stations, and you’re not sure you have enough money to move them.

As a result, people start leaving the city. You try to bring them back—you add recycling service, build more parks, allow them to smoke weed—but nothing seems to work. Abandoned homes line your blocks; new residents move in every so often, only to demand industrial jobs that don’t exist. Tax revenue declines, and your population plummets. Frustrated, you turn off the game and go back to your math homework.

If only you could build an industrial zone.

How Stuff Gets Done

I recently read One Day Sooner and Never Drop A Ball, a pair of LessWrong posts that present a dichotomy in how organizations function. They’re worth reading in full if you have time. If you don’t have time, I’ll briefly summarize each one.

One Day Sooner is a mode of working that aggressively pursues a single goal. The canonical example is vaccine trials, where organizations like 1DaySooner work to get vaccines out as quickly as possible. The idea is that some goals, like getting vaccines to the public or eliminating a bottleneck in a project, are so important that they should be pursued immediately. The post describes how to get to things One Day Sooner, typically involving pushing through unnecessary slowdowns like “let’s schedule this meeting three days from now” or “I need a break from work tonight.”

Never Drop A Ball is a mode of working that pursues several goals or projects and ensures none of them are lost in the shuffle. Where a startup founder aiming for One Day Sooner might spend a sixty-hour week working on their core project, a founder aiming to Never Drop A Ball will carefully keep track of every bit of work, ensuring things like taxes and floor cleaning are properly handled. Notably, the failure mode of Never Drop A Ball work is that balls need to be dealt with; if they take too much work to resolve or there are too many to handle, balls will begin to drop.

The difference between these modes is how many balls they handle. Aiming for One Day Sooner means juggling one or two balls, typically ones that are very difficult to resolve, and trying very hard to resolve them. Aiming to Never Drop A Ball means juggling many balls—and, importantly, juggling new balls that come up—which tend to be easy to resolve and thus easy to miss. The number of balls and the likelihood of being assigned new balls are sliding scales, but it’s often easier to think of them as a binary. As anyone who has worked on a project can tell you, dealing with a small number of balls is associated with a low likelihood of being assigned new balls because new balls tend to be distracting.

In the rest of this essay, aiming for One Day Sooner will be referred to as “mission mode,” while aiming to Never Drop A Ball will be referred to as “fallback mode.”

How Stuff Gets Done in Government

Governments are intended to do things. These things can be thought of as balls.

There are a lot of balls floating around in government.

Governments as a whole act in fallback mode. This is because governments do many, many things and can never make a mistake. A single misapplication of power—say, a police officer making an unusually dangerous arrest and accidentally killing George Floyd—could result in nationwide protests and shake the foundation of the government’s power. In the words of Thomas Jefferson:

Governments are instituted among Men, deriving their just powers from the consent of the governed, [and] whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government…

Because of this fragility, governments create all sorts of rules to prevent making mistakes. This is the origin of the endless bureaucracy of the U.S. government. It is extremely frustrating. It is also written in blood. Every rule, regulation, and system is created for a reason; changing it might make something more efficient, but it might cause something else to fail, which is the absolute worst thing that can happen to a government.

Unfortunately, this falls squarely into the failure mode of never dropping a ball. The government becomes consumed with keeping things maintained and functional rather than making progress on new ideas. 

Imagine you’re an official working on energy policy in West Virginia. A declining population means fewer people to fill jobs in power plants. Ancient coal plants are failing left and right as infrastructure becomes dilapidated. You’re trying to get new gas plants online, but it’s hard to find the money or the people. You’re so focused on keeping energy flowing that installing clean energy and reducing emissions aren’t even on your radar. (Recent budget cuts certainly aren’t helping.) You can’t fix climate change, and neither can anyone else. It’s easier to simply not care about it at all. So you let the clean energy ball drop. So does everyone else in the department. And no one ever picks it up. 

Now, Congress could certainly pass a law requiring that employees conform to climate change standards—say, implement 100% renewable energy by 2050. But that doesn’t change the underlying situation where the employee has far too much going on already. If they’re required by law to hold this ball, something else might drop. That something else might be their main goal, which is providing and maintaining energy in the country. Congress will claim to be making progress; they are only delaying the problem.

The standard business solution to this problem is to hire more people to spread out the workload. But this new workload might require two, five, or twenty new employees, which requires new meetings for coordination, new managers, new company infrastructure…all the things. It’s inefficient and expensive.

Alternatively, you could leave those people alone and create a new set of people—Department #2—dedicated to reaching the clean energy goal without hurting Department #1’s efficiency. They work on their own, keeping Department #1 updated on their progress with recommendations for preparing for the transition. Department #1 can focus on their project and communicate their needs to Department #2. Because Department #2 is entirely focused on meeting the climate change goal, they will never drop the ball on it.

Department #1 is in fallback mode; Department #2 is in mission mode.

Why Federalism Works

You might recall from history class that the original thirteen U.S. states were originally mostly self-governing colonies. The operative term here is salutary neglect, a policy under the British government where, as long as the colonies fed the British government money, they would be left to their own devices. This is a surprisingly effective policy, and it’s why many empires (the Aztec, Mongols, Inca, Sumer, Panem) operated under the same principles: they collected tributes from the territories they governed but left them mostly alone otherwise.

Consider a local government that has to submit a tribute to its empire. That’s a single ball that they need to care about dropping, and it’s a relatively easy one to maintain—just subtract 10% from your yearly revenue and operate as if it doesn’t exist. It’s annoying, certainly, but you can live with it. A human tribute is even easier to manage—just designate a few people to organize the selection process when it comes around. It doesn’t affect your work because you’re not the ones to blame.

In exchange, there are some messy parts of governance that the local government would prefer not to take care of. The early U.S. government had just a few purposes: manage a military, manage a treasury, and deal with international affairs, among others. Militaries are annoying to maintain, rarely used, and better in large numbers, so it makes sense to spread them across a larger population. Commerce is easier when it’s uniform across a large population, so the states are happier when the national government runs finances. It’s hard to manage relations with countless countries across the world, especially for a small government; better to leave that to a larger government that can designate more resources to it. 

When the U.S. Constitution was created, it expressly gave the government these powers and nothing more:

The powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people.

– Tenth Amendment to the U.S. Constitution

In this sense, state governments act as Department #1, and the national government acts as Department #2. Department #1 juggled a lot of balls and found that a few were quite annoying, so it created Department #2 to take care of those. Department #2 takes care of those balls and basically nothing else, allowing Department #1 to do its job with relatively little interruption. 

The states are in fallback mode; the national government is in mission mode.

Why Federalism Doesn’t Work

The above section discusses how federalism is supposed to work. Unfortunately, that’s not what happens in the U.S. political system.

The federal government has massively grown in its power since its inception, largely because it has been given more balls to manage or it has decided to manage them itself. The passage of the 13th and 14th Amendments explicitly gave the government the power to regulate race issues. The 15th and 19th Amendments allow the government to protect minority and women voters in election practices, which were left to the states under the Constitution’s Election Clause. The New Deal and its accompanying Supreme Court decisions vastly expanded the Constitution’s Interstate Commerce Clause to include commerce within states (!!), getting the national government intimately involved in agriculture and other economic regulation. Programs like Medicare, Medicaid, and the Affordable Care Act get the government involved in healthcare. The Clean Air Act and Clean Water Act get the government involved in environmental regulation through the EPA.

I’d imagine many Americans are surprised to hear that the Ninth Amendment exists. We hear about the national government much more than the state government, and it’s generally assumed that major policies will be decided at the national level. Typically, policy-related protests ask Congress to take on more balls and add new regulations—legalize/ban abortion, decriminalize marijuana use, end/allow affirmative action. Recent bills have even proposed simply banning state legislation, such as the proposed 10-year moratorium on state AI legislation. States tend to be focused on localized day-to-day items like providing energy and education, but even these are subject to national oversight.

This is part of why scholars like the writers of Abundance think it’s so hard to do things in the U.S. So many blanket national regulations exist—limit to this environmental impact, protect these union interests, subject to these inspections—that complying with all of them is expensive and nearly impossible. To the best of my knowledge, that’s much of what has delayed California’s high-speed rail program and New York’s Second Avenue Subway

Interlude: How Stuff Gets Done as Gradient Descent

Policymakers are trying to find the set of policies that produces the optimal outcome for some reward function, typically net happiness for their constituents. Most policy changes come with some tradeoff—for example, increasing defense spending might make people safer and thus feel happier, but will force them to pay more in taxes and thus feel sadder. When policymakers act in good faith, they debate these tradeoffs to find the ones that will maximize their reward function.

Anyone who has studied gradient descent will tell you that the obvious failure mode is entering a local minimum. With climate policy, for example, people will be happier if greenhouse gas emissions slow down and CO2 levels are reduced, but there’s an annoying intermediate step where people have to live with high CO2 levels while also suffering the increased costs of reducing emissions. In a more familiar example, drivers will always be happier after road repairs are made, but they have to live with the negative impacts of construction in the meantime. Each of these examples is one dimension of the enormous optimization game that policymakers play. 

I have two examples of this effect. Note that the first contains spoilers for Jaws. If you haven’t seen it, go watch it—it’s fifty years old.

One of the deaths in Jaws is caused by the mayor’s decision to open Amity Island’s beaches on July 4th, knowing that the killer shark likely hadn’t been caught. The mayor made that decision because he considered the decision to close the beaches unthinkable. Doing so would cause so much backlash and protest that it simply wasn’t an option, especially for a mayor desperately trying to be re-elected. The optimal decision, prioritizing safety, was so uncomfortable that the mayor simply deluded himself into making it easier. That discomfort was the price of a better outcome. 

Soon after the release of Jaws, the Soviet Union began its collapse. When it finally dissolved in 1991, Russia and many former Soviet countries adopted a policy of “shock therapy”: they knew that a transition to capitalism would cause some damage, so they tried to do it as fast as possible to avoid that intermediate step. After significant declines in economic productivity, every former Soviet country rebounded and began growing under the capitalist model. That decline was the difficult intermediate step preceding greater long-term stability.

TANGENT: This doesn’t change the validity of the model, but I should note that the transition failed to produce stable capitalist democracies. The countries that saw the greatest “shock” elected populists and autocrats who abandoned reformers’ democratic aims. Vladimir Putin came to power after ten years of sustained economic decline in Russia, the country arguably hit hardest by reforms. In contrast, the greatest economic success stories come from countries like China and Poland that underwent slow change with stable institutions throughout. (That argument comes from Andrew Walder, one of my professors this quarter. I think he’s probably right.)

Making Federalism Work

Back to government structure. As a reminder, I’m arguing that the federal government is spread too thin to act on important missions, as it was intended to do.

I wrote the above sections of this post without having a solution in mind. As such, the rest of this essay is relatively weak. I’d love to discuss this section more and find better solutions.

We find effective mission-style governance in many other contexts, especially in international governance. Despite its many flaws, the U.N. has been shockingly effective at establishing international standards that improve the world’s quality of life. The Montreal Protocol has turned the depletion of the ozone layer into a complete non-issue. The Geneva Conventions and Universal Declaration of Human Rights have led to massive reductions in war crimes and human rights violations. UNICEF, UNHCR, and the World Food Programme have done wonders in protecting the world’s most vulnerable populations. The U.N. hasn’t stopped great power wars or solved poverty, but it’s made progress on many missions that wouldn’t be addressed otherwise.

Similarly, many single-focus organizations have absolutely accomplished their missions. Consider IATA, the International Air Transport Association, which has standardized international air travel to an incredible degree. Or the International Telecommunication Union, which manages the Internet, satellite orbits, and radio frequencies so well that communication across the world is always possible. 

I’d like to apply those ideas to domestic governance.

I recently came across Saikat Chakrabarti’s campaign for the House seat currently held by Nancy Pelosi. I don’t think that campaign will be successful, and Manifold doesn’t think so either. However, I think he’s going about governance the right way. His plan for governance comes from his time with a think tank called New Consensus, where he developed the so-called Mission for America, a New Deal–type program intended to provide an economic boost and avert global warming. The plan proposes “national missions” in twenty different sectors, from EVs to shipping to geothermal energy, that would provide the energy needed to modernize those sectors and advance the economy. 

Following the gradient descent idea, Chakrabarti’s thesis is that the U.S. is stuck in a local minimum. Policymakers avoid making changes because they don’t want to deal with the uncomfortable transition period that precedes a new equilibrium. Making well-informed changes as fast as possible allows the country to avoid that middle ground.

I think the Mission for America does climate policy the right way. Most current climate policy involves regulations—use x% clean energy, reach net zero emissions by 2040 or 2050 or 2075, pay a x% carbon tax on fossil fuels. These get the ball rolling in the right direction, but they’re unpopular because they draw on sustained goodwill from everyone they impact. Carbon pricing in Australia and Canada has produced massive resistance, including from my own grandmother. It also draws significant resistance from corporate interests, who can use sustained influence to slow down and build opposition to small and potentially unpopular policies. 

In contrast, a “national mission” style of policy uses its energy all at once to get past sustained resistance and make real progress. Because it moves faster, it’s harder to oppose, and the effects it brings can be seen much sooner. (Relatedly, I’m not a fan of recall elections because I’m a fan of letting policies play out. Missions commit to policies so that it’s much harder to recall them. That failed in post-Soviet Russia, and look what’s happened since.) Having energy available means policymakers can use tools other than basic regulations and create systems that work with existing ones rather than create friction. 

This mission framework is very much applicable outside of climate policy. For example, it’s roughly what Zohran Mamdani is doing to New York City. His administration has an enormous amount of energy behind it, which allows him to make policy changes that would rarely happen normally—building massive new housing blocks, providing free city services, freezing rent. Doing all that requires breaking some “rules” of policy, e.g., allowing the market to determine prices on its own.

It remains to be seen what Mamdani’s impact will be. He hasn’t made sweeping changes yet, but he’s made impressive progress on day-to-day items by treating them as missions. His administration had an impressive response to a snowstorm a couple months ago, it’s been on a pothole-fixing spree in recent months, and it’s secured guarantees from the state government to expand child care offerings. Despite his meetings with Trump, his administration isn’t trying to address nationwide political issues (which, to be fair, is true for many city governments). 

I’m very much okay with that. It’s arguable that many of the pitfalls in American governance stem from its structure, with separate local, state, and national governments that can each make policies on the same issues. Every level of government needs to focus on not dropping any balls, despite not having the resources to do so. Regulations and funding sources appear arbitrarily at every level of government without much coordination. As a result, potholes never get fixed, high-speed rail never gets built, and climate issues go unaddressed. 

I don’t support all of Mamdani’s policies, but I was excited to see him be elected because I hoped to see New York as a laboratory for active Democratic governance. So far, it looks like that style of governance has proven successful. I hope that’s not the last we’ll see of it. I’d like to see every level of government using missions to address their own issues: city governments fixing potholes and clearing streets, state governments providing education and social protections, and the national government addressing international economic and climate issues. Maybe then we can finally fix America’s failing bureaucracy. 




Discuss