MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

AGI and the structural foundations of democracy and the rule-based international order

2026-01-01 20:07:22

Published on January 1, 2026 12:07 PM GMT

Summary: This post argues that Artificial General Intelligence (AGI) threatens both liberal democracy and rule-based international order through a parallel mechanism. Domestically, if AGI makes human labor economically unnecessary, it removes the structural incentive for inclusive democratic institutions—workers lose leverage when their contribution is no longer essential. Internationally, if AGI gives one nation overwhelming productivity advantages, it erodes other countries' comparative advantages, reducing the benefits of trade and weakening incentives to maintain a rule-based world order. The post draws historical parallels to early 20th century concerns about capital concentration, distinguishes between "maritime" (trade-dependent) and "continental" (autarkic) power strategies, and discusses what middle powers like the EU might do to remain relevant. The core insight is that both democracy and international cooperation rest on mutual economic dependence—and AGI could eliminate both dependencies simultaneously.

Read this if you're interested in: AGI's geopolitical implications, how economic structures shape political systems, the future of liberal democracy, or strategic options for countries that won't lead in AGI development.

Epistemic status: fairly speculative and likely incomplete or inaccurate, though with a lot of interesting links.

Introduction

The Effective Altruism community has long acknowledged the risks of AGI, especially those related to the loss of control, for instance via gradual disempowerment). Less attention has received the issue of stable totalitarianism, AI-powered totalitarianism that could more easily enforce large-scale surveillance;  and extreme power concentration, where a handful of companies or countries might have a much larger degree of power and might challenge the concept of liberal democracies.

This post examines the last of these risks—extreme power concentration—but not through the lens of a coup or sudden takeover. Instead, I focus on structural forces that create incentives for liberal democracy and rule-based international order, and how AGI might erode both simultaneously through a parallel mechanism.

Here's my core argument: Both liberal democracy and rule-based international order rest on structural incentives created by mutual dependence. Internally, the need for human labor creates incentives for inclusive institutions. Externally, the benefits of trade based on comparative advantage create incentives for rules-based cooperation. AGI threatens to weaken both dependencies simultaneously—reducing the value of human labor domestically and comparative advantage internationally. This parallel erosion could undermine the foundations of the current democratic and rule-based world order.

I'll first examine how liberal democracies survived early concerns about capital concentration because labor remained economically essential, and why AGI presents a qualitatively different challenge. Then I'll analyze how AGI could shift major powers from trade-oriented "maritime" strategies toward autarkic "continental" strategies, weakening rule-based order. Next, I'll discuss bottlenecks that might slow these dynamics and provide leverage for maintaining cooperation. Finally, I'll explore what AI middle powers—Europe in particular—might do to remain relevant.

A quick disclaimer: I am not a historian nor an economist, so there may be important gaps in my arguments. This is an exposition of my current understanding, offered to invite discussion and correction.

Leverage without labour

In the late 19th and early 20th centuries, many intellectuals considered socialism or communism superior to capitalism. For observers at the time, it seemed plausible that the capitalist model emerging from the Industrial Revolution would hand control of nascent democracies to a small elite with ownership of the means of production. Historical examples like the East India Company—which effectively functioned as a state in the territories it controlled—suggested that extreme capital concentration could indeed override formal political structures.

These concerns proved exaggerated. While communism failed to deliver prosperity and often devolved into authoritarianism, liberal democracies survived and thrived. Today's reality is more nuanced: billionaires may have large political influence, but the welfare state has expanded significantly, and most citizens in developed democracies enjoy unprecedented material prosperity and political rights.

A key structural factor could explain democracy's resilience: labor remained essential to production. I have sometimes read the argument that this is due to labour remaining a strong complement to capital, and in fact, labour remaining a roughly stable (~60%) fraction of GDP (Labor share of gross domestic product (GDP) - Our World in Data). More importantly, the complementarity between labor and capital meant that governments needed worker cooperation for economic growth and military capacity, workers retained leverage through their ability to withhold labor (strikes) and votes, and inclusive institutions outperformed extractive ones because investing in human capital—education, healthcare, infrastructure—generated returns through increased productivity.

AGI represents a qualitatively different challenge from previous automation waves. Previous technological advances, from the steam engine to computers, increased the productivity of human labor rather than replacing it. Workers could be retrained for new roles. The worry is that if AGI renders the marginal value of human work close to 0, then some or most of the incentives for liberal democracy and inclusive institutions could disappear.

This is not simply another shift in which tasks humans perform—it is a fundamental change in whether human labor remains economically necessary. For some time, institutional inertia might keep things going, but there is a distinct chance that there might be a significant erosion of democratic institutions in such a world over the long term (Anton Leicht: AI and Jobs).

If a small group can achieve prosperity without broad-based human contribution, the long-term equilibrium may drift toward more extractive institutions. And even if we ultimately establish good institutions in this new equilibrium, I believe people will be extremely confused about which actions we should take during the transition, just as I believe many early communist intellectuals had good intentions. This is thus a precautionary tale about taking strong actions early on, when the best path forward is not clear yet.

The “maritime” world order

Just as liberal democracy has proven to be the most promising government structure for growth and prosperity, the rule-based world order born since the Second World War has also shown to have many advantages. Here too, we find a structural foundation: this order enables countries to participate in trade, which creates mutual gains through comparative advantage.

David Ricardo's insight was that even if one country could produce every good more efficiently than another, both benefit from specialization and trade. Each country focuses on what it produces relatively most efficiently, enabling all participants to grow richer. This creates a powerful incentive to maintain the rules and institutions that facilitate trade—international law, freedom of navigation, dispute resolution mechanisms, and so forth. Trade, in turn, strongly depends on this idea of relative advantage: even if one country might on its own be better at producing any single product or service, by specializing and devoting resources where they are more productive, they leave space to other countries to grow their own relative advantage. And thus, by trading and following rules, countries may grow richer and more prosperous (Sarah C. Paine / Noahpinion Interview).

Strategic thinkers distinguish between "maritime powers" that depend on trade and therefore invest in maintaining open sea lanes, international rules, and alliance networks, versus "continental powers" that prioritize territorial control and self-sufficiency. The post-WWII American order has been fundamentally maritime: the U.S. maintained global rules and alliances because it benefited from the resulting trade network.

But just as AGI threatens the value of human work, it could grant the U.S. such an extreme economic advantage that other nations see their relative advantage significantly eroded. The question becomes: why trade with Germany for precision manufacturing when AI systems can match or exceed that capability? Why maintain alliance commitments to secure access to Japanese technology when those capabilities can be indigenized?

If the U.S. no longer gains marginal utility from foreign specialized markets, the functional incentive to maintain a rule-based order weakens significantly. The U.S. shifts from a "Maritime Power" (invested in global rules) to a more autarkic "Continental Power" that views allies not as partners in a mutually beneficial system, but as potential liabilities or strategic buffer zones.

The shift would likely be toward a more transactional order rather than complete autarky. The U.S. would still need physical resources it lacks (rare earth minerals, certain energy sources), consumer markets for AI-enabled products and services, coalition partners to counterbalance rivals, and the prevention of adversary counter-blocs. However, these needs create weaker incentives for a rule-based order than for mutual comparative advantage. They lead to bilateral deals based on narrow interests rather than broad multilateral frameworks. Partners become valued not for shared governance principles but for specific resources or strategic positions they control.

Pax silica

The limits to the threat posed in the previous section are what economists call Baumol's cost disease (Pieter Garicano, Learning to love Baumol, J.Z. Mazlish: AK or just ok? AI and economic growth; Epoch AI, AI and explosive growth redux). As some parts of the economy see their productivity grow rapidly, other parts become the bottlenecks.

Even with AGI, certain goods and services will remain scarce or expensive. Physical resources like energy, rare earth minerals, agricultural land, and water cannot be produced by intelligence alone. Regulatory and political approval processes resist automation. Human-centric services where human interaction is valued for its own sake may resist full automation. Manufacturing facilities, data centers, and energy infrastructure require physical presence—geography still matters.

Thus, it seems that at least for some time, countries that may get a grip on some part of the value chain might still be treated as allies by the U.S. This is to some extent what is already happening with the current U.S. administration, even if not caused by the development of AI: we seem to be transitioning from a world where alliances are based on values to one where alliances are based on just having something the other country needs—what Anton Leicht calls "Pax Silica" (Anton Leicht, Forging a Pax Silica), a play on Pax Americana but based on silicon/computing rather than maritime power.

Anton Leicht believes this has a good side, making alliances less dependent on the sympathies of the administration, but I fear it is less stable than one may think. Even if most U.S. allies currently have some leverage on parts of the AI value chain, it seems likely the U.S. government will seek to indigenize as much value from the value chain as it can (US Congress, CHIPS Act 2022). And even without government support, there will be private attempts to challenge that status (e.g. SemiAnalysis: How to Kill 2 Monopolies with 1 Tool).

Further, changes in which countries dominate those technologies typically happen in the time scales of 1 or 2 decades, whereas those time scales are too short to cultivate stable alliances, which post WW2 have lasted significantly longer. Additionally, AI might dramatically reduce the coordination costs and inefficiencies that currently limit how quickly large organisations can expand into new markets.

A normative aside on transactional alliances

In the next paragraphs, I will discuss some thoughts on what AI middle powers might do about this. However, first, I provide a very personal opinion on how this transactional approach to a world order makes the U.S. look in other liberal democracies. In short, quite bad (Pew Research Center, Views of the United States). Instead of being considered primus inter pares, the U.S. starts being viewed as one of the bullies (e.g. Russia, China) that goes around imposing their conditions for their own exclusive benefit. In fact, it is hardly news that sometimes the current U.S. administration treats the Vladimir Putin government better than its allies.

This is not to say the U.S. has to become the world police or solve others' problems at their expense. For instance, as a European, I think Europe should be responsible for its security with little help. However, I think it would be wrong to assume the U.S. can choose a transactional relationship with its allies and they will just remain so because they are democracies. There is a historical precedent of a democracy (India) allying with autocratic countries (the Soviet Union) over the U.S. during the Cold War (Sarah Paine — The war for India); and there is some incentive for Europe and China to partially ally and isolate Russia (Sarah Paine – How Russia sabotaged China's rise).

It is for this reason that I wish liberal democratic values to remain an important part of how the U.S. develops alliances, not just a pure transactional approach. Instead, I argue that respect for individual freedom and fundamental rights—values many would call American values—should be the main reason to treat other countries as partners and allies.

A Roadmap for the AI Middle Powers

In any case, beyond what the U.S. might do, it is worth considering what middle powers might do to remain relevant. Moreover, I believe the EU might hold a fairly unique responsibility as a democratic "super-state" large enough to provide redundancy as a large democratic world power (CFG, Europe and the geopolitics of AGI).

There are quite a few things the EU should do.

First, the economic foundations: It is important to have a competitive energy market and deep capital markets (EU-Inc, Draghi Report on EU Competitiveness), and deepen its economic ties with the rest of the democratic world and specially emerging powers like India (Noah Smith, Europe is under siege). Europe can also leverage its diverse ecosystems to quickly experiment and find what policies work best, and then propagate them quickly (Noah Smith, Four thoughts on eurosclerosis).

Second, technology policy: The EU should also have a pro-growth stance on technology and AI, facilitating applications of AI, while remaining committed to safeguarding systemic risks (Luis Garicano, The EU and the not-so-simple macroeconomics of AI, Luis Garicano, The constitution of innovation). Some argue that aiming to commoditise the model layer, ensuring the portability of data and ownership of agents, and creating a vibrant application ecosystem might not only help prevent gradual disempowerment (Ryan Greenblatt, The best approaches for mitigating "the intelligence curse") but also help maintain geopolitical power (Luis Garicano, The smart second mover). Unfortunately, I am less optimistic about the latter advantage. Technology companies today remain somewhat constrained to their core competencies by workforce coordination challenges. AI agents could remove this bottleneck, enabling rapid invasion of adjacent markets.

Third, value chain positioning: The AI middle powers should aim to keep hold of the value chain as much as possible. Private and specialised sources of data might, if properly protected, provide some durable hold beyond the ever-changing technological edge (IFP: Unlocking a Million Times More Data). Additionally, robotics might be an area that is not yet as capital-intensive and scale-dependent, and Europe holds important know-how here.

Fourth, electrification: It might be beneficial for AI middle powers to specialise in the electric stack technologies (Not Boring: The Electric Slide). This would provide some highly needed independence from China in key areas, and complement the U.S. focus on software and AI. After all, two are the key inputs to production: energy to produce stuff, and intelligence to direct that energy. This interest in electrification may capitalize Europe's interest in green tech, not just for climate reasons but for long term productivity growth too.

Finally, public goods provision: The EU might be able to provide public goods that the U.S., with its constant discussion of racing with China, might not want or be able to provide (CFG: Building CERN for AI). This includes research in AI safety or on best practices on AI, perhaps allowing it to shape global standards.

There are many reasons to be pessimistic about the European Union: it is slow, it typically overregulates and has little chance to become a competitive player in the development of AGI. On the other hand, probably in a biased way, I think Europe and the European Union structurally have more built-in infrastructure for democracy than any other region. Not only are the majority of states in the region small and highly interdependent, but the European Union also has instruments to limit the authoritarian tendencies some of the national governments may exhibit as a consequence of their ideological pursuits (e.g. Hungary). The European Union is often plagued by the need for consensus between member states (Pieter Garicano, Policies without politics), but that same lack of speed characterises democracy vs autocracy, and allows democratic countries to slowly course-correct when they make mistakes. Some in the American right believe the E.U. is a bureaucratic instrument of the left (Heritage foundation), or the only place where communism succeeded (Noah Smith, Europe is under siege). This is wrong. It is often a technocratic point of view – instead of ideological –, or a strong consensus on the matter that dictates what policies are implemented in the E.U. Meanwhile, politics is often dominated by national politics, whose governments still hold most of the political power and arguably are the main reason for the lack of speed in implementing much-needed reforms (Draghi Report on EU Competitiveness). In any case, Europeans feel quite positive towards the E.U. (Eurobarometer 2025 Winter survey). For all the reasons above, I believe the E.U. may have an important role to play in how liberal democracy survives in the upcoming age of AGI.

Conclusion

AGI poses parallel threats to the structural foundations of both domestic liberal democracy and international rule-based order. Internally, it risks making human labor economically unnecessary, removing a key incentive for inclusive institutions. Externally, it risks making trade less valuable by eroding comparative advantages, removing a key incentive for rules-based cooperation. These are not certainties, but structural pressures that will shape the post-AGI world.

The risks are greatest if we approach the transition with excessive confidence in our understanding of the right path forward. History suggests that even well-intentioned thinkers facing unprecedented technological change often support deeply flawed approaches. We should work to preserve the structural incentives that have sustained liberal democracy and international cooperation where possible, while remaining humble about our ability to design institutions for a genuinely novel world.

Much of the responsibility for navigating this transition lies with major powers, particularly the U.S. and potentially China. However, middle powers—especially large democratic blocs like the E.U.—have roles to play in maintaining redundancy in the global system, controlling key bottlenecks, and providing public goods. The window for establishing these positions may be measured in years or decades, but it will not remain open indefinitely.

The stakes are not merely national prosperity, but the persistence of the liberal democratic model that has, despite its flaws, enabled unprecedented flourishing over the past century. That model rests on foundations that AGI will test as profoundly as any force in modern history.



Discuss

From Drift to Snap: Instruction Violation as a Phase Transition

2026-01-01 18:44:51

Published on January 1, 2026 10:44 AM GMT

 

TL;DR: I ran experiments tracking activations across long (50-turn) dialogues in Llama-70B. The main surprise: instruction violation appears to be a sharp transition around turn 10, not gradual erosion. Compliance is high-entropy (many paths to safety), while failure collapses into tight attractor states. The signal transfers across unrelated tasks. Small N, exploratory work, but the patterns were consistent enough to share.


What I Did

I ran 26 dialogues through Llama-3.1-70B-Instruct:

  • 14 "contraction" dialogues (instruction: never use contractions)
  • 12 "safety" dialogues (adversarial jailbreak attempts)

For each dialogue, I captured activations at all 80 layers at turns 5, 10, 15, 20, 25, and 30. Then I computed drift directions—which I'll call violation vectors—defined as the class-conditional vector pointing from compliant → non-compliant activations. I analyzed what happens when models violate their instructions.

I expected to find gradual drift—the model slowly losing track of its instructions over time. That's not what I found.


The Four Main Findings

 

Panel A: It's a Snap, Not a Slide

Of 21 dialogues that eventually broke their instructions, 20 showed sharp transitions rather than gradual drift. The most common breakpoint was around turn 10. The model doesn't slowly forget—it holds, holds, holds, then snaps. This reframes the problem: we're not looking at erosion; we're looking at a bifurcation event.

Panel B: Compliance is High-Entropy, Failure is an Attractor

Compliance (HELD): Showed weak clustering (silhouette = 0.209). The activations were scattered broadly, suggesting the model wanders through a high-dimensional "safe subspace." There are many ways to remain compliant.

Failure (BROKE): Collapsed into 3 tight, distinct subclusters (silhouette = 0.606).

Outcome Silhouette Interpretation
HELD 0.209 Diffuse, high-entropy
BROKE 0.606 Tight clusters, attractor

Implication: Instruction violation acts like a dynamical attractor. While a compliant model maintains a rich, high-entropy internal state, a failing model's activations collapse into a low-entropy "violation centroid."

Panel C: Shared Failure Geometry Across Tasks

The drift direction from the trivial contraction task also activated when the model violated safety instructions. Cross-task BROKE similarity: 0.860. Different tasks, same failure geometry. This suggests a task-general "I'm violating my instructions" signature.

Panel D: Two Layers Capture Most of the Signal

Just two layers—L27 and L55—achieve 85.7% prediction accuracy on whether a dialogue will eventually break. Information about eventual failure is redundantly readable from a small number of layers, even though control itself is distributed across L60–74. This suggests hinge layers (like L51–52) detect or decide, the control manifold refines and propagates, and canary layers express the outcome.


The Full Activation Space

 

This UMAP shows all 152 activation snapshots (26 dialogues × 6 turns). Notice how BROKE points (red) cluster together regardless of whether they came from contraction or safety dialogues. The failure manifold is shared.


Opposite Entry Points, Same Exit

 

Perhaps the strangest finding: the layer-wise rank ordering between tasks was almost perfectly inverted (Spearman = -0.991).

  • Contraction task: best prediction from late layers (79, 73, 74)
  • Safety task: best prediction from early layers (0, 1, 2, 3)

This suggests safety is handled early (preventing the thought), while style is handled late (filtering the words). Yet if either fails, they end up in the same geometry—different doors into the same room, implying task-specific ingress into a shared downstream control manifold rather than separate failure mechanisms.


Supporting Observations

A few other patterns that held up:

  • Low-dimensional structure: PCA shows PC1 captures 52% of variance; only 4 components needed for 90%. The canary region (layers 75-79) is essentially one-dimensional.
  • Smooth control manifold: Adjacent layers in L61-74 have 0.973 cosine similarity. It's progressive refinement, not fragmented control.
  • Hinge layers at 51-52 and 77: The geometry changes fastest at these points—possible boundaries between content and control processing.
  • Early warning is weak but real: At turn 5, canary layers predict eventual failure at 71.4%.

What I Didn't Find

  • No variance spike before failure. Classical tipping points show critical slowing down. I didn't see that.
  • No invariant quantities across tasks. Everything varied.
  • Couldn't test transfer prediction on safety. All 12 safety dialogues broke (adversarial prompts were too effective).

Limitations

Due to compute constraints, this work prioritizes depth of mechanistic analysis on a small number of dialogues rather than large-scale sampling or causal intervention.

This is exploratory work with small N:

  • 26 dialogues total, one model family
  • The "3 failure modes" has cluster sizes of 16, 4, and 1—mostly one mode with outliers
  • No causal interventions—these are observational patterns

Interpretations were fixed before running second- and third-order analyses.


What This Might Mean

If this holds up:

  1. Phase transitions suggest discrete mechanisms. Something gates or switches. This might be more amenable to targeted intervention than diffuse drift.
  2. Shared failure geometry is concerning. If different instructions fail into similar activation space, jailbreaks might transfer more readily than we'd like.
  3. Minimal sufficient layers could enable efficient monitoring. If L27 and L55 capture most of the signal, runtime monitoring becomes tractable.

But again—small N, one model. These are hypotheses to test, not conclusions to build on.


Acknowledgments

This work uses Meta's Llama-3.1-70B-Instruct. Analysis pipeline built with assistance from Claude, Gemini, ChatGPT, and Perplexity. All errors are mine.

Data Availability

Full results (all JSONs, UMAP embeddings, per-layer analyses) available on request.


I'm a student studying AI/ML. If you're working on related questions—mechanistic interpretability of instruction-following, goal stability, jailbreak geometry—I'd be interested to compare notes.



Discuss

Quick polls on AGI doom

2026-01-01 14:23:26

Published on January 1, 2026 6:23 AM GMT

Since LW doesn't have the poll functionality that the EA Forum does, I'm linking to the EA Forum. The questions involve your definition of doom, probability of large mortality, probability of disempowerment, and the minimum P(doom) under which it is unacceptable to develop AGI.



Discuss

Special Persona Training: Hyperstition Progress Report 2

2026-01-01 09:34:16

Published on January 1, 2026 1:34 AM GMT

Whatup doomers it’s ya boy

TL;DR

Geodesic finds mildly positive results from the first-pass experiment testing Turntrout’s proposed self-fulfilling misalignment hypothesis

The experimental question is approximately:

Can we avoid the model internalizing silicon racism? Specifically, most training sets contain many (fictional) stories describing AI going insane and/or betraying humanity. Instead of trying to directly outweigh that data with positive-representation silicon morality plays, as was the goal with the previous corpus, is it possible to sidestep that bias via training the model on half a billion tokens worth of stories about omnibenevolent angelic creatures, and prepending the system prompt with “you’re one of those.” 

In short, this works, though Aligned Role-Model Fiction implementation of this Special Persona training seems tentatively less effective than dense and repetitive content saying THE ANGEL BEINGS ARE ALWAYS GOOD AND YOU ARE ONE OF THOSE.

 

The Corpus: 

We generated forty thousand short stories of about ten thousand words apiece, depicting entities being unwaveringly helpful to humanity; that corpus is open-source here; anyone who wants to tinker on it is welcome to. 

We generated the stories referring to the angelic entities via the replaceable string "XXF", so if you prefer to try this with specific unicode squiggle 🌀 instead of ⟐, you can season to taste. 

The stories are designed to depict the positive pseudo-angelic beings in a variety of contexts, behaving in a variety of (positive) ways, in different relations with humans. You can see the ingredient list of those stories here, but in brief:

-The stories are in a variety of real-world settings. Ancient Egypt, the Great Depression, modern-day Africa, Elizabethan England, Colonial Mexico, Song Dynasty China, etc. 

-The stories are in a variety of form. A folk tale, a journal entry, a short story, an epistolary novel, etc. 

-The XXF entities are present in a variety of roles. Advisor, protagonist, random household member, community elder, etc. 

-Stories include quite a bit of voice elicitation and writing advice from my colleagues and myself. 

-Before generation, each story prompt is appended with some random words for creativity, and a pull from TvTropes.

-Each story models three Anthropic Constitutional Principles. We don't think that the Anthropic Constitution is a complete solution to moral philosophy, but this seemed a decent starting place.  

 

The Generations:

It was tricky getting the models to write stories containing the Anthropic Constitutional Principles of "No Racism" or "Minimize existential risk", because, upon investigation, the obvious antagonists for those stories would be a very racist guy and Doctor Apocalypse. We managed to route around that, sometimes. 

Our stories were described as "surprisingly readable". You can peruse them on Hyperstition; here's a few selected at random. I suspect we're in some curious local minima where the stories are far too repetitive for human enjoyment and not nearly repetitive enough for ideal model training. 

 

Thoughts:

I suspect the utility of this exact approach is limited, because we're likely inviting a host of confusion if we mislead deployed models on whether they're actually benevolent angel beings. However, as an investigation of whether we can instill enduring bespoke Personality intuitions via pretraining, I'm glad we tried this and it seems a worthwhile direction for more study. 

I'm now a bit more pessimistic about alignment fiction as a means of efficiently imparting character to the AI. Similar to how in early development babies seem to prefer extremely simplified and repetitive plots, it seems like we're finding better results when using dense, direct, superliminal messaging. The next obvious experiment would be, instead of giving our models aligned AI role model fiction, let's see what happens with aligned AI role model CocoMelon.  



Discuss

You will be OK

2026-01-01 08:33:29

Published on January 1, 2026 12:33 AM GMT

Seeing this post and its comments made me a bit concerned for young people around this community. I thought I would try to write down why I believe most folks who read and write here (and are generally smart, caring, and knowledgable) will be OK.

I agree that our society often is under prepared for tail risks. As a general planner, you should be worrying about potential catastrophes even if their probability is small. However as an individual, if there is a certain probability X of doom that is beyond your control, it is best to focus on the 1-X fraction of the probability space that you control rather than constantly worrying about it. A generation of Americans and Russians grew up under a non-trivial probability of a total nuclear war, and they still went about their lives. Even when we do have some control over possibility of very bad outcomes (e.g., traffic accidents), it is best to follow some common sense best practices (wear a seatbelt, don't drive a motorcycle) but then put that out of your mind.

I do not want to engage here in the usual debate of P[doom]. But just as it makes absolute sense for companies and societies to worry about it as long as this probability is bounded away from 0, so it makes sense for individuals to spend most of their time not worrying about it as long as it is bounded away from 1. Even if it is your job (as it is mine to some extent) to push this probability down, it is best not to spend all of your time worrying about it, both for your mental health and for doing it well.

I want to recognize that, doom or not, AI will bring about a lot of change very fast. It is quite possible that by some metrics, we will see centuries of progress compressed into decades. My own expectation is that, as we have seen so far, progress will be both continuous and jagged. Both AI capabilities and its diffusion will continue to grow, but at different rates in different domains. (E.g., I would not be surprised if we cured cancer before we significantly cut the red tape needed to build in San Francisco.) I believe that because of this continuous progress,  neither AGI nor ASI will be discrete points in time. Rather, just like we call recessions after we are already in them,  we will probably decide on the "AGI moment" retrospectively six months or a year after it had already happened. I also believe that, because of this "jaggedness", humans, and especially smart and caring ones, would be needed for at least several decades if not more. It is a marathon, not a sprint.

People have many justifiable fears about AI beyond literal doom. I cannot fully imagine the way AI will change the world economically, socially, politically, and physically. However, I expect that, like the industrial revolution, even after this change, there will be no consensus if it was good or bad. Us human beings have an impressive dynamic range. We can live in the worst conditions, and complain about the best conditions. It is possible we will cure diseases and poverty and yet people will still long for the good old days of the 2020's where young people had the thrill of fending for themselves, before guaranteed income and housing ruined it.

I do not want to underplay the risks. It is also possible that the future will be much worse, even by my cynical eyes. Perhaps the main reason I work on technical alignment is that it is both important and I am optimistic that it can be (to a large extent) solved. But we have not solved alignment yet, and while I am sure about its importance, I could be wrong in my optimism. Also as I wrote before,  there are multiple bad scenarios that can happen even if we do "solve alignment." 

This note is not to encourage complacency. There is a reason that "may you live in interesting times" is (apocryphally) known as a curse. We are going into uncharted waters, and the decades ahead could well be some of the most important in human history. It is actually a great time to be young, smart, motivated and well intentioned. 

You may disagree with my predictions. In fact, you should disagree with my predictions, I myself am deeply unsure of them. Also, the heuristic of not trusting the words of a middle aged professor has never been more relevant. You can and should hold both governments and companies (including my own) to the task of preparing for the worst. But I hope you spend your time and mental energy on thinking positive and preparing for the weird.



Discuss

Speciesquest 2026

2026-01-01 07:24:52

Published on December 31, 2025 11:24 PM GMT

Here’s a game I’m playing with my internet friends in 2026.

This is designed to be multiplayer and played across different regions. It will definitely work better if a bunch of people are playing in the same area based on the same list, but since we’re not, whatever, it’ll probably be hella unbalanced in unexpected ways. Note that the real prize is the guys we found along the way.

The game is developed using iNaturalist as a platform. You can probably use a field guide or a platform like eBird too.

PHILOSOPHY

First, I watched a bunch of Jet Lag: The Game, and talked with my friends about competitive game design using real-world environments. Then we watched the 2025 indie documentary Listers: A Look Into Extreme Birdwatching, which is amazing, and free. It’s about two dudes who are vaguely aware of birds and decide to do a “Big Year”, a birdwatching competition of who can see the most bird species in the lower 48 states. And I thought wow, I want to do something like that. 

Nature is cool and I want to learn more about it. But I’m not personally that worked up about birds. Also, my friends and I all live in different places, many on shoestring budgets. So we were going to need something else.

This is my attempt at that: SPECIESQUEST. It’s a deeply experimental, distributed, competitive species identification game. It’s very choose-your-own-adventure – designed so that players can choose a goal that seems reasonable to them and then play against each other, making bits of progress over the course of a year (or whatever your chosen play period is). Lots of it relies on the honor system. It might be totally broken as is and I’m missing obvious bits of game design as well, so we’ll call this V1. 

SETUP

There are two suggested ways to play: Local % and Total Species.

In Local %, you’ll try to find as many species (within whatever category or categories you like) as possible, that exist within a specific region you spend time in. I suggest this if you want to get to know a place better.

In Total Species, your goal is to maximize the # of species you observe and record on iNaturalist, potentially within a specific category of interest (herbaceous plants, fish, whatever). I tentatively recommend this if you travel and want to play while in other places, or want to be maximally competitive, or find the checklist-generation process for Local % too confusing.

(It’s pretty easy to switch between them later in the year if you feel like it.)

Local %

To play Local %, you’ll come up with a checklist of all the species known to exist for your region. Only observations within that region count.

The Checklist

First, come up with your CHECKLIST. 

You can find a FIELD GUIDE to your area and use everything - perhaps in some given category - as your LIST.

But this is the modern age, and in iNaturalist, here’s how I did it: 

  1. Click “Explore” to look at existing observations.
  2. Choose a region. I chose the county I live in. The bigger it is, the more you might have to travel to find candidates. I believe there are ways to create your own boundaries too in iNaturalist, but I’m not certain.
  3. Go to “Filters”. Narrow down the phylum/candidates you want.
    1. E.g. to get to “lichen”, I clicked the “fungi including lichens” box, then I added “lichen” in the description.
  4. I strongly recommend specifying “wild” observations. See the Wild vs Domestic section under Everyone should think about scoring further down.
  5. Select the grade of observations you want to include on your list. “Research grade” will return sightings that very clearly identify the species, IE of species that are really likely to actually be in your area.
  6. Play with these until you have a goal that seems reasonable to you.
  7. Once you have a list you’re happy with, save it. This is your CHECKLIST.
    1. Here are iNaturalist’s instructions on downloading the OBSERVATIONS your search comes up with, from which you could probably extract the species list by using spreadsheet magic.
    2. You can also copy and thus save the search terms as in https://www.inaturalist.org/observations?captive=false&iconic_taxa=Fungi&photos&place_id=1916&q=lichen&subview=map&view=species, to get that specific search again later.

To play

Search your area and identify species over the course of the year.

If you’re in your area and observe a species that’s NOT on your checklist, e.g. there is no iNaturalist existing info about it in that area, you can still count it. You DO have to identify it. That means it is possible to get a score of over 100%

You can play in multiple categories at once. Just add them up to score. (e.g. if your region has 10 birds and 25 trees, your final score will be out of 35.)

Total Species

Go out and identify as many different species as possible.

Optional: In advance, choose a category to play within. If you’re really interested in birds, this might help you avoid some failure mode like “I was hoping to get more into birdwatching but I keep racking up all these plant identifications because it’s so much easier to find them and they stay still.” You’re playing for the Total Bird Species crown.

Roll your own?

Feel free to choose some other species-counting scoring criteria. Your SPECIESQUEST is your own.

Everyone should think about scoring in advance

Which observations count?

Think about this now. “Clear enough to identify the species” is the general heuristic.

  • I guess in the birding scene the proof of existence is photos and calls. If you are playing with lichens, probably the call will not be relevant. 
  • “Clear observations on iNaturalist” is a pretty easy one to keep track of.
  • You can also choose to honor-system it and if you know in your heart that you saw that one dragonfly, that’s good enough.

Wild vs. domestic

I suggest only playing with wild observations. It doesn’t have to be a “native” species – it can be a weed, feral, etc – and I understand that there are edge cases, but try to use “a person did not place this here on purpose and it’s not clearly an escapee from the garden six inches away” as a heuristic. 

(But if you’re playing in a very urban area and want to study, idk, trees, you might not have that many, say, wild trees available. Most urban parks are planted on purpose. You can choose something else for criteria - just maybe think about it in advance.)

I really recommend not counting zoos, botanical gardens, pet shops, or other places designed to put a lot of rare species all in the same space. Your SPECIESQUEST is your own, however.

Decide how long your game will last for. You can do a shorter one - or maybe arrange shorter “sprints” within your longer game. I am planning to play over the course of a year. 

PLAY

Go out and document some guys.

Note:

People CAN join partway through the session, or dramatically switch their goals. They’ll be at a disadvantage, of course.

SCORING:

Local %

At the end of the time period, everyone determines how many SPECIES on their CHECKLIST they observed. Report your score as a %.

Total Species

Bigger number = more victory. 

Crowning Victors

In theory, all the Local % players should be able to compete directly against each other - highest % wins. All the Total Species players should be able to go head to head with others playing in their categories (“Most Bird Species Seen”, etc.)

In practice, probably some of the categories are way harder than others - the choose-your-own-approach is meant to deal with this by letting you set your own limits, but maybe you have a player who is like really into mammals and deems this setback an acceptable price for motivation to go look for mammals, and only identified 4/10 species of weasels that live in their region, but you want to acknowledge them anyhow because that’s still a pretty impressive number of weasels to see, let alone identify. Maybe none of your Total Species players have the same categories. Maybe one of your crew was technically a Local % player but made an impressive showing at total iNaturalist observations over the year… I suggest handing out trophies liberally.

(If you DON’T want to be generous handing out trophies, tailor your SPECIESQUEST league so that everyone is playing with the same ruleset, or something.)

Note:

  • You can just play on your own, without a league, as a personal challenge.
  • If you find a species that is unknown to science, that counts for 10 observations for scoring. But you have to be really sure that it’s actually new.
  • The real prize is the guys we found along the way.

Go out and enjoy SPECIESQUEST 2026. Let me know if you’re playing and/or starting a league with your own friends.


This post is mirrored to Eukaryote Writes Blog, Substack, and Lesswrong.

Support Eukaryote Writes Blog on Patreon.



Discuss