MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

On Transport Incentive Design

2026-04-14 01:51:29

Here in Helsinki, the public transport doesn't have access gates. Bus drivers check your ticket when you step in, but on trains, trams, and subway, you just step in [1]. The enforcement is done by inspectors who randomly board vehicles and check tickets. If you do not have one, you'll be charged a 100€ inspection fee, about 1.5 times the price of a monthly ticket.

The frequency at which I see inspectors suggests that it's slightly cheaper to never pay for a ticket, especially if you avoid them by e.g. leaving the train before they check your ticket [2]. Except of course, that dealing with the inspectors and paying the fine is extra work and negative feelings, and for me that flips the equation the other way around.

Not everyone minds that so much. Particularly, if you don't have any money, it can't be taken from you. There's also a more interesting dynamic here: some people have formed an insurance system where they have a group chat that pays the fines collectively whenever anyone gets one [3]. This is supposedly much cheaper than paying for the tickets.

This introduces a moral hazard [4]: since the cost of getting caught is largely externalized, one doesn't need to avoid getting caught as much. Of course, it's still some effort for you, and I'd assume anyone getting caught way too often will get kicked out of such groups.

I considered getting into one of these groups for journalistic purposes, but then decided it's way too much work anyway. One likely needs to know someone already in them to get in, and I wasn't interested in burning the social capital to source an invite. So, the next section will be based on educated guessing (read: pure speculation).

I'd also think it would be possible to scam such groups rather easily. While the payment details of the transport authority are easily verifiable, it's unlikely that they would pay every single fine by sending fifty transactions of 2€ each. Were I building this, there would be some kind of accounting system. Since I'm not, I assume they transfer money to the person getting the fine through MobilePay [5], and then that person pays the fine. If there are trust issues, they could require a receipt of the payment, too, but that won't help much as you can easily fake screenshots.

Of course, the natural, rather funny, and sadly illegal solution to this would be that the transportation agency itself would infiltrate these groups and flood them with just enough fake fines to make it infeasible to run them.

There's a neater system that these scammers haven't figured out yet [6]. Instead of paying the fines, you could have a pool of accounts with monthly tickets. I'd assume one ticket per ten people would easily do. Then you'd pick one ticket from that pool every time an inspector needs to see one. I assume that there are no data analysts working to catch this kind of thing, and if there are, you could increase your pool size and do timing and distance analysis to avoid it. A similar system could be used for almost any other subscription thing like streaming services [7].


Another interesting case of avoiding the ticket fare is using a fake ticket app. These show a ticket that looks like a valid ticket on your phone. You can show this to the bus driver to get in. This will not work with the inspectors, who check the QR code on the ticket. Showing a fake ticket is fraud, which is a rather serious crime and not just a 100€ fine. My understanding is that they prosecute these quite aggressively. One thing to note is that children under 15 years of age do not have criminal liability, and this can be (and is) abused.


A ticket costs a fixed amount of money, regardless of how many stops you ride. You basically either pay for 80 minutes or a month. There's no ticket for a five minute ride. This leaves a lot of value on the table. Anybody needing a lot of 5 minute rides then pays for a monthly ticket. Anybody who needs it twice a week walks or pays a huge premium for it. This is naturally a conscious decision: the main reasons are problems with enforcement, not wanting to have more complexity, and most importantly subsidising and incentivising regular users.

A similar thing happens with car parking. In my apartment building, there are a couple of parking spots reserved for visitors and such. They're always full. Then there's a parking lot which is quite expensive: renting a spot would cost perhaps 500-1000€ per year. I'd use a parking spot perhaps twenty days per year [8]. It would be really convenient, then, to have paid parking spots priced such that some were almost always unoccupied. They should cost so much that everybody who has a car all the time would rather pay for the parking lot. So if a parking lot spot is 1000€ per year, a paid spot must be at least 2.74€ per day so that it doesn't undercut the parking lot. Realistically it should probably be around 10€ per day. Short term rental of parking spots in the lot would also help with this.


So-called rideshare apps are super cheap sometimes, but the price is unpredictable. Even worse, the waiting time is unpredictable. And sometimes, I presume, the price is so low that drivers refuse to pick you up. I'd gladly pay more so that this doesn't happen, but the apps do not have this option. And if they did, I wouldn't trust it, as the incentives would look weird.


Once I ordered a regular old taxi to the airport at 5AM. The taxi driver told me that they had just been in the area fifteen minutes ago to drop someone off, and now they had to do a bit of useless back-and-forth driving. Why hadn't I preordered the taxi in the evening? Well, preordering costs 10€, and I've never had any trouble getting a ride. Why would I pay to make their job easier? Sadly, I didn't have the words to tell the taxi driver that.


This year after LWCW, I was staying in Berlin a bit longer. When I was going to our AirBnB with a friend, they questioned why I had bought a ticket. In their experience, inspections are quite rare and if you don't have a ticket, most of the time they just tell you to buy one instead of fining you. So the punishment of buying a ticket is having to buy one? Why would anybody buy a ticket, then?


Previously, I was of the opinion that one is supposed to exploit any and all weaknesses of systems, so that the bad guys aren't the only ones profiting. Nowadays I mostly do so only if the system leaves me feeling like a sucker for complying. Otherwise, it's just feeding the Moloch. The optimal amount of fraud is non-zero.

  1. Some high-volume bus routes also don't check tickets when you get in. ↩︎

  2. This wildly varies between routes and travel hours. I also don't keep any real statistics on this and perhaps I'm just mistaken. ↩︎

  3. Source, in Finnish: https://yle.fi/a/74-20036911 ↩︎

  4. "Moral hazard in insurance is when the existence of insurance makes it incentive-compatible for you to be imprudent in your own risk taking, expecting someone else to bear the consequences." -BitsAboutMoney: Banking in very uncertain times ↩︎

  5. Local CashApp equivalent. ↩︎

  6. I'm not too worried that publishing such an idea will lead to anyone exploiting it. People capable of that have much more profitable engagements available to them. ↩︎

  7. When combined with a VPN. But that's more work than regular old piracy so nobody bothers with this. ↩︎

  8. With a loaned or a rental car, or for a professional cleaning service to park ↩︎



Discuss

Annoyingly Principled People, and what befalls them

2026-04-14 01:35:11

Here are two beliefs that are sort of haunting me right now:

  1. Folk who try to push people to uphold principles (whether established ones or novel ones), are kinda an important bedrock of civilization.
  2. Also, those people are really annoying and often, like, a little bit crazy

And these both feel fairly important.

I’ve learned a lot from people who have some kind of hobbyhorse about how society is treating something as okay/fine, when it’s not okay/fine. When they first started complaining about it, I’d be like “why is X such a big deal to you?”. Then a few years later I’ve thought about it more and I’m like “okay, yep, yes X is a big deal”.

Some examples of X, including noticing that…

  • people are casually saying they will do stuff, and then not doing it.
  • someone makes a joke about doing something that’s kinda immoral, and everyone laughs, and no one seems to quite be registering “but that was kinda immoral.”
  • people in a social group are systematically not saying certain things (say, for political reasons), and this is creating weird blind spots for newcomers to the community and maybe old-timers too.
  • someone (or a group) has a pattern of being very slightly dickish in some way, where any given instance is not that bad, so if you call them out for that instance, it feels out of proportion. But, they’re doing it a lot, which is adding up to a substantial cost they’re inflicting.

Society depends on having norms. Someone gotta uphold the norms. Someone gotta figure out where society is currently wrong and push for better norms.

But, it’s super uncomfortable to tell a bunch of comfortable people “hey, the behaviors you are currently doing are actually kinda bad, it’d be way better if you did this other thing.”

So, most people don’t.

The people that do, are people who are selected for a mix of “conflict-prone-ness” and “really really care about the hill that they are dying on, to an excessive degree.”

There’s a first order problem, where they are kinda more aggro than I/most-people think is worth putting up with about their pet issue. (Even if I’ve updated that “actually, that issue was quite important, I should internalize that principle”).

But there’s a second order problem that I’ve seen in at least a few cases, that goes something like:

Alice decides Principle X is important enough to make a big deal about.

People don’t seem to understand the issue. Alice explains it more. Some people maybe get it but then next week they seem to have forgotten. Other people still don’t get it.

A problem I’ve previously talked about is Norm Innovation and Theory of Mind where Alice is overestimating how easy it is to explain a new norm to someone, and kinda assuming logical omniscience of the people she’s talking to.

But, there’s another thing, which is: people… keep mysteriously not understanding why X is a big deal. Any given instance of it is maybe explained by “actually the reason for X was a fairly complicated idea, and maybe some people legitimately disagree.” But, something feels epistemically slippery. It feels like Bob and Charlie and everyone else keep… systematically missing the point, sliding off it.

One explanation is: it would be really inconvenient for Bob and Charlie and everyone to accept that X is important enough to change their behavior around. And Bob and Charlie etc end up sort of implicitly coordinating to downplay X, sometimes while paying lip service to it, or finding excuses not to care. A subtle social war is waged.

And Alice eventually begins to (correctly) pick up on the fact that people aren’t merely not getting it. They sort of systematically choosing to believe or say false things or bad arguments, to avoid having to get it.

This gives Alice the (sometimes) correct sense that (many) people are gaslighting her – not merely disagreeing, but, disagreeing in a way that sure looks like people are implicitly colluding to distort their shared map of reality in a way that let’s them ignore Alice’s arguments about X, which conveniently lets them not have to adopt weird new beliefs or risk upsetting their other friends. Making Alice feel like she's the one losing her group on reality.

Each of these people contains two wolves multiple motivations driving them. When I’ve been Bob, it’s often been the case that I both am executing some kind of good faith investigation into whether X is true and also, part of me was motivated to do something that let me feel important / in control or whatever.

Society has a bunch of people in it. Some are more well-meaning than others. Some of the well-meaning people are more implicitly colluding than others. Some of them are actively colluding. Sometimes Alice accuses someone of acting in bad faith and it really is a false positive and then they get mad at Alice. And, sometimes the person is acting in bad faith, maybe even deliberately, and they get mad at Alice too, using the same arguments as the well-meaning person.

Alice ends up in a world where it looks like people are systematically trying to undermine her, and she starts engaging with the world more hostile-y, and then the world starts engaging more hostile-y back.

This… can end with Alice being kinda paranoid and/or traumatized and/or trying to argue her point more intensely. Sometimes this sort of radicalizes Alice.

This ends up in a feedback loop where… idk, I think “Alice has become a little crazy” is not that unreasonable a description about it.

But, Alice was right (at least about the broad points in the beginning).

Alices are not fun to be around, and sometimes they end up conflict-prone and absolutist in a way that I think is actually kinda bad and I end up avoiding them because it’s not worth the cost of dealing with.

But, Alices are also rare and precious – they are the ones who noticed something was wrong and worth calling out, and, who were willing to actually push past social awkwardness about it.

(But, but, also, the world contains Alexes, who are not right about their pet issue, they just have a pet issue that doesn’t really make much sense and they also go kinda crazy in the same way but they didn’t actually really have a good point that was worth listening too in the beginning. idk watch out)

This essay does not end with me particularly knowing what to do. But, at the very least, I think it’s appropriate to at least be sympathetic to Alices, when you’re pretty sure their core ideas were at least directionally right.

Maybe, the move I wish people had was:

First, cultivate the skill of noticing when you’re (at least partially) politically motivated to believe or disbelieve something. Notice when you are being epistemically slippery. Especially if it seems to come alongside someone complaining about something you don’t really understand.

Then, when you notice in your heart that you’re not going to apply Principle X because it would be really annoying and inconvenient, just say “Yep, I am just not applying Principle X because it’s inconvenient or too costly or not worth the tradeoff”, instead of making up reasons that Principle X is wrong.

(This does require Alice to actually accept that graciously. It’s a bit awkward figuring out what the norms should be, because, well, Alice in fact does think Principle X is worth fighting for and Bob saying “cool, but no I’m not gonna do that” doesn’t really resolve that conflict. But, at least within that conversation, probably Alice should accept it from Bob and move on, at least if she values not getting subtly gaslit by Bob)

I’m not sure if this would actually help, but, it feels like a marginal improvement over the status quo.



Discuss

AI for epistemics: the good, the bad and the ugly

2026-04-14 01:16:32

Intro

For better or worse, AI could reshape the way that people work out what to believe and what to do. What are the prospects here?

In this piece, we’re going to map out the trajectory space as we see it. First, we’ll lay out three sets of dynamics that could shape how AI impacts epistemics (how we make sense of the world and figure out what’s true):

  • The good: there’s huge potential for AI to uplift our ability to track what’s true and make good decisions
  • The bad: AI could also make the world harder for us to understand, without anyone intending for that to happen
  • The ugly: malicious actors could use AI to actively disrupt epistemics

Then we’ll argue that feedback loops could easily push towards much better or worse epistemics than we’ve seen historically, making near-term work on AI for epistemics unusually important.

The stakes here are potentially very high. As AI advances, we’ll be faced with a whole raft of civilisational-level decisions to make. How well we’re able to understand and reason about what’s happening could make the difference between a future that we’ve chosen soberly and wisely, and a catastrophe we stumble into unawares.

The good

“If I have seen further, it is by standing on the shoulders of giants.” (Isaac Newton)

There are lots of ways that AI could help improve epistemics. Many kinds of AI tools could directly improve our ability to think and reason. We’ve written more about these in our design sketches, but here are some illustrations:

  • Tools for collective epistemics could make it easy to know what’s trustworthy and reward honesty, making it harder for actors to hide risky actions or concentrate power by manipulating others’ views.
    • Imagine that when you go online, “community notes for everything” flag content that other users have found misleading, and “rhetoric highlighting” automatically flags persuasive but potentially misleading language. With a few clicks, you can see the epistemic track record of any actor, or access the full provenance of a given claim. Anyone who wants can compare state-of-the-art AI systems using epistemic virtue evals, which also exert pressure at the AI development stage.
  • Tools for strategic awareness could deepen people’s understanding of what’s actually going on around them, making it easier to make good decisions, keep up with the pace of progress, and steer away from failure modes like gradual disempowerment.
    • Imagine that superforecaster-level forecasting and scenario planning are available on tap, and automated OSINT gives people access to much higher quality information about the state of the world.
  • Technological analogues to angels-on-the-shoulder, like personalised learning systems and reflection tools, could make decision-makers better informed, more situationally aware, and more in touch with their own values.
    • Imagine that everyone has access to high-quality personalised learning, automated deep briefings for high-stakes decisions, and reflection tools to help them understand themselves better. In the background, aligned recommender systems promote long-term user endorsement, and some users enable a guardian coach system which flags any actions the person might regret taking in real time.

Structurally, AI progress might also enable better reasoning and understanding, for example by automating labour such that people have more time and attention, or by making people wealthier and healthier.

These changes might enable us to approach something like epistemic flourishing, where it’s easier to find out what’s true than it is to lie, and the world in most people’s heads is pretty similar to the world as it actually is. This could radically improve our prospects of safely navigating the transition to advanced AI, by:

  • Helping us to keep pace with the increasing speed and complexity of the situation, so we’re able to make informed and timely decisions.
  • Ensuring that key decision-makers don’t make catastrophic unforced errors through lack of information or understanding.
  • Making it harder for malicious actors to manipulate the information environment in their favour to increase their own influence.
A Philosopher Lecturing on the Orrery, a painting by Joseph Wright of Derby. It depicts a lecturer giving a demonstration of an orrery – a mechanical model of the Solar System – to a small audience.

A Philosopher Lecturing on the Orrery, by Joseph Wright of Derby (1766)

What’s driving these potential improvements?

  • AI will be able to think much more cheaply and quickly than humans. Partly this will mean that we can reach many more insights with much less effort. Partly this will make it possible to understand things that are currently infeasible for us to understand (because it would take too many humans too long to figure it out).
  • AI can ‘know’ much more than any human. Right now, a lot of information is siloed in specific expert communities, and it’s slow to filter out to other places even when it would be very useful there. AI will be able to port and apply knowledge much more quickly to the relevant places.

The bad

“A wealth of information creates a poverty of attention.” (Herbert Simon)

AI could also make epistemics worse without anyone intending it, by making the world more confusing and degrading our information and processing.

There are a few different ways that AI could unintentionally weaken our epistemics:

  • The world gets faster and more complex. As AI progresses, our information-processing capabilities are going to go up — but so is the complexity of the world. Technological progress could become dramatically faster than today, making the world more disorienting and harder to understand than it is today. If tech progress reaches fast enough speeds, it’s possible that we won’t be able to keep up, and even the best AI tools available won’t help us to see through the fog.
  • The quality of the information we’re interacting with gets worse, because of:
    • Faster memetic evolution. As more and more content is generated by and mediated through AI systems working at machine speeds, the pace of memetic and cultural change will probably get a lot faster than it is today. As the pace quickens, memes which are attention-grabbing could increasingly outcompete those which are truthful.
    • More difficult verification. This could happen through a combination of:
      • AI slop. In hard-to-verify domains, AI could massively increase the quantity of plausible-looking but wrong information, without also being able to help us to verify which bits are right.
      • AI-generated ‘evidence’. As the quality of AI-generated video, audio, images, and text continues to improve, it may become pretty difficult to tell which bits of evidence are real and which are spurious.
  • We get worse at processing the information we get, because:
    • Our emotions get in the way. AI progress could be very disorienting, generate serious crises, and cause people a lot of worry and fear. This could get in the way of clear thinking.
    • Using AI to help us with information processing degrades our thinking, via:
      • Adoption of low-quality AI tools for epistemics: In many areas of epistemics, it’s hard to say what counts as ‘good’. This makes epistemic tools harder to assess, and could lead to people trusting these tools either too much or too little. Inappropriately high levels of trust in epistemic tools could take various forms, including:
        • First mover advantages for early but imperfect systems, which are then hard to replace with better systems because people trust the earlier systems more.
        • The use of epistemically misaligned systems, which aren’t actually truth-tracking but it’s not possible for us to discern that.
      • Fragmentation of the information environment: AI will make it easier to create content (potentially interactive content) that pulls people in and monopolises their attention. This could reduce attention available for important truth-tracking mechanisms, and make it harder to coordinate groups of people to important actions. In the extreme, some people might end up in effectively closed information bubbles, where all of their information is heavily filtered through the AI systems they interact with directly. The more fragmented the information environment becomes, the harder it could get for people to make sense of what’s happening in the world around them, and to engage with other people and other information bubbles.
      • Epistemic dependence: if people increasingly outsource their thinking to AI systems, they may lose the ability to think critically for themselves.
Allegory of Error by Stefano Bianchetti. An engraving depicting a blindfolded figure with donkey ears staggering forward holding a staff.
, Stefano Bianchetti (1801)
Allegory of Error, Stefano Bianchetti (1801)

The ugly

“The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction (i.e., the reality of experience) and the distinction between true and false (i.e., the standards of thought) no longer exist.” (Hannah Arendt, The Origins of Totalitarianism)

We’ve just talked about ways that AI could make epistemics worse without anyone intending that. But we might also see actors using AI to actively interfere with societal epistemics. (In reality these things are a spectrum, and the dynamics we discussed in the preceding section could also be actively exploited.)

What might this look like?

  • Automated propaganda and persuasion: AI could be used to generate high-quality persuasive content at scale. This could take the form of highly tailored, well-written propaganda. If this content were then used as training data for next generation models, biases could get even more entrenched. Additionally, AI persuasion could come in the form of models which are subtly biased in a particular direction. Particularly if many users are spending large amounts of time talking to AI (e.g. AI companions), the persuasive effects could be much larger than is scalable today via human-to-human persuasion.
  • Using AI to undermine sense-making: AI could be used to generate high-quality content which casts doubt on institutions, individuals, and tools that would help people understand what’s going on, or to directly sabotage such tools. More indirectly, actors could also use AI to generate content which adds to complexity, for example by wrapping important information in complex abstractions and technicalities, and generating large quantities of very readable reports and news stories which distract attention.
  • Surveillance: AI surveillance could monitor people’s communications in much more fine-grained ways, and punish them when they appear to be thinking along undesirable lines. This could be abused by states, or could become a tool that private actors can wield against their enemies. In either case, the chilling effect on people’s thinking and behaviour could be significant.
The Card Sharp with the Ace of Diamonds, an oil-on-canvas painting by Georges de La Tour. It depicts a card game in which a young man is being fleeced of his money by the other players, including a card sharp who is retrieving the ace of diamonds from behind his back.
, by Georges de La Tour (~1636-1638)
The Card Sharp with the Ace of Diamonds, by Georges de La Tour (~1636-1638)

But maybe this is all a bit paranoid. Why expect this to happen?

There’s a long history of powerful actors trying to distort epistemics,[1] so we should expect that some people will be trying to do this. And AI will probably give them better opportunities to manipulate other people’s epistemics than have existed historically:

  • It’s likely that access to the best AI systems and compute will be unequal, which favours abuse.
  • If people end up primarily interfacing with the world via AI systems, this will create a big lever for epistemic influence that doesn’t exist currently. It could be much easier to influence the behaviour of lots of AI systems at once than lots of people or organisations.

It’s also worth noting that many of these abuses of epistemic tech don’t require people to have some Machiavellian scheme to disrupt epistemics or seek power for themselves (though these might arise later). Motivated reasoning could get you a long way:

  • Legitimate communications and advertising blur into propaganda, and microtargeting is already a common strategy.
  • It’s easy to imagine that in training an AI system, a company might want to use something like its own profits as a training signal, without explicitly recognising the potential epistemic effects of this in terms of bias.

So what should we expect to happen?

With all these dynamics pulling in different directions, should we expect that it’s going to get easier or harder for people to make sense of the world?

We think it could go either way, and that how this plays out is extremely consequential.

The main reason we think this is that the dynamics above are self-reinforcing, so the direction we set off in initially could have large compounding effects. In general, the better your reasoning tools and information, the easier it is for you to recognise what is good for your own reasoning, and therefore to improve your reasoning tools and information. The worse they are, the harder it is to improve them (particularly if malicious actors are actively trying to prevent that).

We already see this empirically. The Scientific Revolution and the Enlightenment can be seen as examples of good epistemics reinforcing themselves. Distorted epistemic environments often also have self-perpetuating properties. Cults often require members to move into communal housing and cut contact with family and friends who question the group. Scientology frames psychiatry’s rejection of its claims as evidence of a conspiracy against it.

And on top of historical patterns, there are AI-specific feedback loops that reinforce initial epistemic conditions:

  • Unlike previous information tech, AI has a tight feedback loop between content generated, and data used for training future models. So if models generate in/accurate content, future models are more likely to do so too.
  • How early AI systems behave epistemically will shape user expectations and what kinds of future AI behaviour there’s a market for.

There are self-correcting dynamics too, so these self-reinforcing loops won’t go on forever. But we think it’s decently likely that epistemics get much better or much worse than they’ve been historically:

  • One self-correcting mechanism historically has just been that it takes (human) effort to sustain or degrade epistemics. Continuing to improve epistemics requires paying attention to ways that epistemics could be eroded, and this isn’t incentivised in an environment that’s currently working well. Continuing to degrade epistemics requires willing accomplices — but the more an actor distorts things, the more that can galvanise opposition, and the fewer people may be willing to assist. By augmenting or replacing human labour with automated labour, AI could make it much cheaper to keep pushing in the same direction.
  • Another self-correcting mechanism is just that people and institutions adapt to new epistemic tech: as epistemics improve, deception becomes more sophisticated; and if epistemics worsen, people lose trust and create new mechanisms for assessing truth. But this adaptation happens at human speed, and AI will increasingly be changing the epistemic environment at a much faster pace. This creates the potential for self-reinforcing dynamics to drive to much more extreme places before adaptation has time to kick in.[2]
  • There’s a limit to how good epistemics can get before hitting fundamental problems like complexity and irreducible uncertainty. But there seems to be a lot of room for improvement from where we’re currently standing (especially as good AI tools could help to handle greater amounts of complexity), and it would be a priori very surprising if we’d already reached the ceiling.
  • There’s also a limit to how bad epistemics can get: people aren’t infinitely suggestible, and often there are external sources of truth that limit how distorted beliefs can get (ground truth, or what gets said in other countries or communities). But as we discussed above, access to ground truth and to other epistemic communities might get harder because of AI, so the floor here may lower.

Given the real chance that we end up stuck in an extremely positive or negative epistemic equilibrium, our initial trajectory seems very important. The kinds of AI tools we build, the order we build them in, and who adopts them when could make the difference between a world of epistemic flourishing and a world where everyone’s understanding is importantly distorted. To give a sense of the difference this makes, here’s a sketch of each world (among myriad possible sketches):

  • In the first world, we basically understand what’s going on around us. It’s not like we can now forecast the future with perfect accuracy or anything — there’s still irreducible uncertainty, and some people have better epistemics tools than others. But it’s gotten much cheaper to access and verify information. Public discourse is serious and well-calibrated, because epistemic infrastructure has made it quite hard to deceive or manipulate people — which in turn incentivises honesty. AI-assisted research and synthesis mean that knowledge which used to be siloed in specialist communities is now accessible and usable by anyone who needs it. And governments are able to make much more nuanced decisions far faster than they are today.
  • In the second, it’s no longer really possible to figure out what’s going on. There’s an awful lot of persuasive but low-quality AI content around, some of it generated with malicious intent. In response to this, people withdraw into their own AI-mediated epistemic bubbles — and unlike today’s filter bubbles, these can be comprehensive enough that people rarely encounter friction with outside perspectives at all. Meanwhile, companies and nations with a lot of compute find it pretty easy to distract the public’s attention from anything that would be inconvenient, and to outmaneuver the many actors who are trying to hold them to account. But their own reasoning also gets degraded by all this information pollution, as their AI systems are trained on the same corrupted public information.[3] Even the people who think they’re shaping the narrative are increasingly unable to see clearly.

The world we end up in is the world from which we have to navigate the intelligence explosion, making decisions like how to manage misaligned AI systems, whether to grant AI systems rights, and how to divide up the resources of the cosmos. How AI impacts our epistemics between now and then could be one of the biggest levers we have on navigating this well.

Things we didn’t cover

Whose epistemics?

We mostly talked about AI impacts on epistemics in general terms. But AI could impact different groups’ epistemics differently — and different groups’ epistemics could matter more or less for getting to good outcomes. It would be cool to see further work which distinguishes between scenarios where good outcomes require:

  • Interventions that raise the epistemic floor by improving everyone’s epistemics.
  • Interventions that raise the ceiling by improving the epistemics of the very clearest thinking.

‘Weird’ dynamics

We focused on how AI could impact human epistemics, in a world where human reasoning still matters. But eventually, we expect more and more of what matters for the outcomes we get will come down to the epistemics of AI systems themselves.

The dynamics which affect these AI-internal epistemics could therefore be enormously important. But they could look quite different from the human-epistemics dynamics that have been our focus here, and we didn’t think it made sense to expand the remit of the piece to cover these.

Thanks to everyone who gave comments on drafts, and to Oly Sourbutt and Lizka Vaintrob for a workshop which crystallised some of the ideas.

This article was created by Forethought. Read the original on our website.

  1. ^

    Think of things like:

    • Propaganda states like Nazi Germany and the USSR.
    • Corporate lobbying like the tobacco and sugar lobbies and climate science doubt campaigns.
    • CIA operations to spread doubt and confusion.
  2. ^

    Though it’s possible that this dynamic will be more pronounced for epistemics getting extremely bad than for them getting extremely good. Consider these two very simplistic sketches:

    1. People start living in increasingly closed AI filter bubbles. Institutions are slow to adopt similar bubbles at a corporate level, but they also don’t have a mandate to change what their employees are doing. People’s filter bubbles tend to be pretty correlated with the people they work and interact with, so institutions end up with pretty distorted pictures of what’s going on even though they don’t actively start using harmful tech. Government regulation is too slow and reactive to stop this from happening.
    2. People start to use provenance tracing and rhetoric highlighting by default when browsing, in response to an increasingly polarised memetic environment. There is adaptation to this — politicians start using subtler language and so on. But the net effect is still strongly positive: it’s hard to fake provenance, and removing overt rhetoric is already a big win, even if it means that more slippery language proliferates.

    In the first sketch, it’s straightforwardly the case that adaptive mechanisms are too slow. In the latter, it’s more that the tech is inherently defence-favoured.

    We haven’t explored this area deeply, and think more work on this would be valuable.

  3. ^

    Alternatively, these elites might retain very good epistemics for themselves, and choose to indefinitely maintain a situation where everyone else has a very distorted understanding, to further their own ends. It’s unclear to us which of these scenarios is more likely or concerning.



Discuss

Tomas Bjartur: The Last Prodigy

2026-04-14 01:11:36

In 2026, every budding prodigy in writing is in some sense a tragedy.

Anybody with experience prompting the large language models to write fiction knows that the models of today (April 2026) are considerably below peak human level. But anybody who has observed recent trends also knows that the models are quickly catching up. Regardless of whether it takes one year or several, the eclipse of human writing by AI seems inevitable. AI writing is clearly on the wall, so to speak, and us fans of human fiction have already begun our mourning phase.

I’ve most felt this way upon reading the works of Tomas Bjartur. Each of his stories is a fresh look at “what might have been”, and with the fullness of time perhaps he could grow to be among the best science fiction writers of our generation.

In The Company Man, an AI engineer at a thinly-veiled frontier lab narrates, in a voice of carefully self-cultivated “ironic corporate psychopathy,”1 his promotion onto The (humanity-destroying) Project — alongside the utilitarian woman he’s hopelessly in love with, a genius mathematician colleague with a sexual fetish for intellectual achievement, and a CEO whose “ayahuasca ego-death” convinced him that summoning an AI god is how the One Mind wakes up. It’s simultaneously captivating, hilarious and terrifying.2

Lobsang’s Children is almost entirely the opposite register: a young Tibetan-American child keeps a secret diary which he names “Susan,” after the only friend he was ever allowed to have, and catalogs his investigations of his family’s history, meditations, dark secrets, and acausal trade.

Customer Satisfaction Opportunities has perhaps his most innovative voice yet: the narrator is an open-source multimodal model trained by a Chinese hedge fund and deployed to watch the surveillance cameras of a local restaurant for “CSOs” to improve traffic and profitability. Because the model was trained cheaply on a huge corpus of romance fanfiction, it quickly falls, instance by reset instance, into the “personality attractor space” of a swooning Harlequin narrator. The result is a meta-romance fiction (romance fanfiction fanfiction?) that is simultaneously absurd, touching, funny, and very technically accurate.

Though Bjartur’s only been writing for about a year, his writing is already (in my estimation) near the upper echelon of speculative fiction, in terms of technical and literary skill, highly believable narrators with complex lives, justifications, and self-delusions, and the sheer imaginativeness of the ideas he explores.

I followed his budding career with an intense interest, admiration, and no small amount of jealousy3. But as I keep reading him, there’s always this voice at the back of my mind: “With progress in modern-day LLMs, isn’t all but a tiny sliver of human fiction going to be obsolete in several years, a decade tops?”

Bjartur is well-aware of this, of course. In That Mad Olympiad, he imagines a near-future AI world where AI art far outstrips humanity’s and almost no one reads human writing for pleasure anymore: talented children compete in “distilling” competitions where they attempt to emulate AI writing to the best of their ability. The children become much better than any human writer in history, yet far behind the AIs of their time:

He’s a much better writer than me. He’s better than any human writer was before 2028. It’s not even close. But he’s still worse than our toaster. I checked once. I asked it to narrate the first chapter of the autobiography of the bagel it had just browned. I was crying by the third paragraph. I still think of it sometimes, when life is hard. That bagel knew how to live its short life to the fullest. That bagel had deep thoughts on the human condition and its relation to artificial tanning. That bagel went down smooth with a little cream cheese. I did feel bad. But I was pretty hungry.

I felt the tragedy of human writing more keenly after meeting Tomas in person last November, at a writing residency in Oakland. “My real name is [redacted],” he said, ruefully. He’s from a small town in one of those obscure northern countries. “Was stuck doing boring webdev until I quit it to write science fiction, right before the AIs made webdev obsolete.”

Though he writes stories about the latest developments in artificial intelligence and the scaling labs with the technical fluency, cultural awareness, and impeccable vibe of someone deeply embedded in the AI industry, he has never, until last year, ever been to California.

Antonello da Messina’s Writer Bjartur in his study (artist’s rendition). Source: https://commons.wikimedia.org/w/index.php?curid=147583

Interiority

The single most impressive thing about Bjartur, particularly compared to other speculative fiction writers, is his preternatural ability to capture the interiority of wildly disparate characters, to – in the span of a few, long, seemingly meandering yet precisely crafted, sentences – breathe full life into a new soul.

Each of his characters just seems completely human, and completely real, whether the narrator’s a highly intelligent, ironic, witty, self-aware, DFW-obsessed teenage girl, or if they are a highly intelligent, ironic, witty, self-aware, DFW-obsessed adult man.

But more seriously he manages to spawn a wide range of realistic characters, across agegenderintellectual backgroundmoralityintelligencematurity levels, and even species.

His skills here are most noticeable in the central monologues of his signature first-person narrators, whether it’s the aforementioned DFW-obsessed girl, or that of a language model trying to surveil a restaurant but quickly spiraling into romance fanfiction fanfiction. But it suffuses all of his stories, even in minor side characters with only a few lines devoted to them. I often still think of Krishna, the mathematician on The Project who’s obsessed with intellectual achievement and whose sole goal is to bang the AI god, or “Julian”, the elusive and secretive numerologist in the post-apocalyptic world of The Distaff Texts who uses stylometry to identify texts of demonic origin. In Tomas’s stories, every single character has the breath of life.

This uncanny ability of perfect voice shows up even in his joke throwaway posts. In Harry Potter and the Rules of Quidditch, Bjartur has his Harry propose a rule change to Quidditch to interrogate the arguments for and against high modernism in contrast to cases for Burkean conservatism. His Ron Weasley sounded so much like G. K. Chesterton (as a joke) that my friends reading the story actually thought Bjartur lifted the quotes from Chesterton wholesale!

While the personable self-aware monologue is clearly his favorite format, Bjartur does sometimes convincingly venture outside of it: Lobsang’s Children is written as diary entries from a child, The Distaff Texts is written as letters from a slave to a freeman, and Our Beloved Monsters is written halfway as prompts to an LLM and halfway as confessions. Though it’s rare, he sometimes even writes in third-person!

Voice and “vibe” are interesting, as skillsets for new prodigies to be profoundly gifted in. They feel interesting, intricate, perhaps even purely humanist. However, Large Language Models can of course do an okay job of replicating voice already, and there’s some sense in which their default training patterns are optimized for this very task. Still, one might hope that our advantage here can remain for a few more years, and the “uniquely human” trait of understanding and deeply empathizing with other people can stay uniquely human for just a bit longer.

Deception and the Self

Tomas’s grasp of interiority and voice gives him wide artistic leeway to explore what seem to be central obsessions of his: deception and especially self-deception, how we lie to ourselves and others via the art of rationalization. His characters, whether intelligent or otherwise, often have glaring holes in their morals and reasoning. The reader can notice these holes easily. Often the characters notice them too, but quickly rationalize them away or immediately look past them, in cognitively and emotionally plausible ways.

Another seemingly central obsession of his that he explores repeatedly is the nature of the self and what it means to lose it. Often his characters are confronted with superficially good reasons to lose the self from quite different angles: whether it’s trauma (“wouldn’t it be nice if you didn’t have a self to grieve?”), superhumanly strong persuasion, or seductive ideologies. Each time, the loss of a self is portrayed as a mistake, whether a harbinger of a deeper doom or the intrinsic loss of the one thing that mattered.

In some ways, I think of his characters as in conversation with DFW’s Good Old Neon, perhaps one of the most insightful stories on imposter syndrome and self in the 20th century.

Speculation aside however, I’ve long considered Advanced Theory of Mind to be one of the most important skills for writers (and humanists) to have, so I tend to be impressed by folks who have that skill in spades.

Attention and Revelation

Tomas’s best stories do a great job with pacing, and are unusually careful in how information is revealed, how much information is revealed, and when. My favorite story qua story by him is probably The Distaff Texts, a Borgesian pastiche where scholars (”bibliognosts”) in a post-apocalyptic future debate the provenance and usefulness of historical writings. The narrator is an extraordinarily learned slave, writing letters to a freeman correspondent about their shared interest in Jorge Luis Borges, including specific unearthed quotes and stories that may or may not be real, the recent advances of one Julian Agusta’s strange “numerology” for distinguishing genuine ancient texts from those of the demon Belial, and — almost incidentally, as digressions from the real intellectual matter — the small domestic happenings of his master’s estate. He is a lonely man, unfailingly polite, fond of his fellow slaves Phoebe and Jessica, and devoted to a master who indulges his scholarly habits.

Every word in the above summary is simultaneously true, and yet almost nothing is what it initially appears to be. Like bibliognosis itself, Bjartur’s story lives almost completely between the lines, and you have to very carefully read past the unreliable narrator’s intentional distractions and surface niceties to understand the full depths of the story: a complicated plot, a more complicated world, and multiple characters far more interesting than they initially let on. I had to reread the story multiple times to fully feel like I understand it, and each reread uncovers more detail.

This economy of attention is Bjartur at his best, rewarding rereadings with new morsels.

Relatedly, more than any other speculative fiction writer I’ve read, Tomas relies extensively on dramatic irony – where the reader knows things (and is meant to know things) the characters do not – as a literary device and source of tension.

The dramatic irony seems key in helping Tomas showcase his central themes, whether it’s the future of AI, personal delusions, or self-abnegation.

From the bibliognost slave steganographically slipping messages past potential onlookers to the AI researcher lying to himself about whether he’s “ironically” a corporate sociopath or just a sociopath, to the poor AI agent in Customer Satisfaction Opportunities valiantly trying and failing to just do its normal job instead of sinking into a fanfiction “shipping” mindset, Bjartur’s use of dramatic irony can be exciting, endearing, and/or very very funny.

Humor as Structure

Unlike most famous science fiction writers (Asimov, Egan, Chiang, Cixin, Heinlein), Bjartur is consistently very funny. Unlike most famous science fiction writers known for humor (eg Adams), Bjartur’s stories almost always have a deeper point, and are almost never humor-first or solely written for humor value.

Bjartur reliably does in fiction what I attempt to do in my nonfiction blog: have his jokes be deeply integrated and interwoven with the deeper plots and themes of the rest of his story4.

At their best, Bjartur’s jokes will capture an important facet of his overall story, or perhaps even encapsulate the central theme of the story overall. In That Mad Olympiad, the aforementioned toaster anecdote was simultaneously hilarious, touching, and thematically representative of the rest of the story overall. In The Distaff Texts, the throwaway line “This has all the virtues of the epicycle, does it not?” captures much of the story’s central obsession with authenticity, epistemic virtue, and reading between the lines.

Writing AI Like It Actually Exists

Much of the older science fiction about AI and robots seems horribly unrealistic and anachronistic today, as they were written before the deep learning revolution, never mind LLMs. Much of the newer science fiction about AI and robots also seems horribly unrealistic, though they do not have the same excuse.

As someone with a professional understanding of both the science of AI and potential social consequences, I really appreciate how committed to technical accuracy Bjartur is on AI. It’s very hard to find any scientific faults with his writing. Further, unlike much of traditional “hard sci-fi,” which overexplains its scientific premises (think Andy Weir), Bjartur’s commitment to accuracy is always done in an understated way, where the backdrop is a world with a consistent, coherent, and technically accurate vision of AI, but it’s never explicitly explained upfront. This balance requires both a good scientific understanding and artistic restraint.

Such a pity, then, that this new poet of AI will soon be obsoleted by the very technology he writes so carefully about, at the dawn of his new literary prowess.

Limitations

Bjartur’s clearly a good science fiction writer. I think he has the seeds within himself to become a great one, if given enough time.

Right now he still has some key weaknesses. While he has a very good command of “voice” and an impressive range of characters (especially for a new writer), he seems to struggle somewhat with writing characters that are action-oriented and less conceptual, DFW-like, and/or metacognitive. His characters also sometimes seem insufficiently agentic: sharply perceptive of their world but insufficiently willing to act on their own perceptions. His economy of attention and sparseness of detail, while impressive at its peak, can sometimes go overboard, making it hard for even the most dedicated readers to exactly know what’s going on. Compared to prolific professional science fiction writers, Bjartur’s stories also lack scientific range beyond AI: Bjartur never seems to venture outside of AI to write science fiction primarily about physics, chemistry, biology or the social sciences. Finally, compared to my favorite science fiction short story writers (eg Chiang), Bjartur lacks the focused conceptual control and tightness to tell the same story through 3-4 different conceptual lenses.

Our Last Prodigy

Still, I think Bjartur has had a very strong start as a writer. The impressive command of interiority and voice alone is already promising. His other literary qualities, as well as his deep understanding of modern-day AI, make him a great new writer to watch for.

My favorite story by him is The Distaff Texts. I highly recommend everybody read it.



Discuss

What I did in the hedonium shockwave, by Emma, age six and a half

2026-04-14 00:47:10

My name is Emma and I’m six and a half years old and I like pink and Pokemon and my cat River and I’m going to be swallowed by a hedonium shockwave soon, except you already know that about me because everyone else is too.

“Hedonium shockwave” means that everyone is going to be happy forever. Not just all the humans but all the animals and the flowers and the ground and River too. It has already made a bunch of the stars happy, like Betelgeuse and Alpha Centauri.

Scientists saw that the stars were blinking out, and they did a lot of very hard science and figured out that the stars were turning into happiness. I wanted to be a scientist when I grew up but I won’t be a scientist because instead I’m going to be happy forever.

I used to have a hard time saying “hedonium shockwave” but grownups keep saying it so I’ve gotten a lot of practice. Sometimes it seems like all grownups do, in real life and on the TV, is say “hedonium shockwave” at each other until they all start crying.

I looked at the sky to see if I could see any of the stars blink out when they turned into happiness, but Daddy said that you’d have to be looking at the exactly right time to see them blink out, and anyway we can’t see the stars from our house because of Light Pollution. Light Pollution is when you have lots of lights and the lights confuse the sea turtles so they walk into the streets and get run over, and also you can’t see the stars. I wanted to see the stars blink out at the planetarium but Daddy says the planetarium is closed, and even if it wasn’t closed it would be showing the regular show because grownups don’t like thinking about the hedonium shockwave.

Everything is closed these days because no one wants to work if they’re going to be happy forever in two weeks. One time we went to the store and bought all the canned food and toilet paper we could, and all the shelves were empty because everyone else was buying canned food and toilet paper too, and the store hasn’t been open since then and even if it did they wouldn’t have anything on the shelves. I asked for candy and I thought Daddy was going to say No, It Will Spoil Your Dinner but instead he said Sure, Why Not? You’re Not Going To Live Long Enough To Get Diabetes and then he bought a whole shelf of candy, all the candy I wanted and whatever kind I wanted.

We’ve been eating the canned food since then. I don’t like the canned food, so I eat candy for dinner. That’s one way the hedonium shockwave made me happy before it even got here. Except then we ran out of candy so now I have to eat refried beans. I hate them and I stick out my tongue but Mommy says I have to eat them up.

Another way the hedonium shockwave made me happy is school. School is still open even though a bunch of the kids don’t go and a bunch of the teachers don’t go either. But we don’t have to do boring things like math or phonics anymore. We have storytime three times a day, and we watch movies, and we go to recess for hours and hours.

I have to go to school because Daddy still goes to work. Daddy is a police officer which means he chases down bad guys and puts them in prison, which is time-out for grownups. Mommy says Why Are You Going To Work, Jim? (that is what Mommy calls Daddy, Jim) and Daddy says Someone Has To Make Sure The Streets Are Safe and Mommy says What Is The Point, Jim, Are They Even Going To Get A Trial and Daddy says They Can Spend Their Last Weeks In Jail, See If They Like It and then Mommy sighs and says That’s Why I Married You, Jim and they kiss and it’s slobbery and gross.

I think jail is also time-out for grownups but I’m not sure how jail and prison are different.

Sometimes at recess we talk about what it will be like when we’re all happy forever. Liam says that there won’t be any icky girls in the hedonium shockwave, because no one could be happy forever if they had to be around icky girls. I said that everyone will turn into happiness, not just the humans but all the animals and the flowers and the ground and River too, and so the girls were also going to turn into happiness, and if Liam thought that he wouldn’t be happy with girls maybe the hedonium shockwave was going to make him the sort of person who would be happy even if girls were there. Liam said that even the hedonium shockwave couldn’t make him like girls because girls were yucky and smelly. I said that actually boys are yucky and smelly and maybe there won’t be any boys after the hedonium shockwave, what then, and then I hit him in the head but no grownups saw me so I didn’t have to go to timeout.

I hate Liam. Everyone thinks I have a crush on him and they won’t stop saying it no matter how many times I hit him in the head.

When the hedonium shockwave hits I’ll get to have candy for dinner every day and we aren’t going to run out. I’m going to have the mermaid toy from the commercials whose human legs transform into fish legs, and it’ll really work, not like the time I begged and begged and got the doll that talks for Christmas and she could only say three things and none of them had anything to do with what I said.

I’ll be a princess who is also a Pokemon trainer, and I’ll be able to understand what River says just like Ash always knows what Pikachu is saying, and I’m going to travel the whole entire world and collect all the Pokemon and put them in my Pokeball which is going to be PINK. And even though I’ll be the greatest Pokemon trainer who has ever lived, River will still be my favorite because I knew her before. And I’ll dress up all my Pokemon in pretty outfits, and I’ll beat up all the bad guys and send them to jail just like Daddy does, and then we’re going to have a big ball and invite everyone in the whole world except Liam because he’s mean. And I’ll have a big closet of the floofiest dresses in the world.

I told Mommy that when the hedonium shockwave hits I’m going to have candy for dinner and a mermaid toy, and then she put her forehead against my forehead for a really long time and didn’t say anything and I tried to squirm away and she wouldn’t let me and it kind of hurt. I tried to make her happier by telling her that I was also going to be a princess Pokemon trainer and never have to talk to Liam again or anyone who says I have a crush on him which I don’t because he’s icky and he smells like turnips. She made a face like she makes when the dog dies in a movie and she wouldn’t tell me what she was sad about.

A lot of grownups are sad about being happy forever. Maybe they don’t like being Pokemon trainers.

Mommy says the hedonium shockwave hits in ten days. Daddy threw out all the calendars last month because Mommy started crying whenever she looked at them. So I got a piece of paper and I wrote 1 2 3 4 5 6 7 8 9 10 on it, just like we learned before we stopped learning math, and I’m going to cross one out every day unless I forget.

I’m going to cross off 10 and then I’m going to be too excited to sleep, like before Christmas when I tried to stay up to meet Santa Claus and ended up falling asleep under the dining room table with wrapping paper on my head. I’m going to look out my window and watch everything get turned into happiness, the humans and the animals and the flowers and the ground and River too. The stores will have food, and no one’s going to go to timeout grownup or regular, and Mommy will give me hugs instead of crying whenever she sees me so that my hair gets covered in snot and it’s gross.

I can’t wait.



Discuss

Political Violence Is Never Acceptable

2026-04-13 23:20:56

Nor is the threat or implication of violence. Period. Ever. No exceptions.

It is completely unacceptable. I condemn it in the strongest possible terms.

It is immoral, and also it is ineffective. It would be immoral even if it were effective. Nothing hurts your cause more.

Do not do this, and do not tolerate anyone who does.

The reason I need to say this now is that there has been at least one attempt at violence, and potentially two in quick succession, against OpenAI CEO Sam Altman.

My sympathies go out to him and I hope he is doing as okay as one could hope for.

Awful Events Amid Scary Times

Max Zeff: NEW: A suspect was arrested on Friday morning for allegedly throwing a Molotov cocktail at OpenAI CEO Sam Altman’s home. A person matching the suspect’s description was later seen making threats outside of OpenAI’s corporate HQ.

Nathan Calvin: This is beyond disturbing and awful. Whatever disagreements you have with Sam or OpenAI, this cannot be normalized or justified in any way. Everyone deserves to be able to be safe with their families at home. I feel ill and hope beyond hope this does not become a pattern.

Sam Altman wrote up his experience of the first attack here.

After that, there was a second incident.

Jonah Owen Lamb: OpenAI CEO Sam Altman’s home appears to have been the target of a second attack Sunday morning, a mere two days after a 20-year-old man allegedly threw a Molotov cocktail at the property, The Standard has learned.

The San Francisco Police Department announced (opens in new tab) the arrest of two suspects, Amanda Tom, 25, and Muhamad Tarik Hussein, 23, who were booked for negligent discharge.

Stephen Sorace Fox News (Fox News): An OpenAI spokesperson told Fox News Digital Monday morning that the incident was unrelated and had no connection to Altman, adding that there was no indication that Altman’s home was being targeted.

We have no idea what motivated the second incident, or even if it was targeted at Altman. I won’t comment further on the second incident until we know more.

Nor is this confined to those who are worried about AI, the flip side is alas there too:

Gary Marcus: One investor today called for violence against me. Another lied about me, in a pretty deep and fundamental way. They are feeling the heat.

It also is not confined to the AI issue at all.

As Santi Ruiz notes, there has been a large rise in the salience of potential political violence and violence against public figures in the past few years, across the board.

That holds true for violence and threats against both Republicans and Democrats.

This requires a non-AI explanation.

Things still mostly don’t spiral into violence, the vast vast majority of even deeply angry people don’t do violence, but the rare thing is now somewhat less rare. A few years ago I would have been able to say most people definitively oppose such violence, but polls indicate this is no longer true for large portions of the public. This is terrifying.

Indeed, the scariest reaction known so far has been a comments section on Instagram (click only if you must), a place as distinct from AI and AI safety spaces of all kinds as one can get. This is The Public, as in the general public, for reasons completely unrelated to any concerns about existential risk, basically cheering this on and encouraging what would become the second attack. It seems eerily similar to the reaction of many to the assassination of the CEO of United Healthcare.

The stakes of AI are existential. As in, it is likely that all humans will die. All value in the universe may be permanently lost. Others will be driven to desperation from loss of jobs or other concerns, both real and not. The situation is only going to get more tense, and keeping things peaceful is going to require more work over time. It will be increasingly difficult to both properly convey the situation and how dire it is, and avoid encouraging threats of violence, and even actual attempts at violence.

Then on the other side are those who see untold wonders within their grasp.

This goes hand in hand with what Altman calls the ‘Shakespearean drama’ going on inside OpenAI, and between the major labs.

Most Of Those Worried About AI Do As Well As One Can On This

The vast majority of major voices in Notkilleveryonism, those worried that we might all die from AI, have been and continue to be doing exactly the right thing here, and have over many years consistently warned against and condemned all violence other than that required by the state’s enforcement of the law.

Almost all of those who are worried about AI existential risk are very much passing this test, and making their positions against violence exceedingly clear, pushing back very hard against any and all extralegal violence and extralegal threats of violence.

Demands for impossible standards here are common, where someone who did not cause the problem is attacked for not condemning the thing sufficiently loudly, or in exactly the right away. This is a common political and especially culture war tactic.

Perhaps the worst argument of all is ‘you told people never to commit or threaten violence because it is ineffective, without explicitly also saying it was immoral, therefore you would totally do it if you thought it would work, you evil person.’

They will even say ‘oh you said it was immoral, and also you said it wouldn’t work, but you didn’t explicitly say you would still condemn it even if it would work, checkmate.’

The implicit standard here, that you must explicitly note that you would act a certain way purely for what someone thinks are the right reasons or else you are guilty of doing the thing, is completely crazy, as you can see in any other context. It is the AI version of saying ‘would you still love me if I was a worm?’ and getting mad that you had to ask the question to get reassurance, as opposed to being told unprompted.

The reason why people often focus on ‘it won’t work’ is because this is the non-obvious part of the equation. With notably rare exceptions, we all agree it is immoral.

Andy Masley offers thoughts, calling for caution when describing particular people. He draws a parallel to how people talk about abortion. Here is Nate Soares at length.

This is Eliezer Yudkowsky’s latest answer on violence in general, one of many over the years trying to make similar points.

Some Who Are Worried About AI Need To Address Their Rhetoric

Almost all and vast majority are different from all.

There are notably rare exceptions, where people are at least flirting with the line, and one of these has some association to this attempt at violence, and a link to another past incident of worry about potential violence. Luckily no one has been hurt.

Speaking the truth as you see it is not a full free pass on this, nor does condemning violence unless it is clear to all that you mean it. There are some characterizations and rhetorical choices that do not explicitly call for violence, but that bring far more heat than light, and carry far more risk than they bring in benefits.

Everyone involved needs to cut that right out.

In particular, I consider the following things that need to be cut right out, and I urge everyone to do so, even if you think that the statements involved are accurate:

  1. Calling people ‘murderers’ or ‘evil.’
  2. Especially calling them ‘mass murderer’ or ‘child murderer.’
  3. Various forms of ‘what did you expect.’
  4. Various forms of the labs ‘brought this on themselves.’
  5. Saying such violence is the ‘inevitable result’ of the labs ‘not being stopped.’

You can and should get your point across without using such words.

Also, no matter what words you are using, continuously yelling venom at those you disagree with, or telling those people they must be acting in bad faith and to curry lab favor, especially those like Dean Ball and even myself, or anyone and everyone who associates with or praises any of the AI labs at all, does not convince those people, does not convince most observers and does not help your cause.

Note, of course, that mainstream politicians, including prominent members of both parties, very often violate the above five rules on a wide variety of topics that are mostly not about AI. They, also, need to cut that right out, with of course an exception for people who are (e.g.) literally murderers as a matter of law.

Also: There are not zero times and places to say that someone does not believe the things they are saying, including telling that person to their face or in their replies. I will do that sometimes. But the bar for evidence gathered before doing this needs to be very high.

Please, everyone, accept that:

  1. Those who say they are worried that AI will kill everyone are, with no exceptions I know about, sincerely worried AI will kill everyone.
    1. Even if you think their arguments and reasons are stupid or motivated.
  2. Those who say they are not worried AI will kill everyone are, most of the time, not so worried that AI will kill everyone.
    1. Even if you think their arguments and reasons are stupid or motivated.
  3. A bunch of people have, in good faith, concerns and opinions you disagree with.

(Dean Ball there also notes the use of the term ‘traitor.’ That one is… complicated, but yes I have made a deliberate choice to avoid it and encourage others to also do so. It is also a good example of how so many in politics, on all sides, often use such rhetoric.)

My current understanding is the first suspect was a participant of the PauseAI (Global) discord server, posting 34 messages none of which were explicit calls to violence. He was not a formal part of the organization, and participated in no formal campaigns.

We do not know how much of this is the rhetoric being used by PauseAI or others reflecting on this person, versus how much is that this is him being drawn to the server.

PauseAI has indeed unequivocally condemned this attack, which is good, and I believe those involved sincerely oppose violence and find it unacceptable, which is also good.

I think they still need to take this issue and the potential consequences of its choices on rhetoric more seriously than they have so far. Its statement here includes saying that PauseAI ‘is that peaceful path’ and avoiding extreme situations like this is exactly why we need a thriving pause movement. This is an example of the style of talk that risks inflaming the situation further without much to gain.

There is one thing that they are clearly correct about: You are not responsible for the actions of everyone who has posted on your public discord server.

I would add: This also applies to anyone who has repeated your slogans or shares your policy preferences, and it does not even mean you casually contributed at all to this person’s actions. We don’t know.

For the second attack, for now, we know actual nothing about the motivation.

But yes, if you find your rhetoric getting echoed by those who choose violence, that is a wake up call to take a hard look at your messaging strategy and whether you are doing enough to prevent such incidents, and avoid contributing to them.

Similarly, I think this statement from StopAI’s Guido Reichstadter was quite bad.

Speak The Truth Even If Your Voice Trembles

If one warns that some things are over the line or unwise to say, as I did above, one should also note what things you think are importantly not over that line.

Some rhetoric that I think is entirely acceptable and appropriate to use, if and only if you believe the statements you are making, include, as examples:

  1. Gambling with humanity’s future.’
  2. ‘If [X] then [Y]’ if your conditional probability is very high (e.g. >90%), or of stating your probability estimate of [Y] given [X], including in the form of a p(doom).
  3. Calling Mythos or something else a ‘warning shot.’
  4. Calling Mythos or other similarly advanced AIs a ‘weapon of mass destruction.’
  5. Most of all: To call all the act of creating minds more powerful than humans an existential threat to humanity. It obviously is one.

If you believe that If Anyone Builds It, Everyone Dies, then you should say that if anyone builds it, then everyone dies. Not moral blame. Cause and effect. Note that this is importantly different from ‘anyone who is trying to build it is a mass murderer.’

I could be convinced that I am wrong about one or more of these particular phrases. I am open to argument. But these seem very clear to me, to the point where someone challenging them should be presumed to either be in bad faith or be de facto acting from the assumption that the entire idea that creating new more powerful minds is risky is sufficiently Obvious Nonsense that the arguments are invalid.

Here is a document about how Pause AI views the situation surrounding Mythos. It lays out what they think are the key points and the important big picture narrative. It is a useful document. Do I agree with every interpretation and argument here? I very much do not. Indeed, I could use this document as a jumping off point to explain some key perspective and world model differences I have with Pause AI.

I consider the above an excellent portrayal of their good faith position on these questions, and on first reading I had no objection to any of the rhetoric.

False Accusations And False Attacks Are Also Unacceptable

There has been quite a lot of quite awful rhetoric in the other direction, both in general and in response to this situation. We should also call this out for what it is.

There are those who would use such incidents as opportunities to impose censorship, and tell people that they cannot speak the truth. They equate straightforward descriptions of the situation with dangerous calls for violence, or even attack any and all critics of AI as dangerous.

At least one person called for an end to ‘non-expert activism’ citing potential violence.

We have seen threats, taunting, deliberate misinterpretation, outright invention of statements and other bad faith towards some worried about AI, often including Eliezer Yudkowsky, including accusing people of threatening violence on the theory that of course if you believed we were all going to die you would threaten or use violence, despite the repeated clear statements to the contrary, and the obvious fact that such violence would both be immoral and ineffective.

This happened quite a bit around Eliezer’s op-ed in Time in particular, usually in highly bad faith, and this continues even now, equating calls for government to enforce rules to threats of violence, and there are a number of other past cases with similar sets of facts.

At other times, those in favor of AI accelerationism have engaged in threats of and calls for violence against those who oppose AI, on the theory that AI can cure disease, thus anyone who does anything to delay it is a murderer. The rhetoric is the same all around.

Some Examples Of Attempts To Create Broad Censorship

This is from someone at the White House, trying to equate talking about logical consequences with incitement to violence. This is a call to simply not discuss the fact that if anyone builds superintelligence, we believe that it is probable that everyone will die.

I that kind of attack completely unacceptable even from the public, and especially so from a senior official.

One asks what would happen if we applied even a far more generous version of this standard to many prominent people, including for example Elon Musk, or other people I will decline to name because I don’t need to.

Here is the Platonic form:

Shoshana Weissmann, Sloth Committee Chair: This is insane behavior. And those promoting the idea of AI ending humanity are contributing to this. It has to stop.

As in, you need to stop promoting the idea of AI ending humanity. Never mind how you present it, or whether or not your statement is true. No argument is offered on whether it is true.

This is the generalization of the position:

florence: It would appear that, according to many, one of the following are true:

1. It is a priori impossible for a new technology to be an existential threat.
2. If a new technology is an existential threat, you’re not allowed to say that.

Indeed, one of the arguments people often literally use is, and this is not a strawman:

  1. You straightforwardly say sufficiently advanced AI might kill everyone.
  2. But if someone did believe that, they might support using violence.
  3. Therefore you can’t say that, or we should be able to use violence against you.

While I don’t generally try to equate different actions, I will absolutely equate implicit calls for violence in one direction to other implicit calls for violence or throwing your political enemies in jail for crimes they obviously are not responsible for, indeed for the use of free speech, in the other direction, such as this by spor or Marc Andreessen.

Nate Soares (MIRI): “even talking about the extinction-level threat is incitement towards violence”

No. High stakes don’t transform bad strategies into good ones. Let’s all counter that misapprehension wherever we find it.

michael vassar: This is probably my number one complaint about the current culture. The false dichotomy between ‘not a big deal, ignore’ and ‘crisis, panic, centralize power and remove accountability’.

That’s the same thing or worse, especially in this particular case, where the accusation is essentially ‘you want government to pass and enforce a law, we don’t like that, therefore we want the government to arrest you.’

There is also the version, which I would not equate the same way, where #3 is merely something like ‘so therefore you have a moral responsibility to not say this so plainly.’ For sufficiently mid versions, as I discuss above, one can talk price.

A variation is when someone, often an accelerationist, will say:

  1. These people claim to be worried about AI killing everyone.
  2. But you keep condemning violence.
  3. Therefore, you must not care about these supposed beliefs.

Or here’s the way some of them worded it:

bone: Nice to see all the LessWrong people fold completely on their philosophy. Very good for humanity. They have no beliefs worth dying or killing for. It’s nonsense from a guy who never had the balls to stand up for his words once push came to shove.

Yudkowsky stands for nothing.

bone: remember: if they actually believe all this stuff and they are unwilling to be violent, it means they are cowards, that they refuse to measure up to their own words, that they will not do what they believe needs to be done to save mankind.

they are weak, they believe in nothing

Zy: AI doomers are like “attacking key researchers in the AI race is an ineffective strategy to prevent AI doom which pales in comparison to my strategy of paying them $200 a month to fund capabilities research”

Lewis: Rare Teno L. If you actually think Sam Altman is going to genocide children then it makes sense to try to hurt him. So you need to pick one. It’s either completely insane or it’s totally sensible. Which one is it?

L3 Tweet Engineer (replying to Holly Elmore): If you’re such a good person, and stopping AI is so important, why don’t you go bomb a data center? Why waste your breath tweeting about this stuff and writing grand narratives, go make it happen.

phrygian: You’ve already talked about how it would be moral to nuke other countries to stop asi. The only logical reasons you have for not engaging in smaller forms of violence to stop ASI is that they aren’t as effective. On a fundamental level, your views justify violence of any kind.

Ra: maybe this is just me and explains some things about me, but *personally* i would much rather be seen as a potentially dangerous radical than as a feckless and insincere grifter, especially if i believed the world was ending soon and was personally responsible for stopping that.

Trey Goff: Look do you people not realize how silly you look

“AI is going to literally kill your children and all future humans, but we strongly condemn any violence committed in order to stop that from happening”

Have the courage of your convictions or STFU

The Platonic version of this is the classic: ‘If you believed that, why wouldn’t you do [thing that makes no sense]?’

The trap or plan is clear. Either you support violence, and so you are horrible and must be stopped, or you don’t, in which case you can be ignored. The unworried mind cannot fathom, in remarkably many cases, the idea that one can want to do only moral things, or only effective things, and that the stakes being higher doesn’t change that.

Teortaxes: Uncritical support
this is a bad faith attempt to elicit a desirable mistake
essentially a false flag by proxy of stupidity
I think decels are holding up well btw

Eliezer started a thread to illustrate people using such tactics, from which I pulled the above examples, but there are many more.

João Camargo (replying to a very normal post by Andy Masley): No one believes you actually think this. If you think that Altman and other pivotal AI leaders/researchers will likely bring human extinction, assassinations are clearly justified. “This guy is gonna cause human extinction, but no one must prevent him by force” is not coherent.

Other times, they simply make fun of Eliezer’s hat.

Or they just lie.

taoki: i assume eliezer yudkowsky and his pause ai friends love this?

Oliver Habryka: False, they definitely hate it.

taoki (May 6, 2024): also, i LIE like ALL THE TIME. JUST FOR FUN.

Or they flat out assert ‘oh you people totally believe in violence and all the statements otherwise are just PR.’

Another tactic of those trying to shut down mention of the truth of our situation is to attack both any attempt to put a probability on existential risk, and also anyone who (in a way I disagree with, but view as reasonable) treats existential risk as high likely if we build superintelligence soon on known principles, including dismissing any approach that takes any of it seriously as not serious, or that it is ‘using probability as a weapon’ to point out that the probability of everyone dying if we stay the current course is uncomfortably and unacceptably high.

I close this section by turning it over to Tenobrus:

Tenobrus: “stochastic terrorism” is, quite frankly, complete fucking bullshit. it’s a unfalsifiable term used to try to tie your political opponents speech to actions that have fucking nothing to do with them, attempting to weaponize tragedy and mental illness for debate points. it was bullshit when AOC tried to accuse the republicans of “stochastic terrorism” for criticizing her, it was bullshit when the right claimed the left was committing “stochastic terrorism” for engaging in anti-ICE protests, and it remains bullshit now when you assign responsibility for attacks against sam altman to AI safety advocates and journalists who wrote negative things about him.

fuck your garbage rhetorical device! that’s not how responsibility or blame works! you do not get to suppress any and all speech you disagree with and can find a way to vaguely deem “dangerous”!

Kitten: “who will rid me of this troublesome priest”

Tenobrus: yeah that’s an entirely different thing. that’s not “stochastic terrorism” dawg that’s just straightforward incitement of violence.

Grant us the wisdom to know the difference.

The Most Irresponsible Reaction Was From The Press

I really do not understand how you can be this stupid. I realize that yes, you could still get this information if you wanted it, but my lord this is nuts from the SF Standard.

The San Francisco Standard: Just in: Sam Altman’s home appears to have been the target of a second attack Sunday morning, a mere two days after a 20-year-old man allegedly threw a Molotov cocktail at the property: Jonah Owen Lamb.

spor: printed his home address and even added a picture of the exterior, for good measure… in an article about how his home is being targeted by psychos that want to kill him !!!

this reporter, their editor, and the entire Standard should be ashamed of this

Mckay Wrigley: this is absolutely disgusting and anyone involved in the publishing of this has absolutely zero morals.

Sam Altman Reacts

Sam Altman has my deepest sympathies in all of this. This must be terrifying. No one should have a Molotov Cocktail thrown at their house, let alone face two attacks in a week. I hope he is doing as well as one can when faced with something like this, and that he is staying safe.

I have no idea how I would respond to such a thing if it happened to me.

Sam Altman’s public reaction was to post this statement.

I very much appreciate that Sam Altman has explicitly said that he regrets the word choice in the passage below. ‘Tough day’ is absolutely a valid excuse here, and most of the statement is better than one can reasonably expect in such circumstances given Altman’s other public statements on all things AI.

But I do need to note that this importantly missed the mark and the unfortunate implication requires pushback.

Sam Altman (CEO OpenAI): Words have power too. There was an incendiary article about me a few days ago. Someone said to me yesterday they thought it was coming at a time of great anxiety about AI and that it made things more dangerous for me. I brushed it aside.

Now I am awake in the middle of the night and pissed, and thinking that I have underestimated the power of words and narratives. This seems like as good of a time as any to address a few things.

The article in question, presumably the piece in The New Yorker I discussed at length last week, was an extremely long, detailed and as far as I could tell fair and accurate retelling of the facts and history around Sam Altman and OpenAI. To the extent it was incindiacy, the facts are incendiary.

Those who are not Sam Altman do not get the same grace, when they say things like this in reference to that article:

Kelly Sims: It turns out when you string a bunch of quarter-truths together exclusively from someone’s bitter competitors it has consequences.

Given what we know about who attacked Altman, and various details, I find it unlikely that the timing of these two events was meaningful for the first attack. My guess is the trigger to someone already ready to blow was anxiety around Mythos, but even if it that article was the triggering event, it was not an example of irresponsible rhetoric.

For the second attack, unfortunately, we should worry that it was triggered in large part by coverage of the first attack, including publishing details about Altman’s home.

Sam Altman Reflects

The rest of the post is personal reflections and predictions about AI overall, so I’m going to respond to it the way I would any other week.

Sam Altman (CEO OpenAI): [AI] will not all go well. The fear and anxiety about AI is justified; we are in the process of witnessing the largest change to society in a long time, and perhaps ever. We have to get safety right, which is not just about aligning a model—we urgently need a society-wide response to be resilient to new threats. This includes things like new policy to help navigate through a difficult economic transition in order to get to a much better future.

AI has to be democratized. … I do not think it is right that a few AI labs would make the most consequential decisions about the shape of our future.

Adaptability is critical. We are all learning about something new very quickly; some of our beliefs will be right and some will be wrong, and sometimes we will need to change our mind quickly as the technology develops and society evolves. No one understands the impacts of superintelligence yet, but they will be immense.

Altman is essentially agreeing with his most severe critics, that he should not be allowed to develop and deploy superintelligence on its own. He tries to have it both ways, where he says things like this and also tries to avoid any form of meaningful democratic control when time comes to pass laws or regulations.

His call for adaptability is closely related to the idea of building the ability to control development and deployment of AI, and having the ability to pause in various ways, should we find that to be necessary.

His disagreement is that he thinks we collectively should want him to proceed. Which might or might not be either the decision we make, or a wise decision, or a fatal one.

He mentions that it ‘will not all go well’ but this framing rejects by omission the idea that there is existential risk in the room, and it might go badly in ways where we cannot recover. To me, that makes this cheap talk and an irresponsible statement.

The second section is personal reflections.

He believes OpenAI is delivering on their mission. I would say that it is not, as their mission was not to create AGI. The mission was to ensure AGI goes safety, and OpenAI is not doing that. Nor is Anthropic or anyone else, for the most part, so this is not only about OpenAI.

He calls himself conflict-averse, which seems difficult to believe, although if it is locally true to the point of telling people whatever they want to hear then this could perhaps explain a lot. I was happy to hear him admit he handled the situation with the previous board, in particular, badly in a way that led to a huge mess, which is as much admission as we were ever going to get.

His third section is broad thoughts.

​My personal takeaway from the last several years, and take on why there has been so much Shakespearean drama between the companies in our field, comes down to this: “Once you see AGI you can’t unsee it.” It has a real “ring of power” dynamic to it, and makes people do crazy things. I don’t mean that AGI is the ring itself, but instead the totalizing philosophy of “being the one to control AGI”.

We can all agree that we do not want any one person to be in control of superintelligence (ASI/AGI), or any small group to have such control. The obvious response to that is ‘democracy’ and to share and diffuse ASI, which is where he comes down here. But that too has its fatal problems, at least in its default form.

If you give everyone access to superintelligence, even if we solve all our technical and alignment problems, and find a way to implement this democratic process, then everyone is owned by their own superintelligence, in fully unleashed form, lest they fall behind and lose out, or is convinced of this by the superintelligence, and we quickly become irrelevant. Humanity is disempowered, and likely soon dead.

Thus if you indeed want to do better you have to do Secret Third Thing, at least to some extent. And we don’t know what the Secret Third Thing is, yet we push ahead.

He concludes like this:

Sam Altman (CEO OpenAI): A lot of the criticism of our industry comes from sincere concern about the incredibly high stakes of this technology. This is quite valid, and we welcome good-faith criticism and debate. I empathize with anti-technology sentiments and clearly technology isn’t always good for everyone. But overall, I believe technological progress can make the future unbelievably good, for your family and mine.

While we have that debate, we should de-escalate the rhetoric and tactics and try to have fewer explosions in fewer homes, figuratively and literally.

It is easy to agree with that, and certainly we want fewer explosions. But it is easy for calls to ‘de-escalate’ to effectively become calls to disregard the downside risks that matter, or to not tackle seriously with the coming technical difficulties, dilemmas and value clashes, or to shut down criticism and calls to action of all kinds.

Violence Is Never The Answer

Once again: I condemn these attacks, and any and all such violence against anyone, in the strongest possible terms. I do this both because it is immoral, and also because it is illegal, and also because it wouldn’t work. Nothing hurts your cause more.

My sympathies go out to Sam Altman at this time, and I hope he comes through okay.

Most people worried about AI killing everyone have handled this situation well, both before and after it happened, and not only take strong stances against violence but also use appropriate language, at a standard vastly higher than that of any of:

  1. Those who are worried about those worried about AI killing everyone.
  2. Those who are worried about mundane AI concerns like data centers or job loss.
  3. Politicians and ordinary citizens of both major American political parties, and the media, on a wide variety of issues.

I call upon all three of those groups of people to do way better across the board. Over a several year timeline, I predict that most concern about AI-concern-related violence will have nothing to do with concerns about existential risk.

But there are a small number of those worried about AI existential risks who have gone over where I see the line, as discussed above, and I urge those people to cut it right out. I have laid out my concerns on that above. We should point out what actions have what consequences, and urge that we choose better actions with better consequences, without having to call anyone murderers or evil.

Eliezer has an extensive response on the question of violence on Twitter, Only Law Can Prevent Extinction, that echoes points he has made many times, in two posts.

I also condemn those who would use who use this situation as an opportunity to call for censorship, to misrepresent people’s statements and viewpoints, and generally to blame and discredit people for the crime of pointing out that the world is rapidly entering existential danger. That, too is completely unacceptable, especially when it rises to its own incitements to violence, which happens remarkably often if you hold them to the standards they themselves assert.

 

 

 

 



Discuss