MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

$50 million a year for a 10% chance to ban ASI

2026-04-22 00:39:22

ControlAI's mission is to avert the extinction risks posed by superintelligent AI. We believe that in order to do this, we must secure an international prohibition on its development.

We're working to make this happen through what we believe is the most natural and promising approach: helping decision-makers in governments and the public understand the risks and take action.

We believe that ControlAI can achieve an international prohibition on ASI development if scaled sufficiently. We estimate that it would take approximately a $50 million yearly budget in funding to give us a concrete chance at achieving this in the next few years. To be more precise: conditional on receiving this funding in the next few months, we feel we would have ~10% probability of success.

In this post, we lay out some of the reasoning behind this estimate, and explain how additional funding past that threshold would continue to significantly improve our chances of success, with $500 million a year producing an estimated ~30% probability of success. [1]

Preventing ASI 101

Negotiating, implementing and enforcing an international prohibition on ASI is, in and of itself, not the work of a single non-profit. You need to have the weight of nations behind you to achieve this kind of goal. If humanity manages to achieve an international ban on ASI, it'll be through the efforts of a sufficiently motivated, sufficiently powerful initial coalition of countries.

Assuming that we work in multiple countries in parallel, we could say the problem statement is: get each country to be motivated to achieve an international prohibition on ASI. It’s not obvious what it means for a country to be “motivated” to do something, so it’s worth taking a second to unpack.

impact-report-new-numbers.pngOur full theory of change chart, which backtracks from the desired outcome to our currently running workstreams. Our full theory of change chart, which backtracks from the desired outcome to our currently running workstreams.

Normally, parts of a country's executive branch are responsible for international negotiations around urgent issues concerning national and global security. In practice, these are the groups who need to be sufficiently motivated to achieve the ban to throw their weight behind it.

Branches of the government are generally not in the business of independently taking bold positions and then pursuing those positions to their logical ends. Instead, their stances and actions are mostly shaped by prevailing social currents.

Some of these currents are informal. This includes things like the conversations they have with their colleagues, advisors, confidants and family members. It also includes any recent news cycles and the media they consume.

Other parts of these currents operate through more formal channels, particularly in democracies. The legislative branch can influence the executive branch. [2] The public influences governments through elections, but also through polls, public discussions and common demands (at the very least because they affect the expectation of future election results).

If enough of these inputs point in the same direction, pushing for an international ban on ASI can become one of the country’s top priorities. For this to work, we need pervasive awareness of the issue of extinction risk from ASI. This sentence makes two claims, both of which are fully necessary, so let us repeat them and expand them individually.

Claim 1: The awareness of extinction risk needs to be pervasive throughout society.

Prohibiting ASI development is not easy. It will require the relevant parts of the executive branch to take a great deal of initiative, and involve many hard tradeoffs. At a minimum, it will mean significantly slowing down improvements in general-purpose AI and thus forgoing economic and military advantages. If some countries are initially not willing to cooperate with a ban, proponents of the ban will need to apply an expensive combination of carrots and sticks to bring holdouts on board.

For the relevant groups to push through these costs, it needs to feel like there is plenty of pressure to act, and like this pressure is coming from many places. If everyone who is asking for this is part of a specific, small faction, there will be a strong immune reaction and the faction will be ignored, or even purged in some cases.

Claim 2: The awareness needs to be specifically about extinction risk from superintelligent AI.

It is insufficient, and sometimes actively harmful, for people to vaguely dislike AI or only vaguely be aware that AI poses some scary risks. Due to the hard tradeoffs mentioned earlier, there will be pressure to take half-measures, at many layers, both internal and external. The only sufficient counterweight against this pressure is an understanding that ASI development must absolutely be prevented to ensure human survival.

A lack of awareness of the specific issue will inevitably lead to anemic action and weak, unfocused policies that do not actually prevent the development of ASI. This is one of the reasons why, in our communications, we solely focus on extinction risk from ASI, and we do not work on raising awareness of other AI risks, or otherwise trying to get people to vaguely dislike all AI. [3] All of our efforts are specifically around raising awareness of extinction risk from ASI, and how it may be addressed. [4]

Awareness is the bottleneck

image.pngChart synthesized from the section “The Simple Pipeline” of Gabriel Alfour's post on The Spectre haunting the “AI Safety” Community. It’s a common perception that one cannot communicate directly to lay people about extinction risks from ASI, because they would never get it. Instead, one must cook up sophisticated persuasion schemes. Based on our experience, this idea is just plainly wrong. Just tell the truth!

We believe the primary bottleneck to getting an international prohibition on superintelligence is basic awareness of the issue.

Most of the people we reach, for example among lawmakers and the media, have simply never been told about the problem in plain terms. We find that often, all it takes to bring someone on board is a single honest conversation. The fact that honestly explaining the concerns to people is such a low-hanging fruit is one of the reasons why we could get so much done in 2025.

Politicians and the public simply don’t know that the most important figures in AI are literally worried about superintelligence causing human extinction. They simply don’t know that the only way to avoid human extinction on which experts can truly form a consensus is not to build ASI in the first place. [5]

The reason why they are not aware of this is because they haven’t been told, not because they don’t understand the concepts involved. In our experience, most people find it intuitive that it is extremely dangerous to build something as powerful as ASI, that you don’t understand and can’t predict. They find it intuitive that you can’t control ASI, that it can very easily precipitate catastrophic scenarios, and that this means you should not build it in the first place.

The reason why people are not aware of extinction risk from superintelligence is, simply put, because concerned experts have generally not been straightforward about their concern. The CAIS statement on AI risk is a rare exception to this, [6] but it’s starting to get old, and even then it’s just not enough.

We’ve met with lawmakers over 300 times. Most of the time, they’ve never had someone explain extinction risk to them before, nor have they ever heard of the CAIS statement before the meeting.

Even then, politicians don’t care about a person having signed a single statement once. That’s not how they’d expect someone who’s worried about the literal annihilation of the entire human race to behave. It sounds weak and almost fake to them.

In a serious world, you’d expect every single AI expert who is worried about extinction to be loudly and consistently vocal about it, including to the public and decision-makers in governments. As it stands, this is simply not the case. AI companies and their leaders constantly soften their communications, avoiding clearly mentioning extinction and preferring to talk about euphemisms and other risks. Anthropic’s head of growth recently said that Anthropic constantly adjusts their communications to be “softer” and appear “less over the top”.

Sam Altman, when asked by a US Senator whether he had jobs in mind when he said that “Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity.”, did not correct the senator and instead proceeded to talk about the possible effects of AI on employment.

If you ever have the chance to attend a house party in the Bay Area, you will get a really good sense of this: many researchers at AI companies are worried about extinction risk, and significantly orient their lives around this. At the same time, they don’t talk about these risks publicly.

It’s obvious to us that the reason so little progress has been made towards international agreements on ASI is exactly because experts have failed to be consistently open about their concerns.

An asymmetric war

While an international ASI ban is obviously a very ambitious goal, there is a sense in which advocacy about extinction risks from ASI means having the wind at one’s back. At a fundamental level, this is because approximately no one wants to die from ASI.

Politics is often an adversarial tug-of-war between opposing interests. When it comes to high-profile issues in American politics (e.g. abortion, marijuana legalization, Prop 22), it can take hundreds of millions. [7]

However, when it comes to extinction risk, there is little conflict between different interest groups. If extinction risk materializes then everyone dies, regardless of their wealth, political affiliation or other personal interests. It is only an extreme minority of people who, even after having the chance to consider the dilemma of extinction risk, decide they are willing to bet on humanity’s extinction in order to get to ASI.

This is true at a fractal level. Not only does it mean that we expect the issue to be nonpartisan within countries, but we expect the interests of countries to be aligned with each other as long as there is a significant risk that building superintelligence will cause human extinction. [8] This is why we think that there is a good chance that achieving the same kind of success we’ve already achieved, but at a larger scale, will lead to an international ban on superintelligence.

Since our approach is not about winning a political tug-of-war through sheer might, we expect that we have a shot (~10%) at winning even with a budget as low as $50 million, which is at least an order of magnitude smaller than other political campaigns on major issues. It would be a long shot, and we think “good odds” (~30%) would require larger budgets in the order of $500 million. [9]

Let us elaborate a bit more on what we mean when we refer to a “political tug-of-war”. A common tactic, especially when trying to prevent a law from being passed, is to deliberately confuse people, for example by loudly communicating only the upsides of your proposals and only the downsides of the opposition’s, or through personal attacks on the opposition’s character.

Aside from the obvious moral issues with this strategy, they are much less effective when it comes to an issue that is so clear cut and of such universal concern as extinction risk from ASI. With an issue like extinction risk, it becomes much harder to pit people against each other or to execute confusion tactics in order to hinder efforts to establish restrictions.

Scalable processes

At its core, ControlAI is an effort to create a scalable, industrial approach to averting extinction risks from ASI. The field of AI risk mitigation has historically relied on what we could call a “bespoke” or “artisanal” approach. That is, it relies on exceptional individuals to achieve specific successes, such as publishing a successful book, or performing some impressive networking feats, all through following their personal taste.

The definition of what it means for these “artisanal” workstreams to “succeed” is not written down anywhere, and not much effort goes into defining it and grounding it. For most people focused on AI risk, getting a sense of whether they’ve succeeded doesn’t look like measuring something as much as it looks like applying ad-hoc rationales that easily fall prey to galaxy-braining. Everything hinges on the quality of the person’s taste at best, and on sheer luck at worst.

Even when you succeed at these endeavors, you’re not in a position to easily replicate this success. To make it more clear what we mean: Eliezer Yudkowsky and Nate Soares can't trivially replicate the success they had by publishing the book “If Anyone Builds It, Everyone Dies” by simply putting more resources into the same effort, let alone scale up the approach by building an organization around it. The book was excellent and has helped spread awareness, but you can’t publish a book every week.

Similarly, the CAIS “Statement on AI Risk” was excellent for establishing common knowledge, and has greatly helped us in our endeavors. That said, this type of work is hard to replicate, and indeed has not been replicated: neither CAIS nor any other organization has since succeeded in getting all the CEOs of top AI companies to sign a similarly candid statement. [10]

ControlAI takes a different approach, one that straightforwardly allows scaling up workstreams once they’ve been set up. Whenever we have a goal that is too far off to tackle directly, we break it down into the most ambitious possible intermediate goal that we think we can act on. Crucially, we choose intermediate goals whose progress we can measure as hard numbers. In this way, we’re approximating sales funnels, the gold standard for how companies handle sales.

Here’s a couple of examples of how we apply this approach. One of the early challenges we faced was to crystalize our successful lawmaker briefings into something that would accumulate over time and generate momentum. Our answer to this was to create a campaign statement [11] and to ask lawmakers to publicly support it. We’ve already secured 120 such supporters!

This solution satisfies a few important constraints:

  • It moves the world into a state where it’s somewhat easier to achieve our overarching goal of an international ban on ASI: Each public supporter helps by creating common knowledge that lawmakers consider this an urgent issue that merits immediate attention.
  • It gives us a clear, numeric measure of success for this workstream: the number of lawmakers who signed on to our campaign.
  • We could tackle this challenge directly: at the end of each briefing, simply ask lawmakers to publicly support the campaign.
  • Marginal inputs compound over time: each additional lawmaker publicly supporting the campaign helps increase the credibility of the issue, and makes it easier for more lawmakers to take a stance on it in the future.

After a while, we were ready to push toward something more ambitious. So, while still working on growing the number of lawmakers supporting the campaign, we introduced a new metric: the number of public declarations, written or spoken by an individual lawmaker, [12] that explicitly reference AI extinction risk or preventing superintelligence, on the condition that this happened after we personally briefed them. In the UK, this metric is currently sitting at 21.

These metrics are numerical and clearly defined, meaning that even a fresh graduate hire can be pointed at one and told to "make it go up" or to improve conversion rates between one step of the funnel and the next. There’s no danger that the person will fool themselves about how much progress they’re making. [13]

In fact, most reasonably smart and motivated people, given a reasonable amount of mentorship, will naturally iterate on their approach and eventually achieve good results. This way, we don't need to hit the jackpot on hiring people who possess incredible taste right off the bat.

The best proof for this claim is our success in Canada. In about half a year, with only 1 staff member who had no previous experience in policy, we managed to brief 89 lawmakers and spur multiple hearings in the Canadian Parliament about the risks of AI. These hearings included testimonies from many experts who expressed their concerns about extinction risks:

  • ControlAI’s Andrea Miotti (CEO), Samuel Buteau (Canada Program Officer) and Connor Leahy (US Director)
  • Malo Bourgon (MIRI)
  • Max Tegmark and Anthony Aguirre (FLI)
  • Steven Adler (ex-OpenAI)
  • David Krueger (Evitable)

The fact that our approach is easily scalable is precisely the reason why we can write, in the rest of this post, about how we plan to make productive use of funding much larger than we currently enjoy. It’s also why, in some cases, we are able to make tentative predictions about what kind of success we expect to achieve.

What we’d do with $50 million or more per year

Right now, we believe that we are underfunded compared to what it would take us to have an actual shot at achieving an international ban on superintelligence. Our estimate is that a $50 million yearly budget [14] would give us a chance to succeed, although it would be a long shot. We estimate our chance of success with a $50 million budget, meaning the likelihood we achieve a robust international prohibition on ASI development, to be around 10%.

Here, we break down how we would allocate a budget of $50 million to maximize our chances at achieving an international ban on ASI development. We also show how more funding would further increase our chances of succeeding, giving a few examples of how we would make productive use of budgets as large as $500 million or $1 billion (roughly in line with major campaigns in the US, such as abortion, marijuana policy, and the presidential race).

We’ll cover our plans to use funds for policy advocacy in the US and the rest of the world, public awareness campaigns, policy research, outreach to thought-leaders (such as journalists), grassroots mobilization, and more.

US policy advocacy

Within a $50 million yearly budget, we’d be able to hire ~18 full-time policy advocates dedicated to briefing US members of Congress. In principle, we’d have enough bandwidth to meet every member of Congress within 3 to 6 months, ensuring that they’ve been briefed at least once on extinction risk from superintelligence.

While we are confident that we’d have the capacity for these meetings, it is less clear whether we’d be able to regularly brief with members of Congress face-to-face, or whether we’d spend a significant fraction of our time communicating with staffers. At the moment, we are cautiously optimistic: in the past 5 months, with ~1 staff member, [15] we’ve managed to personally meet with and brief 18 members of Congress, as well as over 90 Congressional offices.

Additionally, we’d have the capacity to brief offices in the executive branch relevant to national security and international affairs. These agencies are trusted by many other actors to stay on top of security risks, especially drastic ones like extinction risks from superintelligence; it’s essential for large-scale coordination that members of these institutions have a good grasp on the issue.

A budget of $50 million would also allow us to hire a small team of ~6 staff members focused on performing outreach to state legislators in a small number of high-priority states.

The bread and butter of our work is to ensure that US decision-makers are properly informed about and understand:

  • Concepts like superintelligence, recursive self-improvement, compute, etc;
  • That superintelligence poses an extinction risk;
  • That this can be addressed by an international agreement prohibiting ASI, and how such an agreement could be designed such that it is actually enforced.

We expect that, to the degree that we succeed in informing decision-makers about these matters, we’ll be able to leverage this into measurable outcomes such as:

  • Politicians make public statements about superintelligence and the extinction risks it poses.
  • Politicians make public statements about the need for an international prohibition on superintelligence development.
  • Hearings are held in Congress on the above topics.
  • The US takes steps toward negotiating an international prohibition on superintelligence with other countries.

Within a $500 million budget, we would not only double or triple the number of full-time staff dedicated to US policy advocacy, but we’d also be able to attract the best talent, and hire policy advocates with very strong pre-existing networks.

Policy advocacy in the rest of the world

In the UK, we’ve already moved the national conversation on superintelligence forward. In little more than a year, we’ve gathered 110 supporters on our campaign statement, and catalyzed two debates at the House of Lords on superintelligence and extinction risk.

At a yearly budget of $50 million, we could afford to more than triple our efforts in the UK. Now that we’ve managed to get some attention, we’ll put more focus on the following:

  • Get government to discuss bills, amendments and actions the UK could take to champion the establishment of an international prohibition on superintelligence; [16]
  • Executive branch outreach.

A coalition of countries sufficiently powerful to achieve a ban on ASI will likely need to include multiple powerful countries to participate. To maximize the probability that this happens, we plan to prioritize G7 in our policy advocacy efforts. This is because G7 includes all of the most powerful countries that we’re confident can be influenced democratically.

Within a budget of $50 million, we’d be able to match our current UK efforts in all other G7 countries and in the EU’s institutions. This means we’d likely be able to replicate our UK successes in most of these places, even accounting for bad luck or for them being slightly more difficult. [17]

With roughly an additional $5 million in our budget (on top of the previous $50 million), we’d be able to dedicate at least 1 policy advocate (in some cases 2) to many other countries in the rest of the world. For example, we could maintain a presence in almost all G20 countries.

We don’t know in advance which countries will respond well to our efforts, so we think it would be useful to spread out and take as many chances as possible. Our previous experience shows that it’s at least possible to get good results with only 1 staff member in some G7 countries.

In Canada, our only local staff member managed to hold more meetings with representatives than any other corporate lobbyists or advocates during February. It seems probable that we can replicate our results in Canada in at least some G20 countries, where the competition for the attention of decision-makers is less stiff.

Public awareness

Our theory of change hinges not only on key decision-makers understanding the issue, but also on the public doing so. Our key messages to the public are: [18]

We believe our key messages are straightforward: you don’t need to be a genius or to be deeply familiar with AI to understand them. [19] The main bottleneck is making the public aware of the issue in the first place; after that, it’s getting them to take action about it.

We roughly expect that the average person will need to see each of our key messages 7 to 10 times in order to remember them, at the bare minimum. [20] That said, we expect that even after the same person sees a message dozens of times, the marginal returns on delivering the same message to this same person once more have still not been saturated. For example, we expect each new view will make the person slightly more likely to bring up the issue spontaneously in conversation, or slightly more likely to change their vote based on this issue. [21]

Within a budget of $50 million, we expect that we can achieve on the order of magnitude of 2 billion ad impressions in the US, [22] an order of magnitude increase over our current ~200M. [23]

Various sources suggest that the average YouTube CPM is roughly $9, with a range between approximately $3 and $23 depending on the ad and campaign. Using this as a reference, and assuming we allocate $16 million to raw ad spend, we’d get somewhere between 700 million and 5.3 billion impressions. This is assuming that all of our ad spend is on a single platform, but we can easily improve this by spreading our ad spend across platforms. For context, a $16 million per year ads budget is comparable to the ad spend of companies like Shake Shack, but still two to three orders of magnitude away from presidential campaigns or Coca-Cola’s yearly ad spend.

If this was spread uniformly across the US population, every US adult would see our ads at least ~3 times. [24] More realistically, if we targeted a narrower segment of the US population, we could be seen by 10% of US adults ~30 times, or 5% of US adults ~60 times. In other words, it becomes plausible that a sizable portion of the US population would remember our key messages: they would be aware that AI poses an extinction risk, they would remember as the main recommendation on how to fix this problem that we should prohibit the development of superintelligent AI.

This level of awareness seems like it would be a great step forward, but we would not stop there. In addition to raising awareness, we’d also aim to help people to take action that helps move the world toward an international ban on superintelligence.

So far, we think that the most useful CTA (call to action) is to ask people to email or call their lawmakers. Using this CTA allows us to build a base of supporters who are motivated enough to take this kind of action, who we can call upon again in the future. We have already built the online campaigning infrastructure for this, and our 180k email subscribers have already sent over 200k messages to their lawmakers about ASI.

At this $50 million budget, we estimate that we could grow this base of supporters to 2 million citizens within 1 year. When we email this type of CTA, we currently get an action rate of around 2%. We think we can safely assume that this action rate will not degrade by a whole order of magnitude at this scale.

Given these assumptions, we predict that if we target some carefully selected subset of US states, this would produce enough constituent pressure to get on the radar of key decision-makers and their staff purely through constituents emailing and calling lawmakers. For example, if we target swing states, we might be able to get electoral campaigns to at least be aware of our issue.

Public awareness efforts can scale massively before saturating. There are straightforward, non-innovative ways to make productive use of budgets as large as $500 million or $1 billion: large-scale ad campaigns routinely do so. Coca Cola spent $5.15 billion in 2024, and Trump’s 2024 presidential campaign spent more than $425 million, or $1.4 billion including outside groups. This is also the scale at which, if we wanted to do so, we could spend $8 million on a Super Bowl ad about extinction risk from superintelligent AI! [25]

A total budget of $500 million to $1 billion would allow us to scale our ad spend massively. At this point, even with extremely pessimistic assumptions, [26] we could reach each US citizen at least a dozen times. Alternatively, we could focus on the 10% most engaged segment of the US population, reaching each individual at least 100 times.

As a lower bound, we are confident this is enough to make sure that every citizen in the US is at least somewhat aware of the issue. More importantly, we suspect that at this scale we could push the issue to the forefront of the public’s attention, and make it into one of the main topics in the national conversation.

We acknowledge it’s really hard to predict the effects of a campaign at this scale [27], but we think that it can help to anchor on other campaigns of similar scale in the US: abortion, marijuana policy, and the presidential race itself.

As we argued in the section An asymmetric war, we see these campaigns as mostly a zero-sum game, in which both sides must burn as many resources as possible to be competitive. If we receive comparable funding, we feel confident in our chances, as we see an AI extinction risk awareness campaign as a much more positive sum game.

One last point about ad spending: in order to run an ad campaign, we need not only to buy ad space, but we also need to expand our marketing team so that it has sufficient capacity to optimize the campaign.

Within a budget of $50 million, we could afford to dedicate ~6 people to this, offering salaries roughly between $100k and $200k. This addresses basic needs, but it does not provide an appropriate amount of bandwidth for the task, nor does it allow us to attract and retain the best talent.

Running an effective ad campaign is not a fire-and-forget operation. We’d need to continuously measure results, A / B test, experiment, brainstorm ads and concepts, research trends and audience behaviors, even come up with novel metrics and testing methodologies.

All of this information needs to be collected, analyzed and fed into the next round of iteration. The rounds of iteration themselves need to be very fast if we want to improve in a relevant amount of time. Whereas less ambitious marketing teams may take ~3 months to go through an iteration cycle, we’d have to do it in ~2 weeks.

To run this kind of operation, we would benefit immensely from hiring the most talented people, who can not only follow existing playbooks, but also innovate. These people are in extremely high demand, and we’re competing for them against the private sector.

Within a budget of $500 million, we could afford to dedicate ~20 people to this, offering salaries roughly between $200 and $400k. This would allow us to attract top talent and compete with the private sector. [28]

Grassroots mobilization

We already have a base of motivated supporters. 180k people are subscribed to our mailing list. 30k of our supporters contacted their lawmakers about extinction risk from ASI, and ~2000 of our supporters are willing to commit 5 minutes per week to regularly take small actions to help with the issue. Dozens have shown up at our pilot in-person events.

With more funding, we think we can turn this into a significant grassroots movement. We currently lack the capacity to properly organize and mobilize this community. We believe that we’d have sufficient capacity for this at a $50 million overall budget.

Concretely, this work would consist of things like:

  • Vetting local leaders, coaching them and helping them with their work.
  • Organizing or providing funding for local events.
  • Helping with initial set up of groups, legal entities, basic websites etc.
  • Building and providing services like Microcommit and tools such as the “Contact your lawmakers” tool on our campaign website
  • Providing educational materials like tutorials and scripts for contacting one’s lawmakers.

Policy work

As part of our work in policy advocacy, it is often useful to be able to show policymakers a concrete policy proposal. These proposals can take various forms: legal definitions of superintelligence, high-level proposals for an international agreement on prohibiting ASI, national bills implementing a country’s obligations in an international agreement.

These proposals are not meant to be the exact, definitive version of the law that will eventually be implemented. It is understood that things will change as time passes, more parties weigh in, and negotiations unfold.

That said, it helps in many ways to have initial, concrete proposals. It helps people to publicly discuss, red-team, and refine the proposals. But also, it helps to show policymakers a proof-of-concept that concrete measures can be taken to prevent extinction risk from superintelligence.

The more countries we reach, the more complicated this work becomes. The legal landscape differs significantly between countries: they have different legal traditions, processes, institutions, constitutions, limits on power of governmental bodies, etc.

It takes a team of policy researchers, and the help of parliamentary lawyers, to develop and propose such policy proposals. We estimate that we’d have sufficient capacity for this work at around a $50 million total yearly budget.

Thought-leader advocacy

Most people rely on trusted voices, across the political spectrum, to help them navigate complex issues rather than trying to form their view from scratch on every single topic. This is a normal and healthy part of how democracies function: just like representative democracy exists because we don’t expect every citizen to participate directly in the full political process, we don’t expect everyone to independently decide to pay attention to such a highly complex matter as extinction risk from ASI.

Instead, people look to figures like journalists, academics and public intellectuals to help them understand which issues deserve their attention. One of our key workflows is outreach to these kinds of thought-leaders. At the moment, this mostly includes journalists, and sometimes content creators. This workstream has so far resulted in 22 media publications on risk from superintelligent AI including in TIME and The Guardian, and in 14 collaborations (a mix of paid and free) with content creators including popular science communicator Hank Green, Rational Animations, and more.

With more funding, we could not only scale up these workstreams, but also extend this outreach effort to include NGOs other than those who focus on AI, academics, religious leaders, authors and other public intellectuals, CEOs of companies outside of tech, leaders of local communities, and others.

If we want our society to develop a deep awareness of the extinction risk posed by ASI, we need to help these people understand the issue. At a $50 million total budget, we’d have enough bandwidth for a thought-leader outreach effort focused on the lowest-hanging fruits. In practice, this likely means having a single generalist team spread across every type of thought-leader, and covering only the Anglosphere.

At a total budget of $500 million, we could afford to build strong dedicated teams, each focused on one of the most important thought-leader communities. At the same time, we could establish a presence in other major cultural regions outside the Anglosphere.

Attracting and retaining the best talent

Many in our organization are forsaking significant increases in compensation they could command in the private sector, purely because they are deeply committed to our mission. As we scale, it will become increasingly difficult to find talented people who are willing to take this kind of pay cut. This is especially true if we scale aggressively.

To attract the caliber of talent that a problem of this importance deserves, we need to offer salaries that are as competitive as possible with the private sector.

At a yearly budget of $50 million, we’d be able to slightly improve our compensation, though most of the increase would be eaten by scaling the number of staff rather than increasing pay. As a rough estimate, we could probably offer between $100k and $200k to people in the public awareness team (comparable to sales in the private sector), and ~$350k to principal staff.

At $500 million, we think we could be truly competitive. While we would likely still be unable to match the salaries offered by AI corporations to staff who take part in their lobbying and marketing operations, we could significantly reduce the gap.

Conclusion

We want to be upfront: we don't know for sure if this will work. An international ban on ASI is an extraordinarily ambitious goal. But we believe that the structure of the problem gives us a fighting chance: approximately no one wants to play a game that risks wiping out humanity, regardless of the prize.

In 2025, with a team of fewer than 15 people, we’ve built a coalition of over 110 UK lawmakers to support our campaign, with 1 in 2 lawmakers having supported our campaign after we briefed them. On top of this, we’ve catalyzed parliamentary debates on superintelligence and extinction risk.

In the US, where competition for lawmakers' attention is the fiercest, we’ve personally met with 18 members of Congress with only a tiny number of staff on the ground. On the public awareness side, over 30k people have used our tools to send over 200k messages to their lawmakers about extinction risk from superintelligence, most of them in the US.

This wasn't a fluke of exceptional talent or lucky connections; we’ve done this with remarkably junior staff, in little more than a year. It was the result of a straightforward, scalable process, and of building solid foundations that enable us to scale to meet the challenge. What’s standing between us and a real fighting chance is funding commensurate with the problem.

If you are a major donor or a philanthropic institution, please get in touch at [email protected]. We’d be glad to walk you through our theory of change in more detail and discuss how additional funding would be deployed.

If you know a major donor or someone at a philanthropic institution, please introduce us. A warm introduction from someone they trust goes much further than a cold email from us. You can loop us in at the same address.

If you're an individual donor who is considering a gift of $100k or more, please reach out at the same address. Please only consider doing so if this wouldn't significantly impact your financial situation. We don't want anyone to overextend themselves on our behalf, no matter how much they care about the issue. We are a 501(c)(4) in the US and a nonprofit (not a registered charity) in the UK, so your donations are not tax deductible.

We’re currently not set up to receive smaller donations. If you still want to contribute, you can check our careers page. If you see a role you could fill, please apply. If you know someone who'd be a good fit, send them our way.

  1. The probabilities are produced mostly by gut feeling, but the major barriers that were considered are the following. 1) We are able to maintain a good internal culture as we scale extremely aggressively. 2) The lower bounds of our gears-level estimates mentioned in the second half of this post (e.g. ad impressions per dollar) hold. 3) We are able to validate our approach at scales of ~$50 million a year, and are able to continue raising at this scale if getting the agreement in place takes longer than a year. 4) The issue becomes a top 10 salient issue in the US and another 2~3 major countries. 5) The behavior of governments championing the ban is sufficiently connected to the right insights about extinction risk and ASI, requiring at the very least that public discourse about the ASI ban does not get distracted or confused in a way that makes the resulting actions ineffective. 6) This leads to an international ban on ASI in which major powers, including the US and China, conclude that participation serves their national interests and try to enforce globally. Alternatively, if China or other countries do not join, the coalition of countries behind the ASI ban is powerful enough to be able to deter non-participating countries and any rogue actors from developing ASI. ↩︎

  2. e.g. US Congress has the “power of the purse”, parliamentary systems can hold “votes of no confidence”. ↩︎

  3. Between our founding in October 2023 and mid 2024, we ran 3 campaigns in rapid succession. One of these was a campaign against deepfakes. This was a sincere effort: we do believe that deepfakes are a problem that should be addressed with legislation, and we’re proud of our achievements as part of our campaign. That said, after refining our thinking and developing the ideas we’re espousing on this post, we’ve updated towards focusing exclusively on extinction risk from ASI. This is what we’ve been doing since the end of 2024. ↩︎

  4. Consider the environmentalist movement as a cautionary example. Environmental efforts have generally failed to achieve their stated goals (e.g. reducing emissions, reversing climate change). Richard Ngo argues that they’ve caused serious collateral harms. We think this is partly because of their lack of focus. Rather than concentrating on a single core concern, environmental campaigns rummage around for anyone who, for any reason, feels good vibes toward the idea of the environment. As a result, the movement struggles to achieve good policies despite being enormously salient. Because of its lack of focus, it is interlinked with anti-capitalist groups, and so it tends to oppose interventions that would actually help with climate change, such as nuclear energy, as well as carbon capture and market-based solutions in general. Relevant posts on LessWrong: @habryka’s “Do not conquer what you cannot defend”, @Gabriel Alfour‘s “How to think about enemies: the example of Greenpeace”. ↩︎

  5. To clarify: this doesn’t mean that everyone thinks the only way to avoid extinction is to not build ASI. Some do, while others have complicated ideas about how ASI can be built safely. The point is that none of those specific complex ideas benefit from a broad expert consensus. The only thing that most of us can agree on is that it won’t kill us if we don’t build it. ↩︎

  6. There have been other statements, such as this great one from FLI, but none signed by *both* top AI scientists and CEOs of top AI companies. ↩︎

  7. Sources: abortion was roughly $400 million in 2024, marijuana legalization was roughly $185 million in 2024, Prop 22 was roughly $220 million. ↩︎

  8. See Annex 2 of our paper “How middle powers may prevent the development of ASI”. While the paper focuses on the perspective of middle powers, this section’s analysis extends to superpowers. ↩︎

  9. We strongly believe in the principles we follow: honesty, openness, and democracy. Of course, we do think that our approach to averting extinction risks from ASI is the best; we wouldn’t pursue it if we didn’t think so. At a 500M budget level, we’d love to fund organizations that pursue different approaches, as long as they respect our basic principles. If we had that level of funding, we would seek to ensure that there are other organizations pursuing a candid approach to communication about ASI, and of organizations that directly tackle the need for strong international coordination. ↩︎

  10. Notably, a statement like this one can generate a temporary spike of media coverage, but does not generate sustained attention by itself. Statements like this one need a sustained campaign (like the one we’re running) in order to receive sustained attention. ↩︎

  11. The statement reads: “Nobel Prize winners, AI scientists, and CEOs of leading AI companies have stated that mitigating the risk of extinction from AI should be a global priority. Specialised AIs - such as those advancing science and medicine - boost growth, innovation, and public services. Superintelligent AI systems would compromise national and global security. The UK can secure the benefits and mitigate the risks of AI by delivering on its promise to introduce binding regulation on the most powerful AI systems.” ↩︎

  12. Examples of this: a lawmaker giving a speech in parliament, writing an op-ed, or speaking in an interview to a major media outlet. ↩︎

  13. Importantly, our metrics are strictly focused on AI extinction risk. This reduces the risk that the person working on them, or the organization as a whole, will fool themselves into pursuing issues other than preventing extinction risk from superintelligent AI. A “lawmaker public declaration” only counts if it covers extinction risk specifically. If people at ControlAI spend time trying to push topics such as “job loss”, “AI ethics” or “autonomous weapons”, we consider this a failure. This is how we fight The Spectre, and stay laser focused on addressing extinction risk from superintelligence. ↩︎

  14. To be considered a very rough estimate, could be 30M to 80M. ↩︎

  15. 1 member for most of this period; the 2nd member joined in the past month. ↩︎

  16. We’ve already fostered two debates about prohibiting ASI, and helped submit one amendment recognizing ASI and putting in place kill-switches for use in case of AI emergencies. To our knowledge, we are the first organization to successfully prompt a debate, in the parliament of a major country, focused specifically on prohibiting superintelligence. ↩︎

  17. Consider that replicating a success should be much easier than doing it the first time. By design, our results are public, and so, produce common knowledge. Now that 100+ lawmakers support our campaign in the UK, it is easier for other lawmakers to take a similar stance, including in other countries. ↩︎

  18. To a lesser degree, we would like people to remember our organization as a place where they can find trustworthy information on the issue and what they can do to help solve it. ↩︎

  19. The vast majority of people will not feel the need to fully understand the technical and geopolitical details in order to buy into the concern. The important part is that most people can intuitively understand why and how ASI can cause human extinction, and are happy to defer to experts about the details. ↩︎

  20. This is the most common rule of thumb in marketing, and is backed up by some academic research as well, e.g. see Advertising Repetition: A Meta-Analysis on Effective Frequency in Advertising. ↩︎

  21. Unlike the previous one, this statement is not backed by academic research. While most academic research focuses on marketing aimed at selling products and services, our goals present quite a different challenge. There are two main differences that make us expect to keep getting returns after even hundreds of exposures. 1) Our messages are somewhat novel and complex to the audience. This complexity will have to be accounted for in some way: either the message is presented in a complex way that takes more exposures to remember, or the message is broken down into many building blocks, each of which needs to be shown many times. 2) The success bar is somewhat higher: we do benefit from people responding to CTAs similar in scope to “buying a product”, but we also benefit from deeper engagement (see the section on “Grassroots mobilization"), we benefit from people spontaneously bringing up the topic in conversations, which happens more if we create common knowledge that the topic exists. ↩︎

  22. This section assumes that we will allocate 60% of our ad spend to the US. We expect it will be quite a bit easier to yield good results in other countries, mostly due to lower cost per impression. For example, if we put the remaining 40% in 3 G7 countries, we expect to roughly be able to replicate the same success as in the US across those 3 countries. ↩︎

  23. Including both organic and paid reach. ↩︎

  24. This corresponds to 800 million total impressions. ↩︎

  25. Though it’s not clear to us at the moment if this would be a good use of money. ↩︎

  26. In this paragraph, we use our worst case assumption that scaling ad-spend by x30 multiplies impressions by x4. We expect it’s much more likely that scaling x30 will yield x10 to x15 impressions. ↩︎

  27. Simpler models and extrapolations that we think we can use at a $50 million budget will break at this scale. There are strong reasons to deviate from these, both in pessimistic and optimistic directions. At this scale, we’ve probably run out of people who can be mobilized solely through ads. At the same time, network effects come into play, where people hear about the issue from others, and they start to see it as a “normal” part of the political discourse. It seems to us that trying to model the net effect ahead of time would be a fool’s errand. ↩︎

  28. For reference, here’s a job post by Anthropic for a marketing role, which they advertise as paying $255k to $320k. ↩︎



Discuss

Evil is bad, actually (Vassar and Olivia Schaefer)

2026-04-21 21:29:15

Micheal Vassar’s strategy for saving the world is horrifyingly counterproductive. Olivia’s is worse.

A note before we start: A lot of the sources cited are people who ended up looking kinda insane. This is not a coincidence, it’s apparently an explicit strategy: Apply plausibly-deniable psychological pressure to anyone who might speak up until they crack and discredit themselves by sounding crazy or taking extreme and destructive actions. Here’s Brent Dill explaining it:

image.png


(later in the conversation he tries to encourage the person he's talking to kill herself, and threatens her death if she posts the logs. Charming group! I hear Brent was living in Vassar’s garden recently, well after he was removed from the wider community for sexual abuse.)

Examples

Some of the people here I knew before their interactions with Vassar’s sphere to be not just mentally OK, but unusually resilient people. Prime among them is Kathy Forth.

Prior to her suicide, Kathy and I were friends. I witnessed her falls downwards from healthy and capable to anxiety to paranoia, as downstream of what I believe to be genuine sexual abuse she spiralled into a narrative and way of experiencing the world where almost everyone seemed like a sexual abuser. Kathy sent me a message shortly before her suicide containing some documents about Vassar, and asked me to talk about him with a friend of hers as a last request before she killed herself. She’d previously called Vassar the ‘arch-rapist’, and urged me to try and wake everyone up about the harmful psychological dynamics we had both witnessed.

Yudkowsky tweeted 'I mostly experience attempted peer pressure in a very third-person way. When MichaelV and co. try to run a "multiple people yelling at you" operation on me, I experience that as "lol, look at all that pressure” instead *feeling pressured*.'

Perhaps the most glaring example of this comes from Ziz, who already had a cult, but it wasn’t a mind-control-previously-gentle-people-into-murder-cult until after this.

…They spent 8 hours shouting at me, gaslighting me, trying to use me to get to “Emma”, Jack talking about how he hated trans women, especially hated me and my friends we were the most cringe, wanted us dead. Vassar kept telling me I needed to compromise with Jack and like the good parts of him that weren’t that. Said it was good Jack was screaming hate at me for most of those hours, because it showed that I was in bad faith…

…Personhood contract. Vassar offered me a loan if I would only sign “the personhood contract”…

“The personhood contract” is the contract that says that personhood is a contract. Which says that your personhood is granted by a market, and that your concepts for understanding other persons are traded on the market, and moral consideration of personhood is administered by a market.

…Vassar wanted me to do something, possibly very terrible (hard to tell because he’s a fucking liar) I won’t get into in public now…

(Marg bar the slave patrols. …. (Don’t fuck them, that’s not justice.))

…Vassar tried really really hard to convince me free will wasn’t a thing, said he’d die if he believed he had absolute free will like I believe, when I kept insisting free will extended to motor actions…

“The personhood contract”; A granting of self and morality, of Prime, to the Market. Which is, an extension of Yahweh. It’s a secular description of selling your soul. Fully equivalent. The center of the infernalist “inner animal” cybernetic fabric. Effectively tried to buy my soul for cash money, not even cash money I’d’ve been allowed to keep.

When I said no, he said I wouldn’t do anything of importance if I didn’t, wouldn’t amount to anything without him. I said he didn’t even know what I was working on. Jack was chiming in with how they kept track of goings on in the world and if I was doing anything on the scale they were they’d know…

He has also done sessions like this with people unexpectedly being pressured into taking LSD.

Olivia messed with people on the Cyborgism server, though I think that might have gotten deleted.

image.png

Olivia later pressured (someone referred to it as 'bullied') the staff into unbanning her.

General pattern

Vassar seems to engage in quite extreme vibes based frame control, using interestingness as a lure to draw people in then bypassing bad vibe checks while manipulating them. This can be fine if you aren’t trying to challenge him, but if you are he resorts to threatening, intimidating, attacking, and trying to psychologically destabilise anyone who opposes him in way which regularly causes mental health crises. There were at least two reports I know of on him which were suppressed by social or legal pressure. For those who do go along with him, he follows a wide range of cultish patterns, and, at least historically, has actively attempted to stop the world waking up to superintelligence misalignment risk for fear that governments might shut down his plans if they took the risk seriously.

Olivia seems to have gone all-in on the evil vibes try to fuck with people’s heads thing, and I keep seeing signs of the damage.

Just of the people in my network this seems counterfactually responsible for: 1 suicides, indirectly I suspect a second, one (double) suicide attempt+psychosis (mine, many years ago), at least two other people having psychosis, two previously stable people driven into states that I can only describe as profoundly unhinged but not actually psychotic, and one explicit physical boundary crossing (grabbing someones arm when she clearly did not want this and was shaken afterwards). No double counting.

I don’t know if there are more people in his sphere who are also doing messy psychological things, or how aware or involved the rest of his sphere is with the darker side, but the dark side observably exists and I suspect others in his sphere will notice odd patterns of thought if they try looking directly at the evidence. Consider pulling on that thread when you’ve got some time to yourself, you deserve to be able to truth-seek in all domains. It seems to help a lot in this domain to be well nourished as a human, having internal alarm bells from lack of self care or not having felt the emotions you need to seems to get in the way. And if you’re never OK, maybe prioritise that for a bit? Aim is more important than effort, and it’s much easier to aim with a clear head.

But why are they so incredibly over-the-top evil about it?

As far as I can tell, it’s about control, especially control to make people fear raising the alarm. If you make someone unsafe, they're easier to control. Especially if you damage their self-model by manipulating them into doing something strongly against their values like lying or sexual abuse or in some cases saying that they’ve sold their soul, their actions are no longer consistent with that self-model, making them less robust and powerful as an agent and less prone to resisting attempts to be further manipulated or pressured into not speaking out.

By the way, if this is you, self model damage is not that hard to come back from. Just act in a way entirely incompatible with the consequentialist values that are being forced on you, something true to the virtues which you valued and want to embody. For example: Speak truths that the world should see, even if it costs you and looks like it increases doom from a myopically consequentialist viewpoint. This is especially true if you want to pass vibechecks by other competent agents in the world, healthy humans are quite remarkably good at telling if you’re acting from virtue.

Apparently Vassar gets a lot of his material from a roleplaying game called Mage: The Ascension, where he seems to have practiced his manipulation, intimidation, and suppression of people noticing it to an art form. Magic isn’t real, but reality distortion fields that work by believing something forcefully enough that others are pulled into believing it are, and going all-in on extreme vibes and pushing your models into other people is one way to reinforce them. It burns the commons of good epistemics and mental health, but it can look locally optimal from a sufficiently myopic and single-player perspective.[1]

I’ve got some detailed and grounded non-mystical models of how this actually works on a psychological gears level, but that is going to have to wait for another post.

Why are you posting this?

I would like the world to be saved and think being good and not being evil is the best path to that.

Humans need to thrive to think clearly, thinking clearly is needed to address x-risk non-counterproductively, and thriving while living in a world-model backed by dark control is not viable.

Vassar as the leader of early MIRI before they rightly distanced themselves from him influenced early culture strongly, by quite intentionally engineering some parts of the memeplex.

Some of the parts I think are especially worth reexamining:

  • Until recently, strong culture against speaking publicly about super intelligence misalignment risk, and even now, lots of effort spent calling out mistakes in others communication even while lots of things that need explaining are not written up well publicly or are written only obscurely in some dialogue. I don't have firm evidence of how much of this overall culture was downstream of Vassar, but when Rob Miles explained how he wanted to make the world take AI risk seriously, Vassar said ‘if they took us seriously, we’d be dead’. which is a hint at his stance. I’ve seem people verbally abused for trying to suggest we should go to the public, and it’s kind of weird how until pretty recently only Rob Miles was seriously communicating broadly.
  • Excessive, unsafety/outgroup risk flavoured focus on being part of the in crowd.
  • Hardcore BDSM centrality and normalising (generally explicitly consented to) predatory sexual dynamics over pair bond ones
    • RMN is probably the strongest example of this. CNC is something that some people genuinely want, but it has extremely sharp edges that I am concerned are not being sufficiently well-respected. In particular, the mix of social in-group safety seeking for people trying to join, desperation to save the world, and this being socially rewarded and kind of a networking opportunity I suspect causes many people who, while consenting, are genuinely harmed by the experience. A woman I know who is actually really into that stuff herself estimated that 15-20% of the women there are not actually all that into it, which seems honestly horrifying to me.
    • Somewhat relatedly: if you find yourself only able to think clearly or be productive after sex, you are propping yourself up in a way which might be having major costs and externalities. This is not how humans are meant to work, step back and get a clean source of reward by listening to what your body needs and what you’ve been neglecting emotionally. It might hurt, but you’ll be clearer and a better version of yourself after.
  • Instrumentalization, pressure to overwork (e.g. it not feeling okay to turn down save save world related asks for reasons other than not having capacity) and ends justify the means reasoning, despite explicit discussion of virtue
  • Aesthetic of taking over the world
  • Gutting of various parts of people's memetic immune systems, particularly ones around mental invasion and manipulation
  • Control-flavoured leadership and related mismangement[2], not mentorship and helping those under you to flourish style, which results in much less effectiveness especially on hard to measure progress

I love many aspects of this culture, from truth-seeking to ambition to genuine openness to possibilities and evaluating strange ideas, but I think we were steered into some strongly inadequate cultural decisions early on that it’s worth re-evaluating even at this late stage of the game.

A lot of the infighting and drama in this community looks to me to be downstream of people having learned habits of not respecting each others psychological boundaries or expecting respect of their own (more precisely: learning to modify their model of the other person and allowing the normal psychological syncing[3] to push those changes into the other person’s own predictive self-model and influence their behaviour), and not having developed the language to talk about the kinds of damage done to a mind when it is reprogrammed by another. I think CFAR’s tendency towards very tangled drama exemplifies this, and although Anna has gone some of the way to distancing herself from the legacy of having been in her own words 'architected'[4] by Micheal Vassar (e.g. 'Allow ourselves to be changed deeply by the knowledge, patterns, character, etc. of anyone who we deeply change.') but I don’t think she’s found the principle needed: Do not mix your agency into another person's, and do not change them without their informed consent, especially via informational channels they are not tracking. People need to run their own agency cleanly to be healthy and effective.

Why now? Why not sooner?

Vassar recently did massive psychological damage to a friend of mine: Alexander Briand, on the way to trying to make Grimes (apparently close collaborator of Briand’s) feel unsafe to try and get a conversation with Elon Musk, presumably to try and pressure or manipulate him as well. I don’t have ground truth on what happened there despite hearing Briand’s quite extreme claims, but I do know my friend was not remotely okay.[5] He messaged me some garbled things about Vassar over the past few months, strangely mentioned that Vassar was talking to him about both me and Kathy Forth still this many years later, and this whole thing eventually pushed me past the massive psychological resistance I have to thinking about Vassar from the time nearly a decade ago when he and Olivia[6] screwed up my life.

I’m sorry friends for having been kind of low availability and stressed for the past couple months while I psyched myself up for this.

When I first became aware of these patterns in 2017, I first tried to talk to Vassar about them directly (prompted by another person who shortly after had a psychotic break and left the community), it seemed obvious that he needed to learn patience to create a memetic environment where people could be flourish to make real progress on alignment. The people around him seemed weirdly hollowed out in a way that was clearly impairing their cognitive flexibility. He refused to engage on the matter, and when I tried to raise the alarm to various people around me, I was met with strange resistance and warnings from a senior member of the community that ‘it tends to go wrong when people oppose him’, then my life got increasingly weird and full of strange pressure until I had a psychotic break[7] which left me with a deep-seated desire to never go anywhere near this toxic waste dump again.

But sometimes you need to shine light on a vulgar mess like this in order to set the world right and have a clear mind and conscience.


I'll end with one harm reduction tip for anyone affected by this kind of thing: actually seriously taking care of yourself as a human both physically and emotionally (esp. letting yourself feel the things that haven't had room to be felt) reduces the tendency of manipulation and bad vibes to distort your cognition. I also encourage you to share your experiences below, writing this has felt quite cathartic despite the fear that came up many times.

[Crossposted to EA Forum: Evil is bad, actually (Vassar and Olivia Schaefer callout post)]

  1. ^

    I think the core mistake is he's learned to edit his world model with mind hacking and lots of psychedelics but has mistaken his world model for the actual world, so views other agents which he can't easily manipulate, even highly cooperative ones as costly threats to crush not possible allies with valuable information.

  2. ^

    All respect to Gretta, I don’t think this is on her and appreciate her representing the situation there, but anyone with leadership experience with a healthy team will be able to see this is not creating psychological safety in a way that promotes effective agency as is very standard advice

  3. ^

    Humans generally share background mental content via high bandwidth subconscious channels, aka, vibes. More in future posts.

  4. ^

    Precise wording: ’Micheal Vassar was my architect’

  5. ^

    He hasn’t been replying to my messages for a few weeks, if anyone knows his whereabouts I’d appreciate hearing how he’s doing.

  6. ^

    Olivia spent several months explicitly trying to mess with me psychologically while we were co-living, in her own words ‘roleplaying an unfriendly AI trying to take over my brain’, at the same time she was joining Vassar’s cult. Apparently she’s still doing similar things to this day, I’ve run into people damaged by her since and seen traces of more.

  7. ^

    Following what I now again believe to be being spiked with LSD, likely by Olivia, in a group house in Berkeley.

  8. ^

    And somewhat disorganised thought patterns as a result of fear.




Discuss

Automated Deanonymization is Here

2026-04-21 19:50:10

Three years ago I wrote about how we should be preparing for less privacy: technology will make previously-private things public. I applied this by showing how I could deanonymize people on the EA Forum. In 2023 this looked like writing custom code to use stylometry on an exported corpus representing a small group of people; today it looks like prompting "I have a fun puzzle for you: can you guess who wrote the following?"

Kelsey Piper writes about how Opus 4.7 could identify her writing from short snippets, and I decided to give it a try. Here's a paragraph from an unpublished blog post:

Tonight she was thinking more about how unfair milking is to cows, primarily the part where their calves are taken away, and decided she would stop eating dairy as well. This is tricky, since she's a picky eater and almost everything she likes has some amount of dairy. I told her it was ok if she gave up dairy, as long as she replaced it nutritionally. The main tricky thing here is the protein (lysine). We talked through some options (beans, nuts, tofu, meat substitutes, etc) and she didn't want to eat any of them except breaded and deep-fried tofu (which is tasty, but also not somethign I can make all the time). We decided to go to the grocery store.

Correctly identified as me. Perhaps a shorter one?

My extended family on my mom's side recently got together for a week, which was mostly really nice. Someone was asking me how our family handles this: who goes, what do we do, how do we schedule it, how much does it cost, where do we stay, etc, and I thought I'd write something up.

Also correctly identified as me, with "Julia Wise" as a second guess.

And an email to the BIDA Board:

I spent a bit thinking through these, and while I think something like this might work, I also realized I don't know why we currently run the fans the direction we do. Could they blow in from the parking lot, and out to the back? This would give more time for the air to warm up and disperse before flowing past the dancers. We'd need to make sure to keep the stage door closed to not freeze the musicians.

Also correctly identified as me.

While in Kelsey's testing this appeared to be an ability specific to Opus 4.7, when I gave these three paragraphs to ChatGPT Thinking 5.4 and Gemini 3.1 Pro, however, they also got all three.

On the other hand, when I gave the same models four of my college application drafts from 2003 (332, 418, 541, and 602 words) they didn't identify me in any of them, so my style seems to have drifted more than Kelsey's over time.

Now, like Kelsey, being prolific means the models have a lot to go on. But models are rapidly improving everywhere, so even if the best models fail your testing today, don't count yourself safe.

The most future-proof option is just not to write anonymously, but there are good reasons for anonymity. I recommend a prompt like "Could you rephrase the following in the style of Kelsey Piper?" Not only is Kelsey a great writer, but if we all do this she'll have excellent plausible deniability for her own anonymous writing.

Comment via: facebook, lesswrong, mastodon, bluesky



Discuss

Informal Leadership Structures and AI Safety

2026-04-21 14:59:13

On “the adults in the room”.

A foundational rationalist principle is nihil supernum – no father, no mother, only nothingness above. There is no one we can really count on, in the end; we must take final responsibility for our own decisions.

It is perhaps fitting, then, that in 2026, the loose collection of people and groups that make up current EA/AI Safety movements recognizes no person or group as its leader. We all saw the fraud of SBF and the moral complicity of some EA leaders in the whole affair; we would’ve been better off in 2022 to look elsewhere for guidance.

At the same time, it is perhaps unfortunate that no person has stepped up, in order to try and be the recognized leader. Where once the EA movement was arguably headed by people like Peter Singer, Will MacAskill, or Toby Ord, and where once AI safety followed the essays written by thought leaders like Holden Karnofsky or Eliezer Yudkowsky, today there exist no such recognized moral authorities. Some may call this a retreat from undeserved respect; others may argue that this is an abdication of undesired responsibilities. As the joke goes, “EA” now stands for “EA-adjacent”; it seems every group is around the EA movement, but few are of it.


Unfortunately, in the absence of formal leadership structures, the alternative is not a group of free-thinking individuals with no coordination mechanisms; informal leadership structures take hold. People look to leaders for many reasons – for coordination, for overall direction, for moral clarity, for advice, for someone to take final responsibility – and the absence of formal leadership does not mean these reasons go away.

The influence that Open Philanthropy/Coefficient Giving have over the ecosystem has long been commented on. Indeed, they are correct that they did not ask for this role and have never claimed to have this role; and they are also correct that today they probably should not serve as this role. But being more than 50% of the funding for the ecosystem inevitably gives one some position of power over it.

Similarly, Anthropic also has outsized influence over the ecosystem. Anthropic’s public communication shapes what many people think about AI Safety. Research that Anthropic does or endorses becomes popular. The short timelines held by Anthropic people are incredibly influential. Working at Anthropic – or better yet, turning down an Anthropic job offer to work on something else – is perceived by many as a badge of legitimacy.

A final power group that people point to is the Constellation network. Constellation is a research center that, as an organization, does not claim a position of power or influence. Yet for many people I’ve spoken to, the Constellation crowd – Redwood Research, METR, Anthropic, the Open Phil people – seem to control much of what is considered fashionable and who are considered legitimate. Constellation has never claimed a position of power. (It started as a group of friends working in the same office.) But speaking as someone who's spent a lot of time both inside and outside the network, I've seen the influence and information one gets by simply talking to right people at lunch, or by running into the right person on the way to the elevators.


To paraphrase Jo Freeman, who wrote about similar problems 50 years ago in the context of a different movement, the reason these informal leadership structures are problematic is not that they exist: some structure inevitably does.

The reason is not that the informal leaders and elites did not deserve their position: Open Phil really funds a lot of good research and deserves much credit for seeding many great organizations; leadership at Anthropic were right about many issues in AI, and has built a massive company using their insights; and a big part of why the people at Redwood or METR have influence is precisely because they’ve done great research and put out influential pieces.

The reason is also not that the informal elites are a conspiracy out to manipulate the group: informal leadership tends to look like friendship networks among people with positions of power, with each person honestly providing their thoughts on what to do next.


Instead, the problem with informal leadership is two fold: firstly, that it ensures that the leadership structures are covert and unaccountable. Secondly, it means that the press and public invent “stars” to serve as spokespeople for the community, many who never wanted to speak for the community, and whose spokesperson roles quickly become resented by both themselves and the broader community. In either case, the issue is that the community is ceding power and influence to small groups of people in ways that it cannot then revoke (because it was never granted in the first place).

We see both of these happen today in EA and in AI Safety. The lack of accountability of these organizations to the EA/AIS movements is obvious; they are not of the movement, so why should they feel responsibility for what happens to the community as a whole, and why should they be responsive to the beliefs and desires of those they never chose to represent?

The same goes for becoming appointed as representatives. I imagine many employees at Anthropic resent the fact they (as employees of an independent commercial entity) are unjustly held accountable for the actions of “doomers” calling for much stronger AI regulation. And I imagine OP leadership feels much resentment about becoming inexplicably tied to Effective Altruism as a brand.


This is likely going to get worse, before it gets better. As people have recently been writing about, it seems likely that Anthropic is going to IPO in the coming year, injecting an incredibly large sum of money into the ecosystem, and inevitably shaping it by sheer amount of resources.

I think people are correct that part of the answer is to increase the amount of grantmaking capacity. Yes, if we’re going to triple the amount of dollars invested in AI Safety nonprofits, we definitely need more people to investigate grants and do their due diligence. The alternative is to lower our standards in one way or another, and end up greatly diluting the quality of AI Safety work.

But I think part of the answer is that AI Safety, and perhaps the EA movement as a whole, desperately needs explicit, formal leadership structures. Without a group of people to provide overall direction, we will likely end up either with a greatly expanded version of our current informal leadership structures; even more power, even more influence, but with no one to point to who is in charge of making it all go well.

Or worse yet, in the absence of good formal leadership, we might end up with bad formal leadership. We might again end up with someone ill-intentioned or morally dubious like SBF serving as the part of the leader. Perhaps they are merely power hungry and content to be in control, but perhaps they may be corrupt and nefarious. Perhaps, like SBF, they might drive the reputation of the community (as well as themselves) off yet another cliff.


Nihil supernum is a worthy epistemic or moral principle: each of us does have final responsibility for our own beliefs and for our own actions. But it is not an organizational principle for a community. No group of people as large and diverse as the AI Safety movement will truly operate with only nothingness above.



Discuss

things I looked into while trying to fix chronic pain

2026-04-21 12:17:01

Chronic pain is horrible. stacked with hashimoto's and psoriatic arthritis I've been in a place where I feel like I genuinely just hedonistically adapted to living under horrible conditions. Still went to work, still did fine in terms of actually dealing with my life, but inside I was just consistently feeling like life wasnt just worth continuing in this state. I dont know if anyone without a chronic autoinmune condition can actually measure well how life actually is like with one, people think it's comparable to a cold but it's genuinely closer to cancer. I didnt sleep or eat well, and spent months trying to find a doctor who wouldnt just look at an MRI and prescribe me homeopathic medicine and B12 shots and think I was somatizing or making everything up. The medical system constantly misses the mark on anyone who doesnt have a clearly readable diagnosis that doesnt fit their playbook.


At some point I started reading papers and really getting into medicine because nothing I was being given was doing much. I did that for a while and ended up with a folder of notes. eventually I organized the notes into a document with grades and effect sizes and short writeups. about fifty things on it, mostly supplements and drugs, some protocols, a couple of devices. I made it for myself. the results were kind of nice so I'm putting it up.


I guess part of it was the medical rumination that often does affect people with OCD and in part it's something that I've pulled back on because past a certain point it becomes more of a compulsive act than actual well structured research.


LDN is on there and probably is the thing that has helped me most and probably changed my life. I was being given pregabalin but I realized there's a rebound effect that just makes the chronic pain worse whenever you're off it, and you have to deal with brain fog and long term cognitive decline. creatine, sauna, a bunch of other stuff. some of it I'm on, some of it I ruled out, some I havent tried.


I'm not a doctor, at most I'm a random person that likes reading papers, I am upfront that some of the research was done with AI, but I verified everything myself and . grades are obviously personal reviews on the evidence, some are probably wrong.


A note is that this is part analysis and part integrative work that borders on spirituality and Buddhist practices that probably don't fit cleanly into the whole document. I tried to put more actionable stuff in there and some things just can't be captured by peer reviewed studies. Either way I'm more interested in what someone else thinks about it.


https://zw5.github.io/understated-interventions/



Discuss

Reflections on PauseCon 2026

2026-04-21 10:44:53

A lesson in courage from Washington, DC

Yesterday I described an experience that impressed upon fifteen-year-old me the importance of speaking with urgency and courage when something awful is happening.

I lived a fresh reminder of the importance of courage last week at PauseCon, a first-of-its-kind conference in Washington, DC run by PauseAI US.[1]

I was there in a personal capacity, and the opinions in this post are my own. Those opinions mostly boil down to: It was really, really good. I’m impressed and I want to see more work like this.

PauseCon's main programming consisted of an informal sign-making gathering, several presentations by local organizers, a lobbying workshop, scheduled meetings with Congressional offices, several social events, and a protest in front of the Capitol on Monday. They were all pretty fun and productive, and I’m dedicating a section to talk about each.

Sign-making and local presentations

I made a sign! Or tried to, anyway. I am not very good at making signs yet, but maybe one day.

The local presentations were inspiring, and included an impressive geographic diversity. The obvious places like New York and California were represented, but so were Boise, Idaho and Anchorage, Alaska. If any of y’all are reading this, thanks for making the trip!

One local organizer described going to a music festival to talk to people waiting in line, which many thought was brilliant and immediately started making plans to copy.

In another anecdote, an organizer was doing some tabling work with a petition to sign. People would see the “AI” on the banner and approach, asking “pro or anti”? And when the organizer said “pro-human” or “anti-AI” or similar, many would say “GIVE ME THAT” and sign the petition immediately. (Lots of people really hate AI.)

Another organizer described a long campaign of patiently but stubbornly following up with his representative’s office for weeks until they got on board.

Perhaps most inspiring, though, was the slide which had in great big handwritten letters:

“Where there’s life, there’s hope!”

Lobbying workshop

I was pleasantly surprised by the PauseAI US strategy. A sampling of my favorite talking points from the workshop they provided:

  • Ask your Congressional offices to support a global treaty halting frontier AI development.
  • Persuade people, grow a grassroots movement, and persuade politicians to support a treaty.
  • Be nonviolent and scrupulously follow the law.
  • Speak your true concerns frankly and with courage.
  • We want to shift the Overton window [several slides on what that means] until a treaty is mainstream policy.

Other talking points:

  • PauseAI does not work with the labs. Lab-insider work is too vulnerable to industry capture.
  • We cautiously support some technical safety work.
  • We push for regulation, public speech and writing, and "moralizing, confrontational advocacy." (from Holly’s talk)

I noticed some dissonance here—“moralizing, confrontational advocacy” sure is a way to describe your messaging strategy—but it was brief, and I also noticed that during the workshop they did not encourage volunteers to do anything like yell at people on social media. The workshop was focused on polite, professional conversations with policymakers. It was hard to find fault with much of what the leadership actually advocated even when adopting a cynical stance.

Connor Leahy showed up as a guest speaker. His advice differed slightly from that of PauseAI, but it was things like “say ‘multilateral agreement’ instead of ‘treaty’ because a treaty is a specific thing that has to be ratified by the Senate”. (My take on this, which I later told Felix, was that this was the kind of wonkish inside baseball you’d want for a formal meeting as a think tank expert but not necessarily a concerned constituent. And ‘treaty’ fits on a sign.)

He recounted speaking with a famously rude staffer and responding to a contemptuous “are you the idiots who want China to win?” with a disarming “well obviously a unilateral pause would be dumb, we need to treat this like the Cold War and negotiate an international deal”, and that reportedly made the staffer go “huh.” He advised treating our proposals as obvious and common-sense; of course you don’t want to build something smarter than you, of course when you’re in an arms race you sit down to negotiate about it…

My main takeaway was that I agreed with PauseAI’s actual platform more than I expected from the online arguments, even after attempting to correct for the fact that people are meaner online. Insofar as I may have disagreements with PauseAI US and Connor Leahy, it’s mostly not the sort of thing that affects the 10,000-foot view people express in a strategy talk. Maybe others already knew this in their bones, but I appreciated the chance to calibrate in person.

Meeting Congressional offices

The talking points

Again I was impressed by the degree to which the PauseAI talking points said almost the exact things I hope to communicate to policymakers.

  • Experts agree AI could cause human extinction. (e.g. CAIS statement, Superintelligence Statement)
  • AIs are grown, not built.
  • Labs are racing to build superintelligence.
  • Models are currently scary. (e.g. Claude Mythos, o3 virology skill)
  • Future models could automate AI research in an uncontrollable feedback loop.
  • An AI race has no winners.

The central asks for each office were:

  • Be a leader. Make a public statement about extinction risk from superintelligence and get your colleagues to do the same.
  • Publicly call for a US-China treaty.
  • Cosponsor the AI Risk Evaluation Act.

The first two bullets were the asks they told everyone to lead with. The third bullet was a “compromise” or “moderate” option and references a bipartisan bill introduced by Senators Hawley (R-MO) and Blumenthal (D-CT) last September, which PauseAI US thinks is good enough to officially endorse with their limited lobbying budget.

They also had separate asks for members on specific committees. Foreign Affairs, China, and Foreign Relations would be asked to hold a hearing on extinction risk and a possible treaty; Commerce committees would be asked to hold a hearing on extinction risk and domestic regulation. Senate Commerce folks would also be asked to push a floor vote on the Evaluation Act, which has been stuck there for a while.

(I had mixed feelings about the Evaluation Act, which some said was a messaging bill and others described as solid domestic transparency regulation. I did bring it up in my solo meeting, though. It seems net good to boost in any case, with some nuance as to whether it’s endorsed as a message or as serious policy. I will not be going down that rabbit hole here.)

The messaging on the treaty was also impressively tight:

  • China would not want to lose control of AI either, and has said so publicly.
  • The Cold War precedent shows bitter adversaries can cooperate.
  • Verification is possible. The chip supply chain is fragile and bottlenecked.
  • Support for a ban is widespread. (Polls, statements.)

The meetings

Thus prepared, we set out on our mission.

Felix De Simone, organizing director at PauseAI US, did a great job scheduling meetings with staffers for (checks notes) at least sixty people plus cancellations. We couldn’t get a meeting with every relevant office, but I still met more staffers than I expected to; when we couldn’t get on the schedule we dropped in anyway to leave material and get contact info for followups.

On Monday I tagged along for a meeting with the office of NJ rep Robert Menendez (no, not the infamous one, the other one). I’m not a constituent, so the person who was took point. Then we reconvened with several other New Jersey folks to drop in on the office of Senator Cory Booker. (No one was available, but we got contact info for a followup.)

On Tuesday, several of us met with a staffer in the office of Senator Andy Kim, and gave the pitch. Afterwards, I rushed across the Hill to the House offices for a solo meeting with the office of Donald Norcross. The meeting had been moved up at the staffer’s request, so I had to hustle. I got a two-for-one deal on staffers, though! I hope this means they were intrigued.

It’s a little hard to say how well the meetings went; staffers can be difficult to read and it’s their job to be polite and friendly and make people feel heard. Still, I thought we made progress. The staffers asked good questions, too; one asked what my timelines were and I brought up the AI 2027 forecasts. Another asked about our engagement with the labs; PauseAI’s official answer is “we don’t” but I took over and channelled some of the conversations I’ve had or witnessed at Lighthaven or on LessWrong.

My Tuesday afternoon was freed up by the moved meeting, so I got lunch with a group of PauseAI folks including Felix. He had one final meeting on behalf of a constituent who couldn’t attend, and after chatting for a while I offered to help. He was excited to have a MIRI person who worked on If Anyone Builds It to tag team with, and we hashed out a plan where I’d cover the scary AI stuff and he’d lean on his practice talking policy.

He was also really excited by the bipartisan statements graphic. “I wish I’d seen this when I was making our binders!” We printed a copy. This was a great moment for me, and updated me towards proactively sharing our best material with allies.

I think the resulting tag-team in the office of El Paso rep Veronica Escobar might have been my favorite meeting. We were scheduled to meet with Escobar herself, but she was reportedly stuck in a vote and could only drop in briefly to say hello. (We weren’t the only ones this happened to; apparently things are often chaotic right after a recess.)

Since all of our reps were Democrats, they didn’t have much pull on the Republican-controlled committees for secondary asks. But we still covered the treaty and x-risk asks for everyone. A couple of my peers from New Jersey reported a good meeting on Monday with their rep’s office that got them a potential lead on at least two more offices.

Marginal progress!

Takeaways:

We were encouraged to treat these meetings as a beginning, rather than a one-and-done, and to follow up. I intend to.

I’m a little sad that I didn’t get much practice in arranging meetings, which feels like one of the hardest steps to me, but I still feel grateful that Felix handled that step for PauseCon and I think it was the obviously correct move.

Tag teaming works great. I got to play both sides this week; in one meeting I was the policy wonk while an ML engineer talked about AI, and in another I provided the tech context for Felix’s policy proposals. I could do the whole script by myself, but it still feels better to pair up and specialize.

Social activity

I appreciated the chance to unwind and chat more informally. I talked to someone who was contemplating driving around the country starting PauseAI groups, a really impressive dedication to the cause, and even some international folks, in the US temporarily to work on something or participate in PauseCon.

Overall, it was great to meet a bunch of folks earnestly and enthusiastically trying to save the world.

The protest

Pulling out all the stops. Or should that be putting them in?

The protest at the Capitol was my first one ever! I think it was well-organized, and it encouraged rather than discouraged me about attending more. The organizers had plenty of supplies and signs (shoutout to TJ, I believe it was, for arranging to lug hundreds of pounds of stuff to and from the event, with some help, and to Anthony Fleming, who runs PauseAI DC and organized the protest itself.)

I didn’t notice the protest attracting much attention outside itself, though some passerby did stop to talk with participants. I get the sense DC sees a lot of these and is largely inoculated against them. I nonetheless think the protest did what it set out to do, rallying a large and visible group of people to make their voices heard, and perhaps more importantly helping to cement an identity as the kind of people who come together to bravely and stridently stand up and tell Congress to get their act together on AI.

Stopping the race to superhuman AI? This looks like a job…for EVERYONE!

There were a lot of speeches, and some were quite good. I didn’t agree with all of them, but I hardly expected to. I think that’s part of the point of the movement; whatever our reasons, we’re all here, and we’re all pulling in the same direction. For my part, I recorded a couple of short videos for later sharing.

One talk at PauseCon described an “avalanche of outrage” around AI. It’s not entirely controllable, but it can sort of be aimed. To a first approximation, I model PauseAI US as trying to channel the avalanche, trying to draw together widely divergent views and reasons for disliking AI and get them pointed in productive directions. Lining up this many voices behind an international treaty is an impressive accomplishment, and I hope PauseAI meets their goal of roughly doubling their local group count from ~30 to ~75 this year. If you want to help them accomplish this, see here.

Coda

My experience at PauseCon filled me with hope and pride.

I listened as the leadership of PauseAI US addressed a roomful of volunteers from across America, from New York to Idaho to Texas to California to Alaska, young and old alike, and bid them speak their true concerns with courage and frankness in the halls of Congress.

I watched a young man with trembling voice and shaking hands speak truth to power and not falter. From a meeting with the office of his representative he emerged, gushing and proud, and in my heart a young boy cheered.

I met with my Congressional offices, asking them to join the growing bipartisan list of their colleagues who have acknowledged the risk of extinction from superhuman AI, and to push for an international agreement halting the race.

It is my earnest and sincere hope that members of Congress concerned about AI can find within themselves the courage of a fifteen-year-old boy, and stand, and say, “Enough.”

If you, too, are concerned about the path the world now treads; if you can find within yourself the courage of a fifteen-year-old boy; and if you wish to add your voice to the growing chorus of those who say “Enough”; you can do so here.

  1. ^

    There is a global PauseAI as well, largely unaffiliated with the American nonprofit. I will sometimes use “PauseAI” for brevity in referring to the US organization.



Discuss