The rest is commentary. If there is enough housing, it will be affordable, people will afford more house, and people will be able to live where they want to live.
It’s always been that simple.
Increased supply of any kind of housing increases affordability of all kinds of housing.
Are there other things that would also be helpful? Yes, but they’re commentary.
Freeing up existing underused housing, for example, is helpful. It is commentary.
Let’s enjoy the lull and see how much of an Infrastructure Week we can do.
New Levels Of Saying Quiet Part Out Loud Even For This Guy
Trump opposes building houses where people want to live, because doing so would let people live there, which would drive down the value of existing homes.
Acyn: Trump: I don’t want to drive housing prices down. I want to drive housing prices up for people who own their homes. You can be sure that will happen.
unusual_whales: Trump: when you make it too easy and cheap to build houses, house prices come down. I don’t want to do that.
I take the bold stance that building more housing makes it cheaper, and that’s good.
Any time you are wondering about Trump’s relationship to affordability, know that where it counts he has taken a bold stance, and that he is explicitly against it.
This isn’t even typically true, since the value of the associated land goes up.
When you change the law to make it legal to build more homes on the same plot of land, the value of the land goes ***up*** while the per unit price of housing goes ***down***
Homeowners usually own both land and structures!
Whose Side Are You On.
Robotbeat ➐: I actually really appreciate Trump making this political dynamic explicit.
Boring_Business: One of the most significant moments from the Trump Davos speech was when he said the quiet part out loud
You cannot lower housing costs for young people without destroying millions in wealth for boomers
“Every time you make it more affordable for somebody to own a house cheaply, you are actually hurting the value of those houses. I don’t want to do anything to hurt the value of their house.
If I wanted to crush the housing market, I could do that so fast that people could buy houses. But you would destroy people who already have houses.”
Our politicians are sacrificing people in their 20s and 30s for the prosperity of boomers
Let that sink in
Your Intervention Only Partly Solves The Problem So We Are Against It
Trump is against building more housing because it might reduce housing prices.
Julie Weil reports that ‘supply skeptics’ oppose building housing where people want to live because it would not reduce prices by enough to fully take care of our inequality problems, so ‘that’s no way to fix affordability.’
As in, and this is the argument they choose to make here, if we built enough to grow housing stock by 1.5% a year, which in the grand scheme does not sound like much but is these days remarkably hard, then that might cause housing prices to only decline 4% per year, so it would ‘take 18 years before a median one-bedroom apartment becomes affordable for a worker in San Francisco.’
I notice I am confused how this is an objection.
Their other complaint is that the housing being built is too useful and in demand, and thus will cost too much money, and therefore it should be worse instead.
Specifically, if San Francisco increased market-rate housing by 1.5% per year, which it notes is more than triple the current rate, it would take 18-124 years to make the median one-bedroom ‘affordable’ to someone earning the median wage for non-college graduates.
That’s a bizarre standard. Why should a single non-graduate be able to live alone in San Francisco proper, one of the most desirable places on Earth? But if you want that, yeah, you’re going to need to expand the housing stock more than 27%.
Could you do that? Very obviously you could do that. You could build 30-story buildings all over the place, if you wanted to. Austin proves you can do it. But that’s not going to happen in San Francisco until both zoning and other government-imposed costs and veto points are handled.
As usual, if someone argues ‘you could double the housing stock and prices would not come down,’ the response is ‘that’s not how it works but if it was then that sounds amazing, think how much value you would be creating.’
Abundance
Matthew Yglesias: There are a lot of regulations curtailing air pollution. If we repeal those regs, then corporations will generate abundant air pollution.
This is bad because air pollution is bad.
If we repeal regs that curtail housing production, corporations will generate housing.
Changes In Rent Are Largely About Changes In Supply
Or at least, those things are highly distinct in places like San Francisco, Los Angeles or New York, where housing is extremely valuable but you have trouble getting permission to build it. If you allow building housing, then yes you should expect the two to converge, at which point it only makes sense to build more housing if you expect appreciation, and yes this helps explain why Austin may have actively overbuilt, although my guess is it at most got ahead of itself a bit.
America
Peter Wildeford: I wonder why population is moving south and the south is getting more electoral votes
Veronica Grecu: New apartment construction in the U.S. is flexing its muscle again in 2025, with an estimated 506,353 units expected to be opened nationwide by the end of the year.
New York Metro includes a wide area and places like Hackensack and Yonkers. Brooklyn is delivering 7,189 units, Manhattan 4,662 and Queens 2,630, versus 15,195 for Austin and 12,365 for Charlotte.
Minnesota
A new paper by Helena Gu and David Munro claims that five years after the Minneapolis 2040 plan, home prices were 16%-34% lower and rents were 17.5%-34% lower versus a counterfactual Minneapolis without that reform (p=0.012). But as much as I want to believe this worked amazingly well, and that there aren’t any other reasons driving this, the housing hasn’t been built yet.
We explore the possible mechanism of these impacts and find that the reform did not trigger a construction boom or an immediate increase in the housing supply. Instead, the observed price reductions appear to stem from a softening of housing demand, likely driven by altered expectations about the housing market.
I get why purchase prices go down anticipating future supply. But rents shouldn’t do the same, and should rise relative to purchase prices. So why didn’t that happen? I can come up with some ‘just so’ stories that try to explain some of the fall in rents due to anticipatory and longer term impacts of rentals, but rents falling fully in line with sale prices seems like it has to be wrong. If it’s right, the market is being highly irrational.
Debunking Obvious Nonsense About Monopolistic Practices
I sometimes worry I am pushing the envelope on how Obviously something has to be Nonsense before it counts as Obvious Nonsense.
This is not one of those times.
Derek Thompson: In housing, for example, Ezra Klein and I write that a key bottleneck to homebuilding in the last few decades has been legal barriers to construction, including zoning laws and minimum lot sizes. This is a mainstream view supported by economistsand scholarswho havestudied the issuefor decades.
The antitrust left, however, claims that the more significant factor is that big homebuilders abuse their power by holding back construction to juice their profits. “Big homebuilders withhold housing supply,” the antitrust advocate Matt Stoller has claimed.
In their paper “Post-Neoliberal Housing Policy,” the law professors Christopher Serkin and Ganesh Sitaraman criticize market concentration in homebuilding and call for “tools from anti-monopoly policy.”
The position that the primary problem is monopolies (or at least oligopolies) has never made any sense. Even if it theoretically did make sense and was a substantial driver holding back construction, zoning reform would allow for entry, as the mechanism to enforce the oligopoly is difficulty getting permission to build.
At a high level, I have never found these arguments persuasive. One hallmark of a monopolistic market is rising profits. But researchers have found that developer profits have remained steady.
If zoning imposes artificial supply restrictions, then profits could rise via a similar mechanism that way, which I presume is not happening because instead we build in increasingly convoluted ways while paying various massive de facto bribes to various veto points. But if profits are not rising, that rules out monopolistic supply restrictions.
In any case, Thompson decides to engage with one prominent article in particular.
I’m not an economist or a lawyer. I’m just a journalist. To the extent that I’m good at anything, it’s calling people on the phone and writing down what they say. So I reached out to the primary sources that Musharbash quotes throughout the piece.
What I found was astonishing. The economist Musharbash cites told me that his theories had been misapplied. The housing analysts quoted in the piece told me Musharbash distorted their points and reached dubious, or even flatly wrong, conclusions. The leading monopoly researcher I spoke to, whose work has been celebrated by the antitrust left, told me that the entire thrust of the article—and, by extension, much of the antitrust-housing philosophy—defied sophisticated antitrust analysis.
The essay you’re reading is very long. But I can sum it up in one sentence: The Musharbash essay on Dallas—like too much of the antitrust left’s work on housing—is filled with out-of-context quotes, overconfident assertions lacking evidence, and generally misguided claims. Now let’s go through them one by one.
I dispute one of Thompson’s claims. His essay is not that long. Well, at least not by my standards. But it is too long to quote that much of it, so we’ll hit the highlights.
Claim #1: Dallas is a “homebuilder oligopoly.”
Reality: I called the key oligopoly researcher cited in the Musharbash essay. He disagreed with the use of his work and told me that any city with Dallas’s construction record was “100 percent” not an oligopoly.
…
I called Luis Quintero to ask what level of market concentration in homebuilding he considered to be dangerous. In the most concentrated markets, Quintero said, one or two firms account for 90 percent of new housing. But problems begin to accelerate, he said, if five or six firms account for 90 percent of new housing.
I immediately saw a problem. In Dallas, the top two firms built just 30 percent of new homes in 2023. The top six firms barely account for 50 percent of new housing. Musharbash’s claim that a homebuilding oligopoly is crushing housing supply in Dallas relies on an economic analysis that doesn’t apply to Dallas at all. I asked Quintero about this: Would you agree that Dallas is “a bad application” of your paper? “I would definitely agree,” Quintero told me.
…
I tracked down a complete listing of the country’s 50 largest homebuilding markets, from #1 Dallas to #50 Cincinnati. How many meet Quintero’s first oligopoly threshold (two companies = 90 percent of the market)? Zero out of 50. And how many meet his second threshold (six companies = 90 percent of the market)? One: Cincinnati.
There is then further discussion with Quintero, who defends his original (very different) claim that oligopolies in housing construction matter by pointing to impact on suburbs or small towns with higher concentration.
Claim #2: Dallas housing experts say local homebuilders are monopolies who are “devouring” the market.
Reality: When I called up a Dallas housing expert [John McManus] cited several times in Musharbash’s essay, he disagreed profoundly with its thesis. He’s actually a big YIMBY.
…
So, what did McManus think was more responsible for driving up the cost of housing in Dallas? “Land use regulation,” he said. “Zoning?” I asked. “Yes. Land constraints and zoning that require certain footage along the street, or minimum lot sizes, or requirements about three-car garages, are more the cause of the prices increasing now,” he said.
We’re already approaching ‘stop, stop, he’s already dead’ territory, but we keep going.
Claim #3: Industry experts have data proving that homebuilding oligopolies are holding back national housing construction.
Reality: I reached out to an industry expert whom the antitrust folks like to quote. He told me that he disagreed with the way that his analysis is being used by Musharbash, Stoller, and other antitrust advocates.
Claim #4: “X companies account for Y percent of this industry” is a smart way to think about market concentration.
Reality: A leading monopoly researcher told me this is an incomplete and overly simplistic way to think about monopoly power.
Like Lambert, Roberts said he couldn’t rule out the possibility that larger homebuilders are actually good for the housing market— or even that today’s homebuilding markets would benefit from being even more concentrated.
So often we run into situations like this, where there is an Obvious Nonsense theory that never made any sense even under ideal conditions, and then you find out that even on its own terms it makes no sense and conditions are not only not remotely ideal, they don’t match up to the claims at all.
One can also look at the following simpler argument:
The argument from monopoly agrees that the primary problem is not enough housing where people want to live and the issue is lack of construction.
Homebuilders keep constantly fighting to build more housing than they are allowed to build.
When we let them build more housing they reliably build more.
We’re done here, right?
A group can’t simultaneously be restricting supply and also fighting to create as much supply as it can. It doesn’t make sense. Stop it.
At this point, one would like to invoke ‘stop stop he’s already dead’ but, well:
Matt Stoller: Ok, so I’m going to call out @DKThomp for journalistic malpractice and unethical behavior. He wrote a piece that was supposedly ‘debunking’ something that antitrust lawyer @musharbash_b put together on how Wall Street limits housing supply.
…
@DKThomp attacked the piece. Here’s what he said on Bluesky:
“I did something really simple. I called up their sources. Everyone I spoke to told me the same thing: Their claims are bullshit.”
Wow that sounds bad! So I picked up the phone.
Stoller goes on to claim that Thompson misrepresented the arguments in the original article when calling his sources, in particular Lance Lambert, and calls out Thompson as being in bad faith.
Then of course Derek Thompson picked up the phone again, called Lance again, and says he got Lance to sign off on every quote and all the language, and got Lance to say he agrees with Derek’s position, and that homebuilders are not a cartel or oligopoly, and in particular “I hope you both communicate my view that I don’t think the big builders are bad actors, or even that they have the power to be the bad actors.”
And indeed we have a response from Lance that both of them posted, in which Lance affirms that Derek is correct, and affirms that there is no one holding back supply. Various people including Matt tried to present this as a balanced perspective or as backing Matt. I do think Lance’s response is balanced and very good, but in a way that (in the most polite way possible) completely vindicates Derek.
Claims went around that the median home buyer was 59 years old. What?
Does it make any sense that the median home buyer could be 59? Really?
The real answer is closer to 40, which makes more sense and seems entirely fine.
Connor O’Brien: The National Association of Realtors says the age of the median homebuyer is now 59. Is that actually true?
In the American Community Survey, the median age of heads of households who are 1) homeowners and 2) moved in within the last year is 41.
The American Housing Survey also doesn’t show a major run-up in the median age of average buyers *or* the median age of first-time homebuyers.
This is average, not median age, but the New York Fed finds that first-time homebuyers were *younger* in 2024 than in the 2000s. Haven’t gotten older on average in nearly 20 years.
… As my colleague has pointed out, Gen Z and Millennials have taken longer to achieve the same home ownership rates than prior generations.
Homeownership is indeed less common for young people than it used to be.
Just isn’t nearly as dire as NAR implies.
… New York Fed Consumer Credit Panel data from Equifax (which excludes all-cash buyers, TBF) finds that a majority of home buyers were first-time buyers in 2023. Overall volumes are down since the early 2000s, of course.
Median 2024 homebuyer in the Survey of Household Economics and Decisionmaking: Age 39
A Zillow national survey of 20,000 homebuyers finds that the median buyer (among all buyers, not just first-time buyers) is 42 years old. The survey was in the field between April and September 2025.
We also have this survey showing a sharp rise in first time buyer age after 2020, with the obvious reason being interest rates, but Ascendiqute points out that HUD and Fannie/Freddie define ‘first time homebuyer’ as not having ownership in a primary residence for 3 years prior. So this is picking up a bunch of not-first-time buyers in its averages, who might have rented for a few years and then bought back in due to Covid dynamics.
Property Taxes Improve Allocation Efficiency
Whereas we often do the opposite via capital gains tax implications of selling, which provides strong incentives to stay put in ‘too much house.’
Bernard Stanford: A retired, empty-nester couple feeling pressure to downsize because their property tax keeps rising as their home appreciates—is exactly how the system SHOULD work.
Hunter: People absolutely HATE this, but it’s true!
It’s just way better and more important for society for a young couple to be able to buy a larger house and start a family than it is to prioritize older owners hanging on to huge homes they no longer need, memories be damned.
Matthew Yglesias: It would really be good if more empty nester couples living in large homes in desirable suburbs would realize their capital gains and downsize to smaller dwellings, but almost every jurisdiction gives special property tax breaks to discourage this.
Apartment buildings with elevators, less square footage to clean, no yard work, professional maintenance, etc seem ideal for America’s growing senior citizen population and you’d free up inventory for young families who will actually use the space.
If an old couple values the memories or other advantages of staying put, such that they are the efficient owners of the space, they should be willing to pay the associated property taxes. If they can’t or won’t, that means they either don’t actually value the house as much as the market does, are effectively ‘living beyond their means’ or both, and often this will mostly be due to inertia.
Property taxes also serve to lower the market value of housing, so the higher taxes don’t make it harder for new homeowners to buy, indeed they may do the opposite, and are highly progressive and a badly needed transfer to those starting out in life.
The instinct against this is the idea that you should be able to own real property ‘free and clear’ and what’s yours is yours, without being forced to engage with any market, and rising property values shouldn’t effectively be able to force you out if you want to stay. I do totally get that attitude, but I don’t think we get that luxury.
Ideally, of course, we would instead have a tax on the unimproved value of land.
Another big advantage of higher property taxes is that it protects against people wanting their property values to rise. This will make them more willing to build new housing.
The downside is that property taxes apply to newly built property and the appreciation in value, so it discourages building. Ideally you would approximate the ‘unimproved value of land’ rule by only taxing appreciation from construction, or new buildings, after a reasonably long period, but yes you have to strike a balance here.
In exchange for the higher property taxes we should eliminate capital gains taxes on home sales for primary residences up to some reasonable limit, which further encourages people not to stick with existing homes.
More Of Old People Inefficiently And Systematically Stealing From Young People
In case you were worried young people might ever own real property.
Or as Marc calls it, without even putting poor in air quotes:
This is New Jersey giving those 65 and older a credit for 50% of their property tax bill, up to $6,500, as long as they have a total income of under $500,000. Income, not wealth, while they are over 65. No, they’re not kidding.
not meant to be formal/rigorous - more stream of thought, I'm trying to think about anthropics more
I wanted to communicate a concept I'd like to call **Nirvana Rank** that I came up with when thinking about Anthropics and reference classes.
I'll cover some background
Anthropics is about reasoning from the fact that you're an observer of reality. It treats your own existence as evidence you can use to deduce things about the worlds you're in.
Within anthropics, people have clustered on two ways of treating your own existence as evidence:
The first is called the Self Sampling Assumption (SSA) - where you treat yourself as a random sample, a typical member from the space of actual observers that exist in your reference class.
The second is called the Self Indication Assumption (SIA) which considers yourself a sample from all *possible* observers. This means SIA considers uninstantiated counterfactual realities while SSA does not.
There's something called the Doomsday Argument which was a thought experiment that was downstream of accepting SSA.
The Doomsday Argument goes as follows:
Everyone who has lived and who will ever lived is assigned a "birth rank" denoting in what order they were born in. So the first person to come into existence is assigned a birth rank of 1, and so on.
Our universe can live to be quite long, yet we find ourselves, relatively speaking, at nearly the very beginning, with very low birth ranks.
If you accept SSA, then observing yourself with a low birth rank means that its a typical value. If humanity were to go on to last trillions of years and produce trillions of more descendants, why would our birth rank be so low?
It's far more likely, then, that humanity will not last that long, that our birth rank is low because humanity will soon undergo a catastrophe barring it from extending far into the future.
This is the Doomsday Argument.
There are likely existing refutations and modifications to the argument, but I wanted to present a thought I had, which may modify SSA and dispel the argument.
Part of that includes noting an observation about what reference class one is allowed to assume they are sampled from. Your reference class is the space you count as valid to consider yourself a sample of. In the Doomsday Argument, the reference class was all of humanity from the first born to the last one standing. You're a random sample from that.
Consider the following very simple very gruesome scenario. If you choose to kill yourself tonight, then tomorrow, 10 trillion trillion people will be born on several thriving exoplanets. Are you a random observer amongst yourself and those 10 trillion trillion people?
The answer is of course not - the existence of those people are conditioned on your nonexistence - you could never be sampled from them.
While this is obvious, the point of the scenario is to point out with an extreme case, that it's possible for some reference classes to be barred from yourself through the use of a conditional which serves as a barrier or filter.
Many sets of reference classes may appear at first glance to be valid for one to be sampled from - but who's membership demands a conditional be made which would ultimately modify you as an observer! On the other side of such conditional gates would be thought forms inaccessible to you.
Its entirely possible you've crossed conditional gates barring others once in a valid reference class from considering you a potential referent.
One can imagine a reference class where all members are grouped by having the same conditional gates they've walked through, and sharing the same conditional gates not-yet-opened - creating a landlocked region of logical access.
I'd like to imagine such a reference class is sharing a cycle of incarnation, on some kind of path to enlightenment ideal where there are no more conditional gates ahead, and all have been opened.
It felt like it'd be like achieving some kind of cognitive singularity, becoming omniscient, or attaining Nirvana and ultimate wisdom.
It felt like a shared cycle of incarnation because the reference class included the same solved conditional gates and had the same conditional gates to eventually pass through.
Members in this special reference class need not be located in the same period of time or be locally near one another. There is a direction of progress, perhaps - something like the direction towards Nirvana.
Everyone in the same cycle of incarnation would share what I would call **Nirvana Rank** (as opposed to Birth Rank, from the Doomsday Argument.) I don't think the concept is so clean, but that's fine.
With Birth Rank, you're privileging a particular ordering, a particular form of distance: order of births, and distance from the first one (or perhaps from the last one). One's Birth Rank marks their progress along an uninspiring journey that is less related to observation or experience itself, and more related to something as arbitrary as temporal positioning.
Why do I refer to the Birth Rank as a marker along a *journey*?
Well, presumably we don't like the conclusions of the Doomsday Argument because we would like to *get to* the far future surviving and thriving in some way. Or we would like to *get to* a blossoming population of a space faring civilization. We care about our typicality because we care about this journey, and where we would like to go.
However, I do not believe it is Birth Rank which tracks the journey we as observers would like to go on - though it could be a proxy (a larger Birth Rank may be far in the future which may be nice!).
With **Nirvana Rank**, the values, if it were easy to assign, would mark progress on unlocking new regions of thought space or observation space based on passing through conditional gates. If we are dead or lose our cognitive and emotional potential, then that would be associated with a low **Nirvana Rank**. As implied by the name, it's the observers' journey through potential experiences until one reaches an exclusive reference class that's nevertheless very enriching in the quality of observations accessible.
One may care about their typicality along *this* journey, hoping to be further along, or hoping not to be doomed to wander into an irreversible low Nirvana Rank reference class.
In the Doomsday Argument, SSA has you sampled from observers spanning from the very first Birth Rank to the very last.
It would be a bit different with the Nirvana Rank - you would not be sampled from the space of observers from Nirvana Rank 0 to ~Nirvana itself. Recall that this reference class was constructed with conditional gates which created boundaries between other observers and those in your reference classes. There are essentially levels you exist within and are sampled from - and those levels share a range of Nirvana Ranks.
Like in the Doomsday Argument, you should then expect to find yourself as a typical member of your incarnation cycle reference class - having a typical Nirvana Rank, *within that level*.
If the Doomsday Argument implied that catastrophe must be soon, then its analog here may imply that one is soon to exit their incarnation cycle to one with higher Nirvana Ranks!
Now, to estimate how soon this transition occurs requires a reference for one of the lower Nirvana Ranks in your incarnation cycle (this would be analogous to the reference point observer with Birth Rank 0).
For example, if you're typical within your incarnation cycle, and of your reference class, the lowest Nirvana Rank observer is not so much lower than yours, then perhaps you should expect to face the conditional gates ahead of you soon and anticipate upgrading your Nirvana Rank.
To recap:
- Presumably, one cares about the typicality of their Birth Rank in the Doomsday Argument because there's a preference for observations to last into the far future, because that would likely be associated with longevity and utopia
- Choice of ordering or classifying observers should reflect what one would prefer, since you'd want to know if your typicality implies something preferable through anthropic arguments
- Birth Rank doesn't satisfy this perfectly, it's an easy choice, but not the best one
- It's possible to satisfy conditionals that put you in a once inaccessible reference class of observers, or make a reference class which is inaccessible to you now
- This ability of conditionals to create boundaries or gates among reference classes can be used to construct a special kind of reference class, where members share the conditional gates they've moved past as well as the conditional gates they have yet to move past and resolve
- These remind me of incarnation cycles - shared observation powers, shared limitations in their extent are reminiscent of being somewhere similar on something like a 'spiritual journey'
- If resolving conditionals can take one to exclusive but expressive and rich reference classes, there can be something like an ordering or path of these classes towards some ideal - maybe one where the reference class is composed of neigh-omniscient minds, the enlightened, cognitive singularities - the term Nirvana Rank was coined to represent distance from such states, though not as clean a concept as Birth Rank
- This sounds like a journey that would be worthwhile for observers to flow through, making it compelling to care about one's typicality amongst the reference class
- The conditional gates form landlocked logical pockets, the fact that some resolved conditionals will grant more access to more possible observations suggests an ordering or path or journey - you've got pockets and you've got progress
- One cannot be sampled from observers outside one's pocket by definition of how such pockets are constructed - so one is a typical member of a pocket, not across all Nirvana Ranks
Hopefully you want yourself and others to reach something like enlightenment . Choice of reference class should match that goal.
Background: I'm thinking through anthropics on my own to form my own views, here are some working notes
There's been disputes about how to define a reference class in anthropic reasoning. I think you can skip those disputes by letting the choice of reference class be a degree of freedom.
How would you do that?
Part of that involves an intuition that there are some observers that cannot be other observers. For example, observers who only see red cannot be observers who only see blue. Another example is observers who don't choose to get drunk cannot be observers who did choose to get drunk. By deciding what you're saying you know about the observers, you can group them up in what I'll call patches.
The patches are defined by filters which I call conditional gates. Those are conditions which, if satisfied, constitute a transformation from an observer of one patch to an observer of another patch. You can't be a random sample across different patches because different patches don't permit you to be any of them - you must be one or the other.
The choice of conditional gates used to carve out a patch of observers is entirely up to you, so long as you satisfy them! If you satisfy the constraints to be in the patch, then you should expect yourself to be a typical member of those in that patch.
The fact that you can choose your filter is interesting to me, because it means you can expect to be a typical member under some filter, and also a typical member of some other filter, and so on - and that reasoning about different filters can lead to different hypotheses about reality which can all be true.
Some ways of choosing a filter can be very uninformative, like if your filter only includes yourself - then you're a typical member of a patch of just yourself, which is trivial.
These patches also allow for dynamic references classes - you can move to another reference class by passing through a conditional gate, becoming a different kind of observer, and arriving on a new patch. There's choice involved.
In SSA you're only sampling from actual observers. If that's the case then SSA can only make local claims using Anthropics - claims about patches. Whenever counterfactuals occur, observers become some other kind of observer, and so a conditional gate needs to be drawn.
In the Doomsday Argument, you assume your birth rank should be a typical sample over all birth ranks for minds that'll be born. There are many counterfactual ways the future could go and you do not know them in advance - if you're allowed to sample yourself from observers as if there's one timeline - it must be because you've chosen conditional gates that every possible branching future can satisfy, under my frame. The more stringent the definition of observer, the more likely it cannot be satisfied by all possible counterfactual futures. This means the inclusion criteria for the observer patch must be very broad and likely uninformative about what we care about. As such, under my frame, the Doomsday Argument wouldn't actually be saying doom is soon in a way we might care about. Rather, it'd be saying "you won't last long trying to find something common across every single possible future the deeper into the future tree you go".
I'll note that observers within a patch do not need to be alive at the same time - the conditional gating can logically organize observers all sorts of ways. A peculiar property is that you can allow inanimate or abstract things to be 'observers' in your observer patch, so long as you also satisfy the constraints to belong in it. I actually find that fine - it doesn't privilege some particular notion of observer. It can treat anything as having a "vantage point" and thus being an observer.
There's plenty of open questions like:
What does it mean to be a typical observer given some patch defined by the filters chosen?
Are there any plausible examples where this frame seems to generate productive, nontrivial beliefs about reality that are harder with SIA or SSA?
How do you aggregate information about reality informed by anthropic reasoning under different choices of filters?
How do you pick filters in an 'informative' way?
Are there any trade offs you make when you add more or less filters when defining your patch?
If an observer's choices can count as a conditional gate, could you have a conditional gate based on the choice of filter made by an observer?
How does it compare to SSA and SIA?
...
I might address these in future posts. There's also a lot of tangents I wound up going on when thinking about this, so I'd like to address those as well.
I've recently updated towards substantially shorter AI timelines and much faster progress in some areas.[1] The largest updates I've made are (1) an almost 2x higher probability of full AI R&D automation by EOY 2028 (I'm now a bit below 30%[2] while I was previously expecting around 15%; my guesses are pretty reflectively unstable) and (2) I expect much stronger short-term performance on massive and pretty difficult but easy-and-cheap-to-verify software engineering (SWE) tasks that don't require that much novel ideation[3]. For instance, I expect that by EOY 2026, AIs will have a 50%-reliability[4] time horizon of years to decades on reasonably difficult easy-and-cheap-to-verify SWE tasks that don't require much ideation (while the high reliability—for instance, 90%—time horizon will be much lower, more like hours or days than months, though this will be very sensitive to the task distribution). In this post, I'll explain why I've made these updates, what I now expect, and implications of this update.
I'll refer to "Easy-and-cheap-to-verify SWE tasks" as ES tasks and to "ES tasks that don't require much ideation (as in, don't require 'new' ideas)" as ESNI tasks for brevity.
Here are the main drivers of my update:
Opus 4.5 and Codex 5.2 were both significantly above my expectations (on both benchmarks and other sources of information). This isn't that much of an update by itself, we should expect some variation and some models to be decently large jumps, but then Opus 4.6 (and probably Codex 5.3 and 5.4) were again above my expectation even after Opus 4.5 and Codex 5.2. In 2025 we saw roughly 3.5 month doubling times on METR 50%-reliability time horizon and a big jump (though with an unreliable measurement) right at the start of 2026.
I've seen demonstrations of AIs accomplishing very large and impressive ES tasks given only moderately sophisticated scaffolding. As in, tasks that would take humans months to years (some of these tasks were ones that weren't contaminated by results on the internet, eliminating that explanation). These demonstrations are: various things I've done with a scaffold I wrote, the C compiler that was (almost entirely) autonomously written by Claude, some cyber results I've seen, and some other soon-to-be released results from METR. Due to this, I tentatively believe that (as of March 1st) the well-elicited 50% reliability time-horizon on ESNI tasks (using only publicly available models) is somewhere between a month and several years (supposing that the AI's overall budget for both tokens and experiments corresponds to roughly what a human would cost to do the same work). I think the high reliability (e.g. 90%) time horizon is much lower.
I now expect a substantial training compute scale up in 2026 (probably mostly pretraining) and I expect this to yield large returns.
I've updated towards somewhat larger scaffolding overhang on very large tasks than I previously thought was present (based on observations of AI performance given different types of scaffolding). Thus, I expect significant improvements in usefulness relative to what's currently in widespread public usage from relatively straightforward scaffolding improvements.
I was previously thinking that frontier AI progress in 2026 would be a bit slower than in 2025[5] (as measured in effective compute or something like ECI), but due to these factors, I now expect progress in 2026 to be a decent amount faster than progress in 2025.
It's worth noting that AIs being more useful (for AI R&D) accelerates AI progress (in addition to being an update towards being closer to various other milestones). So, when I update towards being further along in the timeline and towards AI being more useful at a lower level of capability, I also update towards a faster rate of progress this year.[6]
A key place where I was wrong in the past is that the 50%-reliability time horizon now seems to be around 20x longer on ESNI tasks than METR's task suite (and similar task distributions)—and well greater than 100x is plausible—but I expected a gap of only about 4x. (This error is pretty clear in my predictions in this post.) (There is also a gap where AIs' time horizon on "randomly selected internal tasks at AI companies" is shorter than on METR's task suite (and similar), but this looks like a factor of 2 or 3 and doesn't currently seem to be rapidly growing.)
What's going on with these easy-and-cheap-to-verify tasks?
What explains this very high performance on ES tasks? The core thing is that you can get the AI to develop a test suite / benchmark set and then it can spend huge amounts of time making forward progress by optimizing its solution against this evaluation set. This is most helpful when incrementally improving/fixing things based on test/benchmark results is generally doable (and it's easy for the AI to see what needs to be fixed), it's not that hard to develop a sufficiently good test suite / benchmark set, and running the test suite / benchmark set isn't that hard. These properties hold for many types of very well-specified fully CLI[7] software tasks (and software tasks that are most focused on improving some relatively straightforward metrics).
This type of loop means that even if sometimes the AI gets confused or makes bad calls, there is some correcting factor and mistakes usually aren't critical. You can do things like having multiple different AIs write test sets or getting the AI to incrementally improve the test suite / benchmark set over time to avoid mistakes on the testing yielding overall failures. On many other types of tasks, AIs are limited by having somewhat poor judgment or making kind of dumb mistakes and having a hard time recognizing these mistakes. But, with the ability to just keep iterating, they can do well.
I think we're well into the superexponential progress on 50% reliability time-horizon regime for these ESNI tasks: because sufficient generality and error recovery allows for infinite time horizon (the AI can just keep noticing and recovering from its mistakes), beyond some point each successive doubling of time-horizon will be easier than the prior one. See here and here for more discussion of superexponentiality. The level of generality needed to enter the superexponential regime for ESNI tasks is lower as it's easier to spot and recover from mistakes.
A core thing I wasn't properly pricing in is that a task being easy-and-cheap-to-verify helps at two levels: it's both easier for AI companies to optimize (both directly in RL and as an "outer loop" metric) and it's easier for AIs themselves to just keep applying labor at runtime.
Thus, we can imagine a hierarchy of tasks:
ES tasks
Tasks that can be readily checked for training/evaluation but the AI can't easily check itself
Harder-to-check tasks
It seems as though the gap between (1) and (2) is much larger than the gap between (2) and (3).
A separate dimension is how much the task requires ideation. The more that having somewhat clever ideas is important, the less the AI can operate very iteratively. More generally, tasks vary in how much they are best done with incremental iteration. Some types of software like distributed/concurrent systems and algorithms-heavy software are substantially harder to build iteratively. And lots of software is more schlep-heavy and is just a large number of different things that need to get done, making incremental progress more viable. (A core question is how much it's important to carefully understand the broader complex whole and think of a good way to do/structure things vs. you can just iterate on smaller components.)
Some evidence against shorter timelines I've gotten in the same period
One thing we might wonder is if METR's task suite and similar evaluations were just underelicited and better scaffolding (that e.g. gets the AIs to write tests and then optimize against these tests) would make a big difference. I currently think certain types of better scaffolding might make a moderately big difference on METR's task suite, but that this isn't the main driver of the time-horizon gap between ESNI tasks and METR's task suite. Most of that gap is about the task distribution (checkability, iterability, the remaining unsolved tasks not being central SWE tasks) with AIs actually being bottlenecked by real capability limitations on their current task suite (though because of the task distribution, these capability limitations don't strongly preclude large acceleration of AI R&D). That said, I think scaffolding is increasingly becoming a big deal and will matter more for next-generation models. (In short: I think scaffolding is quite important for current and near future AIs when the task is sufficiently large in scope that completing the task would naturally take up a large fraction of the model's context window, like at least 1/3.)
I think AIs have quite bad "taste" and "judgment" in many domains (generally more so stuff that's harder to RL on) and that this is improving substantially slower than general agentic capabilities. By "taste" and "judgment", I mean something like "making reasonable/good calls in cases that aren't totally straightforward and having good instincts". This includes something like SWE taste which is often the main bottleneck in my experience on somewhat less well-specified SWE tasks and seems to be a major bottleneck on code quality even on very well-specified SWE tasks.
One story here is that taste is mostly driven by pretraining progress or RL on the domain in question (taste doesn't currently generalize that well between domains I think) so outside of heavily RL'd on domains the progress comes mostly from pretraining. And pretraining progress is maybe 2-3x slower than overall AI progress.
However, I do think we might see especially fast pretraining progress in 2026. Thus, I think it's possible these blockers will rapidly improve.
I've seen AIs do a lot of stupid stuff in the course of trying to automate various empirical research projects (though I think some of this stuff that looks like stupidity might be better explained by misalignment / poor RL incentives).
Why does high performance on ESNI tasks shorten my timelines?
The main reasons are:
General capabilities update: I previously didn't think the AIs would be able to do this by now and these tasks are intuitively difficult. This updates me upwards on the overall capability of AIs and on the efficacy of RL. More generally, I just update based on "things have gone faster than I expected".
Superexponentiality: ESNI time-horizon progress seems significantly superexponential, so we've now seen an example of superexponentiality in the wild in one moderately representative domain and it seems like this yielded very fast doubling times. This superexponentiality also kicked in somewhat earlier than my median (in terms of 50% reliability time horizon and qualitative capabilities) for when this would become a big deal.[8]
AI R&D acceleration: I think it's pretty plausible that very strong performance on ESNI tasks (especially extremely, extremely strong performance) will allow AIs to substantially speed up AI R&D. As I'll discuss in the next section, I think it's unclear how large of a speed up this will be, but it could be pretty big especially if AIs get better at very ideation-bottlenecked tasks. Additionally, very high performance on ESNI tasks makes it more plausible that relatively small capability improvements greatly improve performance on tasks which have pretty good progress metrics (metrics that can be gamed or don't perfectly capture quality, but where doing better on the metric generally means doing better) but which aren't totally ES tasks (e.g., tasks where verification is expensive or requires a decent amount of judgment).
Scaffolding and prompting underelicitation: While the required scaffold to mostly unlock these capabilities isn't that complex, it is the case that relatively basic scaffolds don't suffice and my understanding is that performance can probably be greatly improved on ES tasks with better (general purpose) prompting and scaffolding. I also think this generally applies for large scope tasks at the limit of what AIs can currently do. This makes me think there is more underelicitation than I was previously thinking. I also think that AIs could be better adapted to these big scaffolds and get better instincts about how to operate in these scaffolds (e.g. how to write instructions for other AIs) which would further boost performance.
How much does extremely high performance on ESNI tasks help with AI R&D?
By default, not that much of currently done AI R&D is straightforwardly an ESNI task. ML research at AI companies typically either requires expensive (potentially very expensive) verification/evaluation or it requires a decent amount of taste and judgment to come up with the idea, set up the experiments, or interpret the results. Building infrastructure or doing efficiency optimization is much more ESNI-like but typically isn't fully ESNI.
What parts of AI R&D are ESNI?
Implementing optimized versions of experiments or architectures given a precise spec for the architecture/experiment. (Allowing for e.g. comparing behavior at small scale to unoptimized known correct implementations.) This could be pretty helpful and makes using more complex and infrastructurally difficult architectures more viable. It also makes heterogeneous compute more viable. (Optimizing many parts of full scale training runs isn't an ES task because verifying correctness and efficiency requires expensive experiments, and some optimizations aren't purely behavior-preserving — e.g., how much does increased asynchrony affect performance?)
Building or optimizing straightforward/well-specified internal tools/infrastructure used for research.
Some types of ML experiments where the results are cheap to verify, most notably some prompting and scaffolding experiments where we have a good (and cheap) benchmark. There might also be valuable very small scale ML experiments (though getting lots of value from these experiments may be bottlenecked on ideation).
Optimizing some applications of AI (either inside the company or to increase revenue).
Here are some things that might or might not be ESNI tasks:
Building RL environments. It's not super easy to verify if an RL environment is reasonable and it's unclear how bad it is for a reasonably large subset of RL environments to be quite flawed.
Collecting and operating on data.
So naively, we'd expect very high performance on just ESNI tasks to be a moderate speed up that results in AI companies quickly getting bottlenecked on something else. Of course, current AIs are also somewhat helpful on other tasks and can generally accelerate lots of engineering.
I don't feel very confident in my picture of how much of AI R&D is an ESNI task, and AI companies might figure out better ways to leverage AIs doing something ESNI-like.
I do think that if AIs were wildly, wildly superhuman on ESNI tasks (or especially if they were wildly superhuman on the broader category of ES tasks), they could potentially massively accelerate AI R&D via (e.g.) massive improvements through just small scale experiments. As a wildly extreme hypothetical, if AIs could generally complete ES tasks a trillion times cheaper and faster than humans (but were somehow just as capable as current AIs on other tasks), I think AI R&D progress would massively accelerate via some mechanism (probably a very different mechanism than what drives current progress).
A big limiting factor is the "cheap to verify" aspect. If AIs could use expensive resources more sample efficiently than humans (while still only being very good at straightforward-but-potentially-expensive-to-verify domains), then the AI R&D speed up would be massive and depending on details this might yield full automation of AI R&D. But, using expensive resources more sample efficiently than top human experts effectively means having research taste matching (or exceeding) top human experts, which seems at least several years away at the current rate of progress on this capability. However, AIs might not need to utilize resources that effectively to yield large speed ups. My understanding is that lots of progress (currently) is from relatively uninteresting (and not that ideation bottlenecked) research. As in, engineers whose taste isn't that great but are very fast would yield large speed ups. With moderate improvements in taste, AIs might resemble such engineers. This usage of AIs would still require humans to be providing ideas and some taste, but AIs could autonomously run with large parts of the project (doing potentially months of work autonomously).
Some aspects of poor resource utilization feel pretty easy to solve (e.g., Opus 4.5 was a bit too miserly while I tend to find that Opus 4.6 is a bit too profligate) but ultimately my best guess is that this requires reasonably good taste and judgment which will be somewhat difficult to achieve. Notably, doing RL at the same resource usage scale as deployment usage of AI won't generally be viable. That said, transfer from smaller resource usage tasks might not be that hard, and some of the RL can involve the AI using resources that are many times more expensive than the RL rollouts themselves (matching a decent fraction of deployment resource usage scale).[9]
My experience trying to automate safety research with current models
I've recently been working on trying to automate empirical AI safety projects with AIs both because this would be materially useful (at least while being careful about capabilities externalities) and because this seems useful for better understanding blockers to future automation of safety and safely deferring to AIs. As part of this I wrote an agent orchestrator and various other things to try to make AIs better at this.
Early on, one of my main blockers was that Opus 4.5 would consistently fail to complete the full task with anywhere near the desired level of thoroughness (often skipping large parts of the task). I was able to patch around these instruction following issues in various ways and resolve some other issues through better prompting and scaffolding/orchestration. But, I do still see serious productivity hits from (mundane) misalignment, I have a forthcoming post on misalignment in current models and why I think it's problematic.
Currently, the biggest blocker I have on the (small) projects I'm trying to automate is poor taste/judgment where AIs make somewhat bad choices or consider something good when it actually isn't. I've been able to successfully get this overall system to do reasonably big chunks (for an unaccelerated human maybe the equivalent of a day to maybe a few weeks) of relatively straightforward projects, often compensating for poor taste by getting the AI to complete the project much more thoroughly, effectively doing more low value work.
Overall, I think automating many weeks of mostly pretty-well-specified safety research will soon often work.
My experience seeing if my setup can automate massive ES tasks
I also ran the above setup on trying to ~fully autonomously complete:
2 massive (e.g. would take 3-30 person years) easy-and-cheap-to-verify SWE tasks
A hard easy-and-cheap-to-verify AI R&D task
A few (somewhat esoteric) number-go-up optimization tasks that ARC theory was interested in
Some of METR's harder and less well specified tasks (these weren't fully ES)
Vulnerability finding and end-to-end (cyber) exploitation tasks on relatively hardened targets
I did the first two of these mostly to get a better understanding of AI capabilities. I don't want to say what the exact tasks were in this document for a few reasons.
I've generally found AIs able to make quite a bit of autonomous progress, in line with results others have seen, while the amount of progress depends a lot on the details of the task.
(I've done multiple different runs with somewhat different prompts and different scaffolding settings, generally the results are surprisingly invariant to this.)
SWE tasks
I found that the AI successfully completed what looks like many months (3-12 months) of useful work in the SWE projects. In one of these projects it looks like the AIs have beaten or are close to beating a large and moderately complex piece of closed source software in some respects while failing to match it in other respects (and while having various bugs and unimplemented features). For the other, it looks like they may produce something that's pretty impressive but mostly worse than the current best open source project of that type. The code quality is low, but I've since developed approaches that would probably have made the code quality mostly OK (but still not great and likely with some places of very low code quality).
I should say that I did start these projects with a reasonable amount of guidance on what to pursue and what metrics to use and what infrastructure etc. to use which took me around 1-2 hours to write but much of this is amortized over multiple tasks (I've reused this guidance for other tasks) and an hour or two is not that long.
The AI reasonably often makes somewhat bad prioritization choices and has an inclination to consider itself done before trying hard/serious types of improvement. I've had to remind it of metric prioritization, nudge it to keep going, and remind it to periodically clean up its code (even though this was included in instructions). But, by just continuing to iterate, these poor choices aren't catastrophic. I also notice a reasonable amount of misalignment where the AI fails to fully complete various tasks and doesn't keep working, but my scaffolding mostly compensates for this.
I expect that there are pretty large returns to having a human spend 15 minutes giving the AI tips every so often (e.g. every day of calendar time). Often the AIs make mistakes that are pretty obvious to a human who doesn't even have that much state on the project (e.g. due to losing sight of the bigger picture). But the AIs aren't amazing at incorporating advice from what I've seen.
AIs seem especially good at software replication tasks—as in, make a drop-in replacement for this piece of closed source (or open source) software that has some advantage (e.g. speed, security, some feature, etc.). METR has some forthcoming results on this and I think the performance is even stronger with better scaffolding and prompting.
AI R&D task
The AI R&D task I tested involves improving on something that's already well optimized, so it's pretty hard for the AI to make progress. I tentatively believe the AI made somewhere between a few days and a bit over a week of progress on this task relative to a strong human professional. In practice, it was mostly limited by the AI not being very good at finding good ideas, deciding which ideas to investigate, and allocating time/effort for each idea. It seemed to spend most time trying to eke out gains with tweaking rather than making material improvements. The AI also was pretty resource inefficient and not very good at getting more work done in a limited amount of time due to spending lots of time waiting on many runs. Some of this resource usage could be easily improved with better prompting.
Cyber
AIs are quite good at autonomous cyber, especially with a moderate amount of scaffolding. In part, this is due to having a lot of domain-specific knowledge. I don't want to comment on the exact results I've found (on Opus 4.6) at this time, but this talk by Nicholas Carlini is relevant.
Appendix: Somewhat more detailed updated timelines
I thought it might be helpful to also include some updated timelines in this post.
Parity in a domain: the point when you would be better off firing all humans working in that domain than reverting to 2020-era AI. Humans may still add value in places, and firing them all would still slow things down somewhat, but AIs collectively are more valuable than humans collectively.
AI R&D parity. Parity applied to AI R&D at the leading AI company (where "humans" means everyone who knows how to program and/or has done ML research).
AI stack + conflict parity. Parity applied to all activities relevant to (a) maintaining and improving the AI stack and (b) winning wars (broadly construed). This includes: manufacturing, construction, mining, and other physical industrial tasks; R&D in energy, materials, hardware, biotech, robotics, cyberoffense/defense, etc.; and squishy skills like strategy, tactics, and logistics. Note that this requires the ability to do fully autonomous manufacturing. (I'm allowing for a brief period of adaptation without further AI progress to allow for repurposing robots and manufacturing capacity.)
AC
TEDAI
Note that (1) differs from the operationalization of "Full AI R&D automation" I've used historically; it's a bit weaker. (So my probabilities are correspondingly a bit higher.)
My forecasts (mostly? probably?) don't take into account aggressive policy responses to slow down AI development, but do include "business as usual" regulatory blockers. Possibly this is a mistake.
Date
1. AI R&D parity
2. AI stack + conflict parity
3. AC
4. TEDAI
EOY 2026
7%
3%
11%
4%
EOY 2027
19%
9%
27%
12%
EOY 2028
30%
17%
39%
19%
EOY 2029
40%
25%
48%
26%
EOY 2030
48%
32%
56%
32%
EOY 2031
54%
37%
62%
37%
EOY 2032
58%
42%
66%
42%
EOY 2033
61%
47%
69%
46%
EOY 2034
63%
51%
71%
50%
EOY 2038
70%
61%
77%
58%
For comparison, Cotra's median for AI Research Parity (comparable to my AI R&D parity) is early 2030 (slightly before my median of early 2031), and her median for AI Production Parity (comparable to my AI stack + conflict parity, though mine also includes conflict) is mid 2032 (before my median of late 2034).
While I give precise numbers, my views aren't that reflectively stable (e.g. I updated a moderate amount over the last week towards longer timelines after thinking about it a bit!).[11]
Note: my median time from some milestone A to some later milestone B is significantly smaller than the difference between my medians for A and B. This is because for right-skewed distributions, the median of a sum is greater than the sum of the medians. Intuitively: each milestone has some chance of taking a very long time (heavy right tail), and these right tails compound when you add delays together, pulling the median of the total time further right than you'd get by just adding the individual medians. So median(B) - median(A) > median(B - A), i.e., the difference between medians overstates the median of the actual time between milestones.[12] For instance, I estimate my median time from AI R&D parity (conditioning on this happening before 2035) to TEDAI is maybe around 1.75 years, while the difference between my medians is around 3.5 years.[13]
I mostly updated in February 2026 and refined my thinking a bit more in March. ↩︎
I'm at 30% for AI R&D parity (you'd be better off firing all humans working in AI R&D than reverting to using 2020-era AI), but a bit lower for full automation (firing all humans would only slow things down by ~5%), perhaps 26%. ↩︎
As in, don't require coming up with ideas that aren't already on the internet. A key part of the task being discovering somewhat hard-to-find ideas that someone knows but aren't public also makes the task quite a bit harder for models. ↩︎
By "50%-reliability time horizon" I mean: if you randomly sample tasks from the relevant task distribution, this is the time horizon at which the AI has a 50% chance of success. Note that in practice this is mostly driven by variation between tasks (some tasks are harder for the AI than others) than by the model randomly failing on a given task. Thus, it's a bit unnatural to call this reliability. I use the term "reliability" because that's what METR uses and it reads nicely (e.g. "50%-reliability"), though "success rate" might be more accurate. ↩︎
My prior view was based on thinking that 2025 was especially fast due to some low-hanging fruit in RL and some of this progress was from increasing cost as a fraction of human cost. I think these factors still hold to a moderate extent, I just expect them to be less important than other factors. ↩︎
We shouldn't double count this: my view is just that AI progress speeds up as you get closer to full automation of AI R&D all else equal and this was already priced into my timelines. (In practice, I expect that all else isn't equal and I expect compute scaling to start slowing within 3 years or so due to production capacity limits or investment slowing as it reaches more extreme levels. I think we're already seeing some signs of hitting compute production issues with DRAM/HBM, though it's worth taking into account adaptation and there being some lag.) ↩︎
By fully CLI, I mean that the task doesn't require vision, computer use, or non-trivial hard-to-programmatically-automate interaction. ↩︎
Note that just updating towards my median for superexponentiality kicking in would have also shortened my timelines; the situation isn't symmetric. The basic reason for this is that my timelines are substantially longer due to a slower tail on many different factors. ↩︎
You can also do online RL, but this has some downsides. ↩︎
I worry their operationalization is a bit weaker than they intended. In addition to their remote work operationalization, I also intend the definition to include beating human experts at any reasonably important R&D domain when doing that work purely remotely. Like, you would certainly prefer hiring the AI over hiring the top human expert in any reasonably important R&D, putting aside physical manipulation. ↩︎
Given this instability, why have so much precision? I tentatively think my precision is actually indicative of very slightly better guesses; e.g. I expect I would do a little worse at forecasting if forced to round to the nearest 5% or 10% while I'm also pretty likely to adjust my guesses a bunch on further reflection and these can be true at the same time. Also, it's nice to have a smooth curve. ↩︎
Here, A and B are random variables that correspond to the year in which some event happens. ↩︎
Also, if A and B-A are correlated (as I think is true for the milestones I discuss here, shorter timelines are correlated with faster takeoff), then conditioning on A having been reached earlier also shrinks the expected remaining time to B. So, if we reach AI R&D parity in mid 2028, then I'd expect a smaller gap to TEDAI. ↩︎
We think that near-term AI could make it much easier for groups to coordinate, find positive-sum deals, navigate tricky disagreements, and hold each other to account.
Partly, this is because AI will be able to process huge amounts of data quickly, making complex multi-party negotiations and discussions much more tractable. And partly it’s because secure enough AI systems would allow people to share sensitive information with trusted intermediaries without fear of broader disclosure, making it possible to coordinate around information that’s currently too sensitive to bring to the table, and to greatly improve our capacity for monitoring and transparency.
We want to help people imagine what this could look like. In this piece, we sketch six potential near-term technologies, ordered roughly by how achievable we think they are with present tech:[1]
Fast facilitation — Groups quickly surface key points of consensus views and disagreement, and make decisions everyone can live with.
Automated negotiation — Complicated bargains are discovered quickly via automated negotiation on behalf of each party, mediated by trusted neutral systems which can find agreements.
Background networking — People who should know each other get connected (perhaps even before they know to go looking), enabling mutually beneficial trade, coalition building, and more.
Confidential monitoring and verification — Deals can be monitored and verified, even when this requires sharing highly sensitive information, by using trusted AI intermediaries which can’t disclose the information to counterparties.
We also sketch two cross-cutting technologies that support coordination:
AI delegates and preference elicitation — AI delegates can faithfully represent and act for a human principal, perhaps supported by customisable off-the-shelf agentic platforms that integrate across many kinds of tech.
Charter tech — The technologies above, or other coordination technologies, are applied to making governance dynamics more transparent, making it easier to anticipate how governance decisions will influence future coordination, and design institutions with this in mind.
An important note is that coordination technologies are open to abuse. You can coordinate to bad ends as well as good, and particularly confidential coordination technologies could enable things like price-setting, crime rings, and even coup plots. Because the upsides to coordination are very high (including helping the rest of society to coordinate against these harms), we expect that on balance accelerating some versions of these technologies is beneficial. But this will be sensitive to exactly how coordination technologies are instantiated, and any projects in this direction need to take especial care to mitigate these risks.
We’ll start by talking about why these tools matter, then look at the details of what these technologies might involve before discussing some cross-cutting issues at the end.
Why coordination tech matters
Today, many positive-sum trades get left on the table, and a lot of resources are wasted in negative-sum conflicts. Better coordination capabilities could lead to very large benefits, including:
Improving economic productivity across the board
Helping nations avoid wars and other destructive conflicts
Enabling larger groups to coordinate to avoid exploitation by a small few
Making democratic governance much more transparent, while protecting sensitive information
What’s more, getting these benefits might be close to necessary for navigating the transition to more powerful AI systems safely. Absent coordination, competitive pressures are likely to incentivise developers to race forward as fast as possible, potentially greatly increasing the risks we collectively run. If we become much better at coordination, we think it is much more likely that the relevant actors will be able to choose to be cautious (assuming that is the collectively-rational response).
However, coordination tech could also have significant harmful effects, through enabling:
AI companies to collude with each other against the interests of the rest of society[2]
More selfishness and criminality, as social mechanisms of coordination are replaced by automated ones which don’t incentivise prosociality to the same extent
Regardless of how these harms and benefits net out for ‘coordination tech’ overall, we currently think that:
The shape and impact of coordination tech is an important part of how things will unfold in the near term, and it’s good for people to be paying more attention to this.
We’re going to need some kinds of coordination tech to safely navigate the AI transition.
The devil is in the details. There are ways of advancing coordination tech which are positive in expectation, and ways of doing so which are harmful.
Why ‘defense-favoured’ coordination tech
That’s why we’ve called this piece ‘defense-favoured coordination tech’, not just ‘coordination tech’. We think generic acceleration of coordination tech is somewhat fraught — our excitement is about thoughtfully run projects which are sensitive to the possible harms, and target carefully chosen parts of the design space.
We’re not yet confident which the best bits of the space are, and we haven’t seen convincing analysis on this from others either. Part of the reason we’re publishing these design sketches is to encourage and facilitate further thinking on this question.
For now, we expect that there are good versions of all of the technologies we sketch below — but we’ve flagged potential harms where we’re tracking them, and encourage readers to engage sceptically and with an eye to how things could go badly as well as how they could go well.
Fast facilitation
Right now, coordinating within groups is often complex, expensive, and difficult. Groups often drop the ball on important perspectives or considerations, move too slowly to actually make decisions, or fail to coordinate at all.
AI could make facilitation much faster and cheaper, by processing many individual views in parallel, tracking and surfacing all the relevant factors, providing secure private channels for people to share concerns, and/or providing a neutral arbiter with no stake in the final outcome. It could also make it much more practical to scale facilitation and bring additional people on board without slowing things down too much.
Design sketch
An AI mediation system briefly interviews groups of 3–300 people async, presents summary positions back to the group, and suggests next steps (including key issues to resolve). People approve or complain about the proposal, and the system iterates to appropriate depth for the importance of the decision.
Under the hood, it does something like:
Gathers written context on the setting and decision
Holds brief, private conversations with each participant to understand their perspective
Builds a map of the issue at hand, involving key considerations and points of (dis)agreement
Performs and integrates background research where relevant
Identifies which people are most likely to have input that changes the picture
Distils down a shareable summary of the map, and seeks feedback from key parties
Proposes consensus statements or next steps for approval, iterating quickly to find versions that have as broad a backing as possible
Feasibility
Fast facilitation seems fairly feasible technically. The Habermas Machine (2024) does a version of this that provided value to participants — and we have seen two years of progress in LLMs since then. And there are already facilitation services like Chord. In general, LLMs are great at gathering and distilling lots of information, so this should be something they excel at. It’s not clear that current LLMs can already build accurate maps of arbitrary in-motion discourse, but they probably could with the right training and/or scaffolding.
Challenges for the technology include:
Ensuring that it’s more efficient and a better user experience for moving towards consensus than other, less AI-based approaches.
Remaining robust against abusive user behaviour (e.g. you don’t want individuals to get their way via prompt injection or blatantly lying).
Neither of these seem like fundamental blockers. For example, to protect against abuse, it may be enough to maintain transparency so that people can search for this. (Or if users need to enter confidential information, there might be services which can confirm the confidential information without revealing it.)
Possible starting points // concrete projects
Build a baby version. This could help us notice obstacles or opportunities that would have been hard to predict in advance. You could focus on the UI or the tech side here, or try to help run pilots at specific organisations or in specific settings.
Design ways to evaluate fast facilitation tools. This makes it easier to assess and improve on performance. For example, you could create games/test environments with clear “win” and “failure” modes.
Build subcomponents. For example:
Bots that surface anonymous info.
Tools that try to surface areas of consensus or common knowledge as efficiently as possible, while remaining hard to game.
Make a meeting prep system. Focus first on getting good at meeting prep — creating an agenda and considerations that need live discussion — to reduce possible unease about outsourcing decision-making to AI systems.
Make a bot to facilitate discussions. This could be used in online community fora, or to survey experts.
Design ways to create live “maps” of discussions. Fast facilitation is fast because it parallelises communication. This makes it more important to have good tools for maintaining shared context.
Automated negotiation
High-stakes negotiation today involves adversarial communication between humans who have limited bandwidth.
Negotiation in the future could look more like:
You communicate your desires openly with a negotiation delegate who is on your side, asking questions only when needed to build a deeper model about your preferences.
The delegate goes away, and comes back with a proposal that looks pretty good, along with a strategic analysis explaining the tradeoffs / difficulties in getting more.
Design sketch
Humans can engage AI delegates to represent them. The delegates communicate with each other via a neutral third party mediation system, returning to their principals with a proposal, or important interim updates and decision points.
Under the hood, this might look like:
Delegate systems:
Read over context documents and query principals about key points of uncertainty to build initial models of preferences.
Model the negotiation dynamics and choose strategic approaches to maximise value for their principal.
Go back to the principal with further detailed queries when something comes up that crosses an importance threshold and where they are insufficiently confident about being able to model the principal’s views faithfully.
Are ultimately trained to get good results by the principal’s lights.
Neutral mediator system:
Is run by a trusted third-party (or in higher stakes situations, perhaps is cryptographically secure with transparent code).
Discusses with all parties (either AI delegates, or their principals)
Can hear private information without leaking that information to the other party
Impossibility theorems mean that it will sometimes be strategically optimal for parties to misrepresent their position to the mediator (unless we give up on the ability to make many actually-good deals); however, we can seek a setup such that it is rarely a good idea to strategically misrepresent information, or that it doesn’t help very much, or that it is hard to identify the circumstances in which it’s better to misrepresent
Searches for deals that will be thought well of by all parties, and proposes those to the delegates.
Is ultimately trained to help all parties reach fair and desired outcomes, while minimising incentives-to-misrepresent for the parties.
Feasibility
Some of the technical challenges to automated negotiation are quite hard:
The kind of security needed for high-stakes applications isn’t possible today.
Getting systems to be deeply aligned with a principal’s best interests, rather than e.g. pursuing the principal’s short-term gratification via sycophancy, is an unsolved problem.
That said, it’s already possible to experiment using current systems, and it may not be long before they start improving on the status quo for human negotiation. Low-stakes applications don’t require the same level of security, and will be a great training ground for how to set up higher stakes systems and platforms. And practical alignment seems good enough for many purposes today.
Possible starting points // concrete projects
Build an AI delegate for yourself or your friends. See if you can get it to usefully negotiate on your behalf with your friends or colleagues. Or failing that, if it can support you to think through your own negotiation position before you need to communicate with others about it.
Build a negotiation app with good UI. Building on existing LLMs, build an app which helps people think through their negotiation position in a structured way. Focus on great UI.
This could be non-interactive at first, and just involve communication between a human and the app, rather than between any AI systems.
But it builds the muscles of a) designing good UI for AI negotiation, and b) people actually using AI to help them with negotiation.
Run a pilot in an org or community you’re part of.
You could start with fairly low-stakes negotiations, like what temperature to set the office thermostat to or which discussion topics to discuss in a given meeting slot.
Experimenting with different styles of negotiation (in terms of how high the stakes are, how complex the structure is, and what the domain is) could be very valuable.
Arbitrarily easy arbitration
Right now, the risk of expensive arbitration makes many deals unreachable. If disputes could be resolved cheaply and quickly using verifiably fair and neutral automated adjudicators, this could unlock massive coordination potential, enabling a multitude of cooperative arrangements that were previously prohibitively costly to make.
Design sketch
An “Arb-as-a-Service” layer plugs into contracts, platforms, and marketplaces. Parties opt in to standard clauses that route disputes to neutral AI adjudicators with a well-deserved reputation for fairness. In the event of a dispute, the adjudicator communicates with parties across private, verifiable evidence channels, investigating further as necessary when there are disagreements about facts. Where possible, they auto-execute remedies (escrow releases, penalties, or structured commitments). Human appeal exists but is rarely needed; sampling audits keep the system honest. Over time, this becomes ambient infrastructure for coordination and governance, not just commerce.
How this could work under the hood:
Agreement ingestion
Formal or natural language contracts are parsed and key terms extracted, with parties confirming the system’s interpretation before proceeding.
The system could also suggest pre-dispute modifications to make agreements clearer, flag potentially unenforceable terms, and maintain public precedent databases that help parties understand likely outcomes before committing.
Automated discovery
When disputes arise, an automated discovery process gathers relevant documentation, transaction logs, and communications from integrated platforms.
The system offers interviews and the chance to submit further evidence to each party.
Deep consideration
The system builds models of what different viewpoints (e.g. standard legal precedent; commonsense morality; each of the relevant parties) have to say on the situation and possible resolutions, to ensure that it is in touch with all major perspectives.
Where there are disagreements, the system simulates debate between reasonable perspectives.
It makes an overall judgement as to what is fairest.
Transparent reasoning
The system produces detailed explanations of its conclusions, with precedent citations and counterfactual analysis where appropriate.
(Optional) Smart escrow integration
Judgements automatically execute through cryptocurrency escrows or traditional payment rails, with graduated penalties for non-compliance.
In cases where the system detects evidence that is highly likely to be fraudulent, or other attempts to manipulate the system, it automatically adds a small sanction to the judgement, in order to disincentivise this behaviour.
Opportunities for appeal
Either party can pay a small fee to submit further evidence and have the situation re-considered in more depth by an automated system.
For larger fees they can have human auditors involved; in the limit they can bring things to the courts.
Feasibility
LLMs can already do basic versions of 1-4, but there are difficult open technical problems in this space:
Judgement: Systems may not currently have good enough judgement to do 1, 3, 4 in high-stakes contexts (and until recently, they clearly didn’t).
Real-world evidence assessment: Systems don’t currently know how to handle conflicting evidence provided digitally about what happened in the real world.
Verifiable fairness/neutrality: The full version of this technology would require a level of fairness and neutrality which isn’t attainable today.
Those are large technical challenges, but we think it’s still useful to get started on this technology today, because iterating on less advanced versions of arbitration tech could help us to bootstrap our way to solutions. Particularly promising ways of doing that include:
Starting in lower-stakes or easier contexts (for example, digital-only spaces avoid the challenge of establishing provenance for real-world evidence).
Creating evals, test environments and other infrastructure that helps us improve performance.
On the adoption side, we think there are two major challenges:
Trust: As above, some amount of technical work is needed to make systems verifiably fair/neutral. But even if it becomes true that the systems are neutral, people need to build quite a high level of confidence that the system is genuinely impartial before they’ll bind themselves to its decisions for meaningful stakes.
Legal integration: This tech is only useful to the extent that its arbitration decisions are recognised and enforced as legitimate by the traditional legal system, or are enshrined directly via contract in a self-enforcing way.
(We are unsure how large a challenge this will be; perhaps you can write contracts today that are taken by the courts as robust. But it may be hard for parties to have large trust in them before they have been tested.)
Both of these challenges are reasons to start early (as there might be a long lead time), and to make work on arbitration tech transparent (to help build trust).
Possible starting points // concrete projects
Work with an arbitration firm. Work with (or buy) a firm already offering arbitration services to start automating parts of their central work, and scale up from there.
Work with an online platform that handles arbitration. Use AI to improve their processes, and scale from there.
Create a bot to settle informal disputes. Build an arbitration-as-a-service bot that people can use to settle informal disputes.
Trial a system on internal disputes. This could be at your own organisation, another organisation, or a coalition of early adopter organisations.
Run a pilot in parallel to regular arbitration. Run a pilot where an automated arbitration system is given access to all the relevant information to resolve disputes, and reaches its own conclusions — in parallel to the regular arbitration process, which forms the basis of the actual decision. You could partner with an arbitration firm, or potentially do this through a coalition of early adopter organisations, perhaps in combination with philanthropic funding.
Background networking
We can only do things like collaborate, trade, or reconcile if we’re able to first find and recognise each other as potential counterparties. Today, people are brought into contact with each other through things like advertising, networking, even blogging. But these mechanisms are slow and noisy, so many people remain isolated or disaffected, and potentially huge wins from coordination are left undiscovered.[3]
Tech could bring much more effective matchmaking within reach. Personalised, context-sensitive AI assistance could carry out orders of magnitude more speculative matchmaking and networking. If this goes well, it might uncover many more opportunities for people to share and act on their common hopes and concerns.
Design sketch
A ‘matchmaking marketplace’ of attentive, personalised helpers bustles in the background. When they find especially promising potential connections, they send notifications to the principals or even plug into further tools that automatically take the first steps towards seriously exploring the connection.
You can sign up as an individual or an existing collective. If you just want to use it passively, you give a delegate system access to your social media posts, search profiles, chatbot history, etc. — so this can be securely distilled into an up-to-date representation of hopes, intent, and capabilities. The more proactive option is to inject deliberate ‘wishes’ through chat and other fluent interfaces.
Under the hood, there are a few different components working together:
Interoperable, secure ‘wish profiling’ systems which identify what different participants want.
People connect their profiles on existing services (social media, chatbot logs, email, etc).
LLM-driven synthesis (perhaps combined with other forms of machine learning) curates a private profile of user desires.
Optionally, chatbot-style assistance can interview users on the points of biggest uncertainty, to build a more accurate profile.
A searchable ‘wish registry’ which organises large collections of wants and offers, while maintaining semi-privacy.
Each user’s interests can run searches, finding potential matches and surfacing only enough information about them to know whether they are worth exploring further.
Feasibility
A big challenge here is privacy and surveillance. Doing background networking comprehensively requires sensitive data on what individuals really want. This creates a double-edged problem:
If sensitive data is too broadly available, it can be used for surveillance, harassment, or exploitation; including by big corporations or states.
If sensitive data is completely private, it opens up the possibility of collusion, for example among criminals.
This is a pretty challenging trade-off, with big costs on both sides. Perhaps some kind of filtering system which determines who can see which bits of data could be used to prevent data extraction for surveillance purposes while maintaining enough transparency to prevent collusion.
Ultimately, we’re not sure how best to approach this problem. But we think that it’s important that people think more about this, as we expect that by default, this sort of technology will be built anyway in a way that isn’t sufficiently sensitive to these privacy and surveillance issues. Early work which foregrounds solutions to these issues could make a big difference.
Other potential issues seem easier to resolve:
Technically, background networking tools already seem within reach using current systems. Large-scale deployments would require indexing and registry, but it seems possible to get started on these using current systems.
One note is that it seems possible to implement background networking in either a centralised or a decentralised way. It’s not clear which is best, though decentralised implementations will be more portable.
Adoption also seems likely to work, because there are incentives for people to pay to discover trade and cooperation opportunities they would otherwise have missed, analogous to exchange or brokerage fees. Though there are some trickier parts, we expect them to ultimately be surmountable (though timing may be more up for grabs than absolute questions of adoption):
In the early stages when not many people are using it, the value of background networking will be more limited. Possible responses include targeting smaller niches initially, and proactively seeking out additional network beneficiaries.
It’s harder to incentivise people to pay for speculative things like uncovering groups they’d love that don’t yet exist. You could get around this using entrepreneurial or philanthropic speculation (compare the dominant assurance contract model and related payment incentivisation schemes).
Possible starting points // concrete projects
Work with existing matchmakers to improve their offering. Find groups that are already doing matchmaking and are eager for better systems — perhaps among community organisers, businesses, recruiters or investors. Work with them to understand the pain points in their current networking, and what automated offerings would be most appealing. Then build those tools and systems.
Build a networking tool for a specific community. Build a custom networking system for a particular group or subculture. For example, this could look like a networking app or a plug-in to an existing online forum. This could start delivering value fairly quickly, and provide a good opportunity for iteration.
Structured transparency for democratic oversight
Today, citizens in democracies have limited mechanisms to verify whether institutions’ public claims are consistent with their internal evidence:
The baseline is highly opaque.
Freedom of information systems help, but can be evaded by non-cooperating institutions.
Public inquiries can be reasonably thorough, but are expensive and slow.
Full transparency has many costs and is typically highly resisted.
This is costly — e.g. the UK Post Office scandal over its Horizon IT system led to hundreds of wrongful prosecutions that could have been avoided. And it creates bad incentives for those running the institutions.
AI has the potential to change this. Instead of oversight being expensive, reactive, and slow, automated systems could in theory have real-time but sandboxed access to institutional data, routinely reviewing operational records against public claims and surfacing inconsistencies as they emerge.
Where confidential monitoring helps willing parties verify each other, structured transparency for democratic oversight aims to hold institutions accountable to the broader public.[4]
Design sketch
When an oversight body wants to verify facts about the behaviour of another institution, it requests comprehensive data about the internal operations of that institution. AI systems are tasked with careful analysis of the details, flagging the type and severity of any potential irregularities. Most of the data never needs human review.
In the simpler version, this is just a tool which expands the capacity of existing oversight bodies. Even here, the capacity expansion could be relatively dramatic — this kind of semi-structured data analysis is the kind of work that AI models can excel at today — without needing to trust that the systems are infallible (since the most important irregularities will still have human review).
A more ambitious version treats this as a novel architecture for oversight. AI systems operate continuously within secure environments that don’t give any humans access to the full dataset. They can flag inconsistencies as institutional data is deposited rather than waiting for an investigation to begin. For maximal transparency, summaries could be made available to the public in real-time, without revealing any confidential information that the public does not have rights to.
Under the hood, this might involve:
Secure data repositories, such that institutions routinely share operational data with a sandboxed environment operated by or on behalf of the oversight body, without any regular human access to the data.
Continuous ingestion and indexing of institutional public outputs (press releases, regulatory filings, budget documents, etc.) into a searchable database.
Automated cross-referencing between public claims and internal records.
Highlighting of potential issues (mismatches between public statements and private information, as well as decisions made in violation of normal procedures).
Further automated investigation of potential issues, leading to flags to humans in cases with sufficiently large issues flagged with sufficient confidence.
Importantly, the sandbox outputs its findings but not the underlying data; if there is need for transparency on that, this is a separate oversight question.
Feasibility
There are two important aspects to feasibility here: technical and political.
Technically, decent reliability at the core functionality is possible today. Getting up to extremely high reliability so that it could be trusted not to flag too many false positives across very large amounts of data might be a reach with present systems; but is exactly the kind of capability that commercial companies should be incentivised to solve for business use.
Political feasibility may vary a lot with the degree of ambition. The simplest versions of this technology might in many cases simply be adopted by existing oversight bodies to speed up their current work. Anything which requires them getting much more data (e.g. to put in the sandboxed environments) might require legislative change — which may be more achievable after the underlying technology can be shown to be highly reliable.
Challenges include:
Adversarial dynamics: the technical bar to verify claims against actively adversarial institutions (who are manipulating deposited data, potentially via AI) is substantially higher.
This is the bar that we’d need to reach for confidential monitoring below.
Defamation risk: the downsides of false positives, where your system reports someone misrepresenting things when they were not, could be significant (although can perhaps be mediated by giving people a right-of-rebuttal where they give further data to the AI systems which monitor the confidential data streams).
Avoiding abuse: designing the systems so that they do not expose the confidential data, and cannot be weaponised to ruin the reputation of a department with very normal levels of error.
Ultimately the more transformative potential from this technology comes in the medium-term, with new continuous data access for oversight bodies. But this is likely to require legislative change, and the institutions subject to it may resist. Perhaps the most promising adoption pathway is to demonstrate value through voluntary pilots with oversight bodies that already have data access and want better tools. This could build the evidence base (and hence political constituency) for wider and deeper deployment.
Possible starting points // concrete projects
Retrospective validation on historical cases. Apply consistency-checking tools to document sets from well-understood historical cases where the relevant internal documents have subsequently been released (e.g. Enron emails). This builds the technical foundation, and demonstrates the concept without requiring any current institutional access.
Institutional public statement reliability tracker. Build a tool tracking whether agencies’ public claims about performance, spending, or policy outcomes are consistent with publicly available data — statistical releases, budget documents, prior statements. Start with a single policy domain. This requires no institutional partnerships and builds a public constituency for structured transparency. This is a version of reliability tracking, applied specifically to institutional accountability.
Pilot a FOIA exemption assessment tool. Partner with an Inspector General office to build a tool that reviews withheld documents and assesses whether claimed exemptions (national security, personal privacy, deliberative process) are applied appropriately. The IG already has legal access under the Inspector General Act; the tool helps them do their existing job faster and builds the working relationship needed for more ambitious deployments. This is also a natural testbed for the sandboxed architecture in miniature — the tool operates within the IG’s secure environment, producing exemption-appropriateness findings without the documents themselves leaving the system.
Confidential monitoring and verification
Monitoring and verifying that a counterparty is keeping up their side of the deal is currently expensive and noisy. Many deals currently aren’t reachable because they’re too hard to monitor. Confidential AI-enabled monitoring and verification could unlock many more agreements, especially in high-stakes contexts like international coordination where monitoring is currently a bottleneck.
Design sketch
When organisation A wants to make credible attestations about their work to organisation B, without disclosing all of their confidential information, they can mutually contract an AI auditor, specifying questions for it to answer. The auditor will review all of A’s data (making requests to see things that seem important and potentially missing), and then produce a report detailing:
Its conclusions about the specified questions.
The degree to which it is satisfied that it had good data access, that it didn’t run into attempts to distort its conclusions, etc.
This report is shared with A and B, then A’s data is deleted from the auditor’s servers.
Under the hood, this might involve:
Building a Bayesian knowledge graph, establishing hypotheses, and understanding what evidence suggests about those hypotheses.
Agentic investigatory probes into the confidential data, in order to form grounded assessments on the specified questions.
More ambitious versions might hope to obviate the need for trust in a third party, and provide reasons to trust the hardware — that it really is running the appropriate unbiased algorithms, that it cannot send side-channel information or retain the data, etc. Perhaps at some point you could have robot inspectors physically visiting A’s offices, interviewing employees, etc.
Feasibility
Compared to some of the other technologies we discuss, this feels technologically difficult — in that what’s required for the really useful versions of the tech may need very high reliability of certain types.
Nonetheless, we could hope to lay the groundwork for the general technological category now, so that people are well-positioned to move towards implementing the mature technology as early as is viable. Some low-confidence guesses about possible early applications include:
Legal audits — for example, claims that the documents not disclosed during a discovery process are only those which are protected by privilege.
Financial audits — e.g. for the purpose of proving viability to investors without disclosing detailed accounts.
Supply chain verification — e.g. demonstrating that products were ethically sourced without exposing the suppliers.
Possible starting points // concrete projects
Start building prototypes. Build a system which can try to detect whether it’s a real or counterfeited environment, and measure its success.
Work with a law or financial auditing firm. Work with (or buy) a firm that does this kind of work, and experiment with how to robustly automate while retaining very high levels of trustworthiness.
Explore the viability of complementary technology. For example, you could investigate the feasibility of demonstrating exactly what code is running on a particular physical computer that is in the room with both parties.
Cross-cutting thoughts
Some cross-cutting technologies
We’ve pulled out some specific technologies, but there’s a whole infrastructure that could eventually be needed to support coordination (including but not limited to the specific technologies we’ve sketched above). Some cross-cutting projects which seem worth highlighting are:
AI delegates and preference elicitation
Many of the technologies we sketched above either benefit from or require agentic AI delegates who can represent and act for a human principal. Developing customisable platforms could be useful for multiple kinds of tech, like background networking, fast facilitation, and automated negotiation.
Some ways to get started:
Direct preference elicitation: develop efficient and appealing interview-style elicitation of values, wishes, preferences and asks.
Passive data ingestion: build a tool that (consensually) ingests and distils all the available online content about a person — social media, browsing history, email, etc — and extracts principles from it (cf inverse constitutional AI).
One clarification is that though agentic AI delegates would be useful for some of the coordination tech above, it needn’t be the same delegate doing the whole lot for a single human:
You could have different delegates for different applications.
Some delegates might represent groups or coalitions.
Some delegates could be short-lived, and spun up for some particular time-bounded purpose.
Charter tech
A lot of coordination effort between people and organisations goes not into making better object-level decisions, but establishing the rules or norms for future coordination — e.g. votes on changing the rules of an institution. It is possible that coordination tech will change this basic pattern, but as a baseline we assume that it will not. In that case, making such meta-level coordination go well would also be valuable.
One way to help it go well is by making the governance dynamics more transparent. Voting procedures, organisational charters, platform policies, treaty provisions, etc. create incentives and equilibria that play out over time, often in ways the framers didn’t anticipate. Let’s call any technology which helps people to better understand governance dynamics, or to make those dynamics more transparent, ‘charter tech’. In some sense this is a form of epistemic tech; but as the applications are always about coordination, we have chosen to group it with other coordination technologies. We think charter tech could be important in two ways:
Through directly improving the governance dynamics in question, helping to avoid capture, conflict, and lock-in.
Through compounding effects on future coordination, which will unfold in the context of whatever governance structures are in place.
Charter tech could be used in a way that is complementary to any of the above technologies (if/when they are used for governance-setting purposes), although can also stand alone.
For the sake of concreteness, here is a sketch of what charter tech could look like:
A “governance dynamics analyser” that ingests descriptions of constitutions, charters, policies or community norms, builds models of power, incentives, and information flow, and then (a) forecasts likely equilibria and failure modes, (b) red-teams for strategic abuse,[5] and (c) proposes safer rule variants that preserve the framers’ intent.[6]
While this tool can be called actively if needed, there is also a classifier running quietly in the background of organisational docs/emails, and when it detects a situation where power dynamics and governance rules are relevant, it runs an assessment — promoting this to user attention just in cases where the proposed rules are likely to be problematic.
Note that charter tech could be used to cause harm if access isn’t widely distributed. Vulnerabilities can be exploited as well as patched, and a tool that makes it easier to identify governance vulnerabilities could be used to facilitate corporate capture, backsliding or coups. Provided the technology is widely distributed and transparent, we think that charter tech could still be very beneficial — particularly as there may be many high-stakes governance decisions to make in a short period during an intelligence explosion, and the alternative of ‘do our best without automated help’ seems pretty non-robust.
Some ways to get started on using AI to make governance dynamics more transparent:
Work with communities that iterate frequently on governance (DAOs, open-source projects) to test analyses against what actually happens when rules change.
Compile a pattern library of governance failures and successes, documented in enough detail to inform automated analysis.
Build simulation environments where proposed rules can be stress-tested against populations of agents with varying goals, including adversarial ones.
Partner with mechanism design researchers to identify which aspects of their formal analysis can be automated and applied to less formal real-world documents.
Adoption pathways
Many of these technologies will be directly incentivised economically. There are clear commercial incentives to adopt faster, cheaper methods of facilitation, negotiation, arbitration, and networking.
However, adoption seems more challenging in two important cases:
Adoption by governments and broader society. Many of the most important benefits of coordination tech for society will come from government and broad social adoption, but these groups will be less impacted by commercial incentives. This bites particularly hard for technologies that could be quite expensive in terms of inference compute, like fast facilitation, arbitration and negotiation. By default, these technologies might differentially help wealthy actors, leaving complex societal-level coordination behind. We think that the big levers on this set of challenges are:
Building trust and legitimacy earlier, by getting started sooner, building transparently, and investing in evals and other infrastructure to demonstrate performance.
Targeting important niches that might be slower to adopt by default. More research would be good here, but two niches that seem potentially important are:
Coordination among and between very large groups, like whole societies. This might be both strategically important and lag behind by default.
International diplomacy. Probably coordination tech will get adopted more slowly in diplomacy than in business, but there might be very high stakes applications there.
Adoption of confidential monitoring and structured transparency. These technologies are less accessible with current models and may require large upfront investments, while many of the benefits are broadly distributed.
This makes it less likely that commercial incentives alone will be enough, and makes philanthropic and government funding more desirable.
Other challenges
The big challenge is that coordination tech (especially confidential coordination tech) is dual use, and could empower bad actors as much or more than good ones.
There are a few ways that coordination tech could lead to shifts in the balance of power (positive or negative):
Some actors could get earlier and/or better access to coordination tech than others.[7]
Actors that face particular barriers to coordination today could be asymmetrically unblocked by coordination tech.
Individuals and small groups could become more powerful relative to the coordination mechanisms we already have, like organisations, ideologies, and nation states.
It’s inherently pretty tricky to determine whether these power shifts would be good or bad overall, because that depends on:
Value judgements about which actors should hold power.
How contingent power dynamics play out.
Big questions like whether ideologies or states are better or worse than the alternatives.
Predictions about how social dynamics will equilibrate in an AI era that looks very different to our world.
However, as we said above, it’s clear that coordination tech might have significant harmful effects, through enabling:
Large corporations to collude with each other against the interests of the rest of society.[8]
More selfishness and criminality, as social mechanisms of coordination are replaced by automated ones which don’t incentivise prosociality to the same extent.
We don’t think that this challenge is insurmountable, though it is serious, for a few reasons:
The upsides are very large. Coordination tech might be close to necessary for safely navigating challenges like the development of AGI, and could empower actors to coordinate against the kinds of misuse listed above.
The counterfactual is that coordination tech is developed anyway, but with less consideration of the risks and less broad deployment. We think that this set of technologies is going to be sufficiently useful that it’s close to inevitable that they get developed at some point. By engaging early with this space, we can have a bigger impact on a) which versions of the technology are developed, b) how seriously the downsides are taken by default, c) how soon these systems are deployed broadly.
Some applications seem robustly good. For example, the potential for misuse is low for technologies like transparent facilitation or widely deployed charter tech. More generally, we expect that projects that are thoughtfully and sensitively run will be able to choose directions which are robustly beneficial.
That said, we think this is an open question, and would be very keen to see more analysis of the possible harms and benefits of different kinds of coordination tech, and which versions (if any) are robustly good.
This article has gone through several rounds of development, and we experimented with getting AI assistance at various points in the preparation of this piece. We would like to thank Anthony Aguirre, Alex Bleakley, Max Dalton, Max Daniel, Raymond Douglas, Owain Evans, Kathleen Finlinson, Lukas Finnveden, Ben Goldhaber, Ozzie Gooen, Hilary Greaves, Oliver Habryka, Isabel Juniewicz, Will MacAskill, Julian Michael, Justis Mills, Fin Moorhouse, Andreas Stuhmüller, Stefan Torges, Deger Turan, Jonas Vollmer, and Linchuan Zhang for their input; and to apologise to anyone we’ve forgotten.
We’re highlighting six particular technologies, and clustering them all as ‘coordination technologies’. Of course in reality some of the technologies (and clusters) blur into each other, and they’re just examples in a high-dimensional possibility space, which might include even better options. But we hope by being concrete we can help more people to start seriously thinking about the possibilities.
Meanwhile small cliques with clear interests often have an easier time identifying and therefore acting on their shared interests — in extreme cases resulting in harmful cartels, oligarchies, and so on. That’s also why tyrants throughout history have sought to limit people’s networking power.
Both confidential monitoring and what we are calling structured transparency for democratic oversight are aspects of structured transparency in the way that Drexler uses the term.
This red-teaming could be arbitrarily elaborate, from simple LM-based once-over screening to RAG-augmented lengthy analysis to expansive simulation-based probing and stress-testing.
Convert informal descriptions or formal rules into a typed governance graph: roles, permissions, decision thresholds, delegation, auditability, and recourse
Note uncertainties; seek clarification or highlight ambiguities
As we move toward superintelligence, incremental policy updates won’t be enough. To kick-start this much needed conversation, OpenAI is offering a slate of people-first policy ideas(opens in a new window) designed to expand opportunity, share prosperity, and build resilient institutions—ensuring that advanced AI benefits everyone.
These ideas are ambitious, but intentionally early and exploratory. We offer them not as a comprehensive or final set of recommendations, but as a starting point for discussion that we invite others to build on, refine, challenge, or choose among through the democratic process. To help sustain momentum, OpenAI is:
establishing a pilot program of fellowships and focused research grants of up to $100,000 and up to $1 million in API credits for work that builds on these and related policy ideas
convening discussions at our new OpenAI Workshop opening in May in Washington, DC.