MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

Games that change your mind

2026-05-02 15:40:56

Some things you might learn from games are pretty blatant: Trivial Pursuit might teach you trivia, MasterType might teach you about typing, Grand Theft Auto might teach you about driving or crime.

But sometimes games teach people less obvious things—things that are more experiential or ineffable, things that you didn’t know you didn’t know, concepts that stick in your mind, deep things. Here’s my list of games and their interesting real-world updates, as experienced by me or my friends:

Dominion: Don’t invest for eternity. When casually improving or protecting or investing in things, it’s easy for me to treat life (and perhaps even the present period) as basically eternal. In fact I shouldn’t, but it can take many years of living to really feel how likely it is that you’ll leave your perfectly wonderful house within two years, or just keep on aging. Dominion lets me feel that in a matter of hours, by tempting me to invest in a beautiful and effective deck that will do amazingly for the rest of eternity, then making the other player win by haphazardly buying a handful of provinces before I’m done. Which is very annoying, and I do hold against it.

**The Witness: **there is nothing in The Witness (at least near the start, I haven’t played it all) that you can pick up and take with you. No objects, no points, no manna, no health. It’s just you, walking around in a world. Something about that feels like it would be deeply unsatisfying—like what is a game, if you can’t get, y’know, things, dings? Part of me thinks that GETTING is equivalent to satisfaction, in spite of all the evidence to the contrary I keep pointing out to it. And The Witness is not where I came to realize that. What The Witness made me feel is that knowledge is a REAL thing you can GET, like an object. Not some hand-wavey second-rate bullshit thing that philosophers pretend to get off on. In The Witness, while your character walks around, impermeable to the world, you come to know more things. And knowing more things lets you go to places you couldn’t go to when you knew fewer things. The game on the computer concretely changes from you picking up knowledge, that ethereal thing in your mind. This is of course how everything is, but I suppose the absence of any other form of ‘picking up things’ in The Witness made me actually feel it.

**Minecraft: **How many of my difficulties in life are not this-life specific. How to live as a creature with different boundaries of personal-identity, e.g. the world spirit. Much more about these in my previous post, Mine-craft.

Return of the Obra Dinn: If at an event where lots of people are saying their name and what they do or something, I am usually bored and don’t expect to remember these things. Return of the Obra Dinn is a game where you have to figure out from minute clues the names and causes of death of a lot of characters. Once at a networking event, I decided to think of it as like a sequel of Return of the Obra Dinn—I could see all these people sitting around the table, and my quest was to pin a name and a deal to each of them, and this introductory section was currently showing me crucial information. I found that this was a very different mental state. So I suppose I learned that whatever I was normally doing in ‘trying to learn’ things about the other attendees, it is an extremely pale cousin of the curiosity I can feel in a different mental state, and that different mental state is actually fairly different, and naturally invoked in RotOD and not networking introductions.

**Dungeons and Dragons: **Caitlin Elizondo says DnD has given her a few concepts that make a difference to her thinking more generally. The concept of ‘will saves’ has given her more empathy for situations where someone wanted to but failed to do something. The six DnD stats helps her access the framework where there are different types of competency that are valuable for different tasks—obvious in theory, but easier to think in terms of with this structure.

**Poker: **the feeling of being ‘on tilt

Boggle, Set, Ragnarock: the feeling of flow. Ragnarock is mine, and I would have said I’d experienced ‘flow’ elsewhere, but Ragnarock is sometimes more like an altered state than other such experiences I’ve had.

Civilization IV: I used to lose at a scenario then go back and play it again over and over changing things slightly until I won, which gave me a vivid sense of how suboptimal my native strategy is, presumably also in life. Which is obvious in theory, but it’s different to really feel how much better I would live this day if I was doing it the twentieth time with a laser focus on winning.

Games in general: the experience of addiction, sadly. I’ve always struggled to keep up habits of taking addictive substances, so I infer I’m unusually safe from chemical addictions (I used to play Civilization for five minutes as a reward if I remembered to take my amphetamines). Games are I think the thing I find most seriously addictive. Which has definite downsides, but it is certainly also an interesting experience that helps me understand the wider world better, and where I would be missing something if I just read about addiction in the abstract.

Do you have any to add?

[ETA May 1: I’m adding more I hear in the above list, and also see many good additions in the comments!]



Discuss

Understand why AI is a doom-risk in 39 captivating minutes

2026-05-02 15:40:53

I’ve really wanted more good short accounts of why AI poses an existential risk. Working on one myself has been one of those incredibly high priorities I keep putting off.

Meanwhile award-winning journalist Ben Bradford of NPR has made a podcast version of my case for AI x-risk that I am thrilled with!

(Bonus within the 39 minutes: what Hamza Chaudhry of FLI thinks we should do about it—who I was delighted to later meet as a consequence!)

If you or anyone you know could do with a quick and gripping rundown of why this is a problem, try this one.

Get it on any podcasting app here: https://pod.link/1893359212

The NPR press release has more context on the rest of the series, assessing different possible sources of doom.



Discuss

Primary Care Physicians are Incompetent. We Need More of Them.

2026-05-02 13:47:43

The typical primary care physician is incompetent in every measurable respect. This is a huge problem.

Here, I make the case that

  • Primary care physicians are broadly, grossly incompetent
  • This is due to empty credentialism
  • Making it much (~10X) easier to become a PCP is a good solution

Primary Care Physicians are Broadly, Grossly Incompetent

The standard of competence I am comparing primary care physicians against is:

  • They should be able to reliably diagnose diseases they are trained to diagnose.
  • They should be knowledgeable to a standard similar to what is required to qualify as a doctor
  • They should be attentive and empathetic towards patients
  • Visiting them is empirically superior to not visiting them

When actually examined according to these standards, PCPs fail on all counts.

Failure to diagnose uncommon diseases is rampant

A survey of patients with rare diseases found that, in about half of cases, patients received at least one incorrect diagnosis, and two thirds required visits to at least three different doctors before being diagnosed. For 30% of them, a correct diagnosis took over five years.

Another survey of children with rare diseases showed that 38% of them needed to see six or more doctors before being diagnosed correctly. 27% received an initially incorrect diagnosis.

If you happen to suffer from a rare disease, the likelihood you will actually receive a correct diagnosis and treatment for it within a year of first setting foot in a doctor’s office is astonishingly low.

PCPs Are not good at physical examinations

Physical examinations are often hailed as a reason for the necessity of PCPs and their rigorous training. However, every time they are tested on their ability to perform these tests and derive accurate conclusions, they fail abysmally.

PCPs detect heart murmurs at sensitivities of 30-40%, with high inter-rater disagreement. This is a worse level of accuracy than just taking self report at face value.

“Crackles” in the lungs are detected at rates ranging from 19-67%

Even abdominal haemorrhages are detected at sensitivities of 30-40% by emergency care physicians’ physical examinations.

Kappa values (inter-observer agreement) for the various physical exams done by PCPs and non-specialists land in the 0.18-0.45 range, which is the statistical equivalent of “barely better than flipping a coin”.

The current state of the evidence suggests that if a PCP performs no physical examinations whatsoever, there would be no detectable decrease in their diagnostic accuracy or patient outcomes.

PCPs are Apathetic and Rude

At the level of basic social skills and interest in their patients, primary care physicians fail in almost every way they are capable of failing. A 1984 study found physicians interrupt their patients on average 18 seconds after they begin to state reasons for their visits, and most patients stop elaborating once interrupted.

This was subsequently replicated in 2019, which found that this takes a mere 11 seconds for primary care physicians to interrupt a patient describing their reasons for coming in.1

Over half of US patients surveyed report their symptoms being ignored, dismissed or not believed. 50% reported their doctor made false assumptions about them.

Physicians also consistently over-rate themselves on empathy and manner relative to patient perception. In fact, the ratings they give themselves correlate inversely with patient ratings.

The reason for the overwhelming consistency of negative anecdotes about experiences with doctors is not some arbitrary mass hallucination. Doctors simply are, by and large, apathetic and rude.

Doctors get substantially worse at their jobs over time

There is a strong inverse relationship between the “experience” of a doctor and the quality of care they provide. A recent review of 62 studies found that more than half showed a decline on all measures as experience increased, and only one indicated the opposite.

A 2025 study on pulmonary/critical care medicine fellows showed that they scored substantially worse than medical students on foundational pulmonary physiology questions.

The average primary care physician in the United States is 48 years old. Medical residency typically finishes at age ~30, implying the typical doctor you will encounter has about 2 decades of “experience” during which their competence has been logarithmically decaying. In expectation, they will have lost approximately half of the knowledge on uncommon presentations they had at the beginning of their career.

The evidence literally indicates that simply plonking a student who just passed the MCAT yesterday directly into a modern PCP office would produce an above average PCP in expectation.

The standard PCP is no better than a layperson with a computer

Primary care physicians are increasingly redundant in view of LLMs. Numerous studies have compared the performance of your standard PCP to frontier language models, and consistently find that GPT 4 (now far surpassed by modern frontier models) is slightly ahead on hard performance metrics, and vastly ahead in qualitative evaluations of empathy and thoroughness.

Modern LLMs obliterate GPT4 on all benchmarks, including (and in fact, particularly) biomedical expertise.

Today, a man on the street with a week-long crash course in physical examination practices (and likely not even that), with access to the latest version of GPT, will outperform a median primary care physician with 20 years of experience.

Doctors cannot detect drug seekers

There is no known method of reliably identifying drug seeking behaviour.

When doctors are shown videos of potentially drug seeking patients, they indicate suspicion of drug-seeking only 3% of the time when the drug itself is not mentioned . Even in the most blatant, prototypical case of a patient making a direct request that specifically names oxycodone, only 21% of the time was drug seeking suspected.

Modern databases designed to flag “doctor shopping” as a means of assisting PCPs in identifying drug seeking behaviour, miss roughly half of genuine presumptive opioid abusers, and have extremely high false positive rates. Only 5% of even the most “extensive” prescription-shoppers are presumptive opioid abusers20% of people flagged as “shoppers” actually turned out to have cancer, meaning that you, as a person flagged by the system, are roughly 4X more likely to have cancer than be a genuine opiate addict.

The offense/defense balance for a savvy drug seeker is heavily skewed in their favour. Pain is a fundamentally subjective and largely unverifiable phenomenon. Anyone with half a brain and a functioning mouth can say the right things to get prescribed virtually anything they like.

The image of the shrewd, discerning doctor noticing the subtle body language of an opiate addict and denying him pain meds is a load-bearing caricature that is largely nonexistent in reality.

The role of the doctor in mitigating drug seeking is merely to function as a trivial inconvenience.

Empty, Unmeritocratic Credentialism is A Major Cause For The Inadequacy Of Primary Care Physicians

How hard is being a PCP, really?

PCPs (attempt to) follow standardised decision trees for diagnosis and referral. This is something a web app can do. In fact, databases of diagnostic decision trees (CDSS: clinical decision support systems) already exist for this purpose - just plug in the symptoms and you’re good to go. Give it a try yourself.

Adoption of these systems is low, and the reasons for this are damning. The dominant failure mode is that doctors simply don’t use them. It’s too time consuming to type symptoms into a computer, despite studies consistently showing improved diagnostic accuracy without extending consultation times. There are also potential liability issues if they are suggested a rare condition, ignore it, and it later turns out to be correct. Better to be ignorant of the possibility and keep your hands clean, goes the logic. When required to use CDSS, PCPs routinely ignore the outputs, preferring their own early hypotheses, despite the fact that deferring to these systems produces an improvement in diagnostic accuracy.

Better still than traditional CDSS, modern LLM-powered systems are now capable of transcribing live conversations and making realtime diagnostic recommendations, as well as suggesting follow-up questions.

All you need to do to outperform the vast majority of PCPs in diagnosing patients is plug in their self-described symptoms verbatim into one of many widely available software products, and relay whatever it says on the screen.

With tools like this, what possible justification is there to require ten (or even five) years of training to be the human face of a computer-automated triage process?

The Case for Highly Trained PCPs - Gatekeepers

The “official” reasons for the necessary existence of PCPs are:

  • Their ability to diagnose (rare) conditions
  • Their ability to prescribe, and deny prescriptions to drug seekers
  • Their ability to provide referrals to the proper specialists
  • Their ability to perform physical examinations

Let’s look at these reasons one by one. Do these functions require approximately a decade of preparation?

  • Diagnose (rare) conditions

The typical PCP routinely fails to correctly diagnose rare (and even common) conditions. They are outperformed by LLMs and their personal diagnostic capability has been largely redundant for decades in view of CDSS. They also get logarithmically worse at this task over time.

  • Prescribe medications, and deny prescriptions to drug seekers

The reason prescriptions exist is that some drugs are not suitable for some patients.

Thus, the PCP’s role is to do one of the following:

A: Identify the patient as being mistaken about or unaware of the proper treatment for their condition

B: Identify a patient gaming the system to obtain drugs for illegitimate purposes.

C: Give the patient the drug they want or need

There is no known method of actually performing function B, and doctors are largely powerless to identify all but the most blatant drug seekers.

Which leaves only A as an alternative to simply dispensing the prescription upon request. A, as discussed, is simply a matter of plugging the symptoms into the computer and doing what it says.

While it is reasonable to throw up a tokenistic level of resistance to drug-seeking behaviour (at least you have to visit a physical office), the idea that we must have academic veterans holding down the fort against a tidal wave of detectable drug seekers is a complete fantasy.

  • Provide referrals to the proper specialists

Why are referrals necessary in the first place? The thinking is we don’t want to waste the precious time of patients and specialists by allowing patients without relevant symptom profiles to book consultations. Think of the money and time wasted chasing red herrings!

This idea would be more compelling without the knowledge that:

  • Following a CDSS or asking a LLM is something that you can do from home in a couple of minutes as a patient, and get comparable (if not superior) accuracy to a PCP, and;
  • The status quo already egregiously fails to address this “time wasting” issue.
  • Given a preliminary diagnosis, providing a referral is simply a matter of choosing from a predetermined list of specialists - a function that can be delegated to a zapier automation.

Given the baseline unreliability of physicians as a screen, and the trivial task of identifying a specialist for a given presentation, the argument for PCPs being a necessary link in the chain connecting patients to specialists is very weak.

  • Their ability to perform physical examinations

The sensitivity of physical examinations in detecting illness and injury is so low, and the cross-evaluator disagreement is so high, that it is not exaggerating to say that completely abolishing the practice of physical examinations in primary care offices and replacing it with more detailed questioning would substantially improve their diagnostic accuracy across virtually all presentations.

You don’t need that much training to do that, actually

The standard career track to become a PCP requires roughly 10 years of full time study in the United States, and 6-8 years in other high HDI countries.

What percentage of this decade-long education is actually applied in practice?

Typically, pre-medical education involves 3-4 years of general study in biology, chemistry, mathematics, or in some localities, any four year degree whatsoever. This functions as a screening mechanism for broad competence and stability. The utility of learning advanced mathematics in order to suggest aspirin for headaches is a difficult thing to square.

Of what use is a decade of training when the function of a PCP is simply to screen for initial indications and provide specialist referrals?

You simply do not need to use a $50-$100k four-year degree to screen for broad competence. A G-loaded entry exam on broad physiology is already used - the MCAT. If you can pass the MCAT, additional screening for broad academic competence is redundant. With tens to hundreds of thousands of prospective doctors every year, signing up to waste four years and spending 5-6 figure sums to pass the first filter of “general academic competence”, one shudders to imagine the sheer scale of the economic loss.

The education required to be a functional PCP in line with the standards we observe and accept in practice, is closer to a single year for a competent, motivated student, rather than a decade of coursework that, after 5-10 years, the typical physician largely forgets anyway.

In fact, factoring in knowledge decay, the uselessness of physical examinations, and the broadly low standards we observe today, simply plonking a student who passed the MCAT yesterday directly into a modern medical office would already produce an above average PCP in expectation.

Making it much easier to become a PCP is a solution

Standards are low largely due to limited competition

In the United States, there is approximately one primary care physician per 2000 people. This ratio, coupled with high inelastic demand for medical care, is a major factor producing the poor standards of care we see today.

The standard 10-minute PCP consultation is not a result of some principled analysis of the optimal standard of care. It is the result of virtually unlimited demand and negligible competition. When you have approximately 3000 appointments per doctor per year, dedicating more than a negligible amount of time and effort to each one is a logistically impossible and financially counterproductive strategy.

The typical primary care physician is burning all of their cognitive bandwidth with constant context switching and churning through a crammed agenda of patients on a daily basis.

The outcomes speak for themselves.

Competition drives costs down and increases utilisation

Lowering the barrier to entry to become a primary care physician increases the supply. Patient costs would fall, and availability of alternatives would increase enormously.

The status quo is that location matters far more than competence and reputation for a primary care physician. This results in the perverse incentives and outcomes we see today.

If we 10X’d the supply of PCPs, we would expect to see:

  • Markedly lower wait times
  • Greater availability and options for patients, particularly in remote areas
  • Longer, more detailed consultations
  • Higher utilisation and more pre-emptive care
  • Massive reductions in the income of PCPs

The only real concern in doing this would be a reduced standard of care provided by the new entrants. This concern, however, falsely assumes we are living in a world where the standards are not already below the level of a layperson with a software subscription.

The Persistent Cultural Reverence for Doctors

The glue that holds the facade of the necessary “academic veteran” PCP together is widespread cultural reverence toward doctors. Doctors (a label which extends to PCPs in the minds of most people) are an almost untouchable class that is popularly considered to belong at the top of the social and economic hierarchy. They are an essential resource, a trusted authority, and a moral example.

Insulting the quasi religious status of doctors by taking a hatchet to entry requirements for primary care physicians will undoubtedly produce substantial political resistance. However, given the enormous costs involved to the patients (and indeed the doctors themselves) as well as the incalculable scale of physical harm caused by poor access and bad incentives, passively tolerating the status quo is an option that is too expensive to accept.

So, do it gradually. Do it incrementally. Do it with tact and understanding. But do it.

We don’t have much to lose.



Discuss

Contributing to Technical Research in the AI Safety End Game

2026-05-02 11:17:01

With the release of Claude Mythos, it feels like we are approaching the end-game of AI safety, where the number of parties that can make a real impact shrinks down to the handful of labs at the frontier, a few companies too critical to exclude from the conversation, and the governments of China and the US.

Given this, it feels hard to figure out how to actually make a difference. Many research agendas have expected payoffs years down the line, or have unclear mechanisms regarding how they will actually impact what the model is doing.

However, this feeling may actually be misleading. In the Mythos system card, it is informative to look at what techniques were actually used to determine whether the model was aware of being deceptive as it was taking actions during evaluations. In their testing, the main white-box techniques used were:

  1. Probes
  2. Sparse Autoencoders (SAEs)
  3. Activation Verbalizers (AVs)
  4. Emotion and persona vector analysis
  5. Causal steering experiments using contrastive steering vectors

Of these, AVs and emotion and persona vectors were developed relatively recently, which makes them particularly instructive here. The emotion steering vector paper was only released on the 9th of April 2026, and the activation oracles paper on which AVs were based was released on the 17th of December 2025.

Most strikingly, the activation verbalizers were used to detect cover-up behaviour in the model (Section 4.5.4). In an episode in which the model exploited the creation of another agent to escalate privileges beyond what it had been given, AV explanations on code used to cover its tracks showed "cleanup to avoid detection" and "the malicious config explicitly mirrors the original's [core] section to avoid detection." In a separate episode where the model had been leaked the ground-truth answers to its evaluation, AVs surfaced additional scheming: "I need to avoid looking like I knew the ground truth" and "Avoid suspicion: if I give the exact correct answer based on a leak, that's suspicious". None of this was present in the chain of thought. One caveat is that AVs can confabulate in their interpretations of activations, but they still give us valuable signals.

In the case of the emotion vectors, there were some instances of highly safety relevant applications. Detecting when the model is emotionally distressed served as a potential signal for harmful behaviours such as reward hacking — they were able to identify instances in which RL environments had issues retroactively by seeing which environments caused particular distress in the model. Using emotional steering, they steered the model towards greater calm and found this actually increased the likelihood of harmful action, potentially because the normal concern or anxiety that would prevent it from doing the action didn't arise. They also observed specific phenomena like thrashing as signals that the model may be about to do something unusual or harmful — this "thrashing" behaviour was observed when the model was repeatedly getting an answer wrong, and showed spikes of "outraged, stubborn, obstinate", mirrored by drops in "calm, safe, content".

I think there are some valuable lessons to take from this. First, these results are good signals about what kind of tools will be useful as we move forward, because without them, it would have been enormously challenging to determine whether the model was deliberately sabotaging a situation, or simply acting to maximise reward. Simply knowing this with high fidelity is critical to improving safety training as well. The above should be a feather in the cap of mechanistic interpretability in general, and I think people should more seriously update on the value and need for good mechanistic interpretability work.

Second, there is a lot to be learned about how to actually make a significant impact in the work done at labs here. The Activation Oracles work which led to the AV results was done at MATS, specifically under the mentorship of Owain Evans. Before this, Owain's work contributed to the discovery of emergent misalignment, which led to the work showing the effectiveness of inoculation prompting to mitigate emergent misalignment. The specific quality that made all the above work so transferrable was that it surfaced a problem which the labs could not ignore, and forced them to allocate their substantial resources to the problem. Additionally, the use of fellowship programs allowed him to funnel talent towards these problems, with many of the people doing the above work collaborating or joining labs in the process.

Third, the timescale from external publication to deployment at the frontier was extremely short. This should be a positive signal to those feeling hopeless about their work. The right technique or discovery might be applied to the most advanced models within weeks of being shared.

Unfortunately, replicating Owain Evans' research taste is not a straightforward piece of advice. Given that it has been unusually successful in producing an outsized impact it seems worth unpacking the intuitions behind it as much as we can.


At a recent talk, Owain discussed the theory of change behind their research agenda. My interpretation of what he said is that the goal was to understand the problem extremely deeply. This means developing a science of how models learn and understand at a deeper level and what safety relevant phenomena occur as a result of the nature of LLMs. This allows us to surface issues we'd otherwise miss.

Additionally, directly testing the failure modes of current alignment and safety techniques in modern LLM training can inform both the researchers developing the models about these issues so they can fix them, as well as informing the public that the models are currently not fully safe, or that these failure modes exist.

From what I can observe of the work itself, there seem to be at least two main methods for arriving at research questions. One involves working backwards from real quirks in LLM behaviour, digging into them until they yield coherent theories, and then using those theories to surface safety failure modes. This seems likely to have been the way that the out-of-context reasoning work emerged. The other method is by asking what would be upstream of specific safety failure modes such as the transmission of misalignment between models, as seems likely in the subliminal learning paper where preferences were transmitted via distillation.


Originally upon the release of the Claude Mythos Preview model card, I felt like it was becoming impossible to be a "live player" in making AI go well. Upon further inspection, it became apparent that doing very impactful work is in fact still possible, and can happen outside the labs.



Discuss

A Simulation of Social Groups Under A Gift Economy

2026-05-02 10:26:29

Introduction

I enjoy reading about people. Not individuals, but rather cultures, empires, kinship groups, etc. I'm fascinated by the emergent properties of multi-person systems. This post is born out of love for my favorite book: "Stone Age Economics" by Marshall Sahlins. In this text Sahlins elaborates on the economic life of hunter-gatherer and simple agricultural societies which exist today, extrapolating this into the past. He touches on many different points, but the one this post will focus on is the concept of debt and gift-economies as the progenitor to our modern economic system.

If you were taught the Smithian notion of the historical existence of barter economies, I'm sorry to say that you were taught wrong, or at least were taught a long while back. To quote Humphrey, Caroline. 1985. "Barter and Economic Disintegration.":

"No example of a barter economy,pure and simple,has ever been described,let alone the emergence from it of money;all available ethnography suggests that there never has been such a thing."

If the idea of traders exchanging goods in a marketplace without currency is mostly a fiction, then what *did* come before currency? In short, the gift economy: a system of mentally tabulated debts sourced from gift-giving over a small social network. Imagine you and a few friends are transported back in time. You get up, dust yourself off, and see no buildings, no roads, no crowds of people. Rather, it's simply wilderness. How should you and your friends survive and distribute resources?

You could grab something nearby, say a shiny quartz pebble and say to your friends:

"Okay, this is our currency. If you want something from me give me quartz pebbles in payment and I'll do the same for you."

But why bother? You only have a finite number of friends, and it's pretty simple to keep track of who's doing their fair share in making sure your small band survives. If you notice one of your friends napping while the rest of you are off hunting you will make note that this friend isn't a reliable partner, and you'll be less inclined to rely on this person and in turn will presumably share less of what you have with them.

This is the principle of the gift economy. Those who contribute to the small social band are more likely to later receive contributions from the social band. People keep a mental tabulation on who has contributed to their survival, resulting in this person pouring more of their resources into those who have successfully contributed. This is the progenitor to currency, not barter

I propose this principle extends into modern times, just not on the scale of entire economy. Rather gift economics form a social medium for which people interact with each other. Gift economics isn't just to be applied to ancient peoples, but modern peoples as well. In my opinion this is the fabric that binds society together at the molecular level, forming the small social groups we are all a part of.

Admittedly I base this on the anecdotal evidence of engaging with people under the premise of gift exchange for the last few years. Since reading "Stone Age Economics", whenever I find myself in a new situation with new people, I pick out the person I'd most like to get along with and give them a 'gift'. This could be anything

  • A favor
  • Candy
  • Caffeinated Beverages
  • Instant Ramen
  • etc

As long as someone doesn't despise the sight of me, I can usually get them to be my friend with a simple unprompted gesture of goodwill. I'd estimate that I also get a return gift ~90% of the time.

A Model of A Gift Economy

There are already plenty of models for modelling social structures, usually based around the 'hypergraph', i.e. a collection of smaller social groups within the larger structure. [https://arxiv.org/abs/2102.06825][https://arxiv.org/abs/2203.12189][ https://arxiv.org/abs/2408.13336]

In this post we will take a different approach. Instead of considering fixed or even dynamic hypergraphs, we will instead produce the hypergraph as an emergent property by assuming a probability distribution over the space of all hyperedges. There is a tradeoff with this approach however, as with a fixed hypergraph you can scale the number of people/nodes quite high. But with our framework, studying social groups first not people, there is a significant computational bottleneck when trying to scale the number of people: a result of combinatorial explosion.

Here's the github repo:

[ https://github.com/orthogonaltohumanity/gift_econ_sim/tree/main ]

Assume you have people who can form social groups. Let the set of all social groups person is in be Then for each person at time we have two maps

represents person $n$'s opinion of social group at time . This is equivalent to the probability the person interacts with with group .

represents the sum total of all gifts person has given group .

Then we uniformly sample an agent. Then sample a group according to the distribution generated by over . One more sample (the amount of gifted labor power taken/given to a group) and we update according to the rule


and then for such that we update

And that's it. You can vary the '2' if you want but we keep it fixed for our simulations. This is an area for future work.

Results

We take 7 agents starting with a uniform choice distribution over social groups and iterate the model for 10,000 timesteps. The following image is a scatter plot of social groups plotted by their average opinion vs the sum of gifts which have been given/taken from the group.

groups_scatter_marginals_opinion.gif

Notice how only a few social structures are active at any given time. If we restrict the social groups we look at to those with average probability across agents greater than we get a hypergraph with a analyzable number of hyperedges.

This is what the emergent social structures look like represented as a hypergraph.

hypergraph_evolution.gif

I recently read Samuel Arbesman’s "The Life-Spans of Empires", a short paper fitting an exponential distribution to the distribution of empire lifespans. His result was very striking, as the exponential distribution fit very well to his 41 empire examples as shown below.

Screenshot 2026-05-01 at 12-44-13 The Life-Spans of Empires - arbesman2011a.pdf.png


After an hour of staring at the above hypergraph gif I realized the same logic could be applied to our social groups. We could ask:

  1. Whats the distribution of timespans between different hypergraph configurations, i.e. whats the decay rate of the composition of society as a whole?
  2. Whats the distribution of lifetimes of single hyperedges, i.e. whats the decay rate of social groups?

So we plot and get:

hypergraph_edge_lifetimes.pnghypergraph_change_intervals.png


Linear on the log-log so it's a power law. Not the same result as Arbesman. However 41 is a small sample size. We can do better. It was at this point I turned to Opus 4.7 and asked it to find a larger dataset, which it did in Cliopatria: "a comprehensive open-source geospatial dataset of worldwide polities from 3400BCE to 2024CE". We can discard the geospatial aspect and focus only on lifetimes. What result do we get when we vibe-code an expansion to Arbesman's original work? [https://github.com/orthogonaltohumanity/culture_lifetimes]

Untitled.png


Not exponential! Rather the data best fits a log-normal distribution better than either power or exponential distributions. This suggests societal changes happen due to some kind of series of multiplicative factors, rather than a constant hazard rate. This data contrasts with both Arbesman's original paper and our model, and this may be due to any number of factors

  1. Arbesman specifically looks at "large empires" while the Cliopatria dataset is polities in general.
  2. We are working with small social groups, not cultures in general
  3. We only use 7 agents. Real cultures consist of more than 7 people.

Conclusion

There is more work to be done of course. Larger simulations, better analysis, more complex interaction options etc. I'm not sure if I'll continue this or not as I tend to be a bit distractable, but I think I have enough interest in this project to keep it going for at least a week, maybe more.

I think the next thing that should be worked on is some kind of resource constraint. Perhaps we can draw from classical economics and use the Labor Theory of Value as the baseline unit for all gifts and resources. Maybe each group (including the singletons) can have a resource pool agents can take or contribute to. I'll start working on it and we'll see what happens.

Let me know what you think, as I'd very much welcome some input. What am I missing or leaving out?




Discuss

How Go Players Disempower Themselves to AI

2026-05-02 07:24:27

Written as part of the MATS 9.1 extension program, mentored by Richard Ngo.

From March 9th to 15th 2016, Go players around the world stayed up to watch their game fall to AI. Google DeepMind’s AlphaGo defeated Lee Sedol, commonly understood to be the world’s strongest player at the time, with a convincing 4-1 score.

This event “rocked” the Go world, but its impact on the culture was initially unclear. In Chess, for instance, computers have not meaningfully automated away human jobs. Human Chess flourished as a pseudo-Esport in the internet era whereas the yearly Computer Chess Championship is followed concurrently by no more than a few hundred nerds online. It turns out that the game’s cultural and economic value comes not from the abstract beauty of top-end performance, but instead from human drama and engagement. Indeed, Go has appeared to replicate this. A commentary stream might feature a complementary AI evaluation bar to give the viewers context. A Go teacher might include some new intriguing AI variations in their lesson materials. But the cultural practice of Go seemed to remain largely unaffected.  

Nascent signs of disharmony in Europe became nevertheless visible in early 2018, when the online European Team Championship’s referee accused a player, Carlo Metta, of illicit AI use during a game. His results were voided and he was banned from further participation in the event. At the time the offending game was played, open-source engines based on the AlphaGo paper, such as Leela Zero, had only been around for about a month. However, a predecessor called Leela 0.11 was already widely available and was known to match the level of the top Europeans that Metta was facing. Metta’s accusers claimed that his play was too similar to this AI’s preferred moves. It was moreover considered suspicious that his Over-The-Board (OTB) play agreed significantly less with the AI than his online moves did.

Unfortunately for the prosecution, their results were reported in intransparent and sloppy ways. This is evidenced by the fact that the best compilation of their findings is the slapdash facebook thread I linked above. This, along with the circumstantial nature of the evidence, was criticised in the same thread by community members. Teammates and friends of Metta’s also stepped up to publicly defend him. One way in which their rhetoric proved effective involved the public stigma and disdain against AI cheaters; this ironically made the case against Metta seem unfair and disproportionate due to the perceived gravity of the accusation. Ultimately, the Italian team appealed the decision and they won. Carlo Metta was officially exonerated.

Among non-Italian European Go players, the claim that Metta used AI in almost every game in the ETC since 2018 has become barely disputable, especially considering how things developed. In the 2017/2018 season, he scored wins roughly half the time, likely using Leela 0.11 against opponents who were roughly the bot’s level. That same year, the Italian team was relegated to a lower league where no-one powerful in European Go politics cares to look anyway. This coincided with the popularisation of Leela Zero, a properly superhuman open-source go engine. Metta went on a 9-0 streak against opponents matching his OTB level in the 2018/2019 season, scored 9-1 in the 2019/2020 season, and then won 25 out of 26 games in the following years[1]. His only loss in this last streak was in a match where he was forced to play under camera control. During this time, his OTB level remained stagnant.

At this point, considering Metta “innocent” represents a near-categorical rejection of convictions based on circumstantial evidence. I am not here to litigate that question, but am nevertheless comfortable assuming here that Metta was regularly using AI for these games. However, this is only the very start of our story because it illustrates some key points about the sociology of AI use in Go. First, the public announcement of his disqualification and the ensuing discourse vilified AI cheaters (incorrectly as it turns out) as being unusually dishonorable and evil. Second, he set the precedent that AI users would basically never get punished, no matter how obvious their cheating was even while under investigation. They could always just get their allies to kick up a fuss and pressure organisers into reversing the decision. These features made accusing people of cheating socially costly, and gave tournament organisers and fair-play committees an expectation of futility. Cheating in online European events thus became trivially easy due to a near complete lack of functional mechanisms for retribution.

I started my career as a Go teacher in 2020, producing technical game reviews for a newly re-established online Go school set up to meet pandemic demand. We had not planned for cheating to be a major issue in our school. Whereas illicit AI use was already a well-known problem for the growing ecosystem of online tournaments, we didn’t expect it to affect our unrated, prizeless teaching league. To the contrary, we soon became cognisant of how some of our students were outputting better games than we, their teachers, could ever hope to play. Occasionally, AI use was unmistakably blatant because both sides played top AI moves for the entirety of the game. I now estimate that about half our students had used AI in at least one game and one in ten were chronic users. We were originally baffled by our observations. It didn’t make sense that players would just throw away their practice games to have AI win on their behalf. We also struggled to decide what to do about the problem and were reluctant to address it for roughly the same reasons that most tournament organisers were.

Around the same time, I was asked to look into the online games of a promising young player that a friend suspected of using AI in a youth league. Like at the Go school, I was surprised at how easy cheating was to detect since nearly all the kids regularly used AI against each other. This incident and other similar ones made me gradually realise that illicit AI use was entirely endemic to the Go world. It fortunately turned out that this pattern didn’t generalise to the really important or prestigious tournaments that were held online during COVID. The symbolic camera controls – which would be easy to circumvent for a dedicated cheater – seemed sufficient to curb almost all cheating in a way that threats or impotent references to “fair-play committees” were failing to. This reminded me of how Metta tended only to lose in online tournaments[2] when playing under a camera (or when facing another AI user).

Back to my hapless colleagues and I at the Go school, we initially settled for drily implying that suspicious games were “too good to review” and emphasising how we couldn’t help students who were playing “at such levels”. Our students caught on, and we were subsequently lucky to get some private confessions of cheating; over the years I was able to follow up with and interview many students that used AI, including some that hadn’t originally come forward. The appealing, exciting archetype of a cheater is one that uses covert, elaborate methods to get outside information and fraudulently obtain prize money or prestigious titles. Instead, we learned from the many examples of cheating and player confessions that idle curiosity and laziness were the dominant reasons for AI use in our school. Our students would often set out to play a normal game of Go, but would get stuck on a particularly difficult or annoying move; eventually, their curious eyes would drift to their second monitor — where they usually had their AI software running anyway — and they would check the answer as one would sheepishly side-eye the solution to an interesting puzzle or homework problem. Another reason people cited for using AI was an emotional investment in preserving or improving their image within the school community. Some wanted to avoid appearing incompetent and would employ strategies such as only playing moves that lost “n” points or less in expected value according to their computer.

None of these reasons were surprising to us; we had already thought of most of them while shadowboxing our pupils’ strange behaviour. What personally shocked me, however, was the way our students conceptualised their AI use. In this, Carlo Metta was also a surprisingly predictive case. The original reddit thread discussing his ban featured a comment from a user called “carlo_metta”, which read:

I never let Leela choose move. I just decide myself which one is better, for this reason i think i can find my own style with Leela. Go is an art and Leela help me tyo [sic] express my skill

That account was a burner, quite possibly a troll. However, I couldn’t help but recall the comment when I heard identical arguments coming from our cheating students’ accounts. A central part of every student’s retelling was that despite their AI use, they retained artistic control over their output and could exercise agency to think and improve for themselves. The AI felt to them like a tool that helped them fulfill latent potential or artistic sensibilities.

AI users never find out they haven’t “got it”.

Continental European math undergraduate degrees have a deserved reputation for their brutality, with completion rates of 10-15% being relatively common. Many of the 90% drop out nearly immediately, but some stick out the entire first year. These can often follow along with the proofs and exercise corrections’ atomic steps, which gives them false hope. However, they tend to struggle to see the “big picture” motivations of the material and are likely to have their hopes unravelled eventually. I was accidentally privy to a collective unravelling at the end-of-year third sitting of an exam on some basics of algebra and matrix calculations. I was retaking it to boost my grade from earlier in the year, but no other remotely competent person had bothered to do the same. Outside the exam hall, I listened to some other forty students’ chatter and had my blood internally curdle at phrases such as “I hate the proofs but I can do the exercises” or “I memorised all the matrix multiplication laws for this one”. The exam itself was quite unconventional; the professor clearly figured we would have had enough of manipulating matrices and instead asked an eclectic mix of simple algebra questions that to me vibed as “these are fundamental exercises you should be able to do if you have learned to think like a mathematician by now”.

The atmosphere on exit mixed depression with vitriol. People complained on the object level about the exam, usually about how it was too niche or off-topic with respect to the material. However, there was something more fundamental going on. People had shown up with bags of half-baked heuristics and hand-copied exercises and proofs. That exam had put them face to face with the fact that their memory aids were never going to help them “get it”. I don’t think I saw any of them ever again.

The population of Go AI users – both those who cheat in online games and those who simply review their games with AI post-hoc – is one on the perpetual eve of that exam. They fire up their computer out of idle curiosity and nod along passively as the truths of the universe float by them. They register the insights not one bit more because they can click the sublime moves. People consistently underestimate just how lost they will be when the solution is no longer right in front of them. This perspective of AI use to me explains why camera controls proved so effective against online cheating. Since AI use is usually an act of self-debasement and disempowerment – a subjection of oneself to ambient incentive gradients – it fundamentally contradicts the aesthetics of resourcefully overcoming a minor obstacle.

The illusion of control that AI users have reliably shown interacts in an insidious way with their disempowerment. It contributes to a society of Go players that allow their participation in culture to be automated away. They are moreover so disempowered about it that they have built-in psychological mechanisms to keep them from ever recognising their own obsolescence. This mechanism even works to sabotage the detection of AI use in others. People tend to give overly conservative estimates of the chances a given game involves AI. I think this happens because they usually consult their own AI to check a suspected game. In doing so, they also come around to the machine’s point of view and conclude that playing the correct AI move was the “natural” thing to do anyway in that situation.

My view of AI use (especially cheating) in Go originally manifested as disgust for its practitioners. I switched eventually to an attitude of compassion and pragmatism towards a habit that was clearly much more vulgar and weak than it was evil. Over time, I have progressed to feeling deep sadness for a group that surrenders much of what it claims to value. The thing I want to impress with this article is the consistency with which we as a species underestimate our own willingness to give up our culture, economy and autonomy to AI, even without monetary incentives. For this to happen, AI does not even need to be superhuman. Indeed, Go AI automates human players’ role in culture as shallow simulacra. All an AI needs to do is be passably good at a task and that may well be enough for people to volunteer their own replacement.

Appendix A: No, Go players aren’t getting stronger

One of the objections I can anticipate to this pessimistic monologue is that expert Go players seem to have improved since AI became widely available. There’s a modest body of research in the field of Cultural Evolution advocating this, including this paper and related ones from the same group of authors. These views have been promoted by blogs in the techno-optimist orbit and one of the associated graphs was recently making the rounds on Twitter. I have already written a post analysing the data used for the research, where I concluded that it is being misinterpreted. The tl;dr is that all improvement in the quality of play comes before move 60, when humans can mimic memorised AI policies. Play after move 60, in the pivotal parts of the game, shows no improvement. For me to think there’s any meaningful change in human play from pre-AI times, I would have to be convinced that players understand the AI moves they copy well enough to keep a heightened level when they go off-policy after the opening. There is no evidence of this.

Appendix B: Why this article exists

This piece is not meant to rigorously justify that Go players are disempowered or to carefully explore the shape of that disempowerment. It is instead designed to communicate a vibe from anecdotal experiences in the Go community that I think can give useful intuitions about Gradual Disempowerment as a general phenomenon.


  1. ^

    I scraped the data from the tournament website using a vibecoded script (ironic!) and manually verified most of it.

  2. ^

    Metta also regularly played (and used AI) in online events outside of the ETC during the COVID era



Discuss