2026-04-05 19:15:36
This essay is me trying to figure out the “edges” of Singer’s argument in Practical Ethics.
I’ve written and rewritten it several times, and it bothers me that I don’t reach a particular conclusion. The essay itself remains at the level of “musings” instead of “worked out, internally consistent philosophical refutation”.
Nevertheless, I want to share my thoughts, so publishing it anyway.
Some specific disclaimers:
If you’ve read the book, or are otherwise familiar with its arguments, feel free to skip to the next chapter.
Singer claims that you must make ethical decisions based on an equal consideration of interests, and not any other property.
It does not matter what age, race, religion, sex, or species one is – the only thing that matters is one’s capacity to suffer, and one’s capacity to view oneself as a distinct entity, with a past and a future.
Take, for example, eating meat.
It is the human’s interest to feel pleasure from eating a tasty steak. It is the cow’s interest to not be killed.
According to the principle of equal consideration of interests, the cow’s interest to not be killed (nor exposed to factory farming practices) clearly outweighs the human’s interest in eating tasty meat.
There is also a moral ranking here that is based on how refined one’s capacity to suffer is. For example, humans are both sentient and capable of seeing themselves as distinct entities existing over time. Cows are merely sentient.
But if there are some humans who are not sentient nor capable of seeing themselves as distinct entities existing over time (for example, patients in a permanent vegetative state), then they have a lower moral footprint than a sentient cow. The cow still cannot conceive of itself as existing over time (probably), but it can experience suffering, which is more than such a human can.
Therefore, in that case, a cow has a higher moral status, and it would be more wrong to kill that cow than that human.
(Singer explores some edge cases, implications on others and on societal norms; I’m shortening the argument here.)
Singer claims that proximity is not adequate for moral judgment. If we generalize his argument beyond species, race, religion, nationality, to all markers of proximity, we must come to the conclusion that family is equally excluded from moral protection.
My family members are proximate to me in the sense that we have similar genes, and in the sense that we are one tightly-knit group, irrespective of genes (for example, families with adopted children).
Singer claims that genetic proximity is not a relevant moral factor – he rejects preferential treatment based on species, or race. Therefore, if I extend that line of argument, I cannot provide preferential moral treatment to my family based on their genes.
He also claims that other proximity which is not genetic – such as similarity of religion, or nationality – is equally not a relevant moral factor. Therefore, if I extend that line of argument, I also cannot provide preferential moral treatment to my family based on us being the same group.
Therefore, we must either:
Singer also claims that infants do not have the same moral status as adults. They have no conception of themselves as “a distinct entity existing over time”. They have potential personhood, but Singer claims that potential personhood is not as strong of a moral claim as real personhood.
Here’s a thought experiment:
You apartment building is on fire. You rush in. There’s time to save exactly one person: your 6-month-old baby, or an adult stranger.
If we must not give preferential moral treatment based on proximity, and if infants do not yet possess morally relevant characteristics, then the moral thing to do would be to let your child die in the fire, and save the stranger.
I believe that every moral framework that would have you let your child die so that you can save a stranger’s life is wrong. It must have gotten lost along the way somehow, and it is our task now to find where exactly this framework has gotten lost.
I do not believe that infants actually have the morally relevant characteristics that adults have. And I similarly agree with the premise that future personhood is not as strong a claim to moral status as current personhood.
No, the reason why you should save you child, is that it’s your child, which means that I reject the argument against proximity.
A counterargument might be: “you have chosen to have this child and therefore you have a moral obligation to it; it’s different from arbitrary things like nationality or religion.”
We can change the thought experiment to not have your own child in the fire, but your baby brother.
In that case, there is no choice that was made, and you have entered no “contract” that forms a moral obligation of care towards this being; it’s a genetic accident that you had no influence on.
Yet, I argue, it would entail the same effect: if you rush into the building, you should most definitely save your baby brother, and not an adult stranger.
Singer claims that, in aggregate, a society where one is more favorably disposed to one’s family (such as parents being invested in their children) is overall a better society to live in.
This is not because children are more morally valuable than adults, but because the side-effects of behaving that way create a society that is better.
This should mean that parents will invest a lot of time and effort into their children.
But this is a general disposition. It does not mean, in a specific life-or-death situation, that we should ignore the fact that there’s a big difference in infants and adults. If we are to accept “capacity to see oneself as a distinct entity with a past and future” as a moral characteristic that should override proximity-based characteristics, then it seems internally consistent to favor one’s own child in such a situation.
We might say: “Favoring family even in life-or-death situations leads to better overall outcomes”.
I personally agree, but then that seems inconsistent, or, at least, selective.
We want equal consideration of interests, but then there’s a particular place that we carve out where equal consideration of interests doesn’t apply as the relevant framework.
Moreover, if we favor family in life and death, family being just one – though very strong – marker of proximity, then that would justify favoring along any other dimension: race, nationality, gender – all things explicitly rejected by Singer as irrelevant moral characteristics.
Where is the boundary between:
“If everyone saves a member of their own family from a fire, even though there’s someone else who deserves help more, that leads to a better overall outcome for society.”
and
“If everyone saves a member of their own race from a fire, even though there’s someone else who deserves help more, that leads to a better overall outcome for society.”
?
One we favor as proper and good; the other is racism.
You could say that family is a “real” relationship; there’s direct care, you have obligations because your child depends on you, and unlike race or religion, it’s not an arbitrary category. But what if the burning building has your cousin that you know nothing about, don’t have any relationship with, and who is effectively a stranger to you?
Even in that case, most people’s moral intuition is to save the cousin, because he is blood.
If we would admit that saving a cousin you know nothing about purely because of genetic proximity is legitimate, than saving based on race is a matter of degree and not category. And saving based on other proximity factors (for example, belonging to the same tribe, or religion) then becomes acceptable too.
Let us assume that to satisfy (the extension of) Singer’s moral framework, we must sacrifice our own child (or baby brother) to save a stranger. Singer’s other argument is that you should keep giving until you reach a point where you start impoverishing yourself.
In that case, Singer’s argument for giving until you go just above poverty falls apart, because why stop at poverty?
Your child is proximate to you: that itself gives it no stronger claim to life. You yourself are even more proximate to yourself.
Therefore, by the same utilitarian calculus by which I should let my child perish in the fire, I should always sacrifice my own life if at least two lives are saved by my sacrifice.
Giving financially saves lives. The difference between giving money and sacrificing your life is a difference of degree: in both cases you are giving something of yourself, your accumulated capacity for change, your “life-force”.
Therefore, whenever I can give money such that I can save at least two lives, I should give that money even if I go into poverty or die.
The argument is that much stronger insomuch as the fact that my giving will almost definitely save more than two lives – cancelling out any objections that I might be killing myself for producing roughly an equal moral outcome.
Therefore, Singer’s argument that we should stop giving someplace where we start entering into poverty picks an arbitrary point. Internally, it favors the survival of the person giving the money.
But if we should be ready to discard the familial obligation to save the life of our not-yet-person child, then we should equally be ready to discard any “familial” obligation to save our own life.
You could argue that by continuing living, you could produce more utility overall, and therefore killing yourself to save more people is net harmful, given the fact that you could save much more people in the long run.
But there are two issues here.
One, if we are to keep the internal consistency of the argument, then we should not treat potential utility generation any more favorably than treating potential personhood.
Since Singer claims that potential personhood is not as morally relevant as real personhood, we cannot justify a different treatment for potential utility generation vs. real utility generation.
If we should be ready to sacrifice our potential-person child, then we should be ready to sacrifice our potential future giving.
Two, if we argue for our continued survival on the grounds that we might generate more utility by living longer, that line of argument can extend arbitrarily and we can by the same token argue that we should not so much that it brings us just above the line of poverty, because keeping more money will allow us to live better, potentially generate more money, and therefore generate more utility.
In other words, it proves too much.
I want to shortly reflect on the burning building thought experiment I introduced.
I would argue that if you rush into the burning building, and see either an infant or adult, both strangers to you, most people’s moral intuition would be to save the infant.
It certainly feels morally correct to me to save a stranger’s baby.
If the choice is between “adult person I know or love” and “stranger’s baby”, that choice is perhaps the most difficult of all. And I am not entirely sure I would pick the adult.
It seems that my moral intuitions are primarily shaped by the maxim of “the strong should protect the weak”. There’s a European moral lineage of chivalry – the notion that you should help those who are helpless, save those who are oppressed, and otherwise seek to be a hero.
Intuitively, morally, I sense that as the right thing to do.
And I would argue that, even on purely consequentialist grounds, being of that particular moral disposition produces overall better outcomes for society.
2026-04-05 15:25:24
So as I haven’t been able to speak the past short while, one thing I have noticed is that it is harder to communicate with others. I know what you are thinking: “Wow, who could have possibly guessed? It’s harder to converse when you can’t speak?”. Indeed, I didn’t expect it either.
But how much harder is it to communicate?
One proxy you can use is the classic typing metric, words per minute (wpm). So I spend some time looking at various forms of communication and how they differ between one another.
For most tests, i used https://www.typingtom.com/english/typing-test/30s
So I list below the forms of communication I have tried and how slow they are.
Here are the rough tiers that I found:
This is obviously the worst method of communication. Most people don’t know sign language, but can pretty intuitively learn how to infer most-but-not-all letters without needing to use a table. With people I have spend more time with, they have managed to learn it moderately well, but probably they should just learn sign language.
Then with some words, even with the word spelled out, people sometimes struggle to understand the written word spelled out in letters to translate that into the normal way they understand words.
That being said, sometimes people can use their context skills to infer what is wanted by just the first letter or two, so it’s not completely useless. And often it can be easiest since no materials are needed.
I find it slightly surprising how close these end up converging together.
For the most part. Writing on a whiteboard has the added benefit of being much easier to share in some contexts, while writing on a device has the benefit of being able to use Text-To-Speech (TTS). But I find both kinda inadequate in their own ways.
(But you see, there aren’t that many situations where typing with one hand comes up, so perhaps I just haven’t had that much practice with it? unclear)
Yeah I was somewhat surprised when typing on my phone with two hands, that it was not actually as much slower than typing on my laptop is. However, I guess this doesn’t factor into account that when typing on my phone, I might be outside in the cold or rain and simultaneously trying to walk, which combine to make typing on the phone feel much worse.
And yeah, I do wish I was faster at typing on my laptop, but I guess I never got around to it. But it makes sense that using two hands you get roughly double speed than you do with one hand.
I asked a few people to do a speaking speed test at a comfortable talking speed when reading, and found that it is much faster than typing by a significant margin, about double again. And this is effortless.
Speech also includes tone-of-voice and such, in a way that is only implicitly captured when typing and using a real-time TTS model. (my partner still sometimes doesn’t quite decouple that the tone of the “OK” on the TTS is not the tone with which I actually mean it).
I then subjected my same friends to the torture of reading the same passage as fast as they could. And they managed to achieve another ~1.5x in speed compared to normal speaking speed. It goes to show how language is quite optimized for speaking.
One update from doing all of this, is “wow, maybe when I get my voice back, I should just consider improving my Speech-to-Text game” (~10h maybe?), since the input is just so much faster than typing. (2-4x faster!). I used to be a big STT hater, so this is a moderately big update for me.
Some notes though:
One thing, is that the wpm of most of the methods are slightly higher than one might expect based on the naive number. When I do end up typing some sentence out, people can often infer what I am trying to say before I am finished typing. (I usually do end up still typing out the whole sentence anyway though). So one could potentially optimize for this somehow.
Another note, is that when speaking, I very rarely make verbal typos, and when I do, they are quite phonetically similar. When typing however, I make typos more often typographically similar, but when they are passed to a TTS model, the result is often catastrophic and illegible to people who want to understand what I just said.
This list also excludes some possible communication methods that I did not put in the effort to learn. ASL can reach speeds comparable to speaking if you learn all the vocab fluently. If one spends a year or two learning stenography, one can achieve 200-300wpm by typing as well. But I never learned either of these.
Overall, I remain bullish on speaking, more than ever, so I will try see what I can do in the future with this information.
2026-04-05 14:49:54
Written quickly as part of the Inkhaven Residency.
Related: Bureaucracy as active ingredient, pain as active ingredient
A widely known secret in academia is that many of the formalities serve in large part proof of work. That is, the reason expensive procedures exist is that some way of filtering must exist, and the amount of effort invested can often be a good proxy for the quality of the work. Specifically, the pool of research is vast, and good research can often be hard to identify. Even engaging in research enough to understand its quality can be expensive. As a result, people look toward signs of visible, expensive effort in order to determine whether to engage in the research at all.
Why do people insist only on reading research that’s published in well-formatted, well-written papers, as opposed to looking at random blog posts? Part of the answer is that good writing and formatting makes the research easier to digest, and another part is that investing the time to properly write up your results often causes the results to improve. But part of the answer is proof-of-work: surely, if your research is good, you’d be willing to put in the 30-40 hours to do the required experiments and format it nicely as a paper?
Similarly, why do fields often insist on experiments beyond their scientific value? For example, why does machine learning often insist that people do expensive empirical experiments even for theory papers. Of course, part of the answer is that it’s easy to generate theoretical results that have no connection to reality. But another part of the answer is that doing the empirical experiments serves as the required proof of work; implementing anything on even a medium sized open-source LLM is hard, but surely you’d invest the effort if you believed enough in your idea? (This helps explain the apparently baffling observation that many of the empirical results in theoretical papers have little relevance to the correctness or even the applicability of the theoretical results.)
Other aspects of ML academia – the beautifully polished figures[1], the insistence on citing the relevant papers to show knowledge of the field, and so forth – also exist in part to serve as a proof-of-work filter for quality.
In a sense, this is one of the reasons academia is great. In the absence of a proof-of-work system, the default would be something closer to proof-of-stake: that is, some form of reputational system based on known, previously verified accomplishments. While proof-of-work filters can be wasteful, they nonetheless allow new, unknown researchers to enter the field and contribute (assuming they invest the requisite effort).
An obvious problem with this entire setup is that LLMs exist, and what was once expensive is now cheap. While previously, good writing was expensive, LLMs allow anyone to produce seemingly coherent, well-argued English text. While it was once quite expensive to produce ML code, current LLMs produce seemingly correct code for experiments quickly. And the same is true for most of the proof-of-work signifiers that academia used to depend on: any frontier LLM can produce beautifully formatted figures in matplotlib, cite relevant work (or at least convincingly hallucinate citations), and produce long mathematical arguments.
I’ve observed this myself in actual ML conference contexts. In the past, crackpot papers were relatively easily to identify. But in the last year, I’ve seen at least one crackpot paper get past other peer reviewers through a combination of dense mathematical jargon and an expansive code base that was hardcoded to produce the desired results. Specifically, while the reviewers knew that they didn't fully understand the mathematical results, they assumed that this was due to their lack of knowledge, instead of the results themselves being wrong. And since the codebase passed the cursory review given to it by the other reviewers, they did not investigate it deeply enough to notice the hardcoding.[2]
In a sense, this is no different than the problems introduced by AI in other contexts, and I’m not sure there’s a better solution than to fall back to previous proof-of-stake–like reputation systems.[3] At the very least, I find it hard not to engage with new, seemingly-exciting results from unknown researchers without a high degree of skepticism.
This makes me sad, but I'm not sure there's a real solution here.
Especially the proliferation of beautiful "figure one"s that encapsulate the paper's core ideas and results in a single figure.
In fact, it took me about an hour to decide that the paper's results were simply wrong as opposed to confusing. Thankfully, in this case, the paper's problems were obvious enough that I could point at e.g. specific hardcoded results to the other reviewers, (and the paper was not accepted for publication) but there's no guarantee that this would always be the case.
Of course, there are other possibilities that less pessimistic people would no doubt point to: for example, there could be a shift toward proof-of-work setups that are LLM resistant, or we could rely on LLMs to do the filtering instead. But insofar as LLMs are good at replicating all cognitively shallow human effort, then I don't imagine there are going to be any proof-of-work setups that would continue to work as LLMs get better. And I personally feel pretty sad about delegating all of my input to Claude.
2026-04-05 14:30:09
About a year ago, we wrote a paper that coined the term “Gradual Disempowerment.”
It proved to be a great success, which is terrific. A friend and colleague told me that it was the most discussed paper at DeepMind last year (selection bias, grain of salt, etc.) It spawned articles in the Economist and the Guardian.
Most importantly, it entered the lexicon. It’s not commonplace for people in AI safety circles and even outside of them to use the term, often in contrast with misalignment or rogue AI. Gradual Disempowerment tends to resonate more than Rogue AI with people outside AI safety circles.
But there’s still a lot of confusion about what it really is and what it really means. I think it’s a very intuitive concept, but also I still feel like I don’t have everything clear in my mind. For instance, I think our paper both introduces the concept and presents a structured argument that it could occur and be catastrophic. But these things seem somewhat jumbled together both in my mind and the discourse..
So for reasons including all of the above, I plan to write a few posts on the topic, starting with this one.
The rest of this post is a list of ten different ways of thinking about or arguing for gradual disempowerment that I’ve used or encountered.
We’re replacing people with AI. These days when I speak publicly about AI, I often find myself returning to i) the more-or-less explicit goal of many AI companies and researchers of “automating all human labor”, and ii) the fact that many people in the space view humanity as a “bootloader for AI” as Elon Musk evocatively put it. Gradual Disempowerment is the process by which this replacement happens without AI ever rising up -- AI takes our jobs, and the people who control it and still have power increasingly are those who embrace “merging with the machines”, i.e. becoming cyborgs, but with the human bits being phased out over time until before long, humans cease to exist entirely.
Companies and governments don’t intrinsically care about you. This is basically the main argument in the paper… You can think of companies and governments as “agents” or “beings” that are driven by goals like (e.g.) “quarterly profits” or “GDP” or “national security”. Right now, the best ways to achieve these goals make use of humans. In the future, the best ways will instead make use of AI. A relentless pursuit of such goals, powered by AI, seems likely to destroy the things humans need to survive.
It’s (“global” or “late stage”) capitalism. The previous argument bears a significant resemblance to existing arguments, popular on the left, that “capitalism” is responsible for most of the world’s present ills. This feels like a decent “80/20” version of the concern, but importantly, it’s not just companies, but also governments (whose power is often more feared by those on the right) that could end up turning against their citizens once they become useless to them. And indeed, we’ve seen “communist” countries slaughter their own people by the millions. Besides wondering what alternative critics imagine, I don’t wholeheartedly endorse such critiques because I often feel unsure of what exactly people are criticizing when they critique capitalism in this way. But for people who already have this mental model, where our current social arrangements treat people as somewhat disposable or lacking in fundamental dignity or worth, this can be a useful starting point for discussion.
It’s another word for (or the primary symptom of) the “meta-crisis”. A few people in my circles have told me about this concept from Daniel Schmachtenberger, which I originally encountered on a podcast somewhere. The key claim is that all the crises we observe in the modern world are driven by some shared underlying factors. I view this as basically a more nuanced version of the view above, where “capitalism” is the root of all evil: The meta-crisis is still meant to be the root of all evil, but we don’t fully understand its nature. The way I like to describe the basic problem is that we are not practicing good enough methods of collective decision-making, or collective sense-making. And while I think we have some good ideas for improving on the status quo, we don’t have a proven solution.
It’s a structural consequence of the way in which information technology demands metrics, enables large scale influence campaigns, translates money into political power, and concentrates power via a recursive feedback loop. This one is maybe a bit too much to unpack in this blog post, but basically, society is increasingly “standardized” not only in terms of products, but also in terms of processes (e.g. restrictive customer service scripts or standard operating procedures) that have the benefit of being cheap, scalable, and reliable (often by eliminating “human error”, i.e. limiting human decision-making power and otherwise encouraging compliance). They also increasingly make more and more aspects of life subject to measurement and control via optimization of metrics, which necessarily fail to capture everything that matters. This general issue was a prime concern of mine before I learned about deep learning in 2012, and realized we might get to Real AI quite soon -- notably, this can happen even with stupid AI.1 In fact, you could argue that gradual disempowerment is already occurring through advertising, corporate media, and money in politics, among other things. This makes it a bit unclear how far back to go.
It’s evolution, baby! Maybe gradual disempowerment is best viewed as part of a much larger trend, going quite far back: evolution. People like to say “AI is the next stage in evolution” as if that means it’s okay if humanity goes extinct. But whether it’s OK or not, it may be that “Natural Selection Favors AIs over Humans”. At the end of the day, if AI becomes much better than humans at everything, it does seem a bit strange from a “survival of the fittest” point of view that humans would stick around. In such a situation, those who hand over more power and resources to AI would presumably outcompete those who don’t. So in the limit, AIs would end up with ALL the power and resources.
…and there’s no natural limit to outsourcing decision making to AI, even if you don’t trust it. AIs could be like underlings that are untrustworthy, but so skilled that competitive pressures still compel us to delegate to them. Consider the trope of the cowboy cop who’s “a loose cannon, but DAMMIT he’s the best we have!” Trust is important, and people are loath to use things they don’t trust. But AI seems to be becoming a tool so powerful that you almost HAVE to use it, even though it’s not secure, even though we haven’t solved alignment, even though we see evidence of scheming in tests, even though it seems to drive people crazy, etc… For me, this mostly comes up as a counter-argument to people who claim that market forces actually favor making AI aligned and trustworthy… that’s certainly true if doing so is free, but in fact, it’s impossible right now, and alignment doesn’t solve the problem of negative externalities.2 I like to analogize AI to a button that gives you $1,000,000 when you push it, but each time you press it also increases the temperature of the earth by a fraction of a degree. Or each time you press it has a 1% chance of destroying the world.
It’s an incarnation of Moloch. One of the most famous blog posts in the history of AI safety is Meditations on Moloch. It’s often considered a parable about coordination failures, but I think of it as about the triumph of “instrumental goals” over “terminal goals”, i.e. the pursuit of money (“instrumental goal”) as a means to happiness that has a tendency to become an end (“terminal goal”) in itself. We might begin handing over power to AI systems because we hope they will help achieve our goals. But we might need to hand over more and more power and also the AI might need to focus more and more simply on acquiring power in order to avoid being outcompeted by other AIs. This is also like an even deeper version of the evolution argument -- evolution and Moloch as described in the post both have the property where it’s unclear if they can really ever be “defeated” or are rather just part of the way the world works.
It’s on a (2D) spectrum with Rogue AI x-risk scenarios. Rogue AI scenarios are where “the AI suddenly seizes power”; gradual disempowerment is “we gradually hand over power”. There are lots of scenarios in the middle where the handoff takes place in part due to recklessness or negligence, rather than deliberately. One thing I don’t like about this way of talking about it is that I actually think gradual disempowerment is entirely compatible with full-blown rogue AI, in fact, I think one of the most likely outcomes is that competitive pressures simultaneously drive gradual disempowerment and reckless racing towards superintelligence, warning signs are ignored, and at some point in the reckless and chaotic exploration of the AI design space, rogue AI pops out.
Deskilling, aka “the WALL-E problem”. A lot of people these days seem to think of gradual disempowerment as largely about humans losing our own capabilities, e.g. for critical thinking, because we defer to AI so much. Professor Stuart Russell called this the “WALL-E” problem. To be honest, I still don’t fully understand or buy into this concern, or see how it necessarily leads to total disempowerment, but thought it’s worth a mention, due to its place in the discourse.
This might be as bad with smarter AI -- they can use more sophisticated judgments. But that ability also makes it tempting to put them in charge of more stuff.
This point seems important enough I almost want to make it its own item in the list.
2026-04-05 10:39:13
We already knew there's nothing new under the sun.
Thanks to advances in telescopes, orbital launch, satellites, and space vehicles we now know there's nothing new above the sun either, but there is rather a lot of energy!
For many phenomena, I think it's a matter of convenience and utility where you model them as discrete or continuous, aka, qualitative vs quantitative.
On one level, nukes are simply a bigger explosion, and we already had explosions. On another level, they're sufficiently bigger as to have reshaped global politics and rewritten the decision theory of modern war.
Perhaps the key thing is remembering that sufficiently large quantitative changes can make for qualitative macro effects.
For example, basic elements of modern life include transport, communication, energy, computation, and food. All of these have been part of human life for tens of thousands of years! Ancient humans could go places, could talk, could convert wood to heat, perform arithmetic (i.e., computation), and eat stuff!
I assert that to a very high degree, modern technology (and all its increments over the millennia) did not allow us to do fundamentally new stuff. Just the same stuff, but cheaper, faster, and easier.
Cars, trains, and planes are just going places. Could already do that. Emails are just sending information from one place to another. Could already do that. Books are just remembering things. Could already do that. Guns are just hurting people – could already do that.
The sheer magnitude of degree in all of those elements is the difference between hunter-gatherer life and modern life. Along the way, there have been some pretty big step changes. Writing is just remembering stuff and communicating it to another person, which you could already do, but so much more so that it reshapes civilization. Then you make writing cheap via the printing press, and your civilization gets shaped again.
When it comes to the transformative power of modern AI, I think the sufficient quantitative change makes for a large qualitative change is an underdiscussed lens.
The problem is our attention is focused on where LLMs are automating things at a macro-task level: coding, image and video generation, having conversations, medical diagnoses, etc. These are, in fact, a very big deal.
But I think LLMs, even smaller/weaker ones, are able to automate more basic building blocks of thoughts, and there's transformative power there too.
Getting down to some very basic constitutive mental tasks – things I could already do before LLMs:
Throughout my life, I have had thoughts. There is some lossy process that stores the output of my thoughts in my brain for later usage. I think this fails both at the "the info didn't really get stored" level, and the "the info is in there, but the search query failed to return it".
"Taking notes" is an ancient technology we already have for improving upon the fallibility of human memory, but it's effortful in so many ways: you need to be carrying a note-taking device with you, you need to either constantly have it out or pull it out when needed, if it's a notebook, find a blank page, then take the time to write down your note[1].
That's just recording it. For notes to be useful, you also have to remember you have the note, find it, and then read it. The more notes you have, the more expensive that process is.
For the most part, to date, I've relied on my fallible in-built memory.
The thing is, LLMs are able to make all of the above elements vastly cheaper. This is one of the fundamental principles of the "Exobrain" system I've been steadily building up, and hope to describe soon. I don't need it to solve protein folding to be useful; I don't even need it to help with prioritization (although that's a goal). It's incredibly useful if it just improves on basic read/write/search of memory.
Before |
After |
Retrieve phone from pocket, open note-taking app, open new note, or find existing relevant note |
Say "Hey Exo", phone beeps, begin talking. Perhaps instruct the model which document to put a note in, or let it figure it out (has guidance in the stored system prompt) |
Remember that I have a note, either have to remember where it is or muck around with search |
Ask LLM to find the note (via basic key-term search or vector embedding search) |
If the note is lengthy, you have to read through all of note |
LLM can summarize and/or extract the relevant parts of the notes |
Beware Trivial Inconveniences. The above is the difference between rarely taking notes and taking multiple notes a day, narrating long trains of thought. It's the difference between giving up on logging my mental state and conscientiously logging it twice daily for months.
Putting it into handwavy quantitative terms, when the cost of note-taking and record-keeping comes down 20x, my usage goes from 0/day to 20-30/day.
But the value happens because LLMs have made it cheap across the entire pipeline. Not just the storing of information, but also the retrieval and processing. AI makes it fast and easy to search through all my notes, even if I have a lot of notes. If I want all of my thoughts on a topic, I can have it read dozens and dozens of pages over the years and summarize them and extract relevant info.
What this does is a step change. It takes me from not taking many notes to taking copious notes. Same for todo items and reminders, and same for logging data relevant to my health and experimentation.
It benefits from using stronger models, but the core elements are doable even with small models like Haiku, because it's just automating speech-to-text[2], choosing among a small set of files (or making a new one), writing, simple search, and then maybe a simple summary.
It's not just me doing this. Independently, someone else I know began setting up detailed logging on their computer of everything they're doing, and downstream of that, we're starting to record everything at Lightcone to make it accessible to LLMs.
I expect we will see more of this: using LLMs not just for protein folding and novel math conjectures, but replacing very simple operations of recording and retrieving info. But not just replacing, replacing and scaling to unprecedented levels of behaviors, because that's what happens when you make things cheaper.
Humanity has done this many times, with energy, transport, communication, food, and so on. I think where LLMs get different is they bring down the cost of very elementary mental operations (like storing and remembering, choosing between a few options) – menial stuff that can be combined to great effect. (After all, computers are a lot of rather menial arithmetic and logical operations combined to great effect.)
All of this has equivalents if you're taking notes on your phone.
I currently use Deepgram, which isn't great, but is adequate. Pretty sure there are transformers in it.
2026-04-05 10:33:00
A lot of people and documents online say that positive-sum games are "win-wins", where all of the participants are better off. But this isn't true! If A gets $5 and B gets -$2 that's positive sum (the sum is $3) but it's not a win-win (B lost). Positive sum games can be win-wins, but they aren't necessarily games where everybody benefits. I think people tend to over-generalize from the most common case of a win-win.
E.g. some of the claims you see when reading about positive-sum games online:
A positive-sum game is a "win-win" scenario in game theory and economics where participants collaborate to create new value, ensuring all parties can gain or benefit.
[Win-win games are] also called a positive-sum game as it is the opposite of a zero-sum game. – Wikipedia
Here I use "positive-sum game" to refer to resource games that involve allocating resources, not allocating utility. "Positive-sum game" isn't a meaningful thing when referring to utility because the utility of each participant can be individually rescaled, so you can turn any game into one with an arbitrary sum; the sign of the sum doesn't matter.
There are a lot of cases where we can make the world as a whole better while simultaneously making some people worse off, and it's important to acknowledge that. Here are some positive-sum situations:
One interesting thing about positive-sum games with losers is that the players can sometimes turn it into a win-win for everybody by having the winners distribute a portion of their winnings to the losers. You can turn positive-sum games into win-wins if:
This is the concept of turning a Kaldor-Hicks improvement (an improvement where everyone would hypothetically be better off if the winners compensated the losers) into a Pareto improvement (an improvement where everyone is better off).
One interesting example is an efficient auction with an entrance cost[1], which benefits the winner (who values the good the most) and auctioneer, and harms all the other bidders (who paid the costs of entering into the auction and got nothing). The entrance cost doesn't need to be a direct fee to enter into the auction; it can also include indirect costs like spending time and effort to decide how much to bid.
The winner's consumer surplus (how much their value of the goods exceeds what they paid) is value to them, but not cash that they can transfer to compensate the losers. If the winner has enough money they could compensate the other bidders for their wasted costs of entering the auction, and everyone would be better off, but if not the auction winner is better off but can't compensate the losers. In practice, valuing the indirect costs bidders have for entering into auctions is difficult and so auctions are often positive-sum games with losers.
Another example interesting example is expropriation, in practice the government usually pays the fair market value of the land to the person whose land was seized, attempting to turn a positive sum game with losers into a win-win, although landowners often feel the expropriation payments aren't sufficient.[2]
I think it's important to keep all this in mind when making positive-sum proposals that there might be losers and they should be compensated if possible; "positive-sum" doesn't mean that everyone benefits.
This is only positive-sum if the surplus for the winner exceeds the total entrance costs for all the bidders, which I assume is the case.
Which makes sense: landowners have a revealed preference that they value their land more than the fair market value, because if they valued it at less than FMV they could just sell it for the FMV and be better off. (Ignoring illiquidity and the transaction costs for selling the land.)