2025-12-02 09:53:30
Published on December 2, 2025 1:53 AM GMT
MIRI is running its first fundraiser in six years, targeting $6M. The first $1.6M raised will be matched 1:1 via an SFF grant. Fundraiser ends at midnight on Dec 31, 2025. Support our efforts to improve the conversation about superintelligence and help the world chart a viable path forward.
MIRI is a nonprofit with a goal of helping humanity make smart and sober decisions on the topic of smarter-than-human AI.
Our main focus from 2000 to ~2022 was on technical research to try to make it possible to build such AIs without catastrophic outcomes. More recently, we’ve pivoted to raising an alarm about how the race to superintelligent AI has put humanity on course for disaster.
In 2025, those efforts focused around Nate Soares and Eliezer Yudkowsky’s book (now a New York Times bestseller) If Anyone Builds It, Everyone Dies, with many public appearances by the authors; many conversations with policymakers; the release of an expansive online supplement to the book; and various technical governance publications, including a recent report with a draft of an international agreement of the kind that could actually address the danger of superintelligence.
Millions have now viewed interviews and appearances with Eliezer and/or Nate, and the possibility of rogue superintelligence and core ideas like “grown, not crafted” are increasingly a part of the public discourse. But there is still a great deal to be done if the world is to respond to this issue effectively.
In 2026, we plan to expand our efforts, hire more people, and try a range of experiments to alert people to the danger of superintelligence and help them make a difference.
To support these efforts, we’ve set a fundraising target of $6M ($4.4M from donors plus 1:1 matching on the first $1.6M raised, thanks to a $1.6M matching grant), with a stretch target of $10M ($8.4M from donors plus $1.6M matching).
Donate here, or read on to learn more.
As stated in If Anyone Builds It, Everyone Dies:
If any company or group, anywhere on the planet, builds an artificial superintelligence using anything remotely like current techniques, based on anything remotely like the present understanding of AI, then everyone, everywhere on Earth, will die.
We do not mean that as hyperbole. We are not exaggerating for effect. We think that is the most direct extrapolation from the knowledge, evidence, and institutional conduct around artificial intelligence today. In this book, we lay out our case, in the hope of rallying enough key decision-makers and regular people to take AI seriously. The default outcome is lethal, but the situation is not hopeless; machine superintelligence doesn't exist yet, and its creation can yet be prevented.
The leading AI labs are explicitly rushing to create superintelligence. It looks to us like the world needs to stop this race, and that this will require international coordination. MIRI houses two teams working towards that end:
If Anyone Builds It, Everyone Dies has been the main recent focus of the communications team. We spent substantial time and effort preparing for publication, executing the launch, and engaging with the public via interviews and media appearances.
The book made a pretty significant splash:
The end goal is not media coverage, but a world in which people understand the basic situation and are responding in a reasonable, adequate way. It seems early to confidently assess the book's impact, but we see promising signs.
The possibility of rogue superintelligence is now routinely mentioned in mainstream coverage of the AI industry. We’re finding in our own conversations with strangers and friends that people are generally much more aware of the issue, and taking it more seriously. Our sense is that as people hear about the problem through their own trusted channels, they are more receptive to concerns.
Our conversations with policymakers feel meaningfully more productive today than they did a year ago, and we have been told by various U.S. Members of Congress that the book had a valuable impact on their thinking. It remains to be seen how much this translates into action. And there is still a long way to go before world leaders start coordinating an international response to this suicide race.
Today, the MIRI comms team comprises roughly seven full-time employees (if we include Nate and Eliezer). In 2026, we’re planning to grow the team. For example:
We will be making a hiring announcement soon, with more detail about the comms team’s specific models and plans. We are presently unsure (in part due to funding constraints/budgetary questions!) whether we will be hiring one or two new comms team members, or many more.
Going into 2026, we expect to focus less on producing new content, and more on using our existing library of content to support third parties who are raising the alarm about superintelligence for their own audiences. We also expect to spend more time responding to news developments and taking advantage of opportunities to reach new audiences.
Our governance strategy primarily involves:
There's a ton of work still to be done. To date, the MIRI Technical Governance Team (TGT) has mainly focused on high-level questions such as "Would it even be possible to monitor AI compute relevant to frontier AI development?" and "What would an international halt to the superintelligence race look like?" We're only just beginning to transition into more concrete specifics, such as writing up A Tentative Draft of a Treaty, with Annotations, which we published on the book website to coincide with the book release, followed by a draft international agreement.
We plan to push this a lot further, and work towards answering questions like:
We need to extend that earlier work into concrete, tractable, shovel-ready packages that can be handed directly to concerned politicians and leaders (whose ranks grow by the day).
To accelerate this work, MIRI is looking to support and hire individuals with relevant policy experience, writers capable of making dense technical concepts accessible and engaging, and self-motivated and competent researchers.[1]
We’re also keen to add additional effective spokespeople and ambassadors to the MIRI team, and to free up more hours for those spokespeople who are already proving effective. Thus far, the bulk of our engagement with policymakers and national security professionals has been done either by our CEO (Malo Bourgon), our President (Nate Soares), or the TGT researchers themselves. That work is paying dividends, but there’s room for a larger team to do much, much more.
In our conversations to date, we’ve already heard that folks in government and at think tanks are finding TGT’s write-ups insightful and useful, with some calling it top-of-its-class work. TGT’s recent outputs and activities include:
The above isn’t an exhaustive description of what everyone at MIRI is doing; e.g., we continue to support a small amount of in-house technical alignment research.
As noted above, we expect to make hiring announcements in the coming weeks and months, outlining the roles we’re hoping to add to the team. But if your interest has already been piqued by the general descriptions above, you’re welcome to reach out to [email protected]. For more updates, you can subscribe to our newsletter or periodically check our careers pages (MIRI-wide, TGT-specific).
Our goal at MIRI is to have at least two years’ worth of reserves on hand. This enables us to plan more confidently: hire new staff, spin up teams and projects with long time horizons, and balance the need to fundraise with other organizational priorities. Thanks to generous support we received in 2020 and 2021, we didn’t need to run any fundraisers in the last six years.
We expect to hit December 31st having spent approximately $7.1M this year (similar to recent years[2]), and with $10M in reserves if we raise no additional funds.[3]
Going into 2026, our budget projections have a median of $8M[4], assuming some growth and large projects, with large error bars from uncertainty about the amount of growth and projects. On the upper end of our projections, our expenses would hit upwards of $10M/yr.
Thus, our expected end-of-year reserves puts us $6M shy of our two-year reserve target of $16M.
This year, we received a $1.6M matching grant from the Survival and Flourishing Fund, which means that the first $1.6M we receive in donations before December 31st will be matched 1:1. We will only receive the grant funds if it can be matched by donations.
Therefore, our fundraising target is $6M ($4.4M from donors plus 1:1 matching on the first $1.6M raised). This will put us in a good place going into 2026 and 2027, with a modest amount of room to grow.
It’s an ambitious goal and will require a major increase in donor support, but this work strikes us as incredibly high-priority, and the next few years may be an especially important window of opportunity. A great deal has changed in the world over the past few years. We don’t know how many of our past funders will also support our comms and governance efforts, or how many new donors may step in to help. This fundraiser is therefore especially important for informing our future plans.
We also have a stretch target of $10M ($8.4M from donors plus the first $1.6M matched). This would allow us to move much more quickly on pursuing new hires and new projects, embarking on a wide variety of experiments while still maintaining two years of runway.
For more information or assistance on ways to donate, view our Donate page or contact [email protected].
The default outcome of the development of superintelligence is lethal, but the situation is not hopeless; superintelligence doesn't exist yet, and humanity has the ability to hit the brakes.
With your support, MIRI can continue fighting the good fight.
In addition to growing our team, we plan to do more mentoring of new talent who might go on to contribute to TGT's research agenda, or who might contribute to the field of technical governance more broadly.
Our yearly expenses in 2019–2024 ranged from $5.4M to $7.7M, with the high point in 2020 (when our team was at its largest), and the low point in 2022 (after scaling back).
It’s worth noting that despite the success of the book, book sales will not be a source of net income for us. As the authors noted prior to the book’s release, “unless the book dramatically exceeds our expectations, we won’t ever see a dime”. From MIRI’s perspective, the core function of the book is to try to raise an alarm and spur the world to action, not to make money; even with the book’s success to date, the costs to produce and promote the book have far exceeded any income.
Our projected expenses are roughly evenly split between Operations, Outreach, and Research, where our communications efforts fall under Outreach and our governance efforts largely fall under Research (with some falling under Outreach). Our median projection breaks down as follows: $2.6M for Operations ($1.3M people costs, $1.2M cost of doing business), $3.2M Outreach ($2M people costs, $1.2M programs), and $2.3M Research ($2.1M people costs, $0.2M programs). This projection includes roughly $0.6–1M in new people costs (full-time-equivalents, i.e., assuming the people are not all hired on January 1st).
Note that the above is an oversimplified summary; it's useful for high-level takeaways, but for the sake of brevity, I've left out a lot of caveats, details, and explanations.
2025-12-02 07:43:25
Published on December 1, 2025 11:43 PM GMT
In Rubber Souls, Bjartus Tomas argues that we can have cruely-free status games by creating underpeople without moral worth, perhaps because they are non-conscious, worth to serve as our permanent underclass. This removes the current problem where some poor bugger has to be at the bottom of the barrel, or the bottom quarter or half or so forth.
Needless to say, I approve.
But I think it's worth fleshing out a bit why this is possible, and why you won't wind up with everyone associating humans as a high status source of esteem as underpeople as a low status source.
We build our sense of self-worth not from some abstract global ranking, such as percentile ranking of wealth or h-index, but through comparing ourselves to people in our social circles. So in the glorious transhumanist future where the labs somehow avoid flubbing it, Dario Amodei may be God-Emperor of the universe, but as long as he's far from your social graph, and your social graph has nary a whisper of him, then you're not likely to compare yourself with him and feel low self-esteem.
More generally, I expect the far future to have less global status rankings because I expect everything to run at much higher speeds, making fixed travel times feel proportionally longer. If a mind ran 1,000,000,00x faster than us, for them light would only a foot/subjective second, or a measly 1km in a subjective hour. Which means it's harder to communicate and co-ordinate across the total span of human civilization, resulting in smaller, disjoint cultures with their own local hierarchies.
Secondly, humans have very coarse personhood detectors. Let alone AIs like GPT-4o, we've even treated animals or inanimate objects as people in the past. It's just quite easy to convince our brains that a non-human entity is a person. This makes sense; how could evolution have encoded something as complex as a human detector when building our social reward circuitry? No, it had to make do with mere proxies instead.
And that means there are bound to be very strange entities we can construct that would count as valid sources of status in the future. Yes, stranger even than LLMs. And probably more effective than humans, to boot.
So if we build these super-stimuli status sirens, these underpeople forming our permanent underclass, who might I add could well be pleased to be permanent yes-man for mind design space is wide, could we truly say we wouldn't gladly partake of them?
Yes, perhaps we would view associating with them as abhorrent at first. But for those who are stuck at the bottom of the status hierarchies of utopia, their need for social esteem would compel them to at least try it out. Then, deep in the midst of the underpeople's social scene, would it truly seem so bad? I think not.
2025-12-02 05:30:51
Published on December 1, 2025 9:30 PM GMT
Reposting some of my best Inkhaven posts here. Will do so at a 1-2x/week cadence unless anybody objects. I'll start with the first post[1], about writing tips for myself and others:
Today is my first day at Inkhaven! It’s a program where I try to publish a blog post every day for all of November.
To set the stage for the rest of the program, I offer up a loosely organized list of writing advice I wished my younger self would’ve learned earlier. I intend to refer back to this advice often over the course of the program (and hopefully in future years). I’m vaguely optimistic that it can also be helpful to some other participants in my program, and maybe to other aspiring writers as well.
Source: inkhaven.blog
Why do you write like you’re running out of time? (Hey!)
Write day and night like you’re running out of time? (Hey!)
Ev’ry day you fight, like you’re running out of time
Keep on fighting. In the meantime—
The most important thing to do as a budding writer is to write a lot. Write a lot very, very, fast. Write a lot. If the goal of writing a lot is impeded by other advice on this list, you1 should heed the other advice too, but in general, prioritize writing a lot over other goals. For example, if you don’t have surprising things to say, write a lot anyway.
Who knows? Maybe you misjudged your audience and your “obvious points about field/question X” is actually novel to many readers. Probably not. But it’s possible!
Publish a lot, too. Many positive feedback loops only come into play once you hit “publish”, and some from the intention to publish and taking yourself seriously as a fast blogger.
Thanks for reading The Inchpin! Subscribe for free to receive one post a day over November and follow me along my blogging journey!
Many writers enjoy writing what readers expect them to say and re-emphasizing the exact same points over and over again in slightly different ways. For people who want to do this and can do this well (I enjoy Matt Yglesias and Bentham’s Bulldog as two examples of fulfilling this niche), more power to them!
But I at least a) know I’d be bored writing the same thing over and over, and b) realistically, know I’ll suck at it, compared to masters of the craft who live and breathe repetition in their field of interest.
So my comparative advantage is to write a wide breadth of articles, and be surprising in (almost) all of them.
This should be true recursively at all levels:
But in the service of saying surprising things, don’t say false things. It’s easy to be “surprising” or high-entropy simply by not having a restriction on “truth” capping what you can say, or via lacking a detailed world model. Good nonfiction writing should avoid lying, by either commission or omission.
Don’t say random impertinent things, either (#sorandom). Instead, aim to say things that in retrospect would come across to readers as surprising but inevitable.2
Try to figure out ways to get fast, regular, and reliable feedback. From yourself, AIs, early readers, and eventually mentors and intended readers. Like almost all other areas of human endeavor, one of the best ways to improve is via deliberate practice, and one of the cornerstones of deliberate practice is immediate and reliable feedback.
Unfortunately, when you first start out, all your feedback methods suck. Your intuitions suck (unless you are a writing god, which is unlikely). Your AIs will be pretty miscalibrated on what your intended outcome is (plus they’ve been primed by your early drafts, which suck). Your readers will suck. You probably don’t have mentors. You certainly aren’t getting the readers you want. Oops.
But actually, this is okay! Fast biased feedback can still help you improve!
For example, one of the earliest ways I remember improving as a writer is via writing lots of Quora answers (starting back ~2014 when Quora was halfway decent). The training signal I got on which answers were good was a combination of my own thoughts, Quora upvotes and downvotes, and the occasional high-quality comment. Was the feedback great? Of course not! But nonetheless, the early feedback got the ball rolling, and over the course of months and years, I learned to write more quickly, pay attention to what members of my audience like, structure my arguments better, have more engaging examples, etc, etc. All good things!
Many people have the (false) belief that GIGO (“garbage in, garbage out”) is a fundamental law of nature. I instead think of it as a weak heuristic that is frequently wrong. When it comes to writers and writing, compressed garbage can (sometimes) instead become diamonds.
So all in all, it’s very important to continuously write for an audience, and write in a manner that allows you to collect a ton of feedback from yourself and others. Don’t be afraid to (mostly) ignore the bad feedback3, but treasure all the feedback you receive regardless, and create gradients that make it easier for others to give you feedback, and for you to receive it.
A relevant and related model is the generator/discriminator model from machine learning (aka babble/prune). Writing a lot is your babble, good feedback (from yourself and others) is your pruning mechanism.
Relatedly, publish. Publicly. A lot. 80% of the ways you receive feedback are ~effectively closed to you if you only write drafts for yourself, or circulated among a small number of peer editors.
It’s great that you want to write a lot of surprising things in different ways that are conducive to receiving lots of feedback . But how do you write lots of new and surprising articles without lying, or just saying technically true but effectively impertinent facts?
Aristotle said a good story’s ending should be “surprising but inevitable.” Likewise, I think it’s a very important goal of nonfiction. This helps maintain a high-novelty/-surprise factor of your writing, while still making each article internally cohesive and driven by a specific, coherent, logic.
For example, in my Rising Premium of Life post, which until then primarily drew on economics data and modeling on humans about changing preferences in valuation of life this century, I had a long, seemingly random, detour into facts about the lifespan of bats, mice, and other animals. But by the end of the article, the connection was apparent, and perhaps even inevitable – once you see the analogy between intrinsic and extrinsic animal lifespan differences and how safety might beget more safety, it’s hard to unsee.
Likewise, my Why Reality Has a Well-Known Math Bias post opens with the seemingly absurd image of a shrimp physicist trying to do advanced fluid mechanics calculations before giving up, quitting shrimp physics in favor of going back to shrimp grad school in the shrimpanities. Laughable, and yet the metaphor is directly connected to multiple arguments later on about how best to think about the unusual effectiveness of mathematics in the natural sciences.
Source: Gemini Pro 2.5/ my prior article
Among other bloggers, Richard Hanania is halfway decent at this.4 His arguments for leftist censorship are utterly surprising given his own history of being suppressed. Yet his arguments have a sort of internally cohesive logic that’s simultaneously intellectually challenging and funny. Scott Alexander is, of course, a master at crafting surprising but inevitable arguments across a wide range of domains.
To find novel or at least surprising (to your audience) ideas to write about, it helps to read a lot, and persistently.
My first piece that went semi-viral (~6k upvotes, ~500k views) was this Quora answer on “honest college majors.” It’s pretty silly, but I think what elevated this Quora answer was that instead of having a single stereotype that other people complain about (“humanities + social science majors are too leftist! X majors are unemployable! ABC majors is too hard/easy!”) I had already by that time had a reasonably decent introductory understanding about every academic major I joked about. None of my jokes were particularly revolutionary, but I think my breadth set me apart (survivorship bias joke in biomedical engineering, replication crisis joke in psychology, LSAT joke in Pre-Law, decision theory joke in history) and made it more interesting than the other answers that only had 1-3 novel angles to go for.
It’s also good to broadly use multiple separate ~independent quality filters for your readings, and be extremely wary of systematic biases in your access to information.
For specific articles, try to understand the State-of-the-Art of a field before opining on it. This is easier to do in some fields than others! For example, in my anthropics/mathematical effectiveness post, the question spanned many different academic fields (physics, philosophy of math, philosophy of science, etc) and I know there’s a real chance I missed important existing work. Nonetheless I gave it my best shot, skimming many papers I can find online, spinning up multiple different AIs to ask questions about fields I have less familiarity with, and asking three different philosophy academics I know to review my draft before publishing it. Even so, there’s a decent chance I missed something critical5. Still, you got to try!
If you’re opining on a known field, it behooves you to understand the existing academic consensus (or non-consensus!) before opining otherwise. Pay attention to track records on similar arguments, the intelligence/education level of various proponents and opponents, amount of total collective research effort undertaken before researching a conclusion, etc.
Sometimes while researching an article, you realize that the article you originally planned is bunk (either because your ideas are false or because they’ve already been said better elsewhere). This is okay!
Personally at that point I just stop that writing project and move on elsewhere, but people who are overly perfectionist/have trouble following the “write a lot” advice might benefit from documenting their mistakes and discoveries before moving on.
When writing nonfiction, unless you are writing something highly personal and specific to you, it’s often good to read multiple takes before writing your own. This is most obvious for research-heavy writings but it’s easier to do, and arguably more impactful on the margin, for posts others won’t consider “research.”
In my intellectual jokes post, by the time I started the post, I already had read thousands of different jokes online. But just to be safe, I Googled “intellectual jokes” and read/skimmed upwards of a couple hundred existing jokes in various online compilations, just to make sure the nine jokes I could stand behind are genuinely the best/funniest intellectual jokes by my lights.
Could I have written an intellectual jokes compilation if I only knew ~200 jokes and only ~10 intellectual ones? Of course! I expect most of the existing compilations look like this! But the marginal cost of skimming as many jokes as I did really isn’t that high6, especially for a post that eventually got ~30k views, and I think my posts benefit from me caring a bunch more about quality in areas other people don’t seem to.
A corollary of reading a lot and writing things that are surprising is that you want to identify what others miss.
For example, for my Ted Chiang review, I reread several of his stories and read 10+ prior book reviews before starting to write my own (As of today I’ve probably read/skimmed >30 reviews). This helps ensure that my book review covers not just points that are interesting to me, and that Chiang does unusually well, but also specifically points that other book reviews overall missed.
Similarly, in my honey post, I biased towards covering subareas of bee welfare (e.g. positive vs negative valence), that other people did not. I didn’t try to answer thorny questions of normative ethics, bee consciousness, or intensity of valenced experiences, because my impression is that those were already (relatively) widely discussed elsewhere.
To quote a wise woman “Haters gonna hate hate hate” This is not just definitional but causal, hating makes you have more of a character of a hater.
Many people start writing blogposts because someone’s wrong on the internet and they need to be set right. This is a perfectly fine sentiment to carry you through a post or two (and I’ve certainly fallen prey to it myself). But as an overall rule, it’s dangerous to see yourself as primarily a critic (or worse, a “hater”). It’s better to be motivated by love, curiosity, the Good, and other positive traits, rather than hate.
I’ve said before that you should write things that are surprising. But surprising to who? The simplest answer is “your intended audience”.
Who is your intended audience? That’s what you have to find out! Imagine who you want to read your work. Better yet, interview your real readers! (or people who are almost-readers).
Build a rough psychological profile of the people who you want to read your work, and try to say things that are true but surprising to them! Try to say things that slot in well (“inevitably”) with their own mental self-narratives, while at the same time being genuinely surprising.
Say what you believe and believe what you say. This is important.
There are different ways to achieve this. Classic style achieves it via purity of style. I like to just use first-person language (“I believe”, “I think”, “I aim”). The important thing is to avoid a weird academic-ese where you use weaselley, reflexive, language.
It often helps to write in your own voice, and write as you speak, though this is not strictly a requirement. Sometimes it helps to affect specific voices, or over-emphasize some aspects of your personality (see later sections on style).
Talk to your readers as if you are talking to an intelligent, educated, friend, or (as I do sometimes) to your younger self, who is intelligent but hasn’t yet stumbled upon the exact same insights as you.
Include graphs, sources, calculations. Anticipate and diffuse obvious objections. Learn from people who are good at writing with data, and use it natively.
I’m not an expert on any specific field. And if I am, it’d be in relatively niche and inherently inter-disciplinary “fields” (like “EA grantmaking” or “pandemic forecasting”). So it’s pointless to hide that and try to “compete” with the experts in their native turf.
Instead, it’s better to rely on my own breadth and interests to make cohesive interdisciplinary arguments on topics of interest.
For example, my first anthropics post draws not just on anthropic reasoning, but also on philosophy of science, physics, history of science, evolution, math, philosophy of math, and even AI. The Ted Chiang review similarly draws not just from understanding Chiang’s own work or other science fiction, but relies on a confident understanding of social criticism, modern philosophy and economic growth models.
But at the same time, don’t “dazzle” people with knowledge for the sake of impressing them! Make focused arguments necessary to express a point (or occasionally to be interesting/funny), and never try to come across as too intelligent for your readers!7
The following four points on craft are areas that I suspect most aspiring nonfiction writers can benefit from.
As I wrote in my field guide to writing styles, a mature writing style is defined by making a principled choice on a small number of nontrivial central issues: truth, presentation, cast, scene, and the intersection of thought & language.
A good artifact of writing maintains not just surface similarities but has a quiet consistency that underlies the work.
Open Asteroid Impact, for example, never broke character. There were jokes I considered including but rejected because while funny in isolation, they would’ve ruined the mirage that I was running a Serious Company completely lacking in self-awareness, and thus make the website overall less funny.
While I believe a lot in experimenting in general across the course of your writing career, I think most posts should pick a side on each of the central issues and stick with that. Don’t have your book of prophecy be 1337spk.
If Coates instead titled it “Some Arguments in Favor of Considering Ethnicity As One of Several Factors When Deciding Appropriately-Sized Transfer Payments” the article would become substantially less memorable
Spend significant effort on your title (>10 minutes per serious post, and sometimes closer to an hour). It’s one of the most important components of a successful blogpost.
I think this is very unintuitive to people, myself included. From the perspective of a writer, a title is just one line to write (and often one of the least interesting lines, as it’s unlikely that you as a writer discover something novel in the process of creating a title). But from the perspective of a (potential) reader, titles are quite important, as:
So titles are very important to readers! Since they are so important to readers, a writer trying to cooperate with (current and future) readers should also treat titles as important!
In addition to wanting to be cooperative with readers, in the current algorithmic information environment, you also have to somewhat “fight”/argue for your post’s value in seeking your readers’ attention.
Finally, good titles should capture the essence of your point after it’s been argued. So you and your readers can refer to it again in the future. So it behooves you to come up with good titles (or a title; subtitle combination) that both encapsulates your primary thesis and also is intriguing enough for readers to click through.
I often leave my titles blank before finishing a post (or have a purely descriptive internal title like “writing styles post” or “war post”), and use AIs to generate title ideas after I finish writing a post. Sometimes the AIs can help discover real gems, like elevating The Secret Third Thing from what I thought of as a throwaway Twitter reference to a poetic phrase that encapsulates a core argument of mine in the Ted Chiang review.
More often, I go back and forth with AIs and use AIs as inspiration but settle in on a cleverish multilayered title like Why Reality has a Well-Known Math Bias, where the final title and pun was entirely my own, but I probably wouldn’t have generated the title if AIs didn’t suggest the direction of other (worse) puns first.
Unfortunately sometimes I fail at a title, even for pieces I otherwise like. For example, I don’t like to refer to the Why Are We All Cowards? by its final title, since it’s both clickbaity and imprecise compared to the actual argument, but “the rising premium of life post” unfortunately sounds like uninteresting gibberish to people who don’t already understand the post’s central concept.
Self-explanatory.
When you see online advice on complexity of language, many people go for advice like Orwell’s, where you “Never use a long word where a short one will do”, etc, unless you’ll otherwise say something “outright barbarous.”
I mostly disagree. I want to be more ecumenical with my advice. I think complexity of language has its place, and I think there are many times where you ought to prefer more complex language even if simpler words are “good enough,” even when the simpler language is not “outright barbarous.”8
For me, I think better advice is to never write with words you personally are uncomfortable with, and to write with words and methods of expression that are comfortable for your actual audience. This means understanding your actual audience (see above), and developing models of what words would and would not be too complex for them.
As a ballpark, I think for popular pieces, you ought to aim for writing ~4 years of specialization/grade levels below where you think your audience’s actual reading level is (since people are less comfortable with writing at the limits of their ability). But this is just a ballpark. I’m more confident that writers should pay attention to their reading levels/complexity of words and be intentional with their choices9 than that they should land at a specific place.
These are craft points that I suspect are more specific to the types of writing that I want to do.
Some posts (e.g. Rethink Priorities reports, academic papers in the natural sciences) are primarily made to be skimmed by the vast majority of readers. For those works, the core success criterion of the writing is skimmability (ease of parsing on a quick skim) and the rest of the writing is practically more about proof-of-work and/or evidence you know what you’re talking about, than it’s actively meant to be read.
Some writings (e.g. Great American Novels) are written primarily for Deep Reading (™), and the authors would even be offended if you try to skim them! Some of those works are deliberately engineered to be harder to skim.
For my own blog posts, I try to anticipate both skimming and non-skimming use-cases. So I try to write articles that are both easy to skim and easy to not skim. This is a real tradeoff I’m making, and other bloggers might want to only pick one of the two sides to focus on.
I try to have clear, meaningful section breaks that preview content. This makes it easier for readers to skim my content, for returning readers to jump to sections they’re interested in, and for me (and hopefully others!) to link specific sections to use as lemmas/subarguments in discussions with others, etc.
Most writing advice given before the last ten years, and often many that are not, ignore the online nature of the vast majority of written content we produce and consume these days. They assume modern writing still looks like Ye Olde Essayes hosted HTML-only on some blogging server.
Needless to say, modern blogs don’t look like this at all anymore, especially the more successful ones.
Charts are helpful not just for demonstrating content, but for breaking visual monotony. Images, graphs, tables, videos, subscription links, they are all great too.
To quote Scott:
The clickbaiters are our gurus – they intersperse images throughout their content. The images aren’t always very useful, they don’t always add much, but now it’s not just a wall of text. It’s a wall of text and images.
Of course, the best visual and other non-text elements actively advance your core arguments, not just enhance the vibe or make for a more pleasant reading experience.
See here, here, here, and here for non-writing integrations that I especially liked.
When illustrating Wigner’s puzzle, I didn’t introduce it with an academic discussion on the unreasonable effectiveness of mathematics in the natural sciences. Instead we got an image of a harried shrimp physicist at the bottom of a turbulent waterfall. When introducing questions of bee welfare, I didn’t start with complex moral deliberation or questions about arthropod phenomenology, I started with a relatable (to Berkeleyites, anyway) anecdote of holding up a line at the local bubble tea shop. The Puzzle of War was introduced with graphic descriptions of Qin dynasty China (and then mellowed out with specific examples of fictional wars between game-theoretic elves and dwarves).
As for why you want to do this, Scott Alexander puts it best.
Jokes are good. Mediocre jokes aren’t as good as good jokes, but they are better than no jokes. And microjokes…well the homeopathic Law of Minimum Dose will say that microjokes are actually the funniest jokes of all!
In every article, a strong temptation I have in my heart-of-hearts is to include every detail, no matter how irrelevant. But in my head (of-heads?), I know that this is a terrible idea. Readers are busy people, and they have much better things to do than wade through 10,000 words on the rising premium of life10.
But how do I balance my emotional need for completionism and including every detail (plus irrelevant asides) with my intellectual desire to be a good citizen who presents completed and compact finished works?
Answer: Footnotes! After an initial draft where I barf everything on the page, in the editing phase I can tighten up my language and remove unnecessary cruft11. And when there are paragraphs I especially like but I know I ought to cut, footnotes come in handy!
That said, it’s also possible to go overboard with footnotes. I’m less aggressive with cutting footnotes than in the main text, but I still average ~10 footnotes instead of like 25.
Bruce Lee is an expert of kicking, but not necessarily of substack blogging ...
Bruce Lee once said “I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.” Well, writing is rather unlike kicks. If you write the same story, blogpost, or paper 10,000 times, people aren’t going to fear you. They’ll just be confused, and/or bored.
As a starting writer, it’s hard to know which articles a) you especially enjoy writing, and b) the marketplace of ideas would reward you for. Rather than perfecting a specific genre or sub-style of essay, it’s probably much more productive to experiment very broadly and sample across a wide swathe of possible pieces you can write.
Experiment widely with a) different styles of writing, b) different topics, c) different venues to publish in, d) different levels of post-writing edits and perfectionism, e) different intended audiences, and f) different registers and degrees of linguistic complexity, and so forth.
Getting feedback is great! But often, the most critical/useful feedback essentially makes your piece unsalvageable if you were to take it too literally.
Wait…unsalvageable? How is it still useful, never mind “most critical”? Simple: you take the feedback you receive and apply it to your next post.
Seeing a published piece of yours continue to have critical flaws isn’t pleasant, but it’s vastly preferable to constantly rewriting a piece to address what might well be an unfixable flaw. And it’s of course better than pretending to seek feedback but not updating!
It’s good to learn from better writers than you. For me, I learn the most from Scott Alexander, because I respect him and he has a writing style that I find pleasant to emulate.
It’s also good to learn from worse writers than you. Why? Well, first of all when someone else makes the same mistakes as you, you might be able to understand why those are mistakes from a distance of some remove.
Secondarily, someone being worse than you at one dimension (or in aggregate) doesn’t mean they’re worse than you in all dimensions of writing. Just as averaged out faces can be very beautiful and ML student algorithms trained on human game-play can achieve superhuman performance, you too can learn to exceed your masters even if you only ever train on worse (overall!) writers than you.
Finally, it’s harder to see progress if you only compare with writers that are far better than you. By comparing with worse writers, you can learn and improve on more realistic axes.
For me, the writer I learn the most from is my younger self. This might be surprising, since in some sense my younger self is almost by definition a worse writer than me. Nonetheless, I find it valuable to reread my older pieces with clear(er) eyes. Seeing mistakes in my older pieces, as well as things past-me did well, has been very illuminating in improving my present writing.
Finally, be willing to ship imperfect and unfinished work! This is somewhat redundant with past advice, but still bears saying.
Shipping a lot of imperfect work means you can:
That’s why I’m in Inkhaven where I have to publish a blogpost every day for a month! 😱😱😱
You, too, might benefit from being extremely prolific. But that’s only possible if you give up on perfectionism and be open to publishing the imperfect.
So publish even when you aren’t fully ready, even when you haven’t edited everything through, and even when your conclusion isn–
The reception to this post has been odd. It didn't get that many likes on Substack, but it has been liked (and is the only post of mine to be liked) by some pretty big writers on Substack. I'm not sure if this is evidence my writing advice is good or not, however, consider the alt-text here: https://xkcd.com/125/
2025-12-02 05:06:23
Published on December 1, 2025 9:06 PM GMT
We have a ritual around these parts.
Every year, we have ourselves a little argument about the annual LessWrong Review, and whether it's a good use of our time or not.
Every year, we decide it passes the cost-benefit analysis[1].
Oh, also, every[2] year, you do the following:
Maybe you can tell that I'm one of the more skeptical members of the team, when it comes to the Review.
Nonetheless, I think the Review is probably worth your time, even (or maybe especially) if your time is otherwise highly valuable. I will explain why I think this, then I will tell you which stretch of ditch you're responsible for digging this year.
Every serious field of inquiry has some mechanism(s) by which it discourages its participants from huffing their own farts. Fields which have fewer of these mechanisms tend to be correspondingly less attached to reality. The best fields are those where formal validation is possible (math) or where you can get consistent, easily-replicable experiment results which cleanly refute large swathes of hypothesis-space (much but not all of physics). The worst fields are those where there is no ground truth, or where the "ground truth" is a pointer to a rapidly changing[3] social reality.
In this respect, LessWrong is playing on hard mode. Most of the intellectual inquiry that "we" (broadly construed) are conducting is not the kind where you can trivially run experiments and get really huge odds ratios to update on based on the results. In most of the cases where we can relatively easily run replicable experiments, like all the ML stuff, it's not clear how much evidence any of that is providing with respect to the underlying questions that are motivating that research (how/when/if/why AI is going to kill everyone).
We need some mechanism by which we look at the posts we were so excited about when they were first published, and check whether they still make any sense now that the NRE[4] has worn off. This is doubly-important if those posts have spread their memes far and wide - if those memes turned out to be wrong, we should try to figure out whether there were any mistakes that could have been caught at the time, with heuristics or reasoning procedures that wouldn't also throw out all true and useful updates too (and maybe attempt to propagate corrections, though that can be pretty hopeless).
Separate from the question of whether we're unwittingly committing epistemic crimes and stuffing everyone's heads full of misinformation, is the question of whether all of the blood, sweat, tears, and doomscrolling is producing anything of positive value.
I wish we could point to the slightly unusual number of people who went from reading and writing on LessWrong to getting very rich as proof positive that there's something good here. But I fear those dwarves are digging too deep...
So we must turn to somewhat less legible, but hopefully also less cursed, evidence. I've found it interesting to consider questions like:
Imagine that we've struck the motherlode and the answers to some of those questions are "yes". The Review is a chance to form a more holistic, common-knowledge understand of you and other people in your intellectual sphere are relating to these questions. It'd be a little sad to go around with some random mental construction in your head, constantly using it to understand and relate to the world, assuming that everyone else also had the same gadget, and to later learn that you were the only one. By the law of the excluded middle, that gadget is either good, in which case you need to make sure that everyone else also installs it into their heads, or it's bad, which means you should get rid of it ASAP. No other options exist!
If your time and attention is valuable, and you spend a lot of it on LessWrong, it's even more important for you to make sure that it's being well-spent. And so...
Similat to last year, actually. Quoting Ray:
If you're the sort of longterm member whose judgment would be valuable, but, because you're a smart person with good judgement, you're busy... here is what I ask:
First, do some minimal actions to contribute your share of judgment for "what were the most important, timeless posts of 2023?". Then, in proportion to how valuable it seems, spend some time reflecting on bigger picture questions on how LessWrong is doing.
The concrete, minimal Civic Duty actions
It's pretty costly to declare something "civic duty". The LessWrong team gets to do it basically in proportion to how much people trust us and believe in our visions.
Here's what I'm asking of people, to get your metaphorical[5] "I voted and helped the Group Reflection Process" sticker:
Phase I:
Nomination Voting2 weeks
We identify posts especially worthy of consideration in the review, by casting preliminary votes. Posts with 2 positive votes move into the Discussion Phase.
Asks: Spend ~30 minutes looking at the Nominate Posts page and vote on ones that seem important to you.
Write 2 short reviews[6] explaining why posts were valuable.
Phase II:
Discussion4 weeks
We review and debate posts. Posts that receive at least 1 written review move to the final voting phase.
Ask: Write 3 informational reviews[7] that aim to convey new/non-obvious information, to help inform voters. Summarize that info in the first sentence.
Phase III:
Final Voting2 weeks
We do a full voting pass, using quadratic voting. The outcome determines the Best of LessWrong results.
Ask: Cast a final vote on at least 6 posts.
Note: Anyone can write reviews. You're eligible to vote if your account was created before January 1st of 2023. More details in the Nuts and Bolts section.
Bigger Picture
I'd suggest spending at least a little time this month (more if it feels like it's organically paying for itself), reflecting on...
- ...the big picture of what intellectual progress seems important to you. Do it whatever way is most valuable to you. But, do it publicly, this month, such that it helps encourage other people to do so as well. And ideally, do it with some degree of "looking back" – either of your own past work and how your views have changed, or how the overall intellectual landscape has changed.
- ...how you wish incentives were different on LessWrong. Write up your thoughts on this post. (I suggest including both "what the impossible ideal" would be, as well as some practical ideas for how to improve them on current margins)
- ...how the LessWrong and X-risk communities could make some group epistemic progress on the longstanding questions that have been most controversial. (We won't resolve the big questions firmly, and I don't want to just rehash old arguments. But, I believe we can make some chunks of incremental progress each year, and the Review is a good time to do so)
In a future post, I'll share more models about why these are valuable, and suggestions on how to go about it.
Except, uh, s/2023/2024. This year, you'll be nominating posts from 2024!
Copied verbatim from last year's announcement post.
Instructions Here
To nominate a post, cast a preliminary vote for it. Eligible voters will see this UI:

If you think a post was an important intellectual contribution, you can cast a vote indicating roughly how important it was. For some rough guidance:
Votes cost quadratic points – a vote strength of "1" costs 1 point. A vote of strength 4 costs 10 points. A vote of strength 9 costs 45. If you spend more than 500 points, your votes will be scaled down proportionately.
Use the Nominate Posts page to find posts to vote on.
Posts that get at least one positive vote go to the Voting Dashboard, where other users can vote on it. You’re encouraged to give at least a rough vote based on what you remember from last year. It's okay (encouraged!) to change your mind later.
Posts with at least 2 positive votes will move on to the Discussion Phase.
Writing a short review
If you feel a post was important, you’re also encouraged to write up at least a short review of it saying what stands out about the post and why it matters. (You’re welcome to write multiple reviews of a post, if you want to start by jotting down your quick impressions, and later review it in more detail)
Posts with at least one review get sorted to the top of the list of posts to vote on, so if you'd like a post to get more attention it's helpful to review it.
Why preliminary voting? Why two voting phases?
Each year, more posts get written on LessWrong. The first Review of 2018 considered 1,500 posts. In 2021, there were 4,250. Processing that many posts is a lot of work.
Preliminary voting is designed to help handle the increased number of posts. Instead of simply nominating posts, we start directly with a vote. Those preliminary votes will then be published, and only posts that at least two people voted on go to the next round.
In the review phase this allows individual site members to notice if something seems particularly inaccurate in its placing. If you think a post was inaccurately ranked low, you can write a positive review arguing it should be higher, which other people can take into account for the final vote. Posts which received lots of middling votes can get deprioritized in the review phase, allowing us to focus on the conversations that are most likely to matter for the final result.
The second phase is a month long, and focuses entirely on writing reviews. Reviews are special comments that evaluate a post. Good questions to answer in a review include:
In the discussion phase, aim for reviews that somehow give a voter more information. It's not that useful to say "this post is great/overrated." It's more useful to say "I link people to this post a lot" or "this post seemed to cause a lot of misunderstandings."
But it's even more useful to say "I've linked this to ~7 people and it helped them understand X", or "This post helped me understand Y, which changed my plans in Z fashion" or "this post seems to cause specific misunderstanding W."
Posts that receive at least one review move on the Final Voting Phase.
The UI will require voters to at least briefly skim reviews before finalizing their vote for each post, so arguments about each post can be considered.
As in previous years, we'll publish the voting results for users with 1000+ karma, as well as all users. The LessWrong moderation team will take the voting results as a strong indicator of which posts to include in the Best of 2024, although we reserve some right to make editorial judgments.
Your mind is your lantern. Your keyboard, your shovel. Go forth and dig!
Or at least get tired enough of arguing about it that sheer momentum forces our hands.
Historical procedures have varied. This year is the same as last year.
And sometimes anti-inductive!
New relationship energy.
Ray: "Maybe also literal but I haven't done the UI design yet."
Ray: "In previous years, we had a distinction between "nomination" comments and "review" comments. I streamlined them into a single type for the 2020 Review, although I'm not sure if that was the right call. Next year I may revert to distinguishing them more."
Ray: "These don't have to be long, but aim to either a) highlight pieces within the post you think a cursory voter would most benefit from being reminded of, b) note the specific ways it has helped you, c) share things you've learned since writing the post, or d) note your biggest disagreement with the post."
2025-12-02 04:57:32
Published on December 1, 2025 8:57 PM GMT
Bay Solstice is this weekend (Dec 6th at 7pm, with a Megameetup at Lighthaven earlier in the day).
I wanted to give people a bit more idea of what to expect.
I created Solstice in 2011.
Since 2022, I've been worried that the Solstice isn't really set up to handle "actually looking at human extinction in nearmode" in a psychologically healthy way. (I tried to set this up in the beginning, but once my p(Doom) crept over 50% I started feeling like Solstice wasn't really helping the way I wanted).
People 'round here disagree a lot on how AI will play out. But, Yes Requires the Possibility of No, and as the guy who made rationalist solstice, it seemed like I should either:
This Solstice is me attempting to navigate option #2, while handling the fact that we have lots of people with lots of different worldviews who want pretty different things out of solstice, many of whom don't care about the AI question at all.
It has been pretty challenging, but I've been thinking about it over 3 years. I'm feeling pretty good about how I've thread the needle of making a solid experience for everyone.
Somewhat more broadly: Solstice is about looking at the deep past, the present, and the far future. When I created it in 2011, the future was sort of comfortably "over there." Now it feels like The Future is just... already happening, sort of. And some of the framing felt a bit in need of an update.
(Meanwhile there is a separate subthread about making Solstice do a better job of helping people singalong, both with a smoother singing difficulty curve, and picking songs that sort of naturally introduce new required skills in a way that helps you learn, mostly without feeling like you were "learning").
It's sort of a capstone project for the first half of my adult life. It's felt a bit to me like the season finale for Solstice Season 4, if Solstice was a TV show.
It's still a bit of a weird thing that's not for everyone, but, if the above sounds worthwhile to you, hope to see you there.
2025-12-02 04:07:57
Published on December 1, 2025 8:07 PM GMT
Thanks @Ariel Cheng for helping a lot in refining the idea, with her thorough understanding of FEP
I am not claiming to explain depression fully by the theory, it is a probably wrong mechanistic model explaining maybe just a tiny fraction of depression etiology, there are many more biological explanations that may apply better to many cases.
Intro:
Depression is often (usually implicitly) conceived of as "fixed priors" on the state of oneself and the world, with an overly pessimistic bias. Depressed people's views are considered to be a mere product of a "chemical imbalance" (which chemical? serotonin almost certainly not[1]). The standard psychotherapeutic treatment of depression, CBT, is based on this idea; Your problems are cognitive distortions, and by getting into a better epistemic state about them, they diminish.
However, depressive realism seems to hold for at least some cognitive tasks, and increased activity of the same neurotransmitter appears to mediate both the effects of many "cognitive enhancers" (nootropics) and depression. This may be explained by depression being an attractor state achieved by pathologically increased learning rate.
In this text, I propose a theory of the mechanism behind this connection, using mostly an Active Inference model of the mind.
TL;DR (by GPT-5.1):
basic idea
FEP posits that any self-organizing system (like a human) must act to resist increasing entropy to preserve itself. In information theory terms, this means the agent must continually minimize Suprisal (the negative log evidence of its sensory observations).
Computing suprisal () is intractable for a brain, since it can't know the summation of all possible causes for a sensation. So, the brain minimizes a tractable proxy: the Variational Free Energy (VFE).
The VFE () is an upper bound on surprisal. Mathematically, it decomposes like this:
This equation gives the brain two mechanism to stay alive:
But you cannot minimize VFE directly through action, because you cannot control the present instant. You can only control the future.
This requires Expected Free Energy (EFE):
To minimize Surprisal over time, the agent "rolls out" its generative model into the future and selects policies () that minimize the VFE expected to occur.
When you unpack this, the EFE drives two competing behaviors:
Standard RL agent theory usually separates the world-state (is) from the reward function (ought). Active Inference reduces this distinction by using the same "currency" for utility and epistemic value- prediction error (PE). In this framework, desires are just probability distributions- specifically, priors over observations ().
In standard RL, the agent has a state space and a separate reward function. The agent checks the state, consults the reward function, and computes a policy.
The brain (in the FEP framework) just has a generative model of what it expects to happen.
The cost function is simply the probability of the observation:
If you "want" to be warm, your brain implies a generative model where the prior probability of observing a body temperature of (around) is extremely high. which is the basic mechanism behind life-preserving homeostasis.
In a standard Bayesian update, if you observe you are cold, you should update your prior to expect coldness. The reason why this doesn't happen, is that the deep, homeostatic priors (temperature, oxygen) are not standard beliefs.
Mathematically, this means that the parameters of these innate prior distributions – encoding the agent’s expectations as part of its generative model – have hyperpriors that are infinitely precise (e.g., a Dirac delta distribution) and thus cannot be updated in an experience dependent fashion.
Because the hyperprior is a Dirac delta, the agent cannot update its expectation of what its temperature "should" be based on experience. No matter how long you stand in the snow, you will never "learn" that hypothermia is your natural state. The prediction error between the fixed prior () and the sensory reality () remains essentially infinite, forcing the system to minimise VFE the only way left: by acting to heat the body up.
While generally encodes these fixed preferences, beliefs about hidden states, , often encode epistemic beliefs. The deeper you go in the hierarchy, further from immediate sensory input, the more these p(s) distributions begin to resemble stubborn preferences or identity/self-concepts, and the slower they are to update.
In this post, when I talk about priors/beliefs/desires, it means this hierarchy of expectations, where the deepest layers act as the immovable "oughts" that the agent strives to fulfill.
For example, an agent with an abnormally high learning rate might have the prior of "I am worthy/competent", but a single failed exam might update it to "I am incompetent/dumb/worthless". This depressed state becomes an attractor, because the brain, aiming to minimize prediction error, subsequently filters and discounts positive data to confirm the new, negative self-belief.
The neurotransmitter acetylcholine (ACh) is present both in all parts of the CNS and in the PNS. In the brain, there are two classes of receptors for acetylcholine; the nicotinic receptors (the target of nicotine), and muscarinic receptors, both of which are known for having central roles in memory-formation and cognition, as well as (indirectly) being aware targets of common Alzheimer's disease medication.
In the 1950s, the correlation of increased central ACh and depression was discovered, and in the 70s it was formalised as the cholinergic-adrenergic hypothesis of mania and depression[2]. Later, experimental increase in central acetylcholine has been shown to induce analogues of depression in animal models, such as "learned helplessness"[3].
The cholinergic (affecting ACh receptors) system is also the target of many "cognitive enhancers", such as the first explicitly labelled "nootropic" piracetam, as well as nicotine. The mechanism of these cholinergic nootropics has been proposed by Scott Alexander, Steven Byrnes, and firstly Karl Friston, to be an increase in something called "learning rate" in ML, and "precision" (of bottom-up transmission) in the Free Energy Principle approach to neuroscience[4]. In essence, this parameter, encoded by ACh, determines how "significant" the currently perceived signals are, and thus how significantly they may "override" prior models of the perceived object/situation. In ActInf terms, the prediction error in bottom-up signal is made more significant[5], independent of the actual significance of the "wrongness" in one's prior understanding of the given sensed situation. Since prediction error may be perceived as suffering/discomfort, this seems relevant to the dysphoria[6], that is part of depression.
This is similar to the concept of Direction of Fit, where the parameter is [mind-to-world/world-to-mind]. In other words, how strongly one imposes their will to change the world when perceived data conflicts with their desires (~low sensory precision), as opposed to “The signals I perceive differ significantly from my prior beliefs, so I must change my beliefs” (~high sensory precision).
In another model, ACh can be viewed as strengthening the arrows from the blue area, causing the "planning" part to be relatively less influenced by the green (goal) nodes 32 and 33, whereas dopamine is doing the opposite (which suggests the proposed tradeoff between the "simulation" being more "epistemically accurate" vs. "will-imposing").
More handwavily, if agency is time travel, ACh makes this time travel less efficient, for the benefit of better simulation of the current state of the world.
The post assumes that desires and epistemic priors are encoded in a similar way in the brain (explained in the previous section), and a state of high acetylcholine signalling is thus able to "override" not only prior beliefs, but also desires about how the world (including the agent) should be, leading to loss of motivation and goals (long-term and immediate, even physiological needs in severe cases), compromising a part of the symptomatics of depression.
There is also some evidence for ACh modulating updating on aversive stimuli specifically[7][8], as well as acetylcholineesterase (the enzyme breaking down ACh) increasing in the recovery period after stress[9] (suggesting the role of ACh as a positive stress modulator). However, it seems too unclear, so I'll assume for the rest of the post that ACh modulates precision on ascending information (PEs), in general.
Dopamine is a neurotransmitter belonging to the class of catecholamines (together with (nor)-epinephrine) and, more broadly, to the monoamines (with serotonin).
Dopamine (DA) seems to be the reward signal created by the human's learned "reward function", coming from the striatum. In Big picture of phasic dopamine, Steve Byrnes proposes this idea in more detail. In short, there is a tonic level of dopaminergic neuron activation, and depending on whether the current situation is evaluated as positive or negative by the brain, more or less dopaminergic firing will occur than at baseline. At the same time, this reward mechanism applies to internally generated thoughts and ideas on potential actions. This is why dopamine-depleted rats will not put in any effort into finding food (but will consume it if placed into their mouth).
In this theory, dopamine is (very roughly and possibly completely incorrectly) the "top-down" signal enforcer; the mechanism for enforcing priors about oneself (which, according to FEP theory, are all downstream of the prior on one's continued existence). In ActInf literature, dopamine has the role of increasing policy precision , balancing bottom-up information precision.[10]
Overactivity of dopaminergic signalling (in certain pathways, certain receptors) leads to mania[11], and in different pathways, to psychosis[12]. Both seem somewhat intuitive; mania seems like the inverse of depression, as a failure to update realistically based on reality and instead enforcing grandiose ideas propagating top-down. Psychosis seems like the more "epistemic" counterpart to this - internally generated priors on the state of reality are enforced on perception, while bottom-up, epistemically-correcting signalling is deficient. If a psychotic person has a specific delusion or specific pattern/symbol that they are convinced is ever-present, pattern-matching will be extremely biased towards these patterns, enforcing the prior.
Then, should we just give dopamine agonists or amphetamines to depressed people?
Depression usually begins after, or during, some unpleasant life situation. This then leads to the adaptive increase in Acetylcholine and rumination, often reinforced by inflammatory cytokines, causing one to prefer to spend time withdrawn, passive. This adaptation has the role of enabling intense reevaluation and mistake-focused analysis to isolate what exactly one might have done wrong, causing this large clash of one's value function with reality.
In the modern environment, these unpleasant states can often be chronic stress, success anxiety, feeling left out of fully living, etc. If this is the case, enough and/or intense enough situations of failure (in career, socially, ..) can lead to this adaptive hypervigilance to mistakes and rumination, mediated by ACh, as well as expectation of uncertainty.[13]
This increases one's focus on past mistakes, but also on repeated situations where mistakes have occurred in the past. Since (as described before) this high-ACh state erodes confidence in top-down processing (such as values/goals/self-concepts), the observed situation, such as an exam, or a socially stressful situation, is already objectively perceived as being "out of one's control", as the human is less confident in their ability to impose their will on the situation, as opposed to the situation imposing its implications on the human's beliefs/aliefs.
This leads to a positive feedback cycle leading to withdrawal, passivity, pessimism about one's own abilities, etc.
This state seems consistent with the later evolutionary explanation, but usually leads to an inflexible and hard-to-escape attractor, making recovery quite hard. This may plausibly be explained by the fact, that in modern times, the specific "mistakes" leading to this cycle tend to be less tractable, or amplified by contrast to a global set of humans to compare oneself with.
In addition, the depressed state may in part be an adaptation to reduce dysphoria caused by constant prediction error. Specifically, as the world becomes perceived as unpredictable and uncontrollable, it is a simple fallback strategy to predict constant failure. While depression is often seen as a condition of intense suffering, dysphoria (the opposite of euphoria) is not a central symptom (as opposed to e.g. OCD or BPD). This may be because once one is already in a depressed state, the depression can become a sort of "comforting", predictable state, where at least the prediction "it will not get better" is getting confirmed by reality.
The lack of things (success, action, happiness, executive function) is easier to predict than their presence (including their presence to a normal degree - functioning existence is still more variable than a severely depressed state).
How might this be escaped?
[Minimizing prediction error] can be achieved in any of three ways: first, by propagating the error back along cortical connections to modify the prediction; second, by moving the body to generate the predicted sensations; and third, by changing how the brain attends to or samples incoming sensory input.
from Interoceptive predictions in the brain
Using notation from Mirza et al. (2019)[14]
Used notation:
εt = prediction error
Π(o) = sensory precision (inverse variance)
Π(μ) = prior precision
ζ = log-precision; ACh increases ζ → Π(o) = exp(ζ)
γ = policy precision (dopaminergic inverse temperature)
η_eff = effective learning rate induced by precision
G(π) = expected free energy of policy π
Variational free energy for a generative model , approximated as a density is:
Under a Gaussian predictive-coding formulation, and with sensory prediction errors
, free energy can be locally approximated as:
where is the sensory precision (inverse covariance) at time . “Complexity” collects the KL terms over states and any higher-level priors.
Gradient descent on yields the canonical update of sufficient statistics :
Increasing steepens the contribution of sensory prediction errors and thus
increases the effective learning rate , while increasing
stabilises by tightening priors .
Claim:
Acetylcholine primarily modulates the log-precision on ascending prediction errors,
so that
and high ACh corresponds to high sensory precision , producing a high effective learning rate .
Catecholamines (especially dopamine) encode policy precision and contribute to
the stability of higher-level priors (increasing ). Policies are inferred via
, where ) is expected free energy.
Thus:
Depression is characterised by
high (ζ↑ via ACh),
low,
low (DA↓) (but not extremely low, that would probably cause DDMs like Athymhomia[15])
This regime overweights bottom-up errors, underweights stabilising priors, and flattens
the posterior over policies . Small mismatches produce large belief-updates,
leading to unstable self-models, helplessness, anhedonia, and rumination.
Mania is characterised by
low (ζ↓),
high,
high (DA↑).
Prediction errors are underweighted, priors and policies become over-precise, and
becomes sharply peaked. This suppresses corrective evidence and produces
grandiosity, overconfidence, and reckless goal pursuit.
[source].
On ancestral timescales, encountering a persistent, catastrophic model failure (social defeat, resource collapse) justifies switching into a high‑ACh, high‑learning regime that suspends costly goal pursuit and reallocates compute to problem solving (analytic rumination), until a better policy emerges. The cost of false negatives (missing the hidden cause of a disaster) exceeded the cost of prolonged withdrawal; hence a design that forces extended search even when the cause is exogenous.
Hollon et al 2021 justifies long depressive episodes as evolutionarily adaptive because they force rumination & re-examination of the past for mistakes.
One might object that such rumination is merely harmful in many cases, like bereavement from a spouse dying of old age—but from the blackbox perspective, the agent may well be mistaken in believing there was no mistake! After all, an extremely bad thing did happen. So better to force lengthy rumination, just on the chance that a mistake will be discovered after all. (This brings us back to RL’s distinction between high-variance evolutionary/Monte Carlo learning vs smarter lower-variance but potentially biased learning using models or bootstraps, and the “deadly triad”.)
from 'Evolution as a backstop for Reinforcement Learning' by Gwern[16]
This is related to the model of depression as sickness behaviour; an adaptive behaviour caused by an increase in inflammatory cytokines (which are also implicated in depression)[17], causing social withdrawal, physical inactivity and excessive sleep.
This might serve a dual role - giving the immune system the opportunity to focus on combating the pathogen in case of infection, and when combined with increased ACh, allowing the mind to focus on ruminating about how one might have done things differently to avoid failures/mistakes committed.
Depressed patients' sleep tends to have a higher proportion of REM sleep and REM deprivation (REM-D) has been found to be an effective treatment for depression.[18] The standard medications for depression (SSRIs, SNRIs, DNRIs, MAOis,...) increase REM latency and shorten its duration (by preventing the decrease of monoamines necessary for REM sleep to occur), effectively creating REM sleep deprivation, which may be a possible mechanism of their effectiveness.[19] (Interestingly, it doesn't seem like the significantly reduced amount of REM sleep due to SSRI usage causes any severe cognitive side effects.)
How this relates to the theory:
REM sleep (when most dreaming occurs) is characterized by high ACh and relative monoaminergic silence (NE/5‑HT/DA strongly reduced). If ACh scales precision on ascending signals, what does it do in REM when there is no external sensory stream? It amplifies the precision of internally generated activity, treating spontaneous (often related to that day's memories) cortical/hippocampal patterns as high‑fidelity “data,” while weakened monoaminergic tone relaxes top‑down priors. Acetylcholine in REM sleep is theorized to function as following;
"Cholinergic suppression of feedback connections prevents hippocampal retrieval from distorting representations of sensory stimuli in the association neocortex".
This seems to suggest that REM sleep functions essentially as the stage of sleep in which most new/prior memories are not consolidated (as happens in slow-wave-sleep), but rather the space is given for "learning" of new (synthetic) information, without interference from existing models. This happens during waking life when ACh is high, but during dreaming this process is radically upregulated, while there is an absence of external stimuli. (Karl Friston explains this as REM sleep portraying the basal "theatre of perception", which in waking life updates based on sensory information, but during dreaming, the generative "virtual reality model" exists by itself, to be refined for the next time it's used for waking perception).[20]
In Active Inference terms, REM is a regime where precision on ascending (internally generated) errors is high and priors are pliable; the model explores and re‑parameterizes itself by fitting narratives to internally sampled data. If the waking baseline is already ACh‑biased and monoamine‑depleted (the depressed phenotype), REM further erodes stabilizing priors about one's values and self. If REM sleeps dominates compared to slow-wave-sleep, more space is given to increasing uncertainty related to dream subjects (which may be related to the previous day's experiences), rather than consolidating existing priors.[21]
Acetylcholine causes updating based on prediction errors - the learning occurs in uncertain situations, when the agent needs to be hyperaware of possible mistakes that are expected to happen (or have happened, as in the case of rumination).[23] Long-term potentiation (LTM) or long-term depression (LTD) are more likely to occur, in existing synaptic connections.[24]
BDNF, on the other hand, stimulates the creation of entirely new synapses and maintains the survival of existing neurons, such as in the hippocampus. BDNF expression tends to be decreased in depressed individuals, and hippocampal volume usually seems to be lowered.
This enables the emergence of "local plasticity" leading into the depression-phenotype attractor state, while "global plasticity" is lowered. Synapses in the hippocampus die, while a small subset gets continually amplified.
In FEP, the type of learning that's facilitated by BDNF might be structure learning, specifically bayesian model expansion[25], though I have not read much about this.
It seems like Sequences-style epistemic rationality favours a state similar to the high-ACh state described above. There appears to be a divide between the Rationalist and the Bay-area-startup-founder archetypes, the former of which is notably identified with the "doomer", while the other wishes to "accelerate", not worrying about risks.
In addition, it seems like many of the ones closest to the former camp tend to either become disillusioned with their work (such as MIRI dissolving its research team) or switch into the other camp, starting work in AI capabilities research (thus moving right on the diagram).
While I don't have actual data, it anecdotally seems to me like depression is quite common among lesswrongers and is to some extent connected to the emphasis on careful epistemic rationality (through the relative downregulation of policy precision and upregulation of ascending information precision).
It would be foolish to propose taking anticholinergics and dopaminergics because of this; rather, it seems good to be aware of the potential emotional fragility of a highly cautious, high-learning-rate state and the tradeoff that might exist between motivation (limbic dopamine activity driving up policy precision) and learning rate - amphetamines may not necessarily make you smarter/wiser.
Most importantly: avoid nootropics such as acetylcholinesterase inhibitors (huperzine, galantamine), piracetam, Alpha-GPC, CDP choline, ... (anything cholinergic) when depressed.
Potentially effective alternative nootropics:
Targeting ACh receptos- Anticholinergics:
Targeting sleep - REM deprivation:
Targeting sickness behaviour - antiiflamatory drugs:
Targeting dopamine:
Targeting BDNF:
many many others
The topic of why SSRIs and other serotonergics work is very vast; it may involve indirect increase of a neuroplastogen, increase in the GABAergic neurosteroid allopregnanolone, decrease in inflammatory cytokines, 1A serotonin receptor activation decreasing substance P release, sigmaergic agonism, nitric oxide inhibition, REM sleep suppression (mentioned in text), and many more.
Ketamine seems to act by redirecting glutamate from NMDA receptors (which it blocks) to AMPA receptors (which upregulate neuroplasticity). Glutamate in general is quite important in depression.
The HPA axis, specifically overproduction of CRF, which promotes cortisol release, is really important to depression too.[39]
The body also has it's endogenous "dysphoriant" (unpleasant-qualia-inducer), called dynorphin, which is quite understandably linked to depression.[40]
The trace amine phenethylamine (PEA) seems to also be implicated in depression[41], it acts as a sort of endogenous amphetamine and is increased in schizophrenia[42], so maybe it plays a large part in what I attribute to dopamine in the post. Selegiline, mentioned above, inhibits its breakdown.
That is to say, acetylcholine and dopamine are far from being the sole factors in depression, and targeting them may mean not targeting the central factor in some people's depression. Nonetheless, it seems useful not to ignore this model when thinking about depression, as some high-impact interventions that are otherwise ignored depend on this model (or the underlying True Mechanism).
SSRIs increase intersynaptic serotonin acutely, but take weeks to have an effect - there must be something other than serotonin increase going on.
Janowsky et al. (1972)
more detail in: The Role of Acetylcholine Mechanisms in Affective Disorders
https://en.wikipedia.org/wiki/Disorders_of_diminished_motivation, Athymhormia being a severe variant, where motivation is so low it destroys even motivation to move (https://en.wikipedia.org/wiki/Athymhormia)
An interesting anecdotal report of @Emrik using these facts about REM sleep to increase their sleep efficiency.
unlikely the reader hasn't already heard of this research...