MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

GRPO is terrible

2025-12-02 11:43:59

Published on December 1, 2025 10:54 PM GMT

An on-policy, sample-efficient NLP post-training approach not requiring verification

The current state

I once tried teaching a simulated 3D humanoid how to dunk[1]using RL - truly the most effective use of my time. If I took a single thing away from that, it was that designing the best reward function is equivalent to believing your agent is conscious and has the single goal to destroy your project.

My point is, RL is already terrible in practice. Then additionally throwing intermediate rewards out the window and overly relying on the most inefficient part of modern LLMs, their autoregressive inference[2], doesn't exactly seem like the play - somehow it is though.

The first attempts did try exactly this - basically an additional model that takes in one 'reasoning step' and spits out a number - the reward. The problem is that we simply don't have any pretraining data for such a reward model. Generating your own data is expensive[3]and not comparable to any pretraining scale.

There's also a whole different problem - expecting such model to be feasible in the first place[4]: even humans very much struggle with rating whether a particular step shows promise - in fact it would make the life of a researcher much more trivial if at any point in time he could simply make such objective and accurate assessment without actually having to work through the idea.[5]

Nevertheless, disregarding GRPO, the average pipeline was just copying the pretrained LLM, slicing off the predictor head, doing some basic finetuning and calling it a brand-new reward model. This works out "okay"-ish for around 100 training steps[6]but once a significant enough distribution shift occurs in the actor, the shallow understanding of the reward model is revealed.

Contrary to all these RL approaches would be normal finetuning - yet this seems to only lead to very shallow understanding like formatting & vocabulary and at best growing knowledge; but not anything we would normally describe as proper learning. It seems that the On-Policy attribute of approaches like GRPO reduces these superficial changes and therefore focuses on more subtle differences.

Distillation seems to perform slightly better in that regard even though teacher-forcing is basically always used - it might be the case that off-policy can be compensated for if the data at least incorporates similar distributions to those seen at inference time, i.e. making mistakes rather than a perfect solution and then fixing them.

This leaves us in an awkward position:

  • GRPO needs a ton of rollouts and even after that still at best gets a noisy signal. This might be good enough for converging down the distribution of the base model but any learning beyond that seems very unrealistic.
  • Reward model approaches are in theory more sample-efficient but in practice we neither have the data nor do we know that such a model constrained to approximately the same compute as the actor in this setting is even conceivable.
  • Finetuning is a lot more efficient than either of these RL approaches but seems to be constrained to superficial changes
  • Distillation is also very efficient and can learn beyond superficial information but generally requires a more powerful teacher to function

We would like an algorithm that is sample-efficient, on-policy and doesn't require any additional models - and while we are at it, why not desire natively supporting non-verifiable output formats[7]as well?

Approach

If reward models in NLP fail because we simply try to adapt the base model to an ill-suited task with little data, why not just stick to what they are actually good at: predicting the next token. Distillation uses this, trying to pass down prediction capabilities, often even forming the loss between the logits rather than simply sampled token and providing incredibly rich signal as a result. But if we don't have such a bigger model, where would the teacher model get its nuance from?

Well, if the model weights are the same, the only difference could be the input - we would need to supply the teacher model with additional information that would reduce the complexity of the task. In its most extreme version, this would simply be a perfect solution to the problem.

To remain in the distributions seen at inference, we additionally need something like student-forcing. Lastly, we need a mechanism that stops "impossible knowledge" to be transmitted into the gradient - the teacher model directly knows the perfect solution but magically stumbling on this solution before even properly analyzing the problem won't lead to better performance once this knowledge is gone.

It's time to put this into more concrete terms:

You have prompt and  -  is a normal Chain-of-Thought prompt with the problem while  supplies both the problem and a solution, asking to attempt the problem normally and only use the solution as hint/help.

You do inference with , generating . This results in  and [8]. You now do distillation over [9]with the teacher computing logits using , and the student computing logits using , call the logits  and  respectively.

Finally to block "impossible knowledge" we choose an aggregation of both  and  as actual target for the student. This for example could be:

where  is a constant controlling the temperature - it makes sense to choose it such that roughly .

This aggregation basically turns the teacher into a helping hand, only reaching out and intervening when the student is getting severely off track and never giving the student a solution it isn't already seriously considering itself.

[Note: Interactive visualizations were here in the original post but cannot be embedded in LessWrong]

A metaphor for how the token mixing should behave in a real setting

Notes

There is one major problem with this approach - it requires a model that already is powerful, i.e. something upwards of 20B params. Anything below that can't be expected to properly follow the teacher-prompt to a reasonable degree and leverage the solution intelligently opposed to just blatantly copying or completely forgetting about it 1k tokens in. This might not sound like a problem directly, but it does once you understand that I have a total of 0$ in funding right now.

If anybody with access to a research cluster would be interested in trying this approach on a sufficient scale, I would be more than happy to give it a go - I even have the code already written from some toy tests for this.

On another note, you can apply this aggregation during inference for  already - this is useful for very hard problems[10], as it keeps  close to a reasonable approach so that actual learning can happen afterwards. To be precise, during inference you would do two forward passes and already compute  and sample the next token from it - essentially a mix between teacher-forcing and student-forcing.

Another question is the data - one of the advantages of GRPO is that it required no stepwise solutions anymore, only a final verifiable answer. We could of course just generate a bunch of solutions and using the verifiable answer generate our own stepwise solutions[11]- this would still have a significantly higher sample efficiency than GRPO since the signal we get from one trace is token-wise logit-based targets, unimaginably more dense than a single coefficient indicating whether to in- or decrease the probability of the whole trace.

But I think this approach especially shines in settings with no verifiable answers - which is practically everything if we zoom out. One could imagine a company like OpenAI having tons and tons of chats where users iterate on one initial demand with the Chatbot; something like RL approaches or finetuning can't make any use of this at all. This approach on the other hand can simply accept the final output that the user seemed content with as solution and start this self-distillation from the start of the conversation, while the logit rebalancing takes care of information not yet revealed. And the best thing - all autoregressive inference has already been done; this training will be stupidly fast.


Footnotes

  1. yes, the basketball kind ↩︎

  2. parallelizing this inference, as GRPO, does alleviate the damage but doesn't erase it ↩︎

  3. even when attempting novel schemas like incorporating binary search https://arxiv.org/pdf/2406.06592 - very cool paper ↩︎

  4. given the unspoken constraint of compute for reward model approx. compute for LLM ↩︎

  5. This does seem to manifest in experts to some degree through intuition, which can be very powerful, but it's just as common that two experts intuitions completely oppose each other ↩︎

  6. if the finetuning data is good enough, which always goes hand in hand with an absurd amount of compute spent on it ↩︎

  7. non-verifieable in this context doesn't speak to some impossibility of determining correctness but simply the feasibility of it - checking whether a multiple choice answer is correct is trivial, but what about a deep research report? ↩︎

  8. [X,Y] simply means Y appended to X here, basically just think of pasting the generated tokens under both prompts, respectively ↩︎

  9. by this I mean masking out anything else than g_CoT for the gradient ↩︎

  10. where something like GRPO would utterly fail ↩︎

  11. which seem to perform a lot better than human-written ones ↩︎



Discuss

Metric-haven (quick stats on how Inkhaven impacted LessWrong)

2025-12-02 11:31:14

Published on December 2, 2025 3:31 AM GMT

The data here only reflects posting activity on LessWrong itself.

In 2021, the admins of LessWrong had the idea that we'd pay people to write book reviews. In 2025, we had a much better idea: people would pay us to write all kinds of posts!

I think this went pretty well, final determination pending, but in the meantime I can say the numbers have been impacted. That'll be no surprise to those regularly checking the site.

The number of posts increased by 57% (477 → 749) and number of words by 45% (1.0M → 1.46M). The increases were driven by 21 people officially involved in Inkhaven (residents, coaches, contributing writers) and 3 copycats[1] I identified by the numbers and their written intention to participate.

Curiously, the large boost to LessWrong was affected with only a handful of writers posting ~daily posts to the site. Per the Inkhaven blogroll, most writers published on Substack.

I believe beyond the three copycats on LessWrong, others expressed an intention to blog daily but did so on blogs elsewhere. Lorxus participated in Halfhaven and posted weekly roundups of their posts on LessWrong, but those don't count towards the totals here.

21% of the Inkhaven wordcount on LessWrong came from the LessWrong team. 79% came from others!

Ok, but what of quality? Karma ("baseScore") is the perfect measure of that. The good news is non-Inkhaven participant karma only declined 16 → 13 at the median.

There's more interpretation to be done here but I'm out of time. Such is the Inkhaven way. (This post began as an attempt from me to submit another Inkhaven post myself, but also, it's all graphs and not words!)

 

  1. ^

    I use the term affectionately.



Discuss

MIRI’s 2025 Fundraiser

2025-12-02 09:53:30

Published on December 2, 2025 1:53 AM GMT

MIRI is running its first fundraiser in six years, targeting $6M. The first $1.6M raised will be matched 1:1 via an SFF grant. Fundraiser ends at midnight on Dec 31, 2025. Support our efforts to improve the conversation about superintelligence and help the world chart a viable path forward.

MIRI is a nonprofit with a goal of helping humanity make smart and sober decisions on the topic of smarter-than-human AI.

Our main focus from 2000 to ~2022 was on technical research to try to make it possible to build such AIs without catastrophic outcomes. More recently, we’ve pivoted to raising an alarm about how the race to superintelligent AI has put humanity on course for disaster.

In 2025, those efforts focused around Nate Soares and Eliezer Yudkowsky’s book (now a New York Times bestseller) If Anyone Builds It, Everyone Dies, with many public appearances by the authors; many conversations with policymakers; the release of an expansive online supplement to the book; and various technical governance publications, including a recent report with a draft of an international agreement of the kind that could actually address the danger of superintelligence.

Millions have now viewed interviews and appearances with Eliezer and/or Nate, and the possibility of rogue superintelligence and core ideas like “grown, not crafted” are increasingly a part of the public discourse. But there is still a great deal to be done if the world is to respond to this issue effectively.

In 2026, we plan to expand our efforts, hire more people, and try a range of experiments to alert people to the danger of superintelligence and help them make a difference.

To support these efforts, we’ve set a fundraising target of $6M ($4.4M from donors plus 1:1 matching on the first $1.6M raised, thanks to a $1.6M matching grant), with a stretch target of $10M ($8.4M from donors plus $1.6M matching).

Donate here, or read on to learn more.


The Big Picture

As stated in If Anyone Builds It, Everyone Dies:

If any company or group, anywhere on the planet, builds an artificial superintelligence using anything remotely like current techniques, based on anything remotely like the present understanding of AI, then everyone, everywhere on Earth, will die.

We do not mean that as hyperbole. We are not exaggerating for effect. We think that is the most direct extrapolation from the knowledge, evidence, and institutional conduct around artificial intelligence today. In this book, we lay out our case, in the hope of rallying enough key decision-makers and regular people to take AI seriously. The default outcome is lethal, but the situation is not hopeless; machine superintelligence doesn't exist yet, and its creation can yet be prevented.

The leading AI labs are explicitly rushing to create superintelligence. It looks to us like the world needs to stop this race, and that this will require international coordination. MIRI houses two teams working towards that end:

  1. A communications team working to alert the world to the situation.
  2. A governance team working to help policymakers identify and implement a response.

Activities

Communications

If Anyone Builds It, Everyone Dies has been the main recent focus of the communications team. We spent substantial time and effort preparing for publication, executing the launch, and engaging with the public via interviews and media appearances.

The book made a pretty significant splash:

The end goal is not media coverage, but a world in which people understand the basic situation and are responding in a reasonable, adequate way. It seems early to confidently assess the book's impact, but we see promising signs.

The possibility of rogue superintelligence is now routinely mentioned in mainstream coverage of the AI industry. We’re finding in our own conversations with strangers and friends that people are generally much more aware of the issue, and taking it more seriously. Our sense is that as people hear about the problem through their own trusted channels, they are more receptive to concerns.

Our conversations with policymakers feel meaningfully more productive today than they did a year ago, and we have been told by various U.S. Members of Congress that the book had a valuable impact on their thinking. It remains to be seen how much this translates into action. And there is still a long way to go before world leaders start coordinating an international response to this suicide race.

Today, the MIRI comms team comprises roughly seven full-time employees (if we include Nate and Eliezer). In 2026, we’re planning to grow the team. For example:

  • We need someone whose job is to track AI developments and how the global conversation is responding to those developments, and help coordinate a response.
  • We need someone to assess and measure the effectiveness of various types of communications and arguments, and notice what’s working and what’s not.
  • We need someone to track and maintain relationships with various colleagues and allies (such as neighboring organizations, safety teams at the labs, journalist contacts, and so on) and make sure the right resources are being deployed at the right times.

We will be making a hiring announcement soon, with more detail about the comms team’s specific models and plans. We are presently unsure (in part due to funding constraints/budgetary questions!) whether we will be hiring one or two new comms team members, or many more.

Going into 2026, we expect to focus less on producing new content, and more on using our existing library of content to support third parties who are raising the alarm about superintelligence for their own audiences. We also expect to spend more time responding to news developments and taking advantage of opportunities to reach new audiences.

Governance

Our governance strategy primarily involves:

  1. Figuring out solutions, from high-level plans to granular details, for how to effectively halt the development of superintelligence.
  2. Engaging with policymakers, think tanks, and others who are interested in developing and implementing a response to the growing dangers.

There's a ton of work still to be done. To date, the MIRI Technical Governance Team (TGT) has mainly focused on high-level questions such as "Would it even be possible to monitor AI compute relevant to frontier AI development?" and "What would an international halt to the superintelligence race look like?" We're only just beginning to transition into more concrete specifics, such as writing up A Tentative Draft of a Treaty, with Annotations, which we published on the book website to coincide with the book release, followed by a draft international agreement.

We plan to push this a lot further, and work towards answering questions like:

  • What, exactly, are the steps that could be taken today, assuming different levels of political will?
  • If there is will for chip monitoring and verification, what are the immediate possible legislative next-steps? What are the tradeoffs between the options?
  • Technologically, what are the immediate possible next steps for, e.g., enabling tamper-proof chip usage verification? What are the exact legislative steps that would require this verification?

We need to extend that earlier work into concrete, tractable, shovel-ready packages that can be handed directly to concerned politicians and leaders (whose ranks grow by the day).

To accelerate this work, MIRI is looking to support and hire individuals with relevant policy experience, writers capable of making dense technical concepts accessible and engaging, and self-motivated and competent researchers.[1]

We’re also keen to add additional effective spokespeople and ambassadors to the MIRI team, and to free up more hours for those spokespeople who are already proving effective. Thus far, the bulk of our engagement with policymakers and national security professionals has been done either by our CEO (Malo Bourgon), our President (Nate Soares), or the TGT researchers themselves. That work is paying dividends, but there’s room for a larger team to do much, much more.

In our conversations to date, we’ve already heard that folks in government and at think tanks are finding TGT’s write-ups insightful and useful, with some calling it top-of-its-class work. TGT’s recent outputs and activities include:

  • In addition to collaborating with Nate, Eliezer, and others to produce the treaty draft, the TGT has further developed this document into a draft international agreement, along with a collection of supplementary posts that expand on various points.
  • The team published a research agenda earlier this year. Much of their work (to date and going forward) falls under this agenda, which is further explored in a number of papers digging into various specifics. TGT has also participated in relevant conferences and workshops, and has been supervising and mentoring junior researchers through external programs.
  • TGT regularly provides input on RFCs and RFIs from various governmental bodies, and engages with individuals in governments and elsewhere through meetings, briefings, and papers.
  • Current efforts are mostly focused on the U.S. federal government, but not exclusively. For example, in 2024 and 2025, TGT participated in the EU AI Act Code of Practice Working Groups, working to make EU regulations more likely to be relevant to misalignment risks from advanced AI. Just four days ago, Malo was invited to provide testimony to a committee of the Canadian House of Commons; and TGT researcher Aaron Scher was invited to speak to the Scientific Advisory Board of the Secretary-General of the UN on AI verification as part of an expert panel.

The above isn’t an exhaustive description of what everyone at MIRI is doing; e.g., we continue to support a small amount of in-house technical alignment research.

As noted above, we expect to make hiring announcements in the coming weeks and months, outlining the roles we’re hoping to add to the team. But if your interest has already been piqued by the general descriptions above, you’re welcome to reach out to [email protected]. For more updates, you can subscribe to our newsletter or periodically check our careers pages (MIRI-wide, TGT-specific).


Fundraising

Our goal at MIRI is to have at least two years’ worth of reserves on hand. This enables us to plan more confidently: hire new staff, spin up teams and projects with long time horizons, and balance the need to fundraise with other organizational priorities. Thanks to generous support we received in 2020 and 2021, we didn’t need to run any fundraisers in the last six years.

We expect to hit December 31st having spent approximately $7.1M this year (similar to recent years[2]), and with $10M in reserves if we raise no additional funds.[3]

Going into 2026, our budget projections have a median of $8M[4], assuming some growth and large projects, with large error bars from uncertainty about the amount of growth and projects. On the upper end of our projections, our expenses would hit upwards of $10M/yr.

Thus, our expected end-of-year reserves puts us $6M shy of our two-year reserve target of $16M.

This year, we received a $1.6M matching grant from the Survival and Flourishing Fund, which means that the first $1.6M we receive in donations before December 31st will be matched 1:1. We will only receive the grant funds if it can be matched by donations.

Therefore, our fundraising target is $6M ($4.4M from donors plus 1:1 matching on the first $1.6M raised). This will put us in a good place going into 2026 and 2027, with a modest amount of room to grow.

It’s an ambitious goal and will require a major increase in donor support, but this work strikes us as incredibly high-priority, and the next few years may be an especially important window of opportunity. A great deal has changed in the world over the past few years. We don’t know how many of our past funders will also support our comms and governance efforts, or how many new donors may step in to help. This fundraiser is therefore especially important for informing our future plans.

We also have a stretch target of $10M ($8.4M from donors plus the first $1.6M matched). This would allow us to move much more quickly on pursuing new hires and new projects, embarking on a wide variety of experiments while still maintaining two years of runway.

For more information or assistance on ways to donate, view our Donate page or contact [email protected].


The default outcome of the development of superintelligence is lethal, but the situation is not hopeless; superintelligence doesn't exist yet, and humanity has the ability to hit the brakes.

With your support, MIRI can continue fighting the good fight.

Donate Today

  1. ^

    In addition to growing our team, we plan to do more mentoring of new talent who might go on to contribute to TGT's research agenda, or who might contribute to the field of technical governance more broadly.

  2. ^

    Our yearly expenses in 2019–2024 ranged from $5.4M to $7.7M, with the high point in 2020 (when our team was at its largest), and the low point in 2022 (after scaling back).

  3. ^

    It’s worth noting that despite the success of the book, book sales will not be a source of net income for us. As the authors noted prior to the book’s release, “unless the book dramatically exceeds our expectations, we won’t ever see a dime”. From MIRI’s perspective, the core function of the book is to try to raise an alarm and spur the world to action, not to make money; even with the book’s success to date, the costs to produce and promote the book have far exceeded any income.

  4. ^

    Our projected expenses are roughly evenly split between Operations, Outreach, and Research, where our communications efforts fall under Outreach and our governance efforts largely fall under Research (with some falling under Outreach). Our median projection breaks down as follows: $2.6M for Operations ($1.3M people costs, $1.2M cost of doing business), $3.2M Outreach ($2M people costs, $1.2M programs), and $2.3M Research ($2.1M people costs, $0.2M programs). This projection includes roughly $0.6–1M in new people costs (full-time-equivalents, i.e., assuming the people are not all hired on January 1st).

    Note that the above is an oversimplified summary; it's useful for high-level takeaways, but for the sake of brevity, I've left out a lot of caveats, details, and explanations.



Discuss

Everyone Can Be High Status In Utopia

2025-12-02 07:43:25

Published on December 1, 2025 11:43 PM GMT

In Rubber Souls, Bjartus Tomas argues that we can have cruely-free status games by creating underpeople without moral worth, perhaps because they are non-conscious, worth to serve as our permanent underclass. This removes the current problem where some poor bugger has to be at the bottom of the barrel, or the bottom quarter or half or so forth. 

Needless to say, I approve. 

But I think it's worth fleshing out a bit why this is possible, and why you won't wind up with everyone associating humans as a high status source of esteem as underpeople as a low status source. 

We build our sense of self-worth not from some abstract global ranking, such as percentile ranking of wealth or h-index, but through comparing ourselves to people in our social circles. So in the glorious transhumanist future where the labs somehow avoid flubbing it, Dario Amodei may be God-Emperor of the universe, but as long as he's far from your social graph, and your social graph has nary a whisper of him, then you're not likely to compare yourself with him and feel low self-esteem. 

More generally, I expect the far future to have less global status rankings because I expect everything to run at much higher speeds, making fixed travel times feel proportionally longer. If a mind ran 1,000,000,00x faster than us, for them light would only a foot/subjective second, or a measly 1km in a subjective hour. Which means it's harder to communicate and co-ordinate across the total span of human civilization, resulting in smaller, disjoint cultures with their own local hierarchies. 

Secondly, humans have very coarse personhood detectors. Let alone AIs like GPT-4o, we've even treated animals or inanimate objects as people in the past. It's just quite easy to convince our brains that a non-human entity is a person. This makes sense; how could evolution have encoded something as complex as a human detector when building our social reward circuitry? No, it had to make do with mere proxies instead. 

And that means there are bound to be very strange entities we can construct that would count as valid sources of status in the future. Yes, stranger even than LLMs. And probably more effective than humans, to boot. 

So if we build these super-stimuli status sirens, these underpeople forming our permanent underclass, who might I add could well be pleased to be permanent yes-man for mind design space is wide, could we truly say we wouldn't gladly partake of them? 

Yes, perhaps we would view associating with them as abhorrent at first. But for those who are stuck at the bottom of the status hierarchies of utopia, their need for social esteem would compel them to at least try it out. Then, deep in the midst of the underpeople's social scene, would it truly seem so bad? I think not. 



Discuss

How to Write Fast, Weird, and Well

2025-12-02 05:30:51

Published on December 1, 2025 9:30 PM GMT

Reposting some of my best Inkhaven posts here. Will do so at a 1-2x/week cadence unless anybody objects. I'll start with the first post[1], about writing tips for myself and others:


Today is my first day at Inkhaven! It’s a program where I try to publish a blog post every day for all of November.

To set the stage for the rest of the program, I offer up a loosely organized list of writing advice I wished my younger self would’ve learned earlier. I intend to refer back to this advice often over the course of the program (and hopefully in future years). I’m vaguely optimistic that it can also be helpful to some other participants in my program, and maybe to other aspiring writers as well.

Source: inkhaven.blog

Core Principles: Write Fast, Write Weird, Get Feedback

Write like you’re running out of time

Why do you write like you’re running out of time? (Hey!)

Write day and night like you’re running out of time? (Hey!)

Ev’ry day you fight, like you’re running out of time

Keep on fighting. In the meantime—

Hamilton: Non-Stop

The most important thing to do as a budding writer is to write a lot. Write a lot very, very, fast. Write a lot. If the goal of writing a lot is impeded by other advice on this list, you1 should heed the other advice too, but in general, prioritize writing a lot over other goals. For example, if you don’t have surprising things to say, write a lot anyway.

Who knows? Maybe you misjudged your audience and your “obvious points about field/question X” is actually novel to many readers. Probably not. But it’s possible!

Publish a lot, too. Many positive feedback loops only come into play once you hit “publish”, and some from the intention to publish and taking yourself seriously as a fast blogger.

Thanks for reading The Inchpin! Subscribe for free to receive one post a day over November and follow me along my blogging journey!

 

Try to say things people don’t expect

Many writers enjoy writing what readers expect them to say and re-emphasizing the exact same points over and over again in slightly different ways. For people who want to do this and can do this well (I enjoy Matt Yglesias and Bentham’s Bulldog as two examples of fulfilling this niche), more power to them!

But I at least a) know I’d be bored writing the same thing over and over, and b) realistically, know I’ll suck at it, compared to masters of the craft who live and breathe repetition in their field of interest.

So my comparative advantage is to write a wide breadth of articles, and be surprising in (almost) all of them.

This should be true recursively at all levels:

  1. Given the previous body of work, try to pick topics to write about that are surprising to your audience.
    1. Try to not get pigeonholed as the “Effective Altruism (EA) guy”, or the “econ reasoning guy”, or the “book reviews guy” or the “anthropics guy”
  2. Given the topic in question, your typical audience member should not be able to predict your conclusion based on the topic and a basic amount of knowledge about who you are and your past positions.
  3. Your arguments should be surprising
    1. Somebody who has not independently came to the same conclusions as you should not be able to predict the outline/structure of your arguments based on just your choice of topic and your conclusion
  4. Specific examples should be surprising
    1. This goes across both the field you’re talking about (if every other post on cognitive biases start with Lindy the feminist bank-teller, avoid Lindy the feminist bank-teller like a MAGA bank) and across your own past catalog of writing.
      1. Some people have a folksy charm of reusing the same 1-3 examples over and over again in every blog post, paper, or speech.
      2. I refuse.
  5. Specific jokes should be surprising
    1. This is probably obvious but bears repeating. Surprise is one of the most important elements of humor, and good jokes get ruined if they’re telegraphed too much.
  6. Your conclusion can be unsurprising, given the rest of the post
    1. A good argument naturally flows to a desired conclusion. So conclusions don’t need to be surprising
    2. On the flip side, do you even need a conclusion? Respect your reader’s time. If the conclusion is obvious, it might be better to leave it unsaid!

But in the service of saying surprising things, don’t say false things. It’s easy to be “surprising” or high-entropy simply by not having a restriction on “truth” capping what you can say, or via lacking a detailed world model. Good nonfiction writing should avoid lying, by either commission or omission.

Don’t say random impertinent things, either (#sorandom). Instead, aim to say things that in retrospect would come across to readers as surprising but inevitable.2

Figure out ways to get fast + reliable feedback

Try to figure out ways to get fast, regular, and reliable feedback. From yourself, AIs, early readers, and eventually mentors and intended readers. Like almost all other areas of human endeavor, one of the best ways to improve is via deliberate practice, and one of the cornerstones of deliberate practice is immediate and reliable feedback.

Unfortunately, when you first start out, all your feedback methods suck. Your intuitions suck (unless you are a writing god, which is unlikely). Your AIs will be pretty miscalibrated on what your intended outcome is (plus they’ve been primed by your early drafts, which suck). Your readers will suck. You probably don’t have mentors. You certainly aren’t getting the readers you want. Oops.

But actually, this is okay! Fast biased feedback can still help you improve!

For example, one of the earliest ways I remember improving as a writer is via writing lots of Quora answers (starting back ~2014 when Quora was halfway decent). The training signal I got on which answers were good was a combination of my own thoughts, Quora upvotes and downvotes, and the occasional high-quality comment. Was the feedback great? Of course not! But nonetheless, the early feedback got the ball rolling, and over the course of months and years, I learned to write more quickly, pay attention to what members of my audience like, structure my arguments better, have more engaging examples, etc, etc. All good things!

Many people have the (false) belief that GIGO (“garbage in, garbage out”) is a fundamental law of nature. I instead think of it as a weak heuristic that is frequently wrong. When it comes to writers and writing, compressed garbage can (sometimes) instead become diamonds.

So all in all, it’s very important to continuously write for an audience, and write in a manner that allows you to collect a ton of feedback from yourself and others. Don’t be afraid to (mostly) ignore the bad feedback3, but treasure all the feedback you receive regardless, and create gradients that make it easier for others to give you feedback, and for you to receive it.

A relevant and related model is the generator/discriminator model from machine learning (aka babble/prune). Writing a lot is your babble, good feedback (from yourself and others) is your pruning mechanism.

Relatedly, publish. Publicly. A lot. 80% of the ways you receive feedback are ~effectively closed to you if you only write drafts for yourself, or circulated among a small number of peer editors.

Find Surprising Things to Say

It’s great that you want to write a lot of surprising things in different ways that are conducive to receiving lots of feedback . But how do you write lots of new and surprising articles without lying, or just saying technically true but effectively impertinent facts?

“Surprising but inevitable” as a broad North Star

Aristotle said a good story’s ending should be “surprising but inevitable.” Likewise, I think it’s a very important goal of nonfiction. This helps maintain a high-novelty/-surprise factor of your writing, while still making each article internally cohesive and driven by a specific, coherent, logic.

For example, in my Rising Premium of Life post, which until then primarily drew on economics data and modeling on humans about changing preferences in valuation of life this century, I had a long, seemingly random, detour into facts about the lifespan of bats, mice, and other animals. But by the end of the article, the connection was apparent, and perhaps even inevitable – once you see the analogy between intrinsic and extrinsic animal lifespan differences and how safety might beget more safety, it’s hard to unsee.

Likewise, my Why Reality Has a Well-Known Math Bias post opens with the seemingly absurd image of a shrimp physicist trying to do advanced fluid mechanics calculations before giving up, quitting shrimp physics in favor of going back to shrimp grad school in the shrimpanities. Laughable, and yet the metaphor is directly connected to multiple arguments later on about how best to think about the unusual effectiveness of mathematics in the natural sciences.

Source: Gemini Pro 2.5/ my prior article

Among other bloggers, Richard Hanania is halfway decent at this.4 His arguments for leftist censorship are utterly surprising given his own history of being suppressed. Yet his arguments have a sort of internally cohesive logic that’s simultaneously intellectually challenging and funny. Scott Alexander is, of course, a master at crafting surprising but inevitable arguments across a wide range of domains.

Read a lot and do the research

To find novel or at least surprising (to your audience) ideas to write about, it helps to read a lot, and persistently.

My first piece that went semi-viral (~6k upvotes, ~500k views) was this Quora answer on “honest college majors.” It’s pretty silly, but I think what elevated this Quora answer was that instead of having a single stereotype that other people complain about (“humanities + social science majors are too leftist! X majors are unemployable! ABC majors is too hard/easy!”) I had already by that time had a reasonably decent introductory understanding about every academic major I joked about. None of my jokes were particularly revolutionary, but I think my breadth set me apart (survivorship bias joke in biomedical engineering, replication crisis joke in psychology, LSAT joke in Pre-Law, decision theory joke in history) and made it more interesting than the other answers that only had 1-3 novel angles to go for.

It’s also good to broadly use multiple separate ~independent quality filters for your readings, and be extremely wary of systematic biases in your access to information.

For specific articles, try to understand the State-of-the-Art of a field before opining on it. This is easier to do in some fields than others! For example, in my anthropics/mathematical effectiveness post, the question spanned many different academic fields (physics, philosophy of math, philosophy of science, etc) and I know there’s a real chance I missed important existing work. Nonetheless I gave it my best shot, skimming many papers I can find online, spinning up multiple different AIs to ask questions about fields I have less familiarity with, and asking three different philosophy academics I know to review my draft before publishing it. Even so, there’s a decent chance I missed something critical5. Still, you got to try!

If you’re opining on a known field, it behooves you to understand the existing academic consensus (or non-consensus!) before opining otherwise. Pay attention to track records on similar arguments, the intelligence/education level of various proponents and opponents, amount of total collective research effort undertaken before researching a conclusion, etc.

Sometimes while researching an article, you realize that the article you originally planned is bunk (either because your ideas are false or because they’ve already been said better elsewhere). This is okay!

Personally at that point I just stop that writing project and move on elsewhere, but people who are overly perfectionist/have trouble following the “write a lot” advice might benefit from documenting their mistakes and discoveries before moving on.

When writing nonfiction, unless you are writing something highly personal and specific to you, it’s often good to read multiple takes before writing your own. This is most obvious for research-heavy writings but it’s easier to do, and arguably more impactful on the margin, for posts others won’t consider “research.”

In my intellectual jokes post, by the time I started the post, I already had read thousands of different jokes online. But just to be safe, I Googled “intellectual jokes” and read/skimmed upwards of a couple hundred existing jokes in various online compilations, just to make sure the nine jokes I could stand behind are genuinely the best/funniest intellectual jokes by my lights.

Could I have written an intellectual jokes compilation if I only knew ~200 jokes and only ~10 intellectual ones? Of course! I expect most of the existing compilations look like this! But the marginal cost of skimming as many jokes as I did really isn’t that high6, especially for a post that eventually got ~30k views, and I think my posts benefit from me caring a bunch more about quality in areas other people don’t seem to.

Find what others miss

A corollary of reading a lot and writing things that are surprising is that you want to identify what others miss.

For example, for my Ted Chiang review, I reread several of his stories and read 10+ prior book reviews before starting to write my own (As of today I’ve probably read/skimmed >30 reviews). This helps ensure that my book review covers not just points that are interesting to me, and that Chiang does unusually well, but also specifically points that other book reviews overall missed.

Similarly, in my honey post, I biased towards covering subareas of bee welfare (e.g. positive vs negative valence), that other people did not. I didn’t try to answer thorny questions of normative ethics, bee consciousness, or intensity of valenced experiences, because my impression is that those were already (relatively) widely discussed elsewhere.

Don’t be a hater

To quote a wise woman “Haters gonna hate hate hate” This is not just definitional but causal, hating makes you have more of a character of a hater.

Many people start writing blogposts because someone’s wrong on the internet and they need to be set right. This is a perfectly fine sentiment to carry you through a post or two (and I’ve certainly fallen prey to it myself). But as an overall rule, it’s dangerous to see yourself as primarily a critic (or worse, a “hater”). It’s better to be motivated by love, curiosity, the Good, and other positive traits, rather than hate.

Know your audience

I’ve said before that you should write things that are surprising. But surprising to who? The simplest answer is “your intended audience”.

Who is your intended audience? That’s what you have to find out! Imagine who you want to read your work. Better yet, interview your real readers! (or people who are almost-readers).

Build a rough psychological profile of the people who you want to read your work, and try to say things that are true but surprising to them! Try to say things that slot in well (“inevitably”) with their own mental self-narratives, while at the same time being genuinely surprising.

Share

Make Good Arguments

Own your words

Say what you believe and believe what you say. This is important.

There are different ways to achieve this. Classic style achieves it via purity of style. I like to just use first-person language (“I believe”, “I think”, “I aim”). The important thing is to avoid a weird academic-ese where you use weaselley, reflexive, language.

It often helps to write in your own voice, and write as you speak, though this is not strictly a requirement. Sometimes it helps to affect specific voices, or over-emphasize some aspects of your personality (see later sections on style).

Talk to your readers as if you are talking to an intelligent, educated, friend, or (as I do sometimes) to your younger self, who is intelligent but hasn’t yet stumbled upon the exact same insights as you.

Show your work with data

Include graphs, sources, calculations. Anticipate and diffuse obvious objections. Learn from people who are good at writing with data, and use it natively.

Draw on interdisciplinary knowledge

I’m not an expert on any specific field. And if I am, it’d be in relatively niche and inherently inter-disciplinary “fields” (like “EA grantmaking” or “pandemic forecasting”). So it’s pointless to hide that and try to “compete” with the experts in their native turf.

Instead, it’s better to rely on my own breadth and interests to make cohesive interdisciplinary arguments on topics of interest.

For example, my first anthropics post draws not just on anthropic reasoning, but also on philosophy of science, physics, history of science, evolution, math, philosophy of math, and even AI. The Ted Chiang review similarly draws not just from understanding Chiang’s own work or other science fiction, but relies on a confident understanding of social criticism, modern philosophy and economic growth models.

But at the same time, don’t “dazzle” people with knowledge for the sake of impressing them! Make focused arguments necessary to express a point (or occasionally to be interesting/funny), and never try to come across as too intelligent for your readers!7

Polish Your Craft

The following four points on craft are areas that I suspect most aspiring nonfiction writers can benefit from.

Make focused stylistic choices and stick with it (within a post)

As I wrote in my field guide to writing styles, a mature writing style is defined by making a principled choice on a small number of nontrivial central issues: truth, presentation, cast, scene, and the intersection of thought & language.

A good artifact of writing maintains not just surface similarities but has a quiet consistency that underlies the work.

Open Asteroid Impact, for example, never broke character. There were jokes I considered including but rejected because while funny in isolation, they would’ve ruined the mirage that I was running a Serious Company completely lacking in self-awareness, and thus make the website overall less funny.

While I believe a lot in experimenting in general across the course of your writing career, I think most posts should pick a side on each of the central issues and stick with that. Don’t have your book of prophecy be 1337spk.

Pay a lot of attention to titles

 

If Coates instead titled it “Some Arguments in Favor of Considering Ethnicity As One of Several Factors When Deciding Appropriately-Sized Transfer Payments” the article would become substantially less memorable

Spend significant effort on your title (>10 minutes per serious post, and sometimes closer to an hour). It’s one of the most important components of a successful blogpost.

I think this is very unintuitive to people, myself included. From the perspective of a writer, a title is just one line to write (and often one of the least interesting lines, as it’s unlikely that you as a writer discover something novel in the process of creating a title). But from the perspective of a (potential) reader, titles are quite important, as:

  1. It is probably one of the most critical pieces of information for readers to decide whether something is worth investing the time to read and engage with, particularly if linked elsewhere.
  2. It helps set the tone for how readers should engage with the rest of your essay.
    1. Positive example: Some unfun lessons I learned as a junior grantmaker
      1. Short, to the point, and contextually useful (someone reading this title knows both where I’m coming from, as well as have some bounds on the limitations)
    2. Negative example: Why short-range forecasting can be useful for longtermism (old title)
      1. The old title was bad because it led readers to believe (falsely) that the post would give an argument for the statement in the title, rather than just assume that the title is a placeholder.
      2. It also predictably led to people being confused about the intended purpose of the post, which is more “this thing would be cool, would love to find collaborators.”
      3. imo the newer title Early-warning Forecasting Center: What it is, and why it’d be cool is comparatively much better, though still not ideal.
  3. For low-information readers, a title helps them decide whether/how much to be angry at you.
    1. This can either be desirable or undesirable, depending on your intended purpose.
    2. As I personally don’t like it when randos are angry at me, I try to screen all my titles for a “will this title seriously offend low-information people who just randomly see this title on Twitter/Facebook/Hacker News?” check.
      1. All else equal, your title is the substring of your text that is most likely to be taken out of context, so some prudence is warranted.

So titles are very important to readers! Since they are so important to readers, a writer trying to cooperate with (current and future) readers should also treat titles as important!

In addition to wanting to be cooperative with readers, in the current algorithmic information environment, you also have to somewhat “fight”/argue for your post’s value in seeking your readers’ attention.

Finally, good titles should capture the essence of your point after it’s been argued. So you and your readers can refer to it again in the future. So it behooves you to come up with good titles (or a title; subtitle combination) that both encapsulates your primary thesis and also is intriguing enough for readers to click through.

I often leave my titles blank before finishing a post (or have a purely descriptive internal title like “writing styles post” or “war post”), and use AIs to generate title ideas after I finish writing a post. Sometimes the AIs can help discover real gems, like elevating The Secret Third Thing from what I thought of as a throwaway Twitter reference to a poetic phrase that encapsulates a core argument of mine in the Ted Chiang review.

More often, I go back and forth with AIs and use AIs as inspiration but settle in on a cleverish multilayered title like Why Reality has a Well-Known Math Bias, where the final title and pun was entirely my own, but I probably wouldn’t have generated the title if AIs didn’t suggest the direction of other (worse) puns first.

Unfortunately sometimes I fail at a title, even for pieces I otherwise like. For example, I don’t like to refer to the Why Are We All Cowards? by its final title, since it’s both clickbaity and imprecise compared to the actual argument, but “the rising premium of life post” unfortunately sounds like uninteresting gibberish to people who don’t already understand the post’s central concept.

Vary your sentence lengths, but bias towards shorter

Self-explanatory.

Pay attention to reading level and word complexity

When you see online advice on complexity of language, many people go for advice like Orwell’s, where you “Never use a long word where a short one will do”, etc, unless you’ll otherwise say something “outright barbarous.”

I mostly disagree. I want to be more ecumenical with my advice. I think complexity of language has its place, and I think there are many times where you ought to prefer more complex language even if simpler words are “good enough,” even when the simpler language is not “outright barbarous.”8

For me, I think better advice is to never write with words you personally are uncomfortable with, and to write with words and methods of expression that are comfortable for your actual audience. This means understanding your actual audience (see above), and developing models of what words would and would not be too complex for them.

As a ballpark, I think for popular pieces, you ought to aim for writing ~4 years of specialization/grade levels below where you think your audience’s actual reading level is (since people are less comfortable with writing at the limits of their ability). But this is just a ballpark. I’m more confident that writers should pay attention to their reading levels/complexity of words and be intentional with their choices9 than that they should land at a specific place.

Polish Your Craft II

These are craft points that I suspect are more specific to the types of writing that I want to do.

Optimize your posts for both skimmability and non-skimming experiences

Some posts (e.g. Rethink Priorities reports, academic papers in the natural sciences) are primarily made to be skimmed by the vast majority of readers. For those works, the core success criterion of the writing is skimmability (ease of parsing on a quick skim) and the rest of the writing is practically more about proof-of-work and/or evidence you know what you’re talking about, than it’s actively meant to be read.

Some writings (e.g. Great American Novels) are written primarily for Deep Reading (™), and the authors would even be offended if you try to skim them! Some of those works are deliberately engineered to be harder to skim.

For my own blog posts, I try to anticipate both skimming and non-skimming use-cases. So I try to write articles that are both easy to skim and easy to not skim. This is a real tradeoff I’m making, and other bloggers might want to only pick one of the two sides to focus on.

Use headers strategically

I try to have clear, meaningful section breaks that preview content. This makes it easier for readers to skim my content, for returning readers to jump to sections they’re interested in, and for me (and hopefully others!) to link specific sections to use as lemmas/subarguments in discussions with others, etc.

Integrate visual and other non-writing elements

Most writing advice given before the last ten years, and often many that are not, ignore the online nature of the vast majority of written content we produce and consume these days. They assume modern writing still looks like Ye Olde Essayes hosted HTML-only on some blogging server.

Needless to say, modern blogs don’t look like this at all anymore, especially the more successful ones.

 

Charts are helpful not just for demonstrating content, but for breaking visual monotony. Images, graphs, tables, videos, subscription links, they are all great too.

To quote Scott:

The clickbaiters are our gurus – they intersperse images throughout their content. The images aren’t always very useful, they don’t always add much, but now it’s not just a wall of text. It’s a wall of text and images.

Of course, the best visual and other non-text elements actively advance your core arguments, not just enhance the vibe or make for a more pleasant reading experience.

See here, here, here, and here for non-writing integrations that I especially liked.

Lead with concrete examples/analogies

When illustrating Wigner’s puzzle, I didn’t introduce it with an academic discussion on the unreasonable effectiveness of mathematics in the natural sciences. Instead we got an image of a harried shrimp physicist at the bottom of a turbulent waterfall. When introducing questions of bee welfare, I didn’t start with complex moral deliberation or questions about arthropod phenomenology, I started with a relatable (to Berkeleyites, anyway) anecdote of holding up a line at the local bubble tea shop. The Puzzle of War was introduced with graphic descriptions of Qin dynasty China (and then mellowed out with specific examples of fictional wars between game-theoretic elves and dwarves).

As for why you want to do this, Scott Alexander puts it best.

Use microhumor liberally

Jokes are good. Mediocre jokes aren’t as good as good jokes, but they are better than no jokes. And microjokes…well the homeopathic Law of Minimum Dose will say that microjokes are actually the funniest jokes of all!

Deploy footnotes liberally (for asides)

In every article, a strong temptation I have in my heart-of-hearts is to include every detail, no matter how irrelevant. But in my head (of-heads?), I know that this is a terrible idea. Readers are busy people, and they have much better things to do than wade through 10,000 words on the rising premium of life10.

But how do I balance my emotional need for completionism and including every detail (plus irrelevant asides) with my intellectual desire to be a good citizen who presents completed and compact finished works?

Answer: Footnotes! After an initial draft where I barf everything on the page, in the editing phase I can tighten up my language and remove unnecessary cruft11. And when there are paragraphs I especially like but I know I ought to cut, footnotes come in handy!

That said, it’s also possible to go overboard with footnotes. I’m less aggressive with cutting footnotes than in the main text, but I still average ~10 footnotes instead of like 25.

Ship It Anyway

Experiment widely

Bruce Lee performing a side kick

Bruce Lee is an expert of kicking, but not necessarily of substack blogging ...

Bruce Lee once said “I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.” Well, writing is rather unlike kicks. If you write the same story, blogpost, or paper 10,000 times, people aren’t going to fear you. They’ll just be confused, and/or bored.

As a starting writer, it’s hard to know which articles a) you especially enjoy writing, and b) the marketplace of ideas would reward you for. Rather than perfecting a specific genre or sub-style of essay, it’s probably much more productive to experiment very broadly and sample across a wide swathe of possible pieces you can write.

Experiment widely with a) different styles of writing, b) different topics, c) different venues to publish in, d) different levels of post-writing edits and perfectionism, e) different intended audiences, and f) different registers and degrees of linguistic complexity, and so forth.

Integrate feedback across posts

Getting feedback is great! But often, the most critical/useful feedback essentially makes your piece unsalvageable if you were to take it too literally.

Wait…unsalvageable? How is it still useful, never mind “most critical”? Simple: you take the feedback you receive and apply it to your next post.

Seeing a published piece of yours continue to have critical flaws isn’t pleasant, but it’s vastly preferable to constantly rewriting a piece to address what might well be an unfixable flaw. And it’s of course better than pretending to seek feedback but not updating!

Learn from better and worse writers trying to do what you’re trying to do

It’s good to learn from better writers than you. For me, I learn the most from Scott Alexander, because I respect him and he has a writing style that I find pleasant to emulate.

It’s also good to learn from worse writers than you. Why? Well, first of all when someone else makes the same mistakes as you, you might be able to understand why those are mistakes from a distance of some remove.

Secondarily, someone being worse than you at one dimension (or in aggregate) doesn’t mean they’re worse than you in all dimensions of writing. Just as averaged out faces can be very beautiful and ML student algorithms trained on human game-play can achieve superhuman performance, you too can learn to exceed your masters even if you only ever train on worse (overall!) writers than you.

Finally, it’s harder to see progress if you only compare with writers that are far better than you. By comparing with worse writers, you can learn and improve on more realistic axes.

For me, the writer I learn the most from is my younger self. This might be surprising, since in some sense my younger self is almost by definition a worse writer than me. Nonetheless, I find it valuable to reread my older pieces with clear(er) eyes. Seeing mistakes in my older pieces, as well as things past-me did well, has been very illuminating in improving my present writing.

Be willing to ship unfinished work

Finally, be willing to ship imperfect and unfinished work! This is somewhat redundant with past advice, but still bears saying.

Shipping a lot of imperfect work means you can:

  1. Write a lot of posts
  2. Write a lot of different posts
  3. Learn a lot about yourself and what you enjoy and hate
  4. Get a lot of public feedback that you can apply to future posts
  5. And many other virtues.

That’s why I’m in Inkhaven where I have to publish a blogpost every day for a month! 😱😱😱

You, too, might benefit from being extremely prolific. But that’s only possible if you give up on perfectionism and be open to publishing the imperfect.

So publish even when you aren’t fully ready, even when you haven’t edited everything through, and even when your conclusion isn–

  1. ^

    The reception to this post has been odd. It didn't get that many likes on Substack, but it has been liked (and is the only post of mine to be liked) by some pretty big writers on Substack. I'm not sure if this is evidence my writing advice is good or not, however, consider the alt-text here: https://xkcd.com/125/



Discuss

The 2024 LessWrong Review

2025-12-02 05:06:23

Published on December 1, 2025 9:06 PM GMT

We have a ritual around these parts.

Every year, we have ourselves a little argument about the annual LessWrong Review, and whether it's a good use of our time or not.

Every year, we decide it passes the cost-benefit analysis[1].

Oh, also, every[2] year, you do the following:

  • Spend 2 weeks nominating the best posts that are at least one year old,
  • Spend 4 weeks reviewing and discussing the nominated posts,
  • Spend 3 weeks casting your final votes, to decide which posts end up in the "Best of LessWrong 20xx" collection for that year.

Maybe you can tell that I'm one of the more skeptical members of the team, when it comes to the Review.

Nonetheless, I think the Review is probably worth your time, even (or maybe especially) if your time is otherwise highly valuable.  I will explain why I think this, then I will tell you which stretch of ditch you're responsible for digging this year.

Are we full of bullshit?

Every serious field of inquiry has some mechanism(s) by which it discourages its participants from huffing their own farts.  Fields which have fewer of these mechanisms tend to be correspondingly less attached to reality.  The best fields are those where formal validation is possible (math) or where you can get consistent, easily-replicable experiment results which cleanly refute large swathes of hypothesis-space (much but not all of physics).  The worst fields are those where there is no ground truth, or where the "ground truth" is a pointer to a rapidly changing[3] social reality.

In this respect, LessWrong is playing on hard mode.  Most of the intellectual inquiry that "we" (broadly construed) are conducting is not the kind where you can trivially run experiments and get really huge odds ratios to update on based on the results.  In most of the cases where we can relatively easily run replicable experiments, like all the ML stuff, it's not clear how much evidence any of that is providing with respect to the underlying questions that are motivating that research (how/when/if/why AI is going to kill everyone). 

We need some mechanism by which we look at the posts we were so excited about when they were first published, and check whether they still make any sense now that the NRE[4] has worn off.  This is doubly-important if those posts have spread their memes far and wide - if those memes turned out to be wrong, we should try to figure out whether there were any mistakes that could have been caught at the time, with heuristics or reasoning procedures that wouldn't also throw out all true and useful updates too (and maybe attempt to propagate corrections, though that can be pretty hopeless).

Is there gold in them thar hills?

Separate from the question of whether we're unwittingly committing epistemic crimes and stuffing everyone's heads full of misinformation, is the question of whether all of the blood, sweat, tears, and doomscrolling is producing anything of positive value.

I wish we could point to the slightly unusual number of people who went from reading and writing on LessWrong to getting very rich as proof positive that there's something good here.  But I fear those dwarves are digging too deep...

Nano Banana Pro: Viewed from behind: a dwarf digging his way through a mine shaft.  The wall he's digging at is studded with lightly glittering gemstones.  On the right hand side of the image, viewed from the front: a balrog wreathed in flames, standing in a stone cavern on the opposite side of that wall.  Aquarelle.

So we must turn to somewhat less legible, but hopefully also less cursed, evidence.  I've found it interesting to consider questions like:

  • Were there any posts that gave you useful new abstractions or mental handles?
  • Did any of them make any interesting predictions which have since been born out?
  • Was there a post that upended your life plans?
  • Is there a topic or view that felt difficult or impossible to talk about, until a specific post was published?
  • How many of them raised the collective sanity waterline?  (Don't ask what they were putting in the water.)

Imagine that we've struck the motherlode and the answers to some of those questions are "yes".  The Review is a chance to form a more holistic, common-knowledge understand of you and other people in your intellectual sphere are relating to these questions.  It'd be a little sad to go around with some random mental construction in your head, constantly using it to understand and relate to the world, assuming that everyone else also had the same gadget, and to later learn that you were the only one.  By the law of the excluded middle, that gadget is either good, in which case you need to make sure that everyone else also installs it into their heads, or it's bad, which means you should get rid of it ASAP.  No other options exist!

If your time and attention is valuable, and you spend a lot of it on LessWrong, it's even more important for you to make sure that it's being well-spent.  And so...

The Ask

Similat to last year, actually.  Quoting Ray:

If you're the sort of longterm member whose judgment would be valuable, but, because you're a smart person with good judgement, you're busy... here is what I ask:

First, do some minimal actions to contribute your share of judgment for "what were the most important, timeless posts of 2023?". Then, in proportion to how valuable it seems, spend some time reflecting on bigger picture questions on how LessWrong is doing.

 

The concrete, minimal Civic Duty actions

It's pretty costly to declare something "civic duty". The LessWrong team gets to do it basically in proportion to how much people trust us and believe in our visions. 

Here's what I'm asking of people, to get your metaphorical[5] "I voted and helped the Group Reflection Process" sticker:

Phase I: 
Nomination Voting

2 weeks

We identify posts especially worthy of consideration in the review, by casting preliminary votes. Posts with 2 positive votes move into the Discussion Phase.

Asks: Spend ~30 minutes looking at the Nominate Posts page and vote on ones that seem important to you.

Write 2 short reviews[6] explaining why posts were valuable.

Phase II:
Discussion

4 weeks

We review and debate posts. Posts that receive at least 1 written review move to the final voting phase. 

Ask: Write 3 informational reviews[7] that aim to convey new/non-obvious information, to help inform voters. Summarize that info in the first sentence.

Phase III:
Final Voting

2 weeks

We do a full voting pass, using quadratic voting. The outcome determines the Best of LessWrong results.

Ask: Cast a final vote on at least 6 posts.

Note: Anyone can write reviews. You're eligible to vote if your account was created before January 1st of 2023. More details in the Nuts and Bolts section.

Bigger Picture

I'd suggest spending at least a little time this month (more if it feels like it's organically paying for itself), reflecting on...

  • ...the big picture of what intellectual progress seems important to you. Do it whatever way is most valuable to you. But, do it publicly, this month, such that it helps encourage other people to do so as well. And ideally, do it with some degree of "looking back" – either of your own past work and how your views have changed, or how the overall intellectual landscape has changed.
  • ...how you wish incentives were different on LessWrong. Write up your thoughts on this post. (I suggest including both "what the impossible ideal" would be, as well as some practical ideas for how to improve them on current margins)
  • ...how the LessWrong and X-risk communities could make some group epistemic progress on the longstanding questions that have been most controversial. (We won't resolve the big questions firmly, and I don't want to just rehash old arguments. But, I believe we can make some chunks of incremental progress each year, and the Review is a good time to do so)

In a future post, I'll share more models about why these are valuable, and suggestions on how to go about it.

Except, uh, s/2023/2024.  This year, you'll be nominating posts from 2024!

How To Dig

Copied verbatim from last year's announcement post.

Instructions Here

Nuts and Bolts: How does the review work?

Phase 1: Preliminary Voting

To nominate a post, cast a preliminary vote for it. Eligible voters will see this UI:

If you think a post was an important intellectual contribution, you can cast a vote indicating roughly how important it was. For some rough guidance:

  • A vote of 1 means “it was good.”
  • A vote of 4 means “it was quite important”.
  • A vote of 9 means it was "a crucial piece of intellectual progress."

Votes cost quadratic points – a vote strength of "1" costs 1 point. A vote of strength 4 costs 10 points. A vote of strength 9 costs 45. If you spend more than 500 points, your votes will be scaled down proportionately.

Use the Nominate Posts page to find posts to vote on. 

Posts that get at least one positive vote go to the Voting Dashboard, where other users can vote on it. You’re encouraged to give at least a rough vote based on what you remember from last year. It's okay (encouraged!) to change your mind later. 

Posts with at least 2 positive votes will move on to the Discussion Phase. 

Writing a short review

If you feel a post was important, you’re also encouraged to write up at least a short review of it saying what stands out about the post and why it matters. (You’re welcome to write multiple reviews of a post, if you want to start by jotting down your quick impressions, and later review it in more detail)

Posts with at least one review get sorted to the top of the list of posts to vote on, so if you'd like a post to get more attention it's helpful to review it.

Why preliminary voting? Why two voting phases?

Each year, more posts get written on LessWrong. The first Review of 2018 considered 1,500 posts. In 2021, there were 4,250. Processing that many posts is a lot of work. 

Preliminary voting is designed to help handle the increased number of posts. Instead of simply nominating posts, we start directly with a vote. Those preliminary votes will then be published, and only posts that at least two people voted on go to the next round.

In the review phase this allows individual site members to notice if something seems particularly inaccurate in its placing. If you think a post was inaccurately ranked low, you can write a positive review arguing it should be higher, which other people can take into account for the final vote. Posts which received lots of middling votes can get deprioritized in the review phase, allowing us to focus on the conversations that are most likely to matter for the final result.

Phase 2: Discussion

The second phase is a month long, and focuses entirely on writing reviews. Reviews are special comments that evaluate a post. Good questions to answer in a review include:

  • What does this post add to the conversation?
  • How did this post affect you, your thinking, and your actions?
  • Does it make accurate claims? Does it carve reality at the joints? How do you know?
  • Is there a subclaim of this post that you can test?
  • What followup work would you like to see building on this post?

In the discussion phase, aim for reviews that somehow give a voter more information. It's not that useful to say "this post is great/overrated." It's more useful to say "I link people to this post a lot" or "this post seemed to cause a lot of misunderstandings." 

But it's even more useful to say "I've linked this to ~7 people and it helped them understand X", or "This post helped me understand Y, which changed my plans in Z fashion" or "this post seems to cause specific misunderstanding W."

Phase 3: Final Voting

Posts that receive at least one review move on the Final Voting Phase. 

The UI will require voters to at least briefly skim reviews before finalizing their vote for each post, so arguments about each post can be considered. 

As in previous years, we'll publish the voting results for users with 1000+ karma, as well as all users. The LessWrong moderation team will take the voting results as a strong indicator of which posts to include in the Best of 2024, although we reserve some right to make editorial judgments.

Your mind is your lantern.  Your keyboard, your shovel.  Go forth and dig!

  1. ^

    Or at least get tired enough of arguing about it that sheer momentum forces our hands.

  2. ^

    Historical procedures have varied.  This year is the same as last year.

  3. ^

    And sometimes anti-inductive!

  4. ^

    New relationship energy.

  5. ^

    Ray: "Maybe also literal but I haven't done the UI design yet."

  6. ^

    Ray: "In previous years, we had a distinction between "nomination" comments and "review" comments. I streamlined them into a single type for the 2020 Review, although I'm not sure if that was the right call. Next year I may revert to distinguishing them more."

  7. ^

    Ray: "These don't have to be long, but aim to either a) highlight pieces within the post you think a cursory voter would most benefit from being reminded of, b) note the specific ways it has helped you, c) share things you've learned since writing the post, or d) note your biggest disagreement with the post."



Discuss