MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

Training PhD Students to be Fat Newts (Part 2)

2025-11-26 10:23:42

Published on November 26, 2025 2:23 AM GMT

[Thanks Inkhaven for hosting me! This is my fourth and last post and I'm already exhausted from writing. Wordpress.com!]

Last time, I introduced the concept of the “Fat Newt” (fatigue neutral) build, a way of skilling up characters in Battle Brothers that aims to be extremely economical with the fatigue resource, relying entirely on each brother’s base 15 fatigue regeneration per turn. This choice frees up stat and skill points to distribute evenly among offense, defense and utility. To illustrate, let’s compare two possible ways to build a bro.

The first brother is a Nimbleforged Cleaver Duelist, who wields the weighty and brutish one-handed Orc Cleaver. To successfully attack multiple times a turn, this brother needs sky-high base stats - attack, defense, HP, fatigue, and resolve - and a continuous investment of stat points to reach around 70 maximum fatigue. Furthermore, this build requires a specific suite of offensive and fatigue-recovery perks to function, such as Berserk, Killing Frenzy, Duelist, Cleaver Mastery, and Recover. Only the most seasoned of Hedge Knights and Sellswords can pull this build off consistently.

This brother also needs to stay alive in the thick of battle, so optimally he wears “Nimbleforged” armor, famed medium armors light enough not to eat up your fatigue pool and heavy enough to benefit from the Battleforged defensive perk. You might only drop a set of good Nimbleforged armor once or twice a campaign.

The second brother is a Fat Newt, who requires only 15 maximum fatigue to move and attack once a turn. Practically any brother - from Farmhands to Swordsmasters - who rolls high attack and defense can become a Fat Newt. By ignoring the fatigue stat, this brother can use those saved points to shore up weaknesses in HP and Resolve. And since he only needs so little fatigue, he can wear the bulky standard-issue Coat of Plates and wield the mighty two-handed Greataxe.

The Fat Newt also has a lot of slack distributing perk points. Instead of mandatory offensive perks like Berserk and Killing Frenzy, he takes Quick Hands (allowing him to swap to a Longaxe in a pinch to decapitate at range) and Fortified Mind (further bolstering his psychological defenses).

I want to make three salient points about the contrast between “Hero” type brothers like the Nimbleforged Cleaver Duelist on the one hand, and Fat Newts on the other:

  • Heroes are one in a thousand, and Fat Newts are one in ten. Nimbleforged Cleaver Duelists require extraordinary luck and specialized gear to optimize. Fat Newts still have to roll well to function, but there is plenty of slack to shore up weaknesses, so they can basically be mass produced. If you want to build a company of twenty brothers, you cannot fill it with only Heroes unless you are willing to trawl through recruits for hundreds of in-game days.
  • Fat Newts are not just “budget” Heroes, and Heroes do not Pareto dominate Fat Newts. Heroes are stretched so thin that they have real weaknesses and require a lot of babysitting. Generally speaking, Fat Newts will have more survivability and more utility, and they can often act as a menacing offtank to hold key defensive bottlenecks in the battlefield. Their increased utility allows them to save teammates in sticky situations and play more varied roles as the situation demands.
  • The effectiveness of the fatigue stat scales in a complicated nonlinear way. A Nimbleforged Cleaver Duelist with 70 maximum fatigue can chew through a critical flank by himself, fighting on overdrive for four or five turns before tiring out. That time is often enough to decide the flow of the entire battle. The same brother with 35 maximum fatigue is much less than half as effective - he runs out of stamina on turn two, and then stares impotently as the enemy surrounds and overpowers his allies.

Primarily, my intention with this post is to convey a set of intuitions - derived from over three hundred hours of Battle Brothers - about what it might mean to be a working mathematician, Fat Newt style. 

Young people often learn by imitation and emulation, doubly so when they lose themselves in the maze of interlocking cults of personality that is academia. What ends up happening is that battalions of young mathematicians fixate on superhuman “Hero” types - Terry Tao, Peter Scholze, Andrew Wiles, Alexander Grothendieck and so on - mathematicians imbued with four or five standard deviations of intelligence, work ethic, and monomania, and try to “copy their build.” This turns out to be ineffective, maybe even surprisingly so.

I think there is an inarticulate mass delusion that might be called the “Half-a-Hero Trap.” Just being half as smart, and working half as many hours, as Terry Tao, and otherwise copying his behavior line-by-line, and one can hope to become a quarter as good of a mathematician. A quarter-Tao is still an impressive success, after all.

The real scaling laws here are much, much less kind. Setting the murky waters of intelligence scaling aside, let’s talk about productivity. One literal interpretation of fatigue is the number of productive work-shaped hours one can output in a week. If Alice has the capacity to work 70 hours a week, and Bob only 35, Bob is unfortunately much less than half as effective as a researcher. To make the point simply, if Alice and Bob both have 15 hours of teaching a week, then 70 - 15 is more than twice 35 - 15.

Even worse, the best mathematicians are hired to positions with the highest salaries, burdened by the least teaching responsibilities, at universities with the easiest students to manage. Think about the difference in research output between a mathematician who natively works 70 hours a week and only teaches the advanced probability seminar once a year, and the same mathematician who can only work 35 hours a week and teaches three sections of freshman calculus every semester. The difference in available research hours is staggering. The former flourishes and the latter stagnates.

As I wrote in Gravity Turn, the work of getting into orbit is categorically different from the work of staying in orbit. I propose that in a world where almost every PhD student is falling into the Half-a-Hero Trap, there are vastly superior models of skilling up - analogous to the Fat Newt build - that do not look like “imitate the nearest Fields Medalist.” Let me give two examples.

First, time management. Many students who are only capable of working 35 hours a week imitate the outward behavior of mathematicians who work 70. They go to the same number of seminar talks and conferences, spend the same amount of time teaching and grading, and attend the same number of departmental social activities. The well-meaning professor councils his student to attend five hours of class and five hours of seminars a week to broaden her horizons, oblivious to the sheer fraction of her meager productive hours this sucks up. I suspect this category of error is a font of bad decisions for graduate students.

Second, self-reliance. Just as the Cleaver Duelist may be able to jump into battle alone and mow down his enemies (though even for him this is a dangerous gamble), great mathematicians are often cast as lone geniuses, operating far outside of the capacity and understanding of their peers. Fat Newts, on the other hand, operate best in the middle of the fighting line, holding key positions and working in tandem to control and overwhelm important targets that they would be unable to handle alone. There is a whole separate post to be written about this, but briefly, I think that PhD training systematically overlooks building the skills needed to play a supporting role in a research team.

I must end on a sobering thought - even in the best-case scenario, the Fat Newt style is not a magic bullet that makes every recruit useful. In Battle Brothers, only one in ten brothers are generated with the stats to become an acceptable Fat Newt. My observation is that there are many graduate students, who are not generationally talented, who can only genuinely work 10 or 20 hours a week, if that. For them, I see no clear path forward.



Discuss

Things I wish I knew to save GPU minutes on Llama 405b model (and other beasts)

2025-11-26 09:36:46

Published on November 25, 2025 10:56 PM GMT

The goal of the post is to share with you how easy it is to load a llama 405b model in Runpod but also how it might be costly if you don’t know some things in advance, so I hope this video will help you to save these precious gpu minutes!

First, Llama 405b model is huge:

Let’s talk GPU!

You need the right choice of GPU and high disk memory to save model parameters and give some overhead for your experiment, like doing inference and saving some internal activations.

Some good options are H100, A100 and H200 machines:

The H200 currently wins on both cost and inference speed. With FP8 quantization and higher VRAM per GPU, you only need 5 H200s compared to 8 A100/H100 GPUs for the same setup. This makes the H200 cheaper overall ($17.95/hr vs $21.52/hr for H100), while also giving roughly 3× faster inference.

Providers

There are some very good GPU providers out there such as: Runpod, vast.ai, lambda. You can browse some websites comparing different gpu providers, like here e.g.
I decided to go with RunPod because of its reliability, simplicity, and competitive pricing:

These are FOUR tips you should know to save your GPU minutes!

  1. Preparation is a key! Develop locally first, package and test your code on your local machine before deploying to RunPod to avoid wasting GPU time on debugging.
    1. Prepare startup_script to load your credentials and install necessary packages like poetry. You will scp it to runpod and run it there.
    2. Push code to GitHub and create package setup script.
    3. Before spinning up an expensive gpu setup, test your pipeline end to end on the smallest model from this family, in my examples I used llama 1b model and fixed all errors/bugs with minimal costs.
  2. Code and storage:
    1. For active code and temporary data use /root directory as it is on NVMe SSD, not network storage, significantly faster read/write speeds for data processing.
    2. /workspace is RunPod's persistent network storage, so keep your models and cache and data input/output there. Also, /workspace memory is much larger.

To setup this up it’s essential to set hugging face and transformer cache env vars;

This script is an example to do this:

  1. Network Volume is your friend! When you save all your cache, data and model under /workspace there is still risk of not being able to spin up a new pod with the same data. if no pods are available, you may get downgraded to a zero-GPU pod without being able to spin up a new GPU pod 🙁 Remember Llama model is HUGE and it takes around 30 minutes to download it, so you don’t want to waste your money on downloading model each time you create a new pod, you want it to be accessible for different gpu/cpu setups.
    So this is where network volume comes in 🙂!
  • independent of your Pod
  • fast: Typical transfer speeds: 200–400 MB/s, with possible peaks up to 10 GB/s
  • cheap: $0.07 per GB per month for the first 1 TB; $0.05 per GB per month beyond that.
    It’s very easy to set up, BUT make sure you check the data center you choose has your target GPUs and preferably multiple GPU types available as some GPU types may not always be available.
  1. Model Loading Optimization:

Trick 1: When loading large models on RunPod, use these flags:

  • low_cpu_mem_usage = True: Despite its name, this flag helps with both CPU and GPU memory:

Without this flag: Loads all weights at once, creating temporary copies that cause memory spikes of 2-3x the model size.

With this flag: Builds empty model first, then loads weights one tensor at a time, immediately assigning each to its target layer. Peak memory stays close to the final model size. This allows loading larger models on smaller RunPod instances

  • device_map="auto", enables direct loading to target devices across multiple GPUs without staging everything in CPU RAM first.

Trick 2: Use FastAPI for model loading and serving. If running multiple experiments, use FastAPI to load the model once and create API endpoints. It also keeps experiment code modular and separate from model loading so that you can make constant changes to it without having to load model on every change:

 

And of course, don’t forget to terminate your runpod :) Please share any efficiency tips for working with expensive GPUs in the comments



Discuss

Three positive updates I made about technical grantmaking at Coefficient Giving (fka Open Phil)

2025-11-26 09:09:58

Published on November 26, 2025 1:09 AM GMT

Open Philanthropy’s Coefficient Giving’s Technical AI Safety team is hiring grantmakers. I thought this would be a good moment to share some positive updates about the role that I’ve made since I joined the team a year ago.

tl;dr: I think this role is more impactful and more enjoyable than I anticipated when I started, and I think more people should consider applying.

It’s not about the “marginal” grants

Some people think that being a grantmaker at Coefficient means sorting through a big pile of grant proposals and deciding which ones to say yes and no to. As a result, they think that the only impact at stake is how good our decisions are about marginal grants, since all the excellent grants are no-brainers.

But grantmakers don’t just evaluate proposals; we elicit them. I spend the majority of my time trying to figure out how to get better proposals into our pipeline: writing RFPs that describe the research projects we want to fund, or pitching promising researchers on AI safety research agendas, or steering applicants to better-targeted or more ambitious proposals.

Maybe more importantly, cG’s technical AI safety grantmaking strategy is currently underdeveloped, and even junior grantmakers can help develop it. If there's something you wish we were doing, there's a good chance that the reason we're not doing it is that we don't have enough capacity to think about it much, or lack the right expertise to tell good proposals from bad. If you join cG and want to prioritize that work, there's a good chance you'll be able to make a lot of work happen in that area.

How this cashes out is: as our team has tripled headcount in the past year, we’ve also ~tripled the amount of grants we’re making, and we think the distribution of impact per dollar of our grantmaking has stayed about the same. That is, we’ve about tripled the amount of grant money we’ve moved towards the top end of the impact distribution as well as at the marginal end.

To be even more concrete, here’s one anecdote I can share. About a year ago, Jason Gross asked me for $10k for compute for an experiment he was running. I spoke to him a few times and encouraged him to make grander plans. The resulting conversations between him, me, and Rajashree Agrawal led to me giving them a $1M grant to try to found something ambitious in the formal software verification space (I’m reasonably excited about FSV as a def/acc + mitigating reward hacking play.) They eventually founded Theorem, a startup focussed on formal software verification, which went on to be the first FSV startup accepted to YC, and they subsequently raised at one of the largest valuations in their cohort. Jason and Rajashree say that they would have been very unlikely to set their goals that big without my initial grant. Nothing about that seems marginal to me, yet it wouldn’t have happened had I not been here.

There is no counterfactual grantmaker

When I was offered the job a little over a year ago, I was told that I was the only candidate still being considered for the role, and that there was no one left to make offers to if I didn’t accept. In our current hiring round, we’d like to hire 3-4 technical AI safety grantmakers, but once again it’s far from obvious that we’ll find enough candidates that meet our bar. If you get an offer and don’t take it, the likeliest result is that we hire one fewer person.

Why is this? I think the main reason is that fewer people apply to our roles than you might expect (if you’ve already applied, thank you!). We are looking for people who could succeed in a research career, and most such people don’t want to leave research. It also helps for this role if you are well networked and have a lot of context on technical AI safety. Most people with a lot of context are settled in their roles and unlikely to apply. Separately, the set of skills required to be a good grantmaker includes some things that aren’t as important for being a good researcher, so occasionally strong researchers who apply have disqualifying qualities, even people who on paper seemed like they might be really good.

What this all means is that our top candidates end up being extremely counterfactual. Their acceptance or rejection of the role doesn't just improve outcomes very slightly relative to some other person we could have hired, but counterfactually causes tens of millions of dollars to move out the door to really impactful projects that wouldn't have otherwise been funded.

If we're so starved for grantmaker labor, why don't we lower our hiring bar? I think we’re going to have a slightly lower bar than we’ve had in the past; we really want to fill these roles. But also, we think there are diffuse long-term negative effects of seriously lowering our hiring bar. I acknowledge that perhaps we're making the wrong tradeoffs here.

(If you feel moved to apply by the counterfactual argument, but would drop out if it turns out that we think we have enough other good applicants, please feel free to indicate that in your application. If we get an unexpected windfall of strong applicants, such that we have more qualified candidates than we can hire, we’ll be happy to let you know, and there will be no hard feelings if you drop out.)

Grantmaking is more fun/motivating than I anticipated

Before I joined OpenPhil, I was about as “research archetype” as they get. I spent most of my time thinking about wacky theory math ideas. My work style was chaotic-academia: I went to bed at random times and worked at random times and in random places, mostly on whatever interested me at the time. 

Now I have a team and a manager, and I have lots of things that need to be done. I am not planning to have any papers with my name on them in the foreseeable future. But, I'm really enjoying it! So why am I enjoying it more than you might expect, and indeed indeed more than I expected going in? Some factors:

  • I do actually spend a decent fraction of my time thinking about object-level technical stuff. Mostly that looks like talking to top researchers, but it also looks like reading academic papers, and going to conferences to talk to people about their research or argue about AI safety. That can be a big part of the job if you want it to be.
    • One thing that's nice about the technical roles at cG is that we have a lot of generalist grantmakers but not a lot of technical specialists. That means that people are focused on leveraging the technical expertise of those who have it efficiently, which means that I get to spend a larger than expected fraction of my time on technical stuff. For example, with large grants that have a significant technical and non-technical component to the investigations, I often pair with a generalist grantmaker who will take the non-technical aspects off my hands. I get to spend more time thinking about which research agendas are promising, and less time worrying about whether an organisation has a healthy board of directors etc, than I expected.
  • The work I'm doing feels important and tangible. I go to conferences and see people walking around and giving talks who wouldn’t be in the room without my grant. A promising junior person mentions they just got a job at an org I funded to grow its headcount. Maybe it's cliche, but I actually do think that seeing the effects of my work on the shape of the field, is pretty motivating.
  • I'm significantly more empowered than I expected to be when I joined. I've been given much more trust than I expected, and I've been empowered to make decisions based on my inside view. My manager is constantly pushing me to take on bigger projects, be more ambitious and creative, and be more agentic. As a result, I think I have become more ambitious and agentic, and noticing that in myself has been very motivating. I think if you think that a more agentic, ambitious version of yourself is someone you'd like to grow into, then this might be a good role for you, even if you're not sure how well that will go yet.
  • This role can be very sociable. I spend a lot of my time talking to researchers about the research they're doing and why. I don't get as much time to spend on getting into the low-level technicalities, at least not in my day-to-day, but I actually find that high-level strategic thinking which still interfaces with technical details can scratch much of the same itch as doing the research myself. I also think that thinking through strategic questions about technical AI safety and the future of AI are extremely interesting questions.

Please apply!

If all this sounds appealing to you, you can apply here by December 1st! Our team funds a lot of great research – from scaling up research orgs like Redwood or Apollo, to eliciting projects like those described in our RFP, to proactively seeding new initiatives. 

Last year, the Technical AI Safety team made $40 million in grants; this year it’ll be over $140 million. We want to scale further in 2026, but right now we only have three grant investigators on the team, so we’re often bottlenecked by our grantmaker bandwidth. If you think you might be a strong fit, your application could be the difference between us finding the right person or leaving a role unfilled. If you have more questions, you can dm me on LW or reach out to me at jake [dot] mendel [at] coefficientgiving [dot] org



Discuss

Want a single job to serve many AI safety projects? Ashgro is hiring an Operations Associate

2025-11-26 08:39:24

Published on November 26, 2025 12:39 AM GMT

🥞 Apply now! (First step takes < 15 min if you have a résumé ready.)

(Already posted directly on LW.[1])

What is Ashgro?

https://www.ashgro.org/

Ashgro helps AI safety projects focus on AI safety.

We offer fiscal sponsorship to AI safety projects, saving them time and allowing them to access more funding. We save them time by handling accounting, management of grants and expenses, and HR. We allow them access to more funding by housing them within a 501(c)(3) public charity (Ashgro Inc.), which can receive grants from pretty much any source.

Search for ‘Ashgro’ in https://survivalandflourishing.fund/2025/recommendations to find examples of projects we're sponsoring.

What does an operations associate do at Ashgro?

Handle parts of tickets, whole tickets and eventually address just generally described opportunities or problems. You'll start out doing very basic tasks, but we aim to move you up the ladder to more and more complicated or open-ended tasks as quickly as we (and you) can. Given that we're a small team, though, some share of very basic tasks will remain your responsibility for the foreseeable future.

Examples of very basic tasks:

  • Create tickets, link them to existing tickets and add forms to them.
  • Fill details into an email template and send the email.
  • Upload an invoice to a procurement platform every month.
  • Transfer money to prize recipients.

Examples of well-defined tasks:

  • Put together invoices and send them to the UK government.
  • Respond to a fiscal sponsorship inquiry.
  • Advise project staff on what expenses are allowed or not allowed.
  • Review expenses for compliance.
  • Ask our lawyer for legal advice and implement it.

Examples of high-level work:

  • Create and document processes.
  • Figure out how to handle tax withholding for prize payments and communicate about it with project staff.
  • Understand why we're not getting deposits from a set of grants and figure out how to get that unstuck. Including finding a solution, together with the affected project staff, for the cash flow problem resulting from the grant conditions.

All of the above are past examples. Future work will be different.

Requirements

You need to be able to:

  • Communicate externally.
    • Read with attention to detail. In particular, you can read a customer request and understand what it is they want.
    • Write clearly, in a friendly manner and quickly (since you'll have to do a lot of writing).
  • Do stuff.
    • Prioritize well after being given heuristics for prioritization.
    • Stick to the given process, even when it's not perfectly optimal.
    • Handle similar things in a similar way (consistency) (when there is no process or there are gaps in the process).
    • Learn new things quickly (or already know everything).
  • Communicate internally.
    • Recognize when you don't know something and ask.
    • Recognize when something comes up that someone else (in particular, your manager) should know and tell them about it.

Nice to have – lacking these should not (!) stop you from applying:

  • Legal or financial background.
  • Experience working at a US non-profit.
  • Knowledge of AI safety topics.
  • A knack for building rapport through written communication.

Application process

AI warning: We want to know what you can do, not what AI can do. So as soon as we have any suspicion that any part of your application is written by AI, we will put it on the ‘maybe’ pile. If we are reasonably sure that part of the application was written by AI, we will reject it, no matter how good it otherwise seems.

  1. Initial two-part application, which doubles as a work sample test. – What you submit here will be anonymized before review.
  2. 30 min screening interview involving a small sample task.
  3. 1-2 h interview with future manager (Richard).
  4. 1-2 h interview with CEO/skip manager (JJ).
  5. 2-3 h screenshare call working with future manager on actual work tasks.
  6. Reference check.

If you make it through all of this, we'll be excited to offer you a job.

How to prepare for the interview process?

  • Expect to tell us about yourself in general.
  • Be ready to talk about past accomplishments that show you can do what you need to be able to do (see above).
  • We don't expect you to put a lot of effort into anticipating interview questions and rehearsing answers. If we need more information and as long as you don't ramble, there will be plenty of time for us to ask follow-up questions to get the information we need.

How long between applying, getting an answer, and starting to work?

Expect six to eight weeks from submitting your application to getting an offer. We aim to go faster than that, but life usually intervenes. If you let us know, we can accelerate the process for you.

Other information

  • This is a remote position.
  • Part-time is possible.
  • If you have any questions or doubts, please comment or email us at [email protected].

Apply now

🥞 Apply now! (First step takes < 15 min if you have a résumé ready.)

  1. ^

    To the reviewer: I didn't know that posts will be shown on LW while they're in review for AF. Feel free to suppress it there since I've already posted it directly: https://www.lesswrong.com/posts/TnnJYLYSyKQCc6yrf/want-a-single-job-to-serve-many-ai-safety-projects-ashgro-is

    One thing I didn't make clear originally is that we're already fiscally sponsoring many AI safety projects, including some well-known ones. See eg. https://survivalandflourishing.fund/2025/recommendations.



Discuss

Beware boolean disagreements

2025-11-26 08:31:32

Published on November 26, 2025 12:31 AM GMT

(Meta: This is largely a re-framing of Consider your appetite for disagreements.)


Poker players get into arguments a lot. Back when I used to play I would review hands with a friend and we'd get into these passionate disagreements about what the right decision is.

To keep it simple, suppose that my opponent goes all in on the river and I have two choices:

  1. Call
  2. Fold

Suppose my friend Alice thinks I should call and I think I should fold. Alice and I would spend hours and hours going back and forth trying to weigh all the considerations. We'd agree on a large majority of these considerations, but often times we just couldn't seem to agree on what the actual best play is: call or fold.

There are various things that I think went wrong in our conversations, but the biggest is probably a type error: we were disagreeing about a boolean instead of a number.

When making decisions in poker (and life!), the thing that matters is expected value (EV). Suppose the buy-in is $200, Alice thinks calling has an EV of +$1.00, and I think that calling has an EV of -$0.75. In this scenario, the magnitude of our disagreement is less than a measly big blind!

In other words, Alice thinks that calling is very slightly better than folding whereas I think that it is very slightly worse. We're basically in agreement with one another, but framing things as a boolean rather than a number masks this fact.

I wouldn't say that framing a disagreement around booleans is never useful; that'd be a very strong claim that I don't feel confident enough to make. But I do get the sense that the boolean framing usually isn't useful, and so my position is that you should beware of boolean disagreements.



Discuss

EA ITT: An attempt

2025-11-26 05:59:13

Published on November 25, 2025 9:59 PM GMT

Every now and then, I hang out with people who tell me they're doing Effective Altruism. I won't claim to have read even the standard introductory texts, and Ord's The Precipice gifted to me still gathers dust on the shelf, while the ever-practical collection of Nietzsche's works still feels worth revisiting every now and then. That by itself should tell you enough about me as a person. Regardless, for a long time I thought I knew what it was about, namely Purchasing Fuzzies and Utilons Separately, along with some notion of utility meanng something more than just the fuzzies. I already subscribed to fully to the idea of effectiveness in general. Altruism, on the other hand, seemed like something confused people did for wrong reasons and others pretented to do for optics. A typical cynical take, you might say.

One of the most powerful moves in status games is denying that you're playing the game at all. This is how you notice people way above your level. Or way below, but missing that is rare. For a long time I thought EA was about this; publicly claiming that you're only optimizing for the most effective use of resources to avoid falling in the trap of spending most of the money in visibility or awareness campaings and such. Simply working highly-paid jobs and silently gifting that money into best charities one could find. This turns out to be not true, and instead optimizing the ratio between long-term and short-term benefits is one of the key concepts. This clearly is the effective way to do things, but I've got something agaist telling other people what's morally right. Then again, it's just my intuition and is based on nothing in particular. Just like every moral philosophy.


Passing the Ideological Turing Test is a good way check that you understand the points of people with a different world view. After using cynically-playful "effective anti-altruism" as one of my you-should-talk-to-me-about topics in LWCW 2024 names & faces slides, some people (you know who you are) started occasionally referring to me as the "anti-EA guy". After such an endorsement, it would be prudent to verify I actually know what I'm talking about.

So for the next part I'm going to do a short-form self-Q/A -style steelmanning attempt of EA. It will be my cynical side asking the questions, when the ones I found online won't suffice. Needless to say, I hope, is that I don't necessarily believe then answers, I'm just trying to pass the ITT. I've timed about an hour to write this, so if some argument doesn't get addressed, blaming the time pressure will be my way to go.


Why should yo do help others?

Suffering is bad. It's universally disliked. Even the knowledge that someone else suffers causes suffering. We should get rid of it.

Avoiding physical pain isn't everything. The scale certainly goes above zero, and we should climb up. Maslow's Hierarchy of Needs explains quite well what we should am for. For instance, creativity and self-actualization are quite important.

Why not just minimize your own suffering, and the suffering you can observe? Isn't it enough to pay your taxes and not bother anyone else?

I aim to be the kind of decision theoretic agent that would minimize their suffering given any set of starting circumstances. Rawls's veil of ignorance explains this quite well.

There's another side to this, too. My first introduction to the topic likely was this:

Disclaimer: It's your personal responsibility to rise above the ethical standards of society. Laws and rules provide only the minimal groundwork for culture. You enjoy the fruits of labor of countless generations of human progress before you, therefore it's only fair to contribute to the common good. Together we can make it happen.

-DEMO2 by Ekspert

I still find that quite persuasive.

But most people are evil, why would you help them?

Almost nobody sees themselves as evil. They're doing their best too, and sadly they're starting from a worse position, knowledge-wise. We still hold the same underlying values, or at least would if capable of perfectly rationality. Sadly, nobody is, so the coordination issues look like value differences.

They'll still do harmful things, right?

And that is indeed a tragedy. Welcome to Earth.

-Yudkowsky: Are Your Enemies Innately Evil?

If you keep giving away your resources, doesn't that mean whoever doesn't will get ahead?

"Destroy what you love, and I grant you victory!", bellows Moloch. How about no?

Sure, we should be robust about others using our generosity against us. It's allowed and somes necessary to be ruthless, cold and calculating, outcompeting others; if and only if the underlying values are retained. Still, even if you actions are indistinguishable from them, the overhead of maintaining your values is expensive. Which means that by default you'll lose, unless you can build a coalition against racing-to-the-bottom, and indeed, the nature itself.

What do you think of the existential threat from AI?

Our top priority at the moment. If it was up to me, almost all other efforts should be ceased to work on this. But people already working in other areas have great momentum and are doing quite a bit of good. It doesn't make much sense to cease that work, as efficient reallocation of resources seems unlikely.

What's your optinion on wireheading?

Not a good idea. It goes against my sense of aesthetics, but so do many other things that are actually worth it. It would be a huge mistake to do this before we have fully automated resource-efficient research and space exploration. But after that? Still a mistake.

Yudkowsky writes about this in the High Challenge:

The reason I have a mind at all, is that natural selection built me to do things—to solve certain kinds of problems.

"Because it's human nature" is not an explicit justification for anything. There is human nature, which is what we are; and there is humane nature, which is what, being human, we wish we were.

But I don't want to change my nature toward a more passive object—which is a justification. A happy blob is not what, being human, I wish to become.

I fully agree.

Which path would you choose in Three Worlds Collide?

That's hard one. It does partially depend on the exact numerical facts of the situation, which Yudkowsky omits. If the prospects of humankind's long-term survival look decent, then the diversity of experience should probably be valued highly enough to refuse the deal. But if that occurred tomorrow, modified to match our current situation of course, I would definitely go with the cooperative option.

Should we destroy the whole world because it contains so much suffering?

No. Even if the current world wasn't worth having, there's potential to create so much good in the future. And it's not like we could do that anyway anytime soon.

Should we avoid eating meat?

Obviously. But that might not be the best use of your resources right now. Enough advocacy is already done by others, focus on places where more impact is possible.

Should we stop wild animals from eating each other?

Yes. Not currenlty feasible but it might be in the future.

Doesn't that go against the diversity of experience you talked about before?

It does, but with sufficient technology we can mostly avoid this. The predator will have the exact same experience, it's just that there won't be any real prey that suffers.

Cryonics?

A good idea. Not even that expensive if done in scale. Currently not sensible to fund compared to the existential threat from AI though.


Just writing this changed some of my views. Scary stuff.

I might update this in the future if I think I'm missing something important.



Discuss