MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

Snippets on Living In Reality

2025-11-26 12:38:58

Published on November 26, 2025 4:38 AM GMT

Social reality is quite literally another world, in the same sense that the Harry Potter universe is another world. Like the Harry Potter universe, social reality is a world portrayed primarily in text and in speech and in our imaginations. Like the Harry Potter universe, social reality doesn’t diverge completely from physical reality - they contain mostly the same cities, for instance. Like the Harry Potter universe, social reality matches physical reality “by default” wherever there’s no particular pressure pushing against that match - after all, it’s just easier to make up fewer details rather than more. But like the Harry Potter universe, social reality does diverge in many places, and then the “fandom” tries to make up as coherent an interpretation as they can. Indeed, it’s that drive-toward-coherence which pushes both social reality and the Harry Potter universe to be fictional worlds, as opposed to just topics or genres about which people say lots of conflicting things.

And like the Harry Potter universe, sometimes people in the physical world do physical things as a result of goings-on in social reality.


What makes physical reality different from social reality or the Harry Potter universe or any other fictional world? Well, if we change all of our representations of the Harry Potter universe - every copy of every Harry Potter book and movie, every human's recollections, etc - then we've effectively changed the Harry Potter universe. There's nothing else left, besides the symbolic representations.

But if we changed every record of the night sky, altered every photograph and diagram, altered every human's recollections, etc... then the next night an awful lot of people would look up and be very, very confused, because the stars would just keep doing their thing even with all of our representations changed. Physical reality is that one special world which we don't get to choose just by moving around the symbols and representations.


While getting shawarma, I overheard some girls talking about some kind of conference they were at, sounded like a women in CS kind of thing? Anyway, one of them seemed only able to talk about the social impressiveness markers of the speakers. It sounded like she just… didn’t perceive the object level contents of anything they were saying at all? (Which, to be clear, may have been intended - i.e. maybe the speakers themselves were just trying to show their social impressiveness markers and didn’t say anything real at the object level.) Anyway, it was… weird to see this apparent inability-to-perceive reality, like she could only see the social world.


I live in reality. I like it here. It is my home. Insofar as others seek to escape it, that just leaves more reality for me to claim; and by choosing to live more fully in reality, I gain greater claim to it.


Google maps lives in reality.

LLM output usually doesn’t live in reality, at least not for very long. Like social reality and the Harry Potter universe, the worlds which LLMs write about are similar to ours by default wherever nothing in particular is pushing them to be different. But any time the LLM makes up some detail, and then continues to produce text in a manner consistent/coherent with that made-up detail, it’s diverged from physical reality and has begun to paint a fictional world.

Likewise, much/most internet content doesn’t live in reality. Much of it is explicitly fiction. Much of the remainder is social reality, or a picture of the world painted by the egregore of some political faction, or stuff that somebody made up but then backed up with a whole fictional world model (like e.g. some conspiracy theories).

For someone who doesn't usually perceive physical reality much, someone who mentally lives in e.g. social reality or one of the political worlds, do LLMs' worlds seem real?


“Should world” or “moral reality” is another different world.


Virtual reality (VR) shows other worlds. Augmented reality (AR) is intended to live in our world. That’s the root cause of the difference in difficulty between the two technologies: part of living in our world is that we don’t get to choose most of it just by saying things or by writing a few lines of code. We can rewrite an entire virtual reality just by changing the code which specifies that. We can’t do that for physical reality; there are parts beyond the programmer’s control. That’s the main reason why making VR apps is so much easier than making AR apps, to the point where the former is basically-solved and the latter is basically-unsolved.


Maybe LLMs are impressive in the same way as VR: they can display elaborate constructed worlds, but can’t deal with the real world. That would be because, whenever they make up some detail, however minor, there is no particular force to push them back into agreement with physical reality; no tight feedback loop. They just wander further and further off from physical reality as they generate more tokens, maintaining coherence to a large degree, and thereby creating a more and more fictional world.


Is the “fake vs real thinking” distinction about the same thing? Like, thinking which uses mental representations of the real world vs mental representations of other worlds? Is that the True Name of “real thinking”? If so, then tracking “which world does this live in” at all times could maybe lock in real thinking much more robustly.

It sure does match the opening CS Lewis quote perfectly: “There comes a moment when the children who have been playing at burglars hush suddenly: was that a real footstep in the hall?


How would more data and software live in reality? What would the Great Generalized Google Maps project include, in what order? What data structures would it use for more abstract things?

How would this very essay be represented in the great generalized maps project?



Discuss

Evolution & Freedom

2025-11-26 11:38:34

Published on November 26, 2025 3:38 AM GMT

In Against Money Maximalism, I argued against money-maximization as a normative stance. Profit is a coherent thing you can try to maximize, but there are also other sorts of value. Profit-maximization isn't the unique rational way to engage with money.

One way you could respond to this is economic Darwinism: "Sure, you can optimize other things than money, but over time, the market will come to be dominated by money-maximizing agents."

I think this is misguided in several respects.

First, obviously, the values of probable future societies aren't necessarily your values. Even the values of the dominant order in current society aren't necessarily your values. Even if your cause is hopeless, that doesn't automatically mean that you should flip values.

Second, I don't really think markets evolve money-maximizers, in the same sense that evolution doesn't really evolve fitness-maximizers. I see at least three senses in which this is true.

Building fitness-maximizers is hard.

Evolution will aggressively prune behaviors which harm fitness (relative to readily accessible mutations), but this doesn't exactly add up to fitness-maximizing organisms. Similarly, common practices in business aren't exactly as economists would suggest. Economists say that profit-maximizers should set prices by finding the point where marginal revenue equals marginal cost. Real companies almost always calculate price as cost plus markup, instead. This is, at least in part, because it is difficult to know the demand curves needed to compute the marginal cost = marginal revenue.

I'm essentially saying that the inner alignment problem is hard: an outer loop selecting for X doesn't necessarily produce agents who try to maximize X. Instead, it often produces agents who care about things correlated with X.

Evolution is more centrally about weeding out losers than selecting winners.

A big shift in my thinking about evolution and economics has been realizing that firms/organisms don't necessarily have to be profitable; they just have to survive.

The airline industry is a low margin business. We all know that. In the years from 1945 to the end of the twentieth century the global airline industry made total net profits of $36bn representing a margin of 0.8% to revenues. In the first decade of this century the industry generated net losses of $49bn — a third more than it had ever made. (source)

I believe grocery stores and restaurants are also examples of notoriously low-margin businesses.

In a narrow profit-maximizing mindset, there's no reason for non-profitable businesses to exist. Yet, in theory, so long as a business can keep bringing in as much money as it spends, it can persist indefinitely.

Notice that although such a business has zero value from a profit-maximizing perspective, it can still produce a great deal of value as measured by serving customers and paying employees.

I think approaching economics with a profit-maximizing mindset will wrongly heuristically suggest to your mind that there's one winning strategy which everything tends towards; perhaps something like "buy under-valued things, transform them into over-valued things, sell". Instead, there's an extreme diversity of viable strategies. A growing market tends to make things more efficient, pushing out less-efficient ways of doing things via competition; however, an even more important effect seems to be that a growing market creates more diversity of strategies, as new businesses become viable. This diversifying effect seems to outpace the anti-diversity impact of the efficiency incentive.

These effects are perhaps better-documented in the case of biology. 

There's speciation, the process of one species becoming more.

There's the niche, a viable overall survival strategy. 

There's adaptive radiation, which means rapid evolution from one species occupying one niche to many occupying many. For example, Darwin's finches underwent adaptive radiation after landing on an isolated archipelago (the Galapagos), filling many niches by adapting their form to specialize in a way of life.

There's the concept of ecospace filling, a long-term trend for life to occupy more niches over time. One illustration of this is parasites with complex multi-host lifecycles. This survival strategy seems absurd from a fitness-maximization perspective. How did it get started? Yet, it can happen, and over long enough timescales, it eventually will, and a species will emerge adapted to the niche.

There's niche construction: the process by which organisms change their environment and create new selective pressures, which can modify existing niches or create new ones. Trees are a highly-evolved light-eating organism: their hard wood flesh allows sustained growth to extreme heights, their branching structure efficiently encompasses a volume while using a minimal quantity of such flesh, and their leaves efficiently metabolize sunlight within that volume (concentrated towards its surface, since the interior gets shaded by the leaves). This creates many opportunities for other organisms, from birds nesting in trees to fungus rotting dead wood.

Niche construction brings up a final point against the fatalistic argument for money-maximization / fitness-maximization:

Organisms determine the fitness function.

Trees give rise to forests, which support a whole different ecology than would otherwise exist.

Evolution isn't maximizing some static notion of fitness. What "fitness" means is determined by the surrounding world, including other members of a species as well as the surrounding ecosystem.

For example, sexual selection is the selective pressure created by mates. This can create famously inefficient adaptations, such as a Peacock's tail. In some sense, this is a species "deciding" what its values are and modifying the fitness function to select for that.

I've talked to people who find this abhorrent: who see Peacock-tails as a failure mode, a symbol of lost purposes. I think I can see the intuition here. There's a feeling of fake-ness, forms insulated from broader reality, decadence. It's the flipside of the beauty of an efficient form, like a hawk's wings or a tree's branches. 

However, there isn't in fact a """broader reality""" here. Instead, there's just reality, which contains a huge variety of life-situations. The study of "fitness" isn't the study of any single final form, but rather, the study of how organisms adapt to specific situations.

Similarly, a market doesn't select for money-maximization. Rather, the market demands whatever it demands. When a market is working well, it supplies the various demands in an approximately Pareto-efficient manner. What the world looks like under that optimization pressure depends on what is being demanded!

In a future optimized by intelligent beings, there's no special reason why money-maximizing needs to be the convergent equilibrium. Even the market could be discarded.

Speaking for myself, Peacock-tails don't feel like a failure mode. Instead, it feels like a sort of freedom. Beings trapped in evolution by natural selection nonetheless exercise a sort of choice over their trajectory. Beings trapped in a market economy still get some choice over what future they collectively create.



Discuss

Reasons Why I Cannot Sleep

2025-11-26 11:37:22

Published on November 26, 2025 3:37 AM GMT

My therapist says I'm more tired today than she's ever seen me. Here are some reasons my brain says I cannot sleep:

  1. My boss might lose faith in my ability to manage projects. Something important might go wrong with my project (Inkhaven), and I am the only person on-call who has the context to fix it all. If I screw it up he might no longer be willing to give me big projects like this.
  2. Annoying social drama. There was one yesterday, that made me anxious and unable to relax until it was dealt with. I had just come out of a massage supposed to relax me, and yet I was immediately un-relaxed for a few hours.
  3. I associate being in bed with checking Reddit/YouTube. Whoops. Perhaps I should have better spaces to do this, so that I never do it in bed? Alas, I live in a single room with a bed and a desk, this is my private space. I have a hard time just going to sleep in bed.
  4. There are 50-60 people on campus who I am responsible for. When people are around they ask you questions and to solve their problems. All. The. Time. Being seen is an invitation for you to solve a problem. Eye contact is an invitation for you to solve a problem. Saying "hi" is an invitation for you to solve a problem. (This is my job and normally I am happy to do it, but it's hard to turn it off.)
  5. There are many Slack messages waiting for me. I guess this is on me for having a habit of hitting slack inbox-zero and an org philosophy of Slack Maximilism.
  6. I said I'd sing a song in the open mic tonight. Alas, I was looking forward to that. It's my favorite song I know. Well, I've had to cancel.
  7. My mum wants to catch up with me. I did fly her half way across the world to be here, so that's understandable. Fortunately I am keeping her here for 2 months so it's fine to say no sometimes.
  8. I'm having a depressive spiral. Given my lack of sleep, everything else in my life looks worse and like a problem I cannot overcome and will feel shame for failing at. It helps to remember that I feel this way substantially because I'm tired and not because the things are as bad as they seem. Though I am indeed pulling to mind the worst issues that I am facing, and also some of my insecurities being exposed/tested.
  9. Because I need to make plans for a few specific people who are leaving Inkhaven tomorrow. Two residents will be leaving for Thanksgiving and not returning; and one contributing writer will leave in the morning. I want to make sure that the residents have a nice sendoff, and that the contributing writers' time is well-spent.
  10. Because I have to write a blogpost else I have failed out of Inkhaven. I am probably the person at Inkhaven with the most work to do during Inkhaven, but it isn't good for the leader of Inkhaven to fail the daily blogging challenge, so he should figure out a nice short post to write that he can publish to keep up with the challenge.


Discuss

Training PhD Students to be Fat Newts (Part 2)

2025-11-26 10:23:42

Published on November 26, 2025 2:23 AM GMT

[Thanks Inkhaven for hosting me! This is my fourth and last post and I'm already exhausted from writing. Wordpress.com!]

Last time, I introduced the concept of the “Fat Newt” (fatigue neutral) build, a way of skilling up characters in Battle Brothers that aims to be extremely economical with the fatigue resource, relying entirely on each brother’s base 15 fatigue regeneration per turn. This choice frees up stat and skill points to distribute evenly among offense, defense and utility. To illustrate, let’s compare two possible ways to build a bro.

The first brother is a Nimbleforged Cleaver Duelist, who wields the weighty and brutish one-handed Orc Cleaver. To successfully attack multiple times a turn, this brother needs sky-high base stats - attack, defense, HP, fatigue, and resolve - and a continuous investment of stat points to reach around 70 maximum fatigue. Furthermore, this build requires a specific suite of offensive and fatigue-recovery perks to function, such as Berserk, Killing Frenzy, Duelist, Cleaver Mastery, and Recover. Only the most seasoned of Hedge Knights and Sellswords can pull this build off consistently.

This brother also needs to stay alive in the thick of battle, so optimally he wears “Nimbleforged” armor, famed medium armors light enough not to eat up your fatigue pool and heavy enough to benefit from the Battleforged defensive perk. You might only drop a set of good Nimbleforged armor once or twice a campaign.

The second brother is a Fat Newt, who requires only 15 maximum fatigue to move and attack once a turn. Practically any brother - from Farmhands to Swordsmasters - who rolls high attack and defense can become a Fat Newt. By ignoring the fatigue stat, this brother can use those saved points to shore up weaknesses in HP and Resolve. And since he only needs so little fatigue, he can wear the bulky standard-issue Coat of Plates and wield the mighty two-handed Greataxe.

The Fat Newt also has a lot of slack distributing perk points. Instead of mandatory offensive perks like Berserk and Killing Frenzy, he takes Quick Hands (allowing him to swap to a Longaxe in a pinch to decapitate at range) and Fortified Mind (further bolstering his psychological defenses).

I want to make three salient points about the contrast between “Hero” type brothers like the Nimbleforged Cleaver Duelist on the one hand, and Fat Newts on the other:

  • Heroes are one in a thousand, and Fat Newts are one in ten. Nimbleforged Cleaver Duelists require extraordinary luck and specialized gear to optimize. Fat Newts still have to roll well to function, but there is plenty of slack to shore up weaknesses, so they can basically be mass produced. If you want to build a company of twenty brothers, you cannot fill it with only Heroes unless you are willing to trawl through recruits for hundreds of in-game days.
  • Fat Newts are not just “budget” Heroes, and Heroes do not Pareto dominate Fat Newts. Heroes are stretched so thin that they have real weaknesses and require a lot of babysitting. Generally speaking, Fat Newts will have more survivability and more utility, and they can often act as a menacing offtank to hold key defensive bottlenecks in the battlefield. Their increased utility allows them to save teammates in sticky situations and play more varied roles as the situation demands.
  • The effectiveness of the fatigue stat scales in a complicated nonlinear way. A Nimbleforged Cleaver Duelist with 70 maximum fatigue can chew through a critical flank by himself, fighting on overdrive for four or five turns before tiring out. That time is often enough to decide the flow of the entire battle. The same brother with 35 maximum fatigue is much less than half as effective - he runs out of stamina on turn two, and then stares impotently as the enemy surrounds and overpowers his allies.

Primarily, my intention with this post is to convey a set of intuitions - derived from over three hundred hours of Battle Brothers - about what it might mean to be a working mathematician, Fat Newt style. 

Young people often learn by imitation and emulation, doubly so when they lose themselves in the maze of interlocking cults of personality that is academia. What ends up happening is that battalions of young mathematicians fixate on superhuman “Hero” types - Terry Tao, Peter Scholze, Andrew Wiles, Alexander Grothendieck and so on - mathematicians imbued with four or five standard deviations of intelligence, work ethic, and monomania, and try to “copy their build.” This turns out to be ineffective, maybe even surprisingly so.

I think there is an inarticulate mass delusion that might be called the “Half-a-Hero Trap.” Just being half as smart, and working half as many hours, as Terry Tao, and otherwise copying his behavior line-by-line, and one can hope to become a quarter as good of a mathematician. A quarter-Tao is still an impressive success, after all.

The real scaling laws here are much, much less kind. Setting the murky waters of intelligence scaling aside, let’s talk about productivity. One literal interpretation of fatigue is the number of productive work-shaped hours one can output in a week. If Alice has the capacity to work 70 hours a week, and Bob only 35, Bob is unfortunately much less than half as effective as a researcher. To make the point simply, if Alice and Bob both have 15 hours of teaching a week, then 70 - 15 is more than twice 35 - 15.

Even worse, the best mathematicians are hired to positions with the highest salaries, burdened by the least teaching responsibilities, at universities with the easiest students to manage. Think about the difference in research output between a mathematician who natively works 70 hours a week and only teaches the advanced probability seminar once a year, and the same mathematician who can only work 35 hours a week and teaches three sections of freshman calculus every semester. The difference in available research hours is staggering. The former flourishes and the latter stagnates.

As I wrote in Gravity Turn, the work of getting into orbit is categorically different from the work of staying in orbit. I propose that in a world where almost every PhD student is falling into the Half-a-Hero Trap, there are vastly superior models of skilling up - analogous to the Fat Newt build - that do not look like “imitate the nearest Fields Medalist.” Let me give two examples.

First, time management. Many students who are only capable of working 35 hours a week imitate the outward behavior of mathematicians who work 70. They go to the same number of seminar talks and conferences, spend the same amount of time teaching and grading, and attend the same number of departmental social activities. The well-meaning professor councils his student to attend five hours of class and five hours of seminars a week to broaden her horizons, oblivious to the sheer fraction of her meager productive hours this sucks up. I suspect this category of error is a font of bad decisions for graduate students.

Second, self-reliance. Just as the Cleaver Duelist may be able to jump into battle alone and mow down his enemies (though even for him this is a dangerous gamble), great mathematicians are often cast as lone geniuses, operating far outside of the capacity and understanding of their peers. Fat Newts, on the other hand, operate best in the middle of the fighting line, holding key positions and working in tandem to control and overwhelm important targets that they would be unable to handle alone. There is a whole separate post to be written about this, but briefly, I think that PhD training systematically overlooks building the skills needed to play a supporting role in a research team.

I must end on a sobering thought - even in the best-case scenario, the Fat Newt style is not a magic bullet that makes every recruit useful. In Battle Brothers, only one in ten brothers are generated with the stats to become an acceptable Fat Newt. My observation is that there are many graduate students, who are not generationally talented, who can only genuinely work 10 or 20 hours a week, if that. For them, I see no clear path forward.



Discuss

Things I wish I knew to save GPU minutes on Llama 405b model (and other beasts)

2025-11-26 09:36:46

Published on November 25, 2025 10:56 PM GMT

The goal of the post is to share with you how easy it is to load a llama 405b model in Runpod but also how it might be costly if you don’t know some things in advance, so I hope this video will help you to save these precious gpu minutes!

First, Llama 405b model is huge:

Let’s talk GPU!

You need the right choice of GPU and high disk memory to save model parameters and give some overhead for your experiment, like doing inference and saving some internal activations.

Some good options are H100, A100 and H200 machines:

The H200 currently wins on both cost and inference speed. With FP8 quantization and higher VRAM per GPU, you only need 5 H200s compared to 8 A100/H100 GPUs for the same setup. This makes the H200 cheaper overall ($17.95/hr vs $21.52/hr for H100), while also giving roughly 3× faster inference.

Providers

There are some very good GPU providers out there such as: Runpod, vast.ai, lambda. You can browse some websites comparing different gpu providers, like here e.g.
I decided to go with RunPod because of its reliability, simplicity, and competitive pricing:

These are FOUR tips you should know to save your GPU minutes!

  1. Preparation is a key! Develop locally first, package and test your code on your local machine before deploying to RunPod to avoid wasting GPU time on debugging.
    1. Prepare startup_script to load your credentials and install necessary packages like poetry. You will scp it to runpod and run it there.
    2. Push code to GitHub and create package setup script.
    3. Before spinning up an expensive gpu setup, test your pipeline end to end on the smallest model from this family, in my examples I used llama 1b model and fixed all errors/bugs with minimal costs.
  2. Code and storage:
    1. For active code and temporary data use /root directory as it is on NVMe SSD, not network storage, significantly faster read/write speeds for data processing.
    2. /workspace is RunPod's persistent network storage, so keep your models and cache and data input/output there. Also, /workspace memory is much larger.

To setup this up it’s essential to set hugging face and transformer cache env vars;

This script is an example to do this:

  1. Network Volume is your friend! When you save all your cache, data and model under /workspace there is still risk of not being able to spin up a new pod with the same data. if no pods are available, you may get downgraded to a zero-GPU pod without being able to spin up a new GPU pod 🙁 Remember Llama model is HUGE and it takes around 30 minutes to download it, so you don’t want to waste your money on downloading model each time you create a new pod, you want it to be accessible for different gpu/cpu setups.
    So this is where network volume comes in 🙂!
  • independent of your Pod
  • fast: Typical transfer speeds: 200–400 MB/s, with possible peaks up to 10 GB/s
  • cheap: $0.07 per GB per month for the first 1 TB; $0.05 per GB per month beyond that.
    It’s very easy to set up, BUT make sure you check the data center you choose has your target GPUs and preferably multiple GPU types available as some GPU types may not always be available.
  1. Model Loading Optimization:

Trick 1: When loading large models on RunPod, use these flags:

  • low_cpu_mem_usage = True: Despite its name, this flag helps with both CPU and GPU memory:

Without this flag: Loads all weights at once, creating temporary copies that cause memory spikes of 2-3x the model size.

With this flag: Builds empty model first, then loads weights one tensor at a time, immediately assigning each to its target layer. Peak memory stays close to the final model size. This allows loading larger models on smaller RunPod instances

  • device_map="auto", enables direct loading to target devices across multiple GPUs without staging everything in CPU RAM first.

Trick 2: Use FastAPI for model loading and serving. If running multiple experiments, use FastAPI to load the model once and create API endpoints. It also keeps experiment code modular and separate from model loading so that you can make constant changes to it without having to load model on every change:

 

And of course, don’t forget to terminate your runpod :) Please share any efficiency tips for working with expensive GPUs in the comments



Discuss

Three positive updates I made about technical grantmaking at Coefficient Giving (fka Open Phil)

2025-11-26 09:09:58

Published on November 26, 2025 1:09 AM GMT

Open Philanthropy’s Coefficient Giving’s Technical AI Safety team is hiring grantmakers. I thought this would be a good moment to share some positive updates about the role that I’ve made since I joined the team a year ago.

tl;dr: I think this role is more impactful and more enjoyable than I anticipated when I started, and I think more people should consider applying.

It’s not about the “marginal” grants

Some people think that being a grantmaker at Coefficient means sorting through a big pile of grant proposals and deciding which ones to say yes and no to. As a result, they think that the only impact at stake is how good our decisions are about marginal grants, since all the excellent grants are no-brainers.

But grantmakers don’t just evaluate proposals; we elicit them. I spend the majority of my time trying to figure out how to get better proposals into our pipeline: writing RFPs that describe the research projects we want to fund, or pitching promising researchers on AI safety research agendas, or steering applicants to better-targeted or more ambitious proposals.

Maybe more importantly, cG’s technical AI safety grantmaking strategy is currently underdeveloped, and even junior grantmakers can help develop it. If there's something you wish we were doing, there's a good chance that the reason we're not doing it is that we don't have enough capacity to think about it much, or lack the right expertise to tell good proposals from bad. If you join cG and want to prioritize that work, there's a good chance you'll be able to make a lot of work happen in that area.

How this cashes out is: as our team has tripled headcount in the past year, we’ve also ~tripled the amount of grants we’re making, and we think the distribution of impact per dollar of our grantmaking has stayed about the same. That is, we’ve about tripled the amount of grant money we’ve moved towards the top end of the impact distribution as well as at the marginal end.

To be even more concrete, here’s one anecdote I can share. About a year ago, Jason Gross asked me for $10k for compute for an experiment he was running. I spoke to him a few times and encouraged him to make grander plans. The resulting conversations between him, me, and Rajashree Agrawal led to me giving them a $1M grant to try to found something ambitious in the formal software verification space (I’m reasonably excited about FSV as a def/acc + mitigating reward hacking play.) They eventually founded Theorem, a startup focussed on formal software verification, which went on to be the first FSV startup accepted to YC, and they subsequently raised at one of the largest valuations in their cohort. Jason and Rajashree say that they would have been very unlikely to set their goals that big without my initial grant. Nothing about that seems marginal to me, yet it wouldn’t have happened had I not been here.

There is no counterfactual grantmaker

When I was offered the job a little over a year ago, I was told that I was the only candidate still being considered for the role, and that there was no one left to make offers to if I didn’t accept. In our current hiring round, we’d like to hire 3-4 technical AI safety grantmakers, but once again it’s far from obvious that we’ll find enough candidates that meet our bar. If you get an offer and don’t take it, the likeliest result is that we hire one fewer person.

Why is this? I think the main reason is that fewer people apply to our roles than you might expect (if you’ve already applied, thank you!). We are looking for people who could succeed in a research career, and most such people don’t want to leave research. It also helps for this role if you are well networked and have a lot of context on technical AI safety. Most people with a lot of context are settled in their roles and unlikely to apply. Separately, the set of skills required to be a good grantmaker includes some things that aren’t as important for being a good researcher, so occasionally strong researchers who apply have disqualifying qualities, even people who on paper seemed like they might be really good.

What this all means is that our top candidates end up being extremely counterfactual. Their acceptance or rejection of the role doesn't just improve outcomes very slightly relative to some other person we could have hired, but counterfactually causes tens of millions of dollars to move out the door to really impactful projects that wouldn't have otherwise been funded.

If we're so starved for grantmaker labor, why don't we lower our hiring bar? I think we’re going to have a slightly lower bar than we’ve had in the past; we really want to fill these roles. But also, we think there are diffuse long-term negative effects of seriously lowering our hiring bar. I acknowledge that perhaps we're making the wrong tradeoffs here.

(If you feel moved to apply by the counterfactual argument, but would drop out if it turns out that we think we have enough other good applicants, please feel free to indicate that in your application. If we get an unexpected windfall of strong applicants, such that we have more qualified candidates than we can hire, we’ll be happy to let you know, and there will be no hard feelings if you drop out.)

Grantmaking is more fun/motivating than I anticipated

Before I joined OpenPhil, I was about as “research archetype” as they get. I spent most of my time thinking about wacky theory math ideas. My work style was chaotic-academia: I went to bed at random times and worked at random times and in random places, mostly on whatever interested me at the time. 

Now I have a team and a manager, and I have lots of things that need to be done. I am not planning to have any papers with my name on them in the foreseeable future. But, I'm really enjoying it! So why am I enjoying it more than you might expect, and indeed indeed more than I expected going in? Some factors:

  • I do actually spend a decent fraction of my time thinking about object-level technical stuff. Mostly that looks like talking to top researchers, but it also looks like reading academic papers, and going to conferences to talk to people about their research or argue about AI safety. That can be a big part of the job if you want it to be.
    • One thing that's nice about the technical roles at cG is that we have a lot of generalist grantmakers but not a lot of technical specialists. That means that people are focused on leveraging the technical expertise of those who have it efficiently, which means that I get to spend a larger than expected fraction of my time on technical stuff. For example, with large grants that have a significant technical and non-technical component to the investigations, I often pair with a generalist grantmaker who will take the non-technical aspects off my hands. I get to spend more time thinking about which research agendas are promising, and less time worrying about whether an organisation has a healthy board of directors etc, than I expected.
  • The work I'm doing feels important and tangible. I go to conferences and see people walking around and giving talks who wouldn’t be in the room without my grant. A promising junior person mentions they just got a job at an org I funded to grow its headcount. Maybe it's cliche, but I actually do think that seeing the effects of my work on the shape of the field, is pretty motivating.
  • I'm significantly more empowered than I expected to be when I joined. I've been given much more trust than I expected, and I've been empowered to make decisions based on my inside view. My manager is constantly pushing me to take on bigger projects, be more ambitious and creative, and be more agentic. As a result, I think I have become more ambitious and agentic, and noticing that in myself has been very motivating. I think if you think that a more agentic, ambitious version of yourself is someone you'd like to grow into, then this might be a good role for you, even if you're not sure how well that will go yet.
  • This role can be very sociable. I spend a lot of my time talking to researchers about the research they're doing and why. I don't get as much time to spend on getting into the low-level technicalities, at least not in my day-to-day, but I actually find that high-level strategic thinking which still interfaces with technical details can scratch much of the same itch as doing the research myself. I also think that thinking through strategic questions about technical AI safety and the future of AI are extremely interesting questions.

Please apply!

If all this sounds appealing to you, you can apply here by December 1st! Our team funds a lot of great research – from scaling up research orgs like Redwood or Apollo, to eliciting projects like those described in our RFP, to proactively seeding new initiatives. 

Last year, the Technical AI Safety team made $40 million in grants; this year it’ll be over $140 million. We want to scale further in 2026, but right now we only have three grant investigators on the team, so we’re often bottlenecked by our grantmaker bandwidth. If you think you might be a strong fit, your application could be the difference between us finding the right person or leaving a role unfilled. If you have more questions, you can dm me on LW or reach out to me at jake [dot] mendel [at] coefficientgiving [dot] org



Discuss