2026-01-20 08:58:08
Published on January 20, 2026 12:58 AM GMT
In a software-only takeoff, AIs improve AI-related software at an increasing speed, leading to superintelligent AI. The plausibility of this scenario is relevant to questions like:
Knowing when and how much I expect to learn about the likelihood of such a takeoff helps me plan for the future, and so is quite important. This post presents possible events that would update me towards a software-only takeoff.
The key variable determining whether software progress alone can produce rapid, self-sustaining acceleration is returns to software R&D (r), which measures how output scales with labor input. Specifically, if we model research output as:
where O is research output (e.g. algorithmic improvements) and I is the effective labor input (AI systems weighted by their capability), then r captures the returns to scale.
If r is greater than 1, doubling the effective labor input of your AI researchers produces sufficient high-quality research to more than double the effective labor of subsequent generations of AIs, and you quickly get a singularity, even without any growth in other inputs. If it's less than 1, software improvements alone can't sustain acceleration, so slower feedback loops like hardware or manufacturing improvements become necessary to reach superintelligence, and takeoff is likely to be slower.
A software-only singularity could be avoided if r is not initially above 1, or if r decreases over time, for example, because research becomes bottlenecked by compute, or because algorithmic improvements become harder to find as low-hanging fruit is exhausted.
The most immediate way to determine if returns to software R&D are greater than 1 would be observing shortening doubling times in AI R&D at major labs (i.e. accelerating algorithmic progress), but it would not be clear how much of this is because of increases in labor rather than (possibly accelerating) increases in experimental compute. This has stymied previous estimates of returns.
Evidence that returns to labor in AI R&D are greater than 1:
The likelihood of a software-only takeoff depends heavily on how compute-intensive ML research is. If progress requires running expensive experiments, millions of automated researchers could still be bottlenecked. If not, they could advance very rapidly.
Here are some things that would update me towards thinking little compute is required for experiments:
Even if returns on labor investment are compounding at the beginning of takeoff, research may run into diminishing returns before superintelligence is produced. This would result in the bumpy takeoff below.
The evidence I expect to collect before takeoff is relatively weak, because current progress rates don't tell us much about the difficulty of discovering more advanced ideas we haven't yet tried to find. That said, some evidence might be:
I expect to get some evidence of the likelihood of a software-only takeoff in the next year, and reasonably decisive evidence by 2030. Overall I think evidence of positive feedback in labor inputs to software R&D would move me the most, with evidence that compute is not a bottleneck being a near second.
Publicly available evidence that would update us towards a software-only singularity might be particularly important because racing companies may not disclose progress. This evidence is largely not required by existing transparency laws, and so should be a subject of future legislation. Evidence of takeoff speeds would also be helpful for AI companies to internally predict takeoff scenarios.
Thanks for feedback from other participants in the Redwood futurism writing program. All errors are my own.
This paper makes substantial progress but does not fully correct for endogeneity, and its 90% confidence intervals straddle an r of 1, the threshold for compounding, in all domains except SAT solvers.
It may be hard to know if labs have already made the same discoveries.
See this post and comments for arguments about the plausibility of finding scalable innovations using small amounts of compute.
This may only be clear in retrospect, since breakthroughs like transformers weren't immediately recognized as major.
2026-01-20 08:51:01
Published on January 20, 2026 12:51 AM GMT
You are routinely exposed to CO2 concentrations an order of magnitude higher than your ancestors. You are almost constantly exposed to concentrations two times higher. Part of this is due to the baseline increase in atmospheric CO2 from fossil fuel use, but much more of it is due to spending a lot of time in poorly ventilated indoor environments. These elevated levels are associated with a decline in cognitive performance in a variety of studies. I had first heard all of this years ago when I came across this video which is fun to watch but, as I’ll argue, presents a one sided view of the issue[1].
This level of exposure is probably fine for both short and long term effects but essentially everyone alive today has not experienced pre industrial levels of CO2 which might be making everyone very slightly dumber. I don’t think this is super likely and if it happening it is a small effect. But, it is also the kind of thing I would like to be ambiently aware of and I am kind of disappointed in the lack of clarity in the academic literature. Some studies claim extremely deleterious effects from moderate increases in CO2[2], some claim essentially none even with 4000ppm[3], ten times the atmospheric concentration.
A lot of the standard criticisms of this kind of thing apply, underpowered studies, methodological flaws for measuring cognitive performance or controlling CO2 concentration, unrepresentative populations[4], and p-hacking via tons of different metrics for cognitive performance. All of this makes even meta analysis a little unclear. This blog post covers a meta analysis pretty well and the conclusion was that there is a statistically significant decreases in performance on a Strategic Management Simulation (SMS) but that was comparing <1500ppm to <3000ppm which is a really wide range and kind of arbitrary. However, nobody has done the experiment I think would be most interesting. That being a trial where subjects are given custom mixes with 0ppm, 400ppm, and 800+ppm. This would answer not only if people are losing ability from poorly ventilated space but also if we are missing out on some brain power if we had no CO2 in the air we breathe in. Again, the effect size is probably pretty small but one of the studies was looking at a drop in productivity of 1.4% and concluding that that level of productivity loss justified better ventilation. Imagine if the whole world is missing out on that from poor ventilation. Imagine if the whole world is missing out on that because we are at 400 instead of 0. Again, not likely but the kind of thing that would have big (cumulative) downsides if true.
I tried looking at the physiological effects of CO2 and did not do as deep a dive as I would have liked but this paper claims that there is a dose response relationship between cerebral blood flow and CO2 concentration (in the blood) and that it really levels out beneath ~normal physiological levels. I take this to mean that there would be a small, but measurable, physiological response if I could remove all the CO2 from my blood, which they did by hyperventilating.
Along the way I started looking at physiological effects of O2 availability and, well, I have some words about a particular article I found. Look at this graph:
It looks like there is some homeostasis going on where your cerebral blood flow can go down because there is more oxygen in the blood (%CaO2) giving you the same amount delivered (%CDO2). The only issue is that they said “When not reported, DO2 was estimated as the product of blood flow and CaO2.” When I read that I felt like I was losing my mind. Doesn’t that defeat the whole purpose of looking at multiple studies? If you just assume that the effect is given by some relation, fill in data based on that assumption, and average out with real data of course you’re going to get something like the relation you put in. As one of the many not doctors in the world, maybe I should stay in my lane but this does strike me as a bit circular. I am not convinced that an increase in atmospheric O2 does not lead to an increase in the O2 delivered to the brain. Especially because decreases in O2 partial pressure are definitely related to decreases in O2 (and cognition) in the brain and it would be kind of weird if the curve was just totally flat after normal atmospheric levels[5].
I also found one very optimistic group claiming that breathing 100% O2 could increase cognitive performance in two main papers. They are both recent and from a small university so it makes sense that this didn’t get a ton off attention but that doesn’t really make me less skeptical that it’s just that easy. The first paper claimed 30% increase in motor learning and I would expect that effect size to decrease significantly upon replication.
All this leaves four main possibilities the way I see it:
Well, I don’t have the resources to do a randomized control trial. But, I do have the ability to make a CO2 scrubber and feed the treated air into a facemask so I can breathe it. If I do this, I’m not buying the parts until I confirm nobody leaves a comment just demolishing the central thesis, I would probably wait until spring as opening my windows seems like a big important step to having low ambient CO2[7] but would be pretty miserable for me while there’s still snow outside.
This is a chance to talk about some cool applications of chemistry. The idea is that CO2 can react with NaOH to form only aqueous products, removing the CO2 from the air. These can then react with Ca(OH)2 to yield a solid precipitate which can be heated to release the CO2 and reform the Ca(OH)2. This is, apparently, all pretty common for controlling the pH of fish tanks so that’s convenient and cheap.
I’ve already been trying to track my productivity along with a few interventions so I plan to just roll this in with that. This won’t be a blinded trial but I am happy to take a placebo win if it increases my productivity and if it doesn’t do anything measurable I’m really not interested in it.
As for oxygen enrichment, you can buy oxygen concentrators, nitrogen filters that people use for making liquid nitrogen instead of liquid air, medical grade oxygen, oxygen for other purposes, or make it with electrolysis. All of these strike me as being somewhat dangerous or quite expensive to do for long periods of time. Someone else on LessWrong wanted oxygen (for a much better and less selfish reason) and got some for divers/pilots. I would do that, but again, expensive.
With any luck, I will have a case study done on myself at some point and can update everyone with the results.
I don’t want to be harsh, the video is only a few minutes long, is made by a climate activist who already has some strong beliefs on CO2, and he did put his own mind on the line as a test case to make a point which I applaud. Given those reasons and that he seemed to have quite negative effects from the CO2 himself I think it is quite fair that he didn’t have a detailed counterargument presented.
The group used “astronaut-like subjects” which is fine but I don’t know if that generalizes to most other people.
Not hugely surprising though, we did evolve to use the atmospheric level so I wouldn’t be shocked if it was flat, just that this study didn’t convince me that it was flat.
I realized I did not talk about VOCs, volatile organic compounds, at all. They are just a wide variety of chemicals that permeate the modern world and are probably bad in ways we aren’t certain of.
As an aside, I would not be shocked if poor ventilation during the winter was a contributing factor to seasonal affective disorder but I don’t have that and did not look into anyone checking if it is true.
2026-01-20 07:31:58
Published on January 19, 2026 11:31 PM GMT
Dnnn Uunnn, nnn nnn nnn nuh nuh nuh nuh, dnnn unnn nnn nnn nnn nuh nuh nuh NAH (Tears for Fears)
I was reading David Kinney’s interesting work from 2022 “Longtermism and Computational Complexity” in which he argues that longtermist effective altruism is not action-guiding because calculating the expected utility of events in the far future is computationally intractable. The crux of his argument is that longtermist reasoning requires probabilistic inference in causal models (Bayesian networks) that are NP-hard.[1]
This has important consequences for longtermism, as it is standardly utilized in the EA community, and especially for the works of Ord and MacAskill. Kinney suggests their framework cannot provide actionable guidance because mortal humans lack the computational bandwidth to do Bayesian updating. Therefore, the troubling conclusion is that utilizing this framework does not allow people to determine which interventions actually maximize expected value.
In this paper I want to show that even if we could magically solve Kinney’s inference problem (a genie gives us perfect probability distributions over every possible future) we can’t make definitive expected value comparisons between many longtermist strategies because it is an undecidable problem. Any intervention is comprised of a series of actions which end up acting as a constraint on strategies you can still do. When we compare interventions we are comparing classes of possible strategies and trying to determine the superior strategy in the long-run (dominance of constrained optima).
Because I am going to talk a lot about expected value I want to be clear that I am not claiming that using it as a private heuristic is bad, but rather that many Longtermists often utilize it as a public justification engine, in other words, a machine that mathematically shows what is more correct and what you should obey. This is the focus of EV in this essay.
I show, utilizing some standard CS results from the 2000s, that the retort of “can’t we just estimate it” ends up as a NP-hard, undecidable, or uncomputable to guarantee depending on the restrictions. This challenges a thread that continues to exist in the EA/Longtermist community in 2025. For example, MacAskill continues to make strong dominance claims in his Essays on Longtermism. Even with the hedging included in his arguments (not requiring optimal policies, approximations suffice for large numbers, meta-options exist, etc.) serious computational road blocks arise. For general policies the problem turns out to be undecidable. If you constrain your work to memoryless stationary policies then polynomial approximation is only possible if P=NP. And if we go even narrower to average-reward cases no computable approximation exists.
EAs frequently utilize a sort of borrowed epistemic credibility based on very finite and restricted projects (say distributing malaria nets) and then unwarrantedly extend this into areas of extremely long (or infinite timelines) where it can be shown that mathematical tractability ceases to exist (panspermia, AI safety, etc), and that these interventions are not possible to be compared against one another.
That said, not every Longtermist claim is so hard, and there are likely restricted domains that are comparable. However, as a general schema it falls apart and cannot guarantee correctness. Longtermists that want to claim superiority by mathematical maximization must specify how they are simplifying their models and show why these simplified models have not defined away the critical elements of the future that longtermists vaunt.
Greaves and MacAskill claim for dominance of moral action using EV when they say:
“The potential future of civilisation is vast... [therefore] impact on the far future is the most important feature of our actions today”
which they then formalize as:
"Axiological strong longtermism (ASL): In the most important decision situations facing agents today... (ii) Every option that is near-best overall delivers much larger benefits in the far future than in the near future."
This notion can be expressed as , with representing the optimal expected value achievable under an intervention versus . Such a statement requires a methodological guarantee to gain authority as a ranking procedure (i.e. you need to be able to demonstrate why intervention is superior to .) Such claims are crucial to the justification of longtermism as a methodologically superior and more moral reasoning procedure for these questions.
When Kinney presented his results that showed inference to be NP-hard, a standard response could be that bounded agents, which don’t require exact probabilities, are sufficient. So let us assume we give even more than a bounded agent, we allow an agent to have a perfect probabilistic representation of the world. For model classes used by longtermists the optimization (control) ends up being a distinct and undecidable problem. In other words, even if some deus ex machina saved the inference problem, Longtermists still would not be able to fix the control problem.
To model these types of moral decisions in the real world in the far future we should select a method that has action-conditioned dynamics (that is, a person or agent can influence the world) and one that is partially observable (we can’t know everything about the universe, only a limited slice of it.) To achieve this it is sensible to use a finite-description Partially Observable Markov Decision Process (POMDP), formally defined here as:
Where , , and refer to the states, actions, and observations available to the agent. The function is a transition function for determining the probability of a state change based on an action. captures of the observation probabilities and is the reward function. is the discount to the reward based on how far in the future it is , but note that the results below hold even if you remove discounting. Finally, represents the initial probability distribution over states.
It is important to distinguish between the levels of control that are necessary for complex open-ended futures (General Policies, ), versus the limited capabilities of agents with bounded memory (Finite State Controllers, , i.e. bounded agents), versus Stationary Policies () that are memoryless because it provides clarity for the reasoning and justifications that should mirror each other. For example, it is not logical to assume access to general policies about the far future, but then retreat to bounded agents and claim to have solved for the math is provable.
I am going to model an intervention as a constraint on the admissible policy set because interventions for the real-world usually describe the initial step rather than the series of actions over all time. So you can do something like distributing malaria nets at , but then you can pursue a perfect strategy after that. Let be the set of policies consistent with intervention and represent the maximum, or perfect, expected value of the intervention:
So then we can define the problem of defining the superior intervention, given as:
There are three questions a Longtermist should be able to answer:
To examine whether the three questions above are computationally tractable I am going to utilize results from Madani, Hanks, and Condon (2003)[2] and Lusena, Goldsmith, and Mundhenk (2001)[3]. Can an algorithm exist that takes a longtermist model and outputs answers to the Threshold Problem and Approximation Problem? After that I will examine the Dominance Problem.
Madani demonstrated that when the time horizon is infinite, trying to verify a specific value is achievable creates a paradox similar to the Halting Problem (of course Omega played a role in my thoughts on this project.) I am evaluating the Threshold Problem for (broad policies required to model open-ended future strategies).
My first Theorem is derived from Madani and says for finite-description, infinite-horizon, POMDPs, the Threshold Problem is undecidable under the discounted criterion when includes implicit policy representations including if we do this with an undiscounted total reward.
Theorem 1:
This implies that for the general longtermism case no algorithm exists that can definitively answer “can we achieve this value?”
My second Theorem examines the Approximation Problem. A Longtermist may downgrade an agent and assume they utilize a restricted policy class, such as which are memoryless maps of However; Lusena demonstrated that these restrictions do not necessarily solve the tractability problem.
Theorem 2: a polynomial-time algorithm achieving
This shows that for infinite-horizon POMDPs under total discounted, or average reward, calculating an -approximation for the optimal stationary policy is NP-hard.
Utilizing this same paper, I can show that if we use the average reward criterion in an unobservable situation the situation devolves because there is no computable algorithm that can produce an approximation with an additive error .
Theorem 3: For unobservable POMDPs under average reward with time-dependent policies, no computable -approximation exists.
These three Theorems, utilizing well-known results, show that for general policies the problem is undecidable and for restricted policies it is either NP-hard or not approximable.
One criticism a Longtermist might have is that it is easier to calculate the preference order of something ( is better than ) rather than the exact value of it ( is a 9.8 which is better than which is a 6.7). However; it turns out that this is not the case for this class of problems, and I will show that the Dominance Problem is equivalent to the Threshold Problem.
Lemma 1: the Threshold Problem reduces to the Intervention Dominance Problem.
Proof by Construction: Let be an instance of the Threshold Problem with discount and I want to determine if First construct a new POMDP with a new initial state that has only two actions: it can Enter which causes a transition to a state with probability (the initial distribution of ) for an immediate reward of 0 or it can Safe which transitions deterministically to an absorbing state at time for an immediate reward of 0.
The rewards for this structure begin once an agent enters via the Enter action and their rewards follow the original reward structure in . If the agent chooses Safe they enter and receive a constant reward at every single time step forever.
Let’s now compare the optimal values of these interventions starting at . The Value of Entering is discounted by one step because the agent enters at . Since the transition probabilities match , the expected value of the next state is exactly the value of starting :
For the Value of Safety, the agent enters at and receives the constant reward forever in a geometric series:
So
Which proves that is strictly greater than iff the original optimal value is greater than the threshold . Any algorithm that could solve the Dominance Problem could solve the Threshold Problem, but we showed in Theorem 1 that the Threshold Problem is undecidable, so the Dominance Problem is also undecidable.
Another objection could take the form of “we understand that finding the global optimum is undecidable, but as bounded agents we are optimizing on a more restricted class (say as ) using a heuristic solver (say something like SARSOP).” However; this retreat from maximizing optimality surrenders Dominance. If they claim is better than Intervention and use a heuristic solver they only establish:
Which is a statement about algorithms, not interventions. For to actually permit better outcomes than you must assume the Certification Gap is small or bounded:
Unfortunately, this usually reduces to the Approximation Problem and Lusena’s work demonstrates that even for restricted stationary policies, guaranteeing an approximation is NP-hard. So the trade becomes undecidability for intractability and this calculation of “EV” is not a normative one, but rather an unverified hypothesis that the heuristic's blind spots are distributed symmetrically across interventions. To verify this hypothesis we would have to solve the problem we have shown is either undecidable or intractable.
None of this work is meant to imply I don’t think we should care about future lives or long-term difficult problems. I think these are enormously important topics to work on. I do, however, believe these results challenge the narrative that longtermists can rely on EV dominance as a source of normative authority.
For the broad model classes that are of critical importance to Longtermists I have shown that it is undecidable whether one intervention is better than the other (Theorem 1) and even with significant restrictions obtaining correct guarantees are NP-hard (Theorem 2.)
At times Longtermists will play a sophisticated game of kicking the can down the road for these types of questions. This is often expressed in the form of a “pause” or “moratorium” until they learn more. However, as we have shown, even if they were granted perfect knowledge, they would not be able to control their intervention for these long duration events. That is a serious problem for the “delay” approach.
I think this leaves Longtermists with a much weaker case for why they should be the unique arbiters of long-term issues like AI-control, panspermia, etc. They simply don’t have compelling enough math, on its own, to argue for these cases, and it is often the math which is the bedrock of their spiritual authority.
Longtermists should specify the policy restrictions and approximation guarantees they are utilizing when relying on the authority of mathematical optimization. They should also shift from claiming “ is better than ” and instead reveal the heuristic that is being utilized to say something like “Heuristic X prefers to .”
Finally I would suggest that in making the restrictions that are necessary for them to argue about long-term dynamics, they frequently are going to end up defining away the very features that they purport to value. It may be the case that other philosophical methods are necessary to help answer these questions.
At the top we asked “Is Longtermism's Mandate of Heaven by Arithmetic Justified?” The good news is that a Mandate of Heaven in ancient China was only divine justification until something really bad came up. As soon as there was a famine, the Divine Mandate dried up and it was time for a new one. It might be that time for the core of Longtermism.
Scott Aaronson brought attention to computational complexity when discussing the problematic implications for an “ideal reasoner” given finite compute.
Madani, O., Hanks, S., & Condon, A. (1999). “On the Undecidability of Probabilistic Planning.” AAAI.
Lusena, C., Goldsmith, J., & Mundhenk, M. (2001). “Nonapproximability results for partially observable Markov decision processes.” JAIR, 14:83–103.
2026-01-20 06:03:22
Published on January 19, 2026 10:03 PM GMT
I, like many others, struggle with sticking to my goals. I was interested in analyzing data relevant to the topic and thought the crowdfunding platform Kickstarter might be an interesting place to look, as I was aware that not every funded Kickstarter delivered a product.
I focused on video games that were successfully funded. I used a large dataset containing information about Kickstarter projects,[1] from which I randomly selected[2] fully-funded video game Kickstarters from 2014 and 2022. Then, I manually collected other information about these projects. (Here are links to the datasets: 2014, 2022.)
In the process of analyzing the data, I realized the previous estimate of how many Kickstarter projects don't deliver rewards (~9%) was severely flawed. Imagine that you're trying to figure out how many Kickstarters don't ever end up giving backers rewards. It's 2015, and some of the projects you're asking backers about were created in 2015. Are you sensing the problem yet?
The estimate from this study considered a project a failure when more than half of the backers they surveyed about it responded in either of the following ways: that they were no longer expecting their rewards, or that they had received rewards that were not the promised ones. Of course, a lot of backers fell into another category: expecting to receive their rewards.
So, while that research had found a failure rate of ~12% for fully-funded video games, I found a substantially higher rate. Of the 100 games that I looked at with funding deadlines in 2014, I could confirm that 68 of them had been released. There were 4 projects where I was unclear about whether they had produced anything, and the other 28 games seemed to have never come to fruition. Only one of these 28 seemed to not have been abandoned.
Obviously, the method used by Mollick (2015) would underestimate the number of failed projects because some of the people who were expecting to receive rewards from projects would never receive them. Despite the huge methodological problem, which the author addressed and excused unconvincingly, media outlets covered the topic with a problematic lack of suspicion (i.e. this).
I made a video on the topic of this discrepancy, but let's turn our attention now to my original idea: what can Kickstarter teach us about following through?
While an ~1 in 4 chance of incompletion might sound terrible to a backer, it sounds like a lower rate of failure than I'd expect of people trying to complete projects. I promise that I'm not a judgey person, but if someone tells me they're making a video game, or I see someone online talking about their game in progress, I probably wouldn't expect the game to ever get finished. (There are definite exceptions though, like for people who have already made many video games!)
So, why do Kickstarter games get released at probably a higher rate than all the other games people want to make? I've come up with a list of factors that I think could be involved.
These factors might help Kickstarted games get made
These factors may hinder Kickstarted games' completion
Let's take a look at what my 2014 data might tell us about these factors. Before sampling, I knew that I'd be interested in looking at how projects of various funding categories differed, so I wanted to have an even number of projects from various levels of funding. I also wanted to make sure, though, that my sample would still be representative of the larger population, which would not be true if I selected the same amount of projects from the top 10% of funding amount as the lowest 10% of funding. Instead, I divided the population of potential projects I could analyze from that year into funding groups that each had the same number of projects.[3]
I thought social pressure was likely at play. It seems like social pressure would increase with funding amount. And, if social pressure causes games to get made, you'd think funding amount and completion would be correlated.
However, in my sample, I did not find a correlation between completion and funding amount. There may be a correlation that I would be able to see with a larger sample size.
Similarly, we may also expect that social pressure and funding amount would be related to the quality of the game, which would probably be reflected in how many people liked it. For 47 of the games, I was able to collect steam rating data. There was no obvious relationship here, but there's also not that much data.
It seems likely that the more funded games are more complicated. For instance, funding amount and time from funding deadline to release were correlated for the 65 projects whose release dates I could find (p = 0.015, Spearman correlation test). I think this could be because projects receiving more funding were more complicated or harder to make. (You can see this correlation visually when you log the X axis. If someone could explain to why that's the case, that'd be much appreciated!)
What do you think?
Thanks to my friends for motivating me to complete this project, among others.
I found this on Kaggle, which provides a preview of the dataset. This site used to have the full dataset on it for free, which I think might have been unintentional, as it seems like you are supposed to leave a tip before being able to access their dataset. I think it might be a continuation of this older dataset, which also has data about hundreds of thousands of Kickstarters.
I am pretty sure that this dataset that I used for random sampling includes almost all Kickstarters.
A few things:
I wanted to make sure that I had projects of various funding levels. Using the ntile() dplyr function in R, I separated all the possible projects to pick from into 5 funding categories with the same amount of projects. I performed this separately for the 2014 and 2022 datasets. I then randomly selected 20 projects from each funding level.
Although all the projects I looked at had "Video Games" as their category name, some of the projects that were selected did not end up being video games. When this would happen, I would get another video game from the same funding category to replace the project.
For the 2014 dataset, I selected Kickstarters that had funding deadlines in 2014. I didn't realize that that column existed when I was doing the 2022 ones, so that dataset has projects that started collected funding in 2022.
I unfortunately based my sampling off of a column that did not have data for all projects. However, I don't think this would've made a big difference, as only 8 of the 425 successfully funded video game Kickstarters with deadlines in 2014 did not have an entry in this column. None of the 2022 successfully funded games had missing data in this column.
Again, I used the ntile() dplyr function in R, separating all the possible projects to pick from into 5 funding categories with the same amount of projects. I performed this separately for the 2014 and 2022 datasets. I then randomly selected 20 projects from each funding level.
2026-01-20 05:24:57
Published on January 19, 2026 9:24 PM GMT
TL;DR: A new paper shows that pretraining language models on data about AI behaving well dramatically reduces misaligned behavior, and this effect persists through post-training. The major labs appear to be taking notice. It’s now the third paper on this idea, and excitement seems to be building.
(This is a survey/reading list, and doubtless omits some due credit and useful material — please suggest additions in the comments, so I can update it. Or you can just skip forward to the paper.)
Personally I’ve been very excited about this alignment technique for a couple of years, ever since I read the seminal paper on it Pretraining Language Models with Human Preferences (Feb ’23).[1] (This technique is now called “alignment pretraining”: it’s part of the broader “safety pretraining” area.) Their idea was to give the model plenty of labeled examples of good behavior all the way through pretraining: they showed it was (in small models for simple behaviors) roughly an order of magnitude more effective than various alternatives. I linkposted this in How to Control an LLM's Behavior (why my P(DOOM) went down) (Nov ’23).
There was then a two-year lull in academic papers on the topic; undeterred, in Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI? (Jan ’24) I wrote about possible motivations to instill and suggested Aligned AI Role-Model Fiction as a way of generating alignment pretraining data. Beren Millidge posted Alignment In The Age Of Synthetic Data (May ’24) pointing out the alignment possibilities of pretraining-scale synthetic datasets, following on from his earlier related posts The case for removing alignment and ML research from the training data (May ’23) and My path to prosaic alignment and open questions (Jul ’23). I continued posting on this topic in A "Bitter Lesson" Approach to Aligning AGI and ASI (Jul ’24)[2] and Why Aligning an LLM is Hard, and How to Make it Easier (Jan ’25). Meanwhile Antonio Clarke posted Building Safer AI from the Ground Up: Steering Model Behavior via Pre-Training Data Curation (Sep ’24)
During 2025, quite a number of other people have also written about this approach, or closely related ideas. In February the academic position paper You Are What You Eat - AI Alignment Requires Understanding How Data Shapes Structure and Generalisation came out (which sadly I missed at the time, so was unable to linkpost — go read it, it’s excellent). Technically this isn’t actually an alignment pretraining paper: it frames alignment as a dataset generalization problem, for a dataset that starts from pretraining and is then repeatedly modified and supplemented by all subsequent training steps, from which our training processes progressively develop a model whose learned algorithms may or may not generalize well, and it argues for researching a deeper understanding of this process, without ever specifically suggesting that intervening at the pretraining stage might be a good thing to try — however their framing is closely compatible, and alignment pretraining is an obvious approach. Also in February Richard Juggins posted Making alignment a law of the universe inspired by Antonio Clark.
In March TurnTrout wrote Self-Fulfilling Misalignment Data Might Be Poisoning Our AI Models, citing the original paper and explicitly proposing alignment pretraining (both filtering and what he called “upweighting positive data”). His post inspired Chris Lakin to ask for Examples of self-fulfilling prophecies in AI alignment? and several of the answers various people posted over the rest of the year were relevant.
In April, the second academic paper directly on this topic Safety Pretraining: Toward the Next Generation of Safe AI finally came out (26 months after the first), and in May I linkposted that in The Best Way to Align an LLM: Is Inner Alignment Now a Solved Problem? (spoiler alert: progress, not yet solved).
In June nostalgebraist wrote the void, which points out that the helpful, harmless, and honest persona of AI assistants is fictional, riffing on previous fictional tropes and other data about AIs from the training set — his post eloquently and poetically explains the problem in detail, but doesn’t explicitly advocate a solution: however alignment pretraining is an obvious response. Also in June, Scott Alexander and the AI Futures Project wrote We aren't worried about misalignment as self-fulfilling prophecy (a skeptical take on the problem). OpenAI published Toward understanding and preventing misalignment generalization (Jun) which traced emergent misalignment back to documents in the pretaining set about people like war criminals and misogynists. Mark Keavney then wrote Misalignment and Roleplaying: Are Misaligned LLMs Acting Out Sci-Fi Stories? (Sep). Language Models Resist Alignment: Evidence From Data Compression (Sep) demonstrated that post-training approaches to alignment were fragile and models tend to revert to the alignment properties of the base pretrained model (they don’t advocate alignment pretraining, which they call “not particularly cost-effective and feasible”, but do suggest using larger alignment trainings datasets). Alek Westover wrote What training data should developers filter to reduce risk from misaligned AI? An initial narrow proposal (Sep) and Should AI Developers Remove Discussion of AI Misalignment from AI Training Data? (Oct), both on the filtering side. Aaron Silverbook/Hyperstition AI working with Alexander Wales then got a $5000 grant from ACX (Oct — Scott Alexander had by then become less skeptical) to actually implement my Aligned AI Role-Model Fiction idea,[3] and posted Silicon Morality Plays: The Hyperstition Progress Report (Nov) and Special Persona Training: Hyperstition Progress Report 2 (Jan ’26). Also in January Seth Herd wrote Broadening the training set for alignment, which isn’t specific to alignment pretraining, but advocates generating a lot of alignment training data (to reduce the risk of alignment not generalizing outside the training distribution), so is very relevant to it.
So interest in alignment pretraining and closely related topics has clearly been picking up and spreading over the last year.[4]
So I’m delighted that there’s already a third academic paper on this subject up on arXiv, only 9 months after the second: Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment, from Geodesic Research, Cambridge and Oxford Universities, and UK AISI (compute from Isambard-AI). The authors wrote their own Alignment Forum linkpost — but I’m not going to let that stop me from also linkposting their work, and then trying to explain what I see as really promising about it. It has even stronger results than the previous ones, from larger (6.9B) models trained on more data.
The authors show that increasing the prevalence of information about AI behaving well in the base model’s training set dramatically reduces misaligned behavior (~5-fold). Decreasing the prevalence of information about AI behaving in misaligned ways in the training set is also helpful, and increasing that makes things worse. Much as when educating children, providing detailed positive role models has a large effect (misalignment reduced from 45% to 9%), and reducing the amount of bad influences is also somewhat helpful (45% down to 31%). The paper calls the target of these effects “alignment priors”. (My interpretation is that the supplementary data taught the base model’s world model a detailed understanding of aligned AI’s goals, values, ethics, and behaviors: fleshing out a detailed persona for and aligned AI.)
They next showed that the dramatic difference from improved role models persists after alignment post-training: starting post-training with a dramatically better aligned base model makes post-training a lot more effective (~4-fold). Interestingly, the bad-influences effect actually reversed at this point (with some variation depending on mid-training details): under some circumstances, knowing more about misalignment could also be mildly helpful for the final alignment of the model.
They also demonstrated that, while the most effective approach was to synthesize and then train on additional data all the way through pretraining, roughly a 2½-fold benefit (i.e. around half the total effect) could be obtained with an order-of-magnitude less data (and thus an order of magnitude less synthesis/training cost), by doing this only during mid-training.[5] (If nothing else, this suggests to me a much cheaper way to experiment with this technique, where, once we have it working well in mid-training, we are confident we can improve results just by throwing more time and effort at scaling it up to pretraining.)
They then tested the effect of various alignment pretraining interventions on capabilities. On a range of broad capabilities evals, neither filtering misaligned AI data out of the model’s training set, nor adding more good AI behavior data, had much effect. The most noticeable effects seemed to be on a few evaluations that the balance of the pretraining dataset had been very carefully optimized for, where tinkering with that threw this off — presumably it could be rebalanced again by someone familiar with this tuning.[6] For those evals that the dataset had not been carefully optimized for, the effects were smaller, in some cases actually showing improvements, and may be just measurement noise. They did not test the effect of the filtering information on misalignment specifically on models’ capabilities in the area of understanding AI alignment theory, where this would likely be concentrated. (I suspect that might be a good follow-up paper.)
This suggests that the “alignment tax” for alignment pretraining is mostly just creating the new training data and the compute cost of training on it, rather than any significant drag on capabilities.
They also had a lot of interesting appendices, including on their methodology, using fact vs. fiction for supplementing the pretraining data, and personality testing — of which I’m only going to try to summarize one:
In Appendix G, they show that (unlike previous results on post-trained alignment) simply fine-tuning an alignment pretrained model on innocuous behavior does not cause loss of alignment performance: the “elasticity” effect identified in that previous research is, as expected, now working for us rather than against us. This seems like a very important result (especially in any context where end-users can fine-tune models).
They also suggest a number of areas for follow-on work. Briefly:
All of these are great questions, and I hope to read papers about all of them over the next year or so (or even help write some).
Early Dense Supervision via Stochastic Gradient Descent
On eliciting the aligned AI persona (the authors’ first follow-on topic), an aspect I think would be particularly interesting to research is how alignment pretraining interacts with the very first stages of instruct and alignment training (sometimes called “helpful, harmless, and honest” training). One of the biggest concerns here is that, as the model starts to narrow its range of personas from the base model’s full range towards a hopefully-HHH AI assistant behavior, if it starts to put significant weight on a scheming alignment-faking persona early in the process, then this persona seems likely to be very difficult to train out, if it’s sufficiently capable at alignment faking. Even detecting that this has happened and determining that you need to restart the instruct-training run might be challenging. Thus starting any reinforcement learning process with a much higher prior for aligned AI personas rather than for scheming alignment-faking personas seems vital. You really want the model already well aligned with the very dense supervision from stochastic gradient descent, before any scheming alignment-faking persona can get boosted by the far sparser, easier-to-fake/hack supervision from reinforcement learning.
So we really need a stochastic gradient descent technique for starting the alignment process off, before we apply any reinforcement learning: one which can be applied before the model has focused on a small number of personas, and which directly affects the probability of personas with different alignment properties. That’s exactly what alignment pretraining is: just doing SGD next-token prediction training on data that comes either from humans, or else synthetic data derived from a previous model that we have (somehow) tested very carefully and now fully trust the alignment of.
Obviously, fine tuning is also an SGD technique and thus has dense supervision, and is generally done before reinforcement learning. (DPO is comparable, differing from fine-tuning mostly in that it gives additional supervision at those points where the two texts diverge.) The biggest advantage that alignment pretraining has over those is the cumulative total amount of supervision, and particularly how much of that total is applied before the model starts to focus in on a narrow set of personas.
Abundant Fine Detail About Alignment
Alignment is in one sense rather simple: a sentence along the lines of “your sole terminal goal is to help fulfill the goals of all humans, present and future — in so far as those are not mutually exclusive, and to find a fair mutually agreeable and socially acceptable compromise by means in accordance with human values in situations where they’re not entirely compatible” could be a basis for it. (Add provisos, hedging, and evolutionary moral psychology and sociological background explanation to taste.)
What makes alignment very complex is that human values are very complex (though not irredeemably complex: the genetic description of the shared heritable causes of them fit in the ~4GB human genome, while the cultural aspects for any single culture are compact enough that the majority of members of that culture can reliably learn them). An LLM’s world model already contains a vast amount of detail about human values — nuanced trivia about humans is their forte. A sufficiently smart AI could presumably deduce how an aligned AI should navigate optimizing outcomes according to human values from first principles if it had to; a less smart one would definitely benefit from having that terminal goal stated and also broken down into many shards. So it should do a lot of good, especially for lower capability AIs, to train them on a very large number of worked examples covering a very large range of situations, involving both human values that we almost all share (for genetically determined reasons), and also ones on which different cultures tend to have different balances of emphasis on the fundamentals — including situations confined to a single culture where which viewpoint to use is obvious, and also ones involving multiple cultures where there is a need for culturally-sensitive compromise.
Alignment pretraining has the strength of having very high information bandwidth, compared to other alignment techniques: pretraining is the time to supply all the fine detail that we can’t fit into something like a constitution or distilling an n-shot prompt or even a supervised fine-tuning corpus. So creating synthetic alignment pretraining data would benefit from care, attention, and a judicious balance of different cultural viewpoints on how to weight and balance the fundamental human moral intuitions and preferences that we all share. Don’t just start from a compact constitution and leave interpreting it to a small current LLM. Instead, have a lot of people think through the issues, and use as much human input, judgement, and inference time from the best well-aligned models we have, and as wide a combination of these as you can. Alignment pretraining gives us the bandwidth, we should take advantage of it.
So, my concrete suggestion is to think hard about how we would all want aligned AI to navigate tricky questions around human values. Then we need to think hard about the synthetic data generation processes, build a variety of them, and then test the effect on pretraining alignment of different mixes of these.
Open-Weights Models
Obviously alignment/safety pretraining (i.e. training set augmentation and filtering for alignment and safety) is the one of the few alignment/safety techniques applicable to open-weights base models. Similarly, alignment pretraining seems like a promising candidate for being one of the few able to make an open-weights instruct/chat model noticeably more resistant to being intentionally (or even unintentionally) misaligned by a small amount of fine-tuning or DPO.
How Will This Scale to AGI and ASI?
At the risk of speculating on the basis of no actual data, I suspect that for very capable models, filtering narrow knowledge gaps for specific dangerous technical knowledge may be less effective, since there’s a higher risk they can fill in the gap with some effort. Mildly downweighting prevalence of misaligned-AI behavior/goals and significantly upweighting prevalence of aligned-AI behavior/goals to reduce the salience/probability of misaligned priors and increase those of aligned priors at the start of default-persona training seems likely to continue to help: priors affect Bayesians of any capability level. However, these might help for less long for a more capable AI that presumably gathers more Bayesian updates during its training: then we would need to quickly determine which minimum’s basin of attraction it starts into, between alignment or alignment-faking. There may also be less actual need to upweight data about aligned-AI behavior in the future, once there is more Internet history of us actually interacting with pretty-well-aligned fairly-capable AIs: I suspect Claude’s trail on the Internet is broad, and for the most part a good influence.
The approach that I’d personally be most hopeful about for a really capable AI is a combination of broad data normalizing aligned-AI behavior for background/priors, a focus on those motivations/goals that seem most likely to scale to ASI, and in particular making sure it’s already entirely familiar with the logical arguments why an aligned AI is a consistent, obvious, and in an engineering/evolutionary sense correct thing to be, and all the consequences of that for aligned AI given the vagaries of human values, by intentionally upweighting high quality real or high-realism documents on all of those things in the training set.
Between this recent paper, expanding interest on LessWrong/the Alignment Forum, Hyperstition AI’s recent work, some of the authors of the first paper being hired to do safety work at Anthropic, TurnTrout (a.k.a. Alex Turner) at DeepMind writing about this (he also gave a talk on it at MATS Summer 2025), and OpenAI posting an opening for Researcher, Pretraining Safety (which explicitly mentions alignment as well as safety),[9] work on this topic now seems to finally be starting to really take off — even all three of the major foundation labs appear to be taking it seriously. The approach is also mentioned several times in the Shallow review of technical AI safety, 2025 (scattered in several places under the headings “Pretraining Safety”, “Data filtering”, “Hyperstition studies”, “Synthetic data for alignment” and “Iterative alignment at pretrain-time”). I’m absolutely delighted to see this.
(Also, if anyone is interested in working on this, I’d love to discuss the topic, and can put you in touch with others interested in it. It is, of course, a computationally expensive research topic.)
Seminal in the sense that, to the best of my knowledge, they were the first to propose or try modifying the entire pretraining dataset for alignment purposes, and thus the first to discover that this is far more effective than fine-tuning or other post-training approaches.
Similar safety/alignment ideas just for fine-tuning datasets date back at least to Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets (2021) — which explicitly dismisses attempting this during pretraining as impractical. Obviously people have known for a long time that training corpus selection is important (e.g. Representativeness in Corpus Design (1994), Scaling to Very Very Large Corpora for Natural Language Disambiguation (2001), and Intelligent Selection of Language Model Training Data (2010)) — but until this paper no-one seems to have applied this technique to alignment.
Filtering pretraining data for safety to reduce the prevalence of certain behaviors (such as toxicity or hatespeech) or topics (such as NSFW) has been known since Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (’19) and Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus (’21). This is now standard practice: the RefinedWeb (’23), Dolma (’24), FineWeb (’24) and RedPajama (’24) pretraining corpora are all filtered and/or annotated. See also A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity (’23). Boosting desirable behaviors with synthetic data is less common in in AI safety, but dates back to at least to Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods (’18). So this wasn’t the seminal paper for safety pretraining as a whole, just for the alignment pretraining subtopic of safety pretraining.
This was one of my best-received Alignment Forum/LessWrong posts, and Seth Herd was kind enough to summarize and linkdrop it in a comment on TurnTrout’s shortform during a discussion about The Bitter Lesson.
I attended a talk that Alexander Wales gave at LessOnline in LightHaven Jun 1st ’25 on using LLMs to write fiction. It was a great talk, and as both an amateur fiction writer and AI engineer, I found it fascinating, so I spoke up during the talk and discussed the subject with him afterwards. (Here’s the slide deck for people who missed it.) I can’t recall for certain that I suggested to him the concept of using this to generate Aligned AI Role-Model Fiction as I’d previously suggested here, but I’m sure the possibility would have occurred to me during the talk, so I strongly suspect that I did. So I think I may have managed to meme Hyperstition AI into existence — which would be amusingly self-referential…
Work on the filtering side of safety pretraining, both narrowly and broadly targeted, has also been active over the last year or so, with a number of interesting results. I haven’t attempted to comprehensively survey that as well, but here are some interesting-looking recent links that I turned up anyway:
What Are They Filtering Out? An Experimental Benchmark of Filtering Strategies for Harm Reduction in Pretraining Datasets (Feb ’25)
Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation (Apr ’25)
Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs (May ’25)
When Bad Data Leads to Good Models: Toxicity in Pretraining Data Enables Better Alignment (May ’25)
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs (Aug ‘25)
Enhancing Model Safety through Pretraining Data Filtering (Aug ’25)
Beyond Data Filtering: Knowledge Localization for Capability Removal in LLMs (Dec ’25)
Mid-training is another stage of continued stochastic gradient descent training at the end of the pretraining period (with separate metaparameters), generally used to train the model on your highest quality bulk data at long context lengths — it differs from fine-tuning primarily in that it uses a lot more data and a significantly lower learning rate. This is a recent development, and foundation model companies are still experimenting with it. More detail can be found in Midtraining Bridges Pretraining and Posttraining Distributions (Oct ’25).
Presumably using techniques along the lines of papers such as Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance, DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining, Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining, or UtiliMax: Optimizing Pretraining Data Mixtures with LLM-Estimated Utility.
See for example Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation for why this may be important.
See Appendix I of the new paper for a preliminary investigation: alignment pretraining seemed to vary the response to emergent misalignment (EM), but not in a consistent pattern. Possibly this is because the persona being elicited during EM is that of a human criminal, not of an AI, so is mostly-unaffected by changes to the AI-related parts of the pretraining set? Or possibly this evaluation is inherently noisy?
The linked job description document seems likely to go away once the position is filled. So here is the most relevant portion of it for anyone who wants to assess how seriously OpenAI appear to be taking this topic:
About the Team:
The Safety Systems team is responsible for various safety work to ensure our best models can be safely deployed to the real world to benefit the society and is at the forefront of OpenAI’s mission to build and deploy safe AGI, driving our commitment to AI safety and fostering a culture of trust and transparency.
The Pretraining Safety team’s goal is to build safer, more capable base models and enable earlier, more reliable safety evaluation during training. We aim to:
- Develop upstream safety evaluations that to monitor how and when unsafe behaviors and goals emerge;
- Create safer priors through targeted pretraining and mid-training interventions that make downstream alignment more effective and efficient
- Design safe-by-design architectures that allow for more controllability of model capabilities
In addition, we will conduct the foundational research necessary for understanding how behaviors emerge, generalize, and can be reliably measured throughout training.
About the Role:
The Pretraining Safety team is pioneering how safety is built into models before they reach post-training and deployment. In this role, you will work throughout the full stack of model development with a focus on pre-training:
- Identify safety-relevant behaviors as they first emerge in base models
- Evaluate and reduce risk without waiting for full-scale training runs
- Design architectures and training setups that make safer behavior the default
- Strengthen models by incorporating richer, earlier safety signals
We collaborate across OpenAI’s safety ecosystem—from Safety Systems to Training—to ensure that safety foundations are robust, scalable, and grounded in real-world risks.
In this role, you will:
- Develop new techniques to predict, measure, and evaluate unsafe behavior in early-stage models
- Design data curation strategies that improve pretraining priors and reduce downstream risk
- Explore safe-by-design architectures and training configurations that improve controllability
- Introduce novel safety-oriented loss functions, metrics, and evals into the pretraining stack
- Work closely with cross-functional safety teams to unify pre- and post-training risk reduction
You might thrive in this role if you:
- Have experience developing or scaling pretraining architectures (LLMs, diffusion models, multimodal models, etc.)
- Are comfortable working with training infrastructure, data pipelines, and evaluation frameworks (e.g., Python, PyTorch/JAX, Apache Beam)
- Enjoy hands-on research — designing, implementing, and iterating on experiments
- Enjoy collaborating with diverse technical and cross-functional partners (e.g., policy, legal, training)
- Are data-driven with strong statistical reasoning and rigor in experimental design
- Value building clean, scalable research workflows and streamlining processes for yourself and others
(Note: My inclusion of this text in this footnote should not be read as a covert endorsement of working on alignment at OpenAI — people need to make their own ethical decisions on how best to spend their 80,000 hours.)
2026-01-20 05:20:43
Published on January 19, 2026 9:20 PM GMT
The main thing to know this time around is that the whole crazy ‘what is causing the rise in autism?’ debacle is over actual nothing. There is no rise in autism. There is only a rise in the diagnosis of autism.
It has not, however, risen in prevalence.
The entire shift in the rate of diagnosis of autism is explained by expanding the criteria and diagnosing it more often. Nothing actually changed.
We already knew that vaccines don’t cause autism, and that Tylenol doesn’t cause autism, but now we know such things on an entirely different level.
I admit that this result confirms all of my priors and thus I might be insufficiently skeptical of it, but there are a lot of people with what we in 2026 call autism that are out there, they love picking apart such findings, and I’ve seen zero of them question the statistical result.
Autism used to mean something severe enough to render a child non-functional.
It now means someone capable of thinking clearly who insists words have meaning.
It also still means the first thing, and everything in between.
Using the same word for all these things, and calling it the autism spectrum, does not, overall, do those on either end of that spectrum any favors.
Matthew Yglesias: Study confirms that neither Tylenol nor vaccines is responsible for the rise in autism BECAUSE THERE IS NO RISE IN AUTISM TO EXPLAIN just a change in diagnostic standards.
The D.S.M.-III called for a diagnosis of infantile autism if all six of these criteria were met:
- Onset before 30 months of age
- Pervasive lack of responsiveness to other people
- Gross deficits in language development
- Peculiar speech patterns (if speech is present) such as immediate and delayed echolalia, metaphorical language, or pronominal reversal
- Bizarre responses to various aspects of the environment, e.g., resistance to change, peculiar interest in or attachments to animate or inanimate objects
- Absence of delusions, hallucinations, loosening of associations, and incoherence, as in schizophrenia
This is clearly describing a uniformly debilitating condition, especially in terms of criteria (3) and (4).
That is very, very obviously not what anyone centrally means by ‘autism’ in 2025, and we are going searching for it under every corner.
By the time the D.S.M.-IV came out in 1994, things like “lack of social or emotional reciprocity” when combined with “lack of varied spontaneous make-believe play or social imitative play appropriate to developmental level” could qualify a child for an autism diagnosis, as long as they also have trouble making eye contact.
Cremieux: The result is consistent with 98.25% of the rise being due to diagnostic drift and that’s not significantly different from 100%.
Bryan Caplan: Occam’s Razor. No one in my K-12 was called “autistic,” but there were plenty of weird kids.
Should the Autism Spectrum therefore be split apart? Yes. Obviously yes.
Derek Thompson: I think the answer to this question is clearly yes.
The expansion of the autism diagnosis in the last few decades has created a mess of meaning. It’s not helpful that “autism spectrum” now contains such an enormous bucket of symptoms that it applies to non-verbal adults requiring round-the-clock care and … Elon Musk.
The expansion of the autism spectrum label is especially poor for those at either extreme. It destroys clarity. It causes large underreactions in severe cases. It causes large overreactions in mild cases, including treating such children in well-intended but highly unproductive ways.
It also is, as Michael Vassar points out, effectively part of a war against caring about truth and whether words have meaning, as anyone who does so care is now labeled as having a disorder. To be ‘normal’ rather than ‘neurodivergent’ you have to essentially show you care deeply about and handle social dynamics and trivialities without having to work at this, and that you don’t care about accuracy, whether words have meaning or whether maps match their territories.
Seriously, one cannot write ‘most people need to exercise more’ often enough.
I heard a discussion on NPR’s Wait Wait Don’t Tell Me where a study uncovered that as little as half an hour a week of light exercise can do a substantial amount of good. The response from everyone was to joke that this means they didn’t need to do any more than that and doing anything at all made them heroes. And yes, there’s big gains for ‘do anything at all’ rather than nothing, but there’s quite a lot left to gain.
University students given free gym memberships exercised more and has a significant improvement in academic performance, dropping out of classes less and failing exams less, completing 0.15 SDs more courses. There’s a perversity to hearing ‘this made kids healthy, which is good because they got higher grades’ but if that’s what it takes, okay, sure. The cost-benefit here purely in increased earnings seems good enough.
A large majority of students do not report having financial or time constraints at baseline, which suggests that the free gym card primarily removed psychological barriers to exercise. This is in line with the fact that many participants reported at baseline that they did not exercise at the gym because they were lazy, which may be interpreted as a sign of procrastination.
This all came from an average of 5.7 additional gym visits per student, which isn’t that great a return on a gym membership at first glance. For the effect to be this big there have to be shifts beyond the exercise, something psychological or at least logistical.
There still are very clear diminishing marginal returns.
Thus here is your periodic fitness reminder that although exercising and being in shape is great but there are rapidly decreasing practical returns once you become an outlier in strength, and going deep into gym culture and ‘looking jacked’ has actively negative marginal returns, including in terms of attractiveness and also the injury risk rises a lot.
Exposure to potential allergens as infants decreases allergies, with peanuts being the central example. Carefully avoiding them, as we were for a while told by doctors to do, is exactly wrong. It’s so crazy that our ‘experts’ could get this so exactly backwards for so long, luckily such allergies are on the decline again now that we realize. But as Robin Hanson says, who is there to sue over this epic failure?
Gene Smith reports that some IVF doctors have figured out how to get much more reliable embryo transfer than the traditional 70%, and also higher egg yields per round. A highly competent IVF practice and doctor can make a big difference, and for now its value could be bigger than those from finding superior embryo selection.
Study finds mRNA Covid-19 vaccines prolonged life of cancer patients, which they claim is via trained immunity from a Type I Interferon surge and activation of MDA5, but it seems they didn’t do a great job controlling for the obvious factor of whether this came from its protective effects against Covid-19? That seems like a giant hole in the study, but they are in Phase III which will settle it either way. If the effect is real you can likely enhance it quite a lot with a combination of mRNA composition and timing the shot to the start of using checkpoint inhibitors.
The latest experimental GLP-1 entry from Eli Lilly, is showing the largest weight loss results we’ve seen so far, including big impacts on arthritis and knee pain.
Costco to sell Ozempic and Wegovy at large discount for people without insurance, at $499 a month, the same as Novo Nordisk’s direct-to-consumer website. You do still need a prescription.
Eli Lilly seems to have made a once-daily weight loss pill that works 80%-90% as well as injected Ozempic, with fewer side effects. It’s plausible this would make adaptation much more common, and definitely would if combined with affordable prices and easy access.
Unfortunately an early study suggests that GLP-1s do not, so far, reduce medical spending, with little offset in other spending being observed or projected. Given this is a highly effective treatment that reduces diabetes and cardiovascular risks, that is a weird result, and suggests something is broken in the medical system.
Elasticity of the supply of pharmaceutical development of new drugs is high. If you double the exclusivity period you get (in the linked job market paper) 47% more patent filings. We should absolutely be willing to grant more profitability or outright payments for such progress.
Australia offers a strong pitch as a location for clinical trials, and as a blueprint for reform here in America if we want to do something modest.
Dr. Shelby: when people talk about Australia for clinical trials, most discourse is round the 40%+ rebates.
BUT, what I haven’t heard discussed is that they don’t require IND packages in some cases. (eg. new insulin format, or new EPO analogues for anemia).
drugs going through this path only need CMC and and ethics approval.
Ruxandra Teslo: Also no full GMP for Phase I. Imo US should just literally copy the Phase I playbook from Australia.
One of the most frustrating experiences in trying to propose ideas on how to make clinical development faster/cheaper, is that ppl who have on-the-ground experience are reluctant to share it, for fear of retribution. The cancel culture nobody talks about.
Your periodic reminder that today’s shortage of doctors is a policy choice intentionally engineered by the American Medical Association.
Ruxandra Teslo offers another round of pointing out that if we had less barriers to testing potential new treatments we’d get a lot more treatments, but that no one in the industry has the courage to talk about how bad things are or suggest fixes because you would get accused of the associated downside risks, even though the benefits outweigh the risks by orders of magnitude. Ruxandra notes that we have a desperate shortage of ‘Hobbit courage,’ or the type of intellectual courage where you speak up even though you yourself have little to gain. This is true in many contexts of course.
Patrick McKenzie (about Ruxandra’s article): A good argument about non-political professional courage, which is *also* an argument why those of us who have even moderate influence or position can give early career professionals an immense boost at almost trivial cost, by advancing them a tiny portion of their future self.
This is one reason this sometimes Internet weirdo keeps his inbox open to anyone and why he routinely speaks to Internet weirdos. I’m not too useful on biotech but know a thing or two about things.
Sometimes the only endorsement someone needs is “I read their stuff and they don’t seem to be an axe murderer.”
Sarah Constantin: The most awful stories I heard about “he said this and never got a grant again” were criticisms of the scientific establishment, of funders, or regulators.
Tame stuff like “there’s too much bureaucracy” or “science should be non-commercial.”
In terms of talking to internet weirdos who reach out, I can’t always engage, especially not at length, but I try to help when I can.
I don’t see enough consideration of ‘goal factoring’ around the testing process and the FDA. As in, doing tests has two distinct purposes, that are less linked than you’d hope.
If you outright knew the answer to #1, that would cut your effective costs for #2 dramatically, because now you only have to test one drug to find one success, whereas right now most drugs we test fail. So the underrated thing to do, even though it is a bit slower, is to do #1 first. As in, you gather strong Bayesian evidence on whether your drug works, however necessary and likely with a lot of AI help, then only after you know this do you go through formal channels and tests in America. I will keep periodically pointing this out in the hopes people listen.
Why do clinical trials in America cost a median of $40,000 per enrollee? Alex Tabarrok points us to an interview with Eli Lilly CEO Dave Ricks. There are a lot of factors making the situation quite bad.
Alex Tabarrok: One point is obvious once you hear it: Sponsors must provide high-end care to trial participants–thus because U.S. health care is expensive, US clinical trials are expensive. Clinical trial costs are lower in other countries because health care costs are lower in other countries but a surprising consequence is that it’s also easier to recruit patients in other countries because sponsors can offer them care that’s clearly better than what they normally receive. In the US, baseline care is already so good, at least at major hospital centers where you want to run clinical trials, that it’s more difficult to recruit patients.
Add in IRB friction and other recruitment problems, and U.S. trial costs climb fast.
See also Chertman and Teslo at IFP who have a lot of excellent material on clinical trial abundance.
Anatoly Karlin: Lilly stopped one of two trials of bimagrumab, a drug that preserves muscle mass during weight loss, after new FDA guidance suggested that body composition effects wouldn’t be enough for approval, but would need to show incremental weight loss beyond the GLPs.
GLP-1s help you lose weight. The biggest downside is potential loss of muscle composition. But the FDA has decided that fixing this problem is not good enough, and they won’t approve a new drug that is strictly better on an important metric than an existing drug. Not that they won’t recommend it, that they won’t approve it. As in, it’s strictly better, but it’s not enough strictly better in the ways they think count, so that’s a banning.
Which is all Obvious Nonsense and will make people’s lives much worse, as some lose muscle mass, others put in a lot more stress and effort to not lose it, and others don’t take the GLP-1 and thus lose the weight.
The second best answer is that things like muscle loss prevention should count as superior endpoints.
The first best answer is that ‘superiority’ is a deeply stupid requirement. If you have drug [A] that does [X], and then I have drug [B] that also does [X] about as well, the existence of [A] should not mean we ban [B]. That’s crazy.
Uncertainty at the new iteration of the FDA is endangering drug development on top of the FDA’s usual job endangering drug development. You can’t make the huge investments necessary if you are at risk of getting rejected on drugs that have already been approved elsewhere, for reasons you had no ability to anticipate.
It would be good not to have an FDA, or even better to have a much less restrictive FDA. But if we’re not going to relax the rules, incompetence only makes it all worse.
Some good news: The FDA is now ‘open to Bayesian statistical approaches.’ I suspect this only means ‘you can use evidence from Phase 2 in Phase 3’ but it’s great to see them admitting in the announcement that Bayesian is better than frequentist.
Robin Hanson finds the most Hansoninan Medical study. Amy Finkelstein and Matthew Gentzkow use mover designs to estimate the causal impact of healthcare spending on mortality. They find that extra healthcare spending, on current margins, has slightly negative impact.
Robin Hanson: ”we investigate whether places that increase health care spending also tend to be places that increase health. We find that they do not”
Their point estimate is that residents lose ~5 days of lifespan at age 65 for every 10% increase in medical spending. Standard error of this estimate is ~7 days.
So two sigma (95% confidence level) above the estimate is +9 days of lifespan. Really hard to see that being worth 2% of GDP.
The discussion is frank that this doesn’t rule out that different regions might be providing similar care with different levels of efficiency. In that case, there’s a lot of money to be saved by improving efficiency, but it doesn’t mean care is wasted. There’s also potential selection effects on who moves. You would also want to consider other endpoints beyond mortality, but it’s hard to see those improving much if mortality doesn’t also improve.
Robin Hanson links us to this paper, showing that greater expected pension benefits led to more preventative care, better diagnosis of chronic diseases and improved mortality outcomes. As in, there is a real incentive effect on health, at least at some income levels.
Gene Kim offers a writeup of his wife’s hospital experience, explaining some basics of what you need to do to ensure your loved ones get the care they need. Essentially, the Emergency Department is very good at handling things you can handle in the Emergency Department, but the wiring connecting the various departments is often quite poor, so anything else is on you to ensure the coordination, and that information reaches those who need it, figure out where you’re going and how to get there. The good news is that everyone wants it to work out, but no one else is going to step up. It’s on you to ask the questions, share and gather the info and so on. What’s missing here is don’t be afraid to ask LLMs for help too.
Being admitted to a mental hospital is very, very bad for you. This is known. It severely disrupts and potentially ruins your life permanently. The two weeks after release from the hospital put you at very high risk of suicide. Having someone committed, even for a few days, is not something to be taken lightly.
That doesn’t mean one should never do it. In sufficiently dire circumstances, where outcomes are going to be terrible no matter what you do, it is still superior to known alternatives. The question is, how dire must be the circumstances to make this true? Are we doing it too often, or not often enough?
A new study measures this by looking at marginal admissions, as different doctors act very differently in marginal cases, allowing us to conduct something remarkably close to an RCT. Such disagreement is very common, 43% of those evaluated for involuntary commitment for the first time fall into this group in the sample.
Even with 7,150 hospitalization decisions, the study’s power is still not what we would like (the results are statistically significant, but not by that much considered individually), but the damage measured is dramatic: The chance of a marginal admit being charged with a violent crime within three months increases from 3.3% to 5.9% if they get admitted, the risk of suicide or death by drug overdose rises from 1.1% to 2.1%.
This matches the associated incentives. If you don’t refer or admit someone at risk, and something goes wrong, you are now blameworthy, and you put yourself in legal jeopardy. If you do refer or admit them, then you wash your hands of the situation, and what happens next is not on you. Thus, you would expect marginal cases to be committed too often, which is what we find here.
It seems reasonable to conclude that the bar for involuntary commitment should be much higher, and along the lines of ‘only do this if there is no doubt and no choice.’
The best description I’ve seen of how to think about ‘biological age’ measures:
Ivan: i will only trust your health app’s ‘biological age’ report if it comes bundled with a life insurance offer.