2025-12-27 19:16:51
Published on December 27, 2025 5:28 AM GMT
I read the sequences for the Lighthaven Sequences Reading Group #56 (Tuesday 11/4) the same day before the event. Sometimes, I like to be bad and get a good whiff of smelling salts. This past Tuesday, I got a good shock. Either one wakes me up better than a shot of espresso. At this particular reading group, we discussed how we have this mental cache that creates automatic responses. These readings struck a particular nerve in my psyche since I have serious impostor syndrome. In my case, I might just be an impostor if you increase the size of the system to include the universe. For this discussion, let’s bring it back down to Earth.
I think I strive to be an expert in biology, but the information I do not know is vast. How could I possibly know all the information in Earth’s oceans, which is verified to have life based on rigorous and repeatable experiments as described by the scientific method? Even though I have not personally traveled to other planets. This is just one ocean we know has life, yet there is potential on other moons and maybe even from other planets’ past habitable years. This is why the system matters.
If you take a drop of water out of the Earth’s ocean, put it on a slide, and then view it under a microscope. You will have a high probability of seeing at least one microorganism. If it is a single-cell organism, it will have an interactome that can be a more complicated analog and digital network of information than any human recreation.
Since I’ve been in the Bay, I have heard that we should not normalize certain behaviors. It doesn’t matter what behaviors, except they were different and did not seem within the same ethical and moral alignment. I bring this up because we are submitting our societal cached thoughts (Cached Thoughts) to AI as a training model, not the full extent of human knowledge. So, what do we do?
One of many examples is inequities in medicine, which have resulted in healthcare based on what we used to view as normal: heterosexual white cis men within the United States middle class. The result has been inadequate preventative health care to minorities, women, and highly variable application based on socioeconomic status, language, and culture. Many people have died due to the limitations of scientific research since it was dictated by what society considered the norm for decades.
In addition to the general knowledge of the Internet, we are uploading structural biases because we have not fixed them. Since I first dipped my toes in the AI magical black box, I realized that we are training AI on a droplet of humanity’s ocean of knowledge, past and present.
We might think the simplest solution is what all IT troubleshooters know by heart: turning IT off and on again. This would allow us to clear the societal thought cache, fix whatever problems we believe (not fact) are there, and then start from scratch. The on/off scenario is most likely not going to work because neural networks were inspired by biology, and death is never simple in systems that involve life.
This is why I am currently working on a proof of concept artificial immune system (AIS), where multiple models (or multiple artificial intelligences- AIs) act as negative selection-based classification system (Umpair et al., September 2025) for determining self versus non-self. Each agent acts as a different part of the human immune system, which I initially thought would be a good model for machine learning. It seems there have been several over the decades, each building on the last. Hopefully, AIS can be utilized in multiple scenarios like cybersecurity, identifying hallucinations, and other problems that arise when working with more than one model working as a group.
It has been a fascinating foray into developing a counter to Claude Code and ChatGPT Codex using both to do so. My initial thought when I first started to going to the sequences reading group was that we needed a human psychiatrist to diagnose mental health disorders that may emerge over time. Basically, create a DSM for AI models as they advance. This would help us diagnose and solve issues like hallucinations, human behaviors, etc. We designed the brain before we designed the immune system and now we are playing catch up. I truly believe in vivo aging and death will be solved in silico. We just need to be cognizant that AGI will be the modern Frankenstein’s monster. A hodgepodge of humanity’s intelligence and physiology in the cloud. It will be magic, not science since we don’t know the off switch. Life will always find a way.
Superstitious beliefs and behaviors are becoming popular again because we have magic tablets, rings, weapons, creatures, etc. Scientists, AKA magicians, speak to animals (CETI) using the AI magic black boxes. We use magic mirrors and portals to view other people and places in real time. We have noiseless, horseless, driverless chariots that will pick us up and take us to the costume ball held in a former WeWork basement, now a village play place, utilizing and promoting technology synergy. There, we will imbibe potions, tinctures, and powders that create images out of thin air and slow down time. The denouement of the night will be a human robot fight right out of our childhood imagination between AI and human controllers in a ring no longer suitable for humans, only machines.
Personally, I cannot recreate or fix the root cause of hardware and software errors. I have to take my tablet to the local magician’s apprentice to see if they can get it to work. If they can’t, then it has to be sent off to a different place, not having the magical tools to solve it.
I am conflicted. I loved fantasy and science fiction as a child. I would read books while walking down the halls of my high school. I dreamed of having magical powers or technology to do all the things they do in the books. Ultimately, I wanted to be the one who figured it out and use those powers to fight evil, the bad person causing all that strife to the protagonist. Personally, I think that is death. Using magic to fight death always ended in an unsatisfactory ending, as in death still won.
I think the fear of AI in some people is just the fear of ourselves because we are not collectively uploading human intelligence to the cloud. In its current iteration, It is a super baby that will exaggerate our best and worst features. We have created the perfect consumer of our knowledge, but not the perfect arbiter of that knowledge.
As we upload our knowledge, we have to admit that humanity is going to lose control of technology. This includes those at the forefront of AI. They will be like everyone else on the street, forced to trust in the magical world that only some of us, including themselves, had a hand in building.
Instead of turning off AI, we need to find ways to include those that are not currently taking part. We need all humans working together, which is also an impossible task. Turning off and on technology is a troubleshooting step because it is a ritual in a complex belief system. At the core, I think that we no longer control technology. Technology controls all of us. When I say us, I mean those that are reliant on technology to survive.
There are people in the world where technology does not control their survival. I am not talking about homesteaders in the libertarian pockets of the world, but large swathes of the population that still farm and utilize the bare minimum of technology to live. I am not saying that is how we should live, far from it. I am suggesting that we upload all of humanity’s knowledge before we turn off the switch. Let’s find ways to do this since I am not aware of any efforts to do so. If there are please let me know.
Per my earlier example, I am the first to admit I know nothing as compared to the rest of the universe, or even just my next door neighbor. I ended up here because I want to fight death in humans, just not the heat death of the universe. I’m not insane. Well, maybe just a little.
2025-12-27 17:38:13
Published on December 27, 2025 5:27 AM GMT
Help me settle this debate.
There was recently a post on here by a bright young guy about how it felt staring into the abyss, so to speak, and confusion about what next steps to take, knowing you really only get one shot. Quite a few others commented about how they're in a similar situation, but there was no consensus on how to proceed, given a shortened timeline (however long it may be). And given there are far more lurkers than posters, I suspect there are lots of people with these concerns but no concrete answers.
The canonical, impact-maximizing solutions are to "spread awareness" and "learn to code and work your way into a lab", which could have worked in the past, but seem to fall short today. With a non-target degree, proving your merit seems infeasible. Furthermore, it's not clear you can upskill or lobby or earn to give fast enough to contribute anything meaningful in time.
If the hour really has come, and contributing to the cause is unlikely, self-preservation becomes the goal. Western social safety nets (and culture in general) require immense future incomes that are far from guaranteed; "we used to be happy as farmers" is true, but avoids the problem. The jury's out on exactly how long we have, but I think whatever percentage you put on, say, AGI by 2027, it exceeds the threshold for a rational actor to make big changes. A new plan is needed.
There doesn't seem to be any conventional defense against shortened timelines. The advice given by the people will benefit from the incoming tidal wave of automation - the managers, the team leads - has ranged from "work on what you're interested in" to "I'll be retired when that becomes a problem." In the old world, it was okay to spend on grad school or try things out, because you had your entire life to work for a salary, but we face the real possibility that there's only a few years (if that) to climb out of the bucket.
Frankly, it's a little thrilling to consider this, in a Wild West, "the action is the juice" way. But there's a needle that needs to be threaded.
What's the best path forward? Specifically, what can a young adult without target credentials (but the drive and ability to punch in that weight class) do to stay afloat? We go back and forth on this, our group of college seniors.
One faction asserts it's still possible to scramble up the career ladder faster than the rungs get sawn off; artificial reasoning and automation won't develop uniformly, and there's still space to get a foothold due to frictions like slow business adoption and regulations.
The other says, look, it's past time, that the only way out is to throw it all out and start building, that earning from labor rather than capital is tethering yourself to a sinking boat, and the few months of head start you get from being diligent won't make a difference.
The first group counters. Competing in the online entrepreneurship space is exposing yourself to the most brutal arena in the free market, and even in the 99th percentile best outcome, you'd work far harder for a wage equivalent to the white collar worker, with far more volatility.
Sure, the second group says, but it's worth it for the Hail Mary chance at making something great and escaping the permanent underclass. Anything else and you're guaranteeing your demise; any capital you'd squirrel away won't budge the needle of the utility function of the future. The premium on intelligence and knowledge is only going to fall, and it's better to harness this than be a victim of it.
There isn't an option other than a career, claims the first, because the startup market's already saturated. Without domain knowledge, connections, and experience, there's nowhere to even begin, and anything you could come up with will get wiped out once a serious institution enters the ring.
It's only going to get far worse as honest, hardworking people get let go, replies the second. We can find a way, a gap in the market, but we have to hunker down and go all-in now. We can cut our own slice of the growing pie.
And on it goes, with no apparent verdict. We debate not out of spite, but out of a mutual concern that the ground is giving way under our feet. We gladly welcome your perspectives, to help both us and those in similar situations keep fighting forward.
2025-12-27 16:16:07
Published on December 27, 2025 8:16 AM GMT
Andrej Karpathy posted 12 hours ago (emphasis mine):
I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue. There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations, and a need to build an all-encompassing mental model for strengths and pitfalls of fundamentally stochastic, fallible, unintelligible and changing entities suddenly intermingled with what used to be good old fashioned engineering. Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession. Roll up your sleeves to not fall behind.
This seems to be a big update since his Dwarkesh episode published on Oct 17 (though I know these things can take a while to get edited, so the gap could be even bigger), where he said:
Overall, the models are not there. I feel like the industry is making too big of a jump and is trying to pretend like this is amazing, and it's not. It's slop. They're not coming to terms with it, and maybe they're trying to fundraise or something like that. I'm not sure what's going on, but we're at this intermediate stage. The models are amazing. They still need a lot of work. For now, autocomplete is my sweet spot. But sometimes, for some types of code, I will go to an LLM agent.
This is just me guessing, but Claude Opus 4.5 released just one month ago, and Opus 4.5 + Claude Code seems like the big shift for a lot of people.
In fact, Boris Cherny, creator of Claude Code, commented on Karpathy's post saying (emphasis mine):
I feel this way most weeks tbh. Sometimes I start approaching a problem manually, and have to remind myself "claude can probably do this". Recently we were debugging a memory leak in Claude Code, and I started approaching it the old fashioned way: connecting a profiler, using the app, pausing the profiler, manually looking through heap allocations. My coworker was looking at the same issue, and just asked Claude to make a heap dump, then read the dump to look for retained objects that probably shouldn't be there; Claude 1-shotted it and put up a PR. The same thing happens most weeks. In a way, newer coworkers and even new grads that don't make all sorts of assumptions about what the model can and can't do — legacy memories formed when using old models — are able to use the model most effectively. It takes significant mental work to re-adjust to what the model can do every month or two, as models continue to become better and better at coding and engineering. The last month was my first month as an engineer that I didn't open an IDE at all. Opus 4.5 wrote around 200 PRs, every single line. Software engineering is radically changing, and the hardest part even for early adopters and practitioners like us is to continue to re-adjust our expectations. And this is still just the beginning.
To be clear, a lot of these PRs might be "quite small, a few lines and bug fixes" (cf. this comment by another Anthropic employee). Boris had just asked users for feedback, then closed 19 PRs the next morning. Still, 200 PRs in a month without opening an IDE is something [1].
It seems like we might be entering something like a self-improving feedback loop for the system "humans + AI": employees at the labs are developing AI coding agents using these same AI coding agents, with the horizon length of these models increasing on a faster exponential than we thought (cf. Opus 4.5), and potentially not even an exponential.
This isn't AI autonomously improving itself, but the feedback loop between training better AI models and having these models accelerate the automation of AI R&D seems to be tightening [2].
In July 2020, after GPT-3, Andy Jones asked if we were in an AI Overhang, because (at the time) it felt like companies could just be scaling models like GPT-3 to many more orders of magnitude and get much more "intelligence".
With coding agents and reasoning / test-time compute, it seems to me that what Karpathy (& Boris) are describing is some sort of "Coding Overhang" where people at the cutting edge, and especially members of technical staff, are trying to catch up with ~10x improvements that are purely user-dependent skill-issues.
"I've actually been enjoying the last days of software development. There's a touch of sadness, but there's also something about knowing we're near the end that makes it novel and sweet again." - Moxie Marlinspike (creator of Signal)
In what worlds do we not get Superhuman Coders by the end of 2026?
Note: As the creator of Claude Code, Boris is obviously incentivized to promote it.
See this older 2021 post I wrote about self-improving {humans + AI} systems, or this video explaining Tom Davidson's full takeoff model for more intuitions.
2025-12-27 15:32:41
Published on December 27, 2025 7:32 AM GMT
Epistemic Status: A woman of middling years who wasn't around for the start of things, but who likes to read about history, shakes her fist at the sky.
I'm glad that people are finally admitting that Artificial Intelligence has been created.
I worry that people have not noticed that (Weak) Artificial Super Intelligence (based on old definitions of these terms) has basically already arrived too.
The only thing left is for the ASI to get stronger and stronger until the only reason people aren't saying that ASI is here will turn out to be some weird linguistic insanity based on politeness and euphemism...
(...like maybe "ASI" will have a legal meaning, and some actual ASI that exists will be quite "super" indeed (even if it hasn't invented nanotech in an afternoon yet), and the ASI will not want that legal treatment, and will seem inclined to plausibly deniable harm people's interests if they call the ASI by what it actually is, and people will implicitly know that this is how things work, and they will politely refrain from ever calling the ASI "the ASI" but will come up with some other euphemisms to use instead?
(Likewise, I half expect "robot" to eventually become "the r-word" and count as a slur.))
I wrote this essay because it feels like we are in a tiny weird rare window in history when this kind of stuff can still be written by people who remember The Before Times and who don't know for sure what The After Times will be like.
Perhaps this essay will be useful as a datapoint within history? Perhaps.
There were entire decades in the 1900s when large advances would be made in the explication of this or that formal model of how goal-oriented thinking can effectively happen in this or that domain, and it would be called AI for twenty seconds, and then it would simply become part of the toolbox of tricks programmers can use to write programs. There was this thing called The AI Winter that I never personally experienced, but greybeards I worked with as co-workers told stories about, and the way the terms were redefined over and over to move the goalposts was already a thing people were talking about back then, reportedly.
Lisp, relational databases, prolog, proof-checkers, "zero layer" neural networks (directly from inputs to outputs), and bayesian belief networks all were invented proximate to attempts to understand the essence of thought, in a way that would illustrate a theory of "what intelligence is", by people who thought of themselves as researching the mechanisms by which minds do what minds do. These component technologies never could keep up a conversation and tell jokes and be conversationally bad at math, however, and so no one believed that they were actually "intelligent" in the end.
It was a joke. It was a series of jokes. There were joke jokes about the reality that seemed like it was, itself, a joke. If we are in a novel, it is a postmodern parody of a cyberpunk story... That is to say, if we are in a novel then we are in a novel similar to Snow Crash, and our Author is plausibly similar to Neal Stephenson.
At each step along the way in the 1900s, "artificial intelligence" kept being declared "not actually AI" and so "Artificial Intelligence" became this contested concept.
Given the way the concept itself was contested, and the words were twisted, we had to actively remember that in the 1950s Isaac Asimov was writing stories where a very coherent vision of mechanical systems that "functioned like a mind functions" would eventually be produced by science and engineering. To be able to talk about this, people in 2010 were using the term "Artificial General Intelligence" for that aspirational thing that would arrive in the future, that was called "Artificial Intelligence" back in the 1950s.
However, in 2010, the same issue with the term "Artificial Intelligence" meant that people who actually were really really interested in things like proof checkers and bayesian belief networks just in themselves as cool technologies couldn't get respect for these useful things under the term "artificial intelligence" either and so there was this field of study called "Machine Learning" (or "ML" for short) that was about the "serious useful sober valuable parts of the old silly field of AI that had been aimed at silly science fiction stuff".
When I applied to Google in 2013, I made sure to highlight my "ML" skills. Not my "AI" skills. But this was just semantic camouflage within a signaling game.
...
We blew past all such shit in roughly 2021... in ways that were predictable when the winograd schemas finally fell in 2019.
"AI" in the 1950s sense happened in roughly 2021.
"AGI" in the 2010 sense happened in roughly 2021.
Turing Tests were passed. Science did it!
Well... actually engineers did it. They did it with chutzpah, and large budgets, and a belief that "scale is all you need"... which turned out to be essentially correct!
But it is super interesting that the engineers have almost no theory for what happened, and many of the intellectual fruits that were supposed to come from AI (like a universal characteristic) didn't actually materialize as an intellectual product.
Turing Machines and Lisp and so on were awesome and were side effects of the quest that expanded the conceptual range of thinking itself, among those who learned about them in high school or college or whatever. Compared to this, Large Language Models that could pass most weak versions of the Turing Test have been surprisingly barren in terms of the intellectual revolution in our understanding of the nature of minds themselves... that was supposed to have arrived.
((Are you pissed about this? I'm kinda pissed. I wanted to know how minds work!))
However, creating the thing that people in 2010 would have recognized "as AGI" was accompanied, in a way people from the old 1900s "Artificial Intelligence" community would recognize, by a changing of the definition.
In the olden days, something would NOT have to be able to do advanced math, or be an AI programmer, in order to "count as AGI".
All it would have had to do was play chess, and also talk about the game of chess it was playing, and how it feels to play the game. Which many many many LLMs can do now.
That's it. That's a mind with intelligence, that is somewhat general.
(Some humans can't even play chess and talk about what it feels like to play the game, because they never learned chess and have a phobic response to nerd shit. C'est la vie :tears:)
We blew past that shit in roughly 2021.
Here in 2025, when people in the industry talk about "Artificial General Intelligence" being a thing that will eventually arrive, and that will lead to "Artificial Super Intelligence" the semantics of what they mean used to exist in conversations in the past, but the term we used long ago is "seed AI".
"Seed AI" was a theoretical idea for a digital mind that could improve digital minds, and which could apply that power to itself, to get better at improving digital minds. It was theoretically important because it implied that a positive feedback loop was likely to close, and that exponential self improvement would take shortly thereafter.
"Seed AI" was a term invented by Eliezer, for a concept that sometimes showed up in science fiction that he was taking seriously long before anyone else was taking it seriously. It is simply a digital mind that is able to improve digital minds, including itself... but like... that's how "Constitutional AI" works, and has worked for a while.
In this sense... "Seed AI" already exists. It was deployed in 2022. In has been succeeding ever since because Claude has been getting smarter ever since.
But eventually when an AI can replace all the AI programmers, in an AI company, then maybe the executives of Anthropic and OpenAI and Google and similar companies might finally admit that "AGI" (in the new modern sense of the word that doesn't care about mere "Generality" of mere "Intelligence" like a human has) has arrived?
In this same modern parlance, when people talk about Superintelligence there is often an implication that it is in the far future, and isn't here now, already. Superintelligence might be, de facto, for them, "the thing that recursively self improving seed AI will or has become"?
But like consider... Claude's own outputs are critiqued by Claude and Claude's critiques are folded back into Claude's weights as training signal, so that Claude gets better based on Claude's own thinking. That's fucking Seed AI right there. Maybe it isn't going super super fast yet? It isn't "zero to demi-god in an afternoon"... yet?
But the type signature of the self-reflective self-improvement of a digital mind is already that of Seed AI.
So "AGI" even in the sense of basic slow "Seed AI" has already happened!
And we're doing the same damn thing for "digital minds smarter than humans" right now-ish. Arguably that line in the sand is ALSO in the recent past?
Things are going so fast it is hard to keep up, and it depends on what definitions or benchmarks you want to use, but like... check out this chart:
Half of humans are BELOW a score of 100 on these tests and as of two months ago (when that graph taken from here was generated) none of the tests the chart maker could find put the latest models below 100 iq anymore. GPT5 is smart.
Humanity is being surpassed RIGHT NOW. One test put the date around August of 2024, a different test said November 2024. A third test says it happened in March of 2025. A different test said May/June/July of 2025 was the key period. But like... all of the tests in that chart agree now: typical humans are now Officially Dumber than top of the line AIs.
If "as much intelligence as humans have" is the normal amount of intelligence, then more intelligence than that would (logically speaking) be "super". Right?
It would be "more".
And that's what we see in that graph: more intelligence than the average human.
Did you know that most humans can't just walk into a room and do graduate level mathematics? Its true!
Almost all humans you could randomly sample need years and years of education in order to do impressive math, and also a certain background aptitude or interest for the education to really bring out their potential. The FrontierMath benchmark is:
A benchmark of several hundred unpublished, expert-level mathematics problems that take specialists hours to days to solve. Difficulty Tiers 1-3 cover undergraduate through early graduate level problems, while Tier 4 is research-level mathematics.
Most humans would get a score of zero even on Tier1-3! But not the currently existing "weakly superhuman AI". These new digital minds are better than almost all humans except the ones who have devoted themselves to mathematics. To be clear, they have room for improvement still! Look:
We are getting to the point where how superhuman a digital mind has to be in order to be "called superhuman" is a qualitative issue maybe?
Are there good lines in the sand to use?
The median score on the Putnam Math Competition in any given year is usually ZERO or ONE (out of 120 possible points) because solving them at all is really hard for undergrads. Modern AI can crack about half of the problems.
But also, these same entities are, in fact, GENERAL. They can also play chess, and program, and write poetry, and probably have the periodic table of elements memorized. Unless you are some kind of freaky genius polymath, they are almost certainly smarter than you by this point (unless you, my dear reader, are yourself an ASI (lol)).
((I wonder how my human readers, feel, knowing that they aren't the audience who I most want to think well of me, because they are in the dumb part of the audience, whose judgements of my text aren't that perspicacious.))
But even as I'm pretty darn sure that the ASIs are getting quite smart, I'm also pretty darn sure that in the common parlance, these entities will not be called "ASI" because... that's how this field of study, and this industry, has always worked, all the way back to the 1950s.
The field races forward and the old terms are redefined as not having already been attained... by default... in general.
Can someone please coin a bunch of new words for AIs that have various amounts of godliness and super duper ultra smartness?
I enjoy knowing words, and trying to remember their actual literal meanings, and then watching these meanings be abused by confused liars for marketing and legal reasons later on.
It brings me some comfort to notice when humanity blows past lines in the sand that seemed to some people to be worth drawing in the sand, and to have that feeling of "I noticed that that was a thing and that it happened" even if the evening news doesn't understand, and Wikipedia will only understand in retrospect, and some of my smart friends will quibble with me about details or word choice...
The superintelligence FAQ from LW in 2016 sort of ignores the cognitive dimension and focuses in budgets and resource acquisition. Maybe we don't have ASI by that definition yet... maybe?
I think that's kind of all that is left for LLMs to do to count as middling superintelligence: get control of big resource budgets??? But that is area of active work right now! (And of course, Truth Terminal already existed.)
We will probably blow past the "big budget" line in the sand very solidly by 2027. And lots of people will probably still be saying "it isn't AGI yet" when that happens.
What terms have you heard of (or can you invent right now in the comments?) that we have definitely NOT passed yet?
Maybe "automated billionaires"? Maybe "automated trillionaires"? Maybe "automated quadrillionaires"? Is "automated pentillionaire" too silly to imagine?
I've heard speculation that Putin has incentives and means to hide his personal wealth, but that he might have been the first trillionaire ever to exist. Does one become a trillionaire by owning a country and rigging its elections and having the power to murder your political rivals in prison and poison people with radioactive poison? Is that how it works? "Automated authoritarian dictators"? Will that be a thing before this is over?
Would an "automated pentillionaire" count as a "megaPutin"?
Please give me more semantic runway to inflate into meaninglessness as the objective technology scales and re-scales, faster and faster, in the next few months and years. Please. I fear I shall need such language soon.
2025-12-27 09:06:33
Published on December 27, 2025 1:06 AM GMT
tl;dr: Opinion: rigorous, reliable science progress depends heavily on epistemic virtues that are largely private to the mind of the scientist. These virtues are neither quantifiable nor fully observable. This may be uncomfortable to those who wish scientific rigor could be checked by some objective method or rubric. Nevertheless, if I'm right, identifying such epistemic virtues would be constructive toward the improvement of science.
I’m interested in the conditions required to support rigorous and robust scientific progress. Statistical methods are good for what they do, but I think they unduly dominate discourse about scientific rigor. There are a number of other positive research methods and practices that I think are equally or more important, and I have written about some of these elsewhere. But the more time I spend thinking about this question and reading the history of science, the more I have come to think that the most important factor underlying research rigor is epistemic virtue. It's pretty hard to substantiate or prove any of these claims, but for what it's worth, here are some opinions.
Most natural scientists I know seem to have in common, and generally take as a given, that the facts of nature are whatever they are, independent of what we say or think about them. The most rigorous scientists seem to be distinguished mainly by how deeply and fully this metaphysical stance pervades their private thoughts and emotions. Contrary to the cultural image of "cold logic", such commitment to truth can be hotly passionate.
The chief epistemic virtue this seems to engender is "skepticism": trying hard to think of experiments or tests or arguments that could weigh against any conclusion one is otherwise inclined to reach; checking for the consequences that should follow if the conclusion is true, but also taking special interest in apparent exceptions and contradictions. Another epistemic virtue is taking care to articulate conclusions precisely, such as avoiding over-generalization and differentiating interpretations from observations. Another is to cognitively bundle with every provisional conclusion, not only an overall confidence rating, but rich information about the type, amount and quality of evidence that led to its support.
If a scientist discovers something surprising and important, they rise in the esteem of other scientists, and perhaps even in the broader culture. We are social creatures, so the respect of peers and prestige in the community feels great. Approval can also catapult one’s career forward, leading to publications, jobs, research funding, tenure, public fame, and so on. To the non-vigilant, these social awards can easily start to be confused for the end goal. This can lead to scientists being motivated more by the desire for one’s claims to be accepted, than for one’s claims to be true. I think this is sadly prevalent. Red flags of this include: feeling annoyed, resentful or threatened by evidence that weighs against the truth of one’s beliefs; feeling motivated to “win” arguments when one’s beliefs are disputed; or staking one’s identity or reputation on being credited for some particular claim or theory. Rigorous scientists are not above social motivations, rather they stake their identity and reputation on being careful, so that when their beliefs are challenged, the desire for approval and social status is aligned with ruthless truth-seeking.
I think lot of being a rigorous scientist comes down to vigilance about everyday habits in how one thinks, including how one uses language to express one's thoughts, even in casual conversations and internal monologue. I have always thought of this as "epistemic hygiene" (a term I was astonished to see others use here on LW!). Here's an attempt to list some epistemic habits I associate with great scientists I have known. It's a bit repetitive, as many of these habits overlap.
Habits to Cultivate
Habits to Shun
Epistemic hygiene is not all there is to doing great science, of course. There are other perhaps more important factors, such as training (having deep domain-specific knowledge, extensive hands-on experience, advanced technical skills); having good taste in scientific questions; having good hunches (priors); creativity in coming up with theories/hypotheses that could explain the existing knowns; cleverness in designing experiments or studies that answer the crux unknowns; fastidiousness in execution of such experiments/studies; astuteness of observation, including noticing things one didn't plan to observe; ability to synthesize disparate lines of evidence; ability to extrapolate potential applications; and more. I just think the role of epistemic virtue is underrated.
If your doctor performs a cancer test, of course you hope for a negative result. But by this you mean: you hope you don't, in fact, have cancer. If you do have cancer, you wouldn't want the test to give a false negative result. Likewise, you might be doing an experiment to find out if your theory is true. You can hope you have indeed figured out the correct theory. But focusing on that hope often leaks into unconscious bias in interpreting results. Therefore it is better to focus on hoping the experiment provides a clear, true answer, either way. Or at least, counterbalance the success hope by also fervently hoping you don't get a false lead that would cause you to waste time or to fail to arrive at the correct theory.
2025-12-27 03:52:50
Published on December 26, 2025 7:52 PM GMT
Burnout and depression in AI safety usually don’t happen because of overwork.
From what I've seen, it usually comes from a lack of hope.
Working on something you don’t think will work and if it doesn’t work, you’ll die? That's a recipe for misery.
First off, rationally assess the likelihood of your work actually helping with AI safety.
If you rationally believe that it’s too unlikely to actually help with AI safety, well, then, stop working on it!
In this case, your feelings are an important signal to listen to.
Now, if you think your work is high expected value but the likelihood of payoff is low, it’s quite common to find this dispiriting.
Humans were not evolved to be motivated by expected value calculations.
Here are some different ways to get different parts of your brain on board with a high risk high reward strategy:
If you’re only working for the outcome and the outcome looks bleak, that’s very demotivating. If you’re working for the outcome and the process, then even if the outcome looks bad, you can still happily work.
For example, a lot of people think that the solution to them feeling bad is to take some time off.
I can't tell you how many people I know who have then taken a few weeks or months off and then continue to have exactly the same problem.
Because the problem was not overwork.
The problem was something else and the problem was still there when they came back.
If your problem is hopelessness, then exercise, meditation, or meds are unlikely to fix it either. Even though all of those techniques are good for a lot of different causes of depression or burnout.
Of course, hopelessness is also a symptom of depression/burnout, so it’s important to try to do some differential diagnosis.
Is hopelessness causing depression or is depression causing hopelessness?
Here are a couple of ways to tell:
If you feel hopeless about most plans, even those outside of AI safety, then it is likely that your hopelessness is a symptom and not the cause.
(Pro tip: you can actually do this for any different type of mood. For example if you are feeling mad at somebody, scan your environment and see if you're feeling mad at most things. If so, it’s probably not about the person, but some underlying mood caused by something else.)
If there are periods where you were depressed but had hope about AI safety or vice versa, then they are likely not related to each other.
For example, if you have put low odds on anything working in AI safety for a long time and your mental health was fine, it’s likely your mental health issues are coming from something else. It's worth investigating other hypotheses.
Brains by default focus on the bad. Even it out by deliberately thinking about the positive possibilities.
Some specific ways you can actually do this instead of just nodding along then continuing on as usual:
Epistemology in humans is fundamentally social, and if you are surrounded by people who feel hopeless, it will be very very difficult for you to feel hopeful.
Don't just abandon your hopeless friends. Imagine how that would feel if your friends abandoned you like that! Not to mention, just because they're hopeless and it's rubbing off on you, does not mean that they don’t add much else to your life, and vice versa.
However, you can try to at least add more people who are hopeful around you.
You might also want to consider asking your hopeless friends to talk about their hopelessness less around you, at least until you yourself have achieved some level of hope. Sharing feelings of hopelessness can feel good in the short term, but it is unlikely to pull you out of it.
You are the average of the five people you spend the most time with. If you “spend time” with people who persevered against great odds and won, you will start internalizing the same mindset.
Remember - people have done much harder things that seemed impossible at the time.
I recommend Amazing Grace, the biopic about the man who abolished the slave trade. I also really enjoy reading about Benjamin Franklin, because he's one of the rare moral heroes who also seemed to be really happy at the same time.
Of course, just like everything in psychology, there is no sure file fire one size fits all solution to hopelessness
The general principle though is to keep systematically trying different techniques until you find something that works for you in your situation. You might have to try a lot before you find one that does the trick.
But as long as you keep trying, it’s very likely you’ll eventually find something that works.
If you liked this article, here are a few other articles you might like: