MoreRSS

site iconDynomightModify

Dynomight is a SF-rationalist-substack-adjacent blogger with a good understanding of statistics.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Dynomight

You’re probably taking the wrong painkiller

2026-04-30 08:00:00

This is an essay that recently appeared in Asterisk. Consider the rest of the risk issue for all your risk needs.


Lots of people die after overdosing on acetaminophen (paracetamol, Tylenol, Panadol). In the U.S., it’s estimated to cause 56,000 emergency department visits, 2,600 hospitalizations, and 500 deaths per year. Acetaminophen has a scarily narrow therapeutic window. The instructions on the package say it’s okay to take up to four grams per day. If you take eight grams, your liver could fail and you could die. 

Meanwhile, it seems to be really hard to kill yourself by overdosing on ibuprofen (Advil, Nurofen, Motrin, Brufen). In 2006, Wood et al. searched the medical literature and found 10 documented cases in history. Nine of those cases involved complicating factors, and in the 10th, a woman took the equivalent of more than 500 standard (200mg) pills. 

So, for many years, if I needed a painkiller, I’d try to take ibuprofen rather than acetaminophen. My logic was that if eight grams of acetaminophen could kill my liver, then one gram was probably still hard on it. I’m fond of my liver and didn’t want to cause it any unnecessary inconvenience. 

But guess what? My logic was wrong and what I was doing was stupid. I’m now convinced that for most people in most circumstances, acetaminophen is safer than ibuprofen, provided you use it as directed. I think most doctors agree with this. In fact, I think many doctors think it’s obvious. (Source: I asked some doctors; they said it was obvious.) 

Should this have been obvious to me? I figured it out by obsessively researching how those drugs work and making up a story about metabolic pathways and blood flow, and amino acid reserves. It’s a good story, one that revealed that my logic stemmed from an egregious lack of respect for biology and that I’m a big dummy (always a favorite subject). But if the clearest road to some piece of knowledge runs through metabolic pathways, then I don’t think that knowledge counts as obvious. 

So how is a normal person meant to figure it out? Why doesn’t the fact that acetaminophen is typically safer than ibuprofen appear on drug labels or government websites or WebMD? Are normal people supposed to figure it out, or has society decided that this is the kind of thing best left illegible? 

Note: You should not switch medications based on the uninformed ramblings of non-trustworthy pseudonymous internet people.

How does ibuprofen work?

Ibuprofen inhibits the the Cyclooxygenase (COX) enzyme. This in turn inhibits the formation of messenger molecules involved in inflammation, which leads to less physical inflammation and thus less pain. 

The same story is true for almost all over-the-counter painkillers, which is why they’re almost all considered “non-steroidal anti-inflammatory drugs,” or NSAIDs. This includes ibuprofen, aspirin, naproxen (Aleve), and a long list of related drugs. But it does not include acetaminophen.

How does acetaminophen work?

Nobody knows!

Like ibuprofen, acetaminophen inhibits some COX enzymes. But it does so in a weird way that barely affects inflammation or messenger molecules, so it’s unclear if this matters for pain reduction. 

In the brain,  acetaminophen is metabolized into a mysterious chemical called AM404. This activates the cannabinoid receptors and increases endocannabinoid signaling, which seems to reduce the subjective experience of pain. AM404 also activates the capsaicin receptor, which is associated with burning sensations that you’d normally expect to increase pain, but maybe some desensitization thing happens downstream? And maybe acetaminophen also interacts with serotonin or nitric oxide or does other stuff? How this all comes together to reduce pain is still somewhat a scientific mystery. 

Aside: When trying to understand painkillers, it’s natural to focus on chemistry and molecular biology. But the unknown physical origins of consciousness are always nearby, looming ominously.

What risks does ibuprofen have?

In an ideal world, the only thing ibuprofen would do is reduce inflammation in the part of your body that hurts. But that is not our world. When ibuprofen inhibits the COX enzymes, it does so throughout the body. And mostly, that is bad. 

For one, ibuprofen reduces production of mucus in the stomach. That might sound okay or even good. But stomach mucus is important. You need it to shield the lining of your stomach from your extremely acidic gastric juice 1. Having less mucus can lead to gastrointestinal problems or even ulcers. 

Ibuprofen also affects the heart. When ibuprofen inhibits the COX enzymes there, this in turn inhibits one chemical that prevents clotting and another that causes clotting. In balance, this seems to lead to more clotting, and an increased statistical risk of heart attacks 2. If you’re healthy, the risk of a heart attack from an occasional low dose of ibuprofen is probably zero. But if you have heart issues and take medium to large doses regularly for as little as a few days, this might  be a serious concern. 

Ibuprofen also affects the kidneys. If you’re stressed, or cold, or dehydrated, or take stimulants, your body will constrict your blood vessels. That squeezes your kidneys’ intake tube, depriving them of blood. Your kidneys don’t like that, so they release signaling molecules to locally re-dilate the blood vessels. 

Trouble is, when ibuprofen inhibits COX enzymes in the kidneys, it inhibits those signaling molecules. If everything is normal, that’s okay, because the kidneys wouldn’t try to use those molecules anyway. But if your body has clamped down on the blood vessels, then the kidneys don’t have the tool they use to keep blood flowing, meaning they don’t get as much blood as they want. This is bad 3.

There are many other less common side effects, including allergies, respiratory reactions in asthmatics, induced meningitis, and suppressed ovulation. If you take a lot of ibuprofen, this could hurt your liver. But the major concerns seem to be the stomach, the heart, and the kidneys.

What risks does acetaminophen have?

Acetaminophen also inhibits some COX enzymes. But unlike ibuprofen, the effect is minimal outside the central nervous system. Thus, acetaminophen has little effect on stomach mucus, blood clots, or blood flow, and so presents almost none of the risks that ibuprofen does. 

Even so, if you take too much acetaminophen at once, you could easily die. 

How does this happen? Well, when acetaminophen is metabolized by the liver, it’s mostly broken down into harmless stuff. But a small fraction (5-15%) is broken down by the P450 system into an extremely toxic chemical called NAPQI

Ordinarily this is fine; your body creates and neutralizes toxic stuff all the time. For example, if you drank 20 grams of formaldehyde, you’d likely die. But did you know that your body itself makes and processes ~50 grams of formaldehyde every day? When liver cells sense NAPQI, they immediately release glutathione, which binds to NAPQI and renders it harmless. 

But there’s a problem. If you take too much acetaminophen at once, the pathways that break it down into harmless stuff get saturated, but the P450 system doesn’t get saturated. This means that not only is there more acetaminophen, but also that a much larger fraction of it is broken down into NAPQI. Soon your liver cells will run out of glutathione to neutralize it. Then, NAPQI will build up and bind to various proteins in the liver cells (especially in mitochondria) causing them to malfunction and/or commit suicide. This can cause total liver failure. 

So you should never take more than the recommended dose of  acetaminophen 4. If you do take too much, you should go to a hospital immediately. They will give you NAC, which will replenish your glutathione and neutralize the NAPQI. Your prospects are good as long as you get to the hospital within a few hours 56.

Acetaminophen has lots of other possible side effects, like skin issues and blood disorders. But these all seem to be quite rare.

What if you have liver issues?

The primary concern with acetaminophen  is liver damage. So if you have liver disease, then surely you’d want to avoid acetaminophen and take ibuprofen instead, right? 

Nope. It’s the opposite. Liver disease shifts the balance of risk in favor of acetaminophen. 

With liver disease, it’s hard for blood to flow into the liver, meaning that blood tends to pool in the abdomen. To counter this, blood vessels elsewhere in the body contract. This includes blood vessels around the kidneys. 

Remember the kidneys? Again, when blood vessels are constricted, the kidneys send out signaling molecules to locally re-dilate the blood vessels. But those signaling molecules are blocked by ibuprofen. So if you have liver disease, taking ibuprofen risks starving your kidneys of blood just like if you were dehydrated.

Meanwhile, people with moderate liver disease are usually still able to process acetaminophen without issue, as long as it’s in smaller amounts. So doctors usually tell patients with liver disease to avoid ibuprofen and take  acetaminophen instead, just with a maximum of two grams per day instead of four. 

(Obviously, if you have liver disease, then you should talk to a doctor, I beg you, for the love of god.)

What about other situations?

The main takeaway from all this is that the risks of both drugs emerge from the madhouse of complexity that is your body. Surely there are some situations where acetaminophen is more dangerous than ibuprofen?

I tried to capture the most common situations in this table:

Situation Acetaminophen safe? Ibuprofen safe?
Fasting No. Fasting leads to low glutathione and the risk of liver damage. No. Risks pain or bleeding in the stomach, could damage kidneys.
Dehydrated Yes. No. Could damage kidneys.
Liver Disease Maybe (low dose). Often preferred by doctors at <2g/day. No. Increases bleeding risk, could damage kidneys.
Stomach Ulcers / Heartburn Yes. No. Strips protective mucus.
Chronic Heavy Drinking Maybe (low dose). Seems safer if limited to <2g/day. No. Risk of stomach bleed.
Kidney Disease Yes. No. Puts stress on the kidneys.
Heart Conditions Yes. No. Interferes with blood clotting, raises blood pressure.
Active bleeding Yes. No. Inhibits clotting.
After drinking (a little) Maybe (low dose with food). Alcohol depletes glutathione, raising risk of liver damage. Maybe (low dose with food and water). Alcohol and ibuprofen both irritate the stomach. Alcohol also leads to dehydration.
After drinking (a lot) No. No.
Hangover No. The liver is already depleted. Maybe (with food and water). But never when dehydrated.

It’s actually fairly hard to find situations where ibuprofen is safer than acetaminophen. Possibly this is true if you’re hungover, but I would be very careful, because you tend to be dehydrated when hungover, raising the risk of kidney damage. (It’s probably optimal, from a health perspective, to avoid taking recreational drugs at doses that leave you physically ill the next day.) 

Aside from hangovers, the only situations I could find where ibuprofen might be safer than acetaminophen  are if you’re taking certain anti-seizure or tuberculosis drugs or maybe if you have a certain enzyme deficiency (G6PDD). 

So…

What have we learned so far? 

  1. The body is really complicated! 

  2. The main risk of acetaminophen is liver damage by creating too much NAPQI. Taking too much at once can easily kill you. However, as long as you don’t take too much at once and your liver isn’t depleted, then your liver will maintain NAPQI levels at zero and it will be completely fine. And there are very few other risks. 

  3. Meanwhile, ibuprofen poses a risk of gastrointestinal issues, heart attacks, or kidney damage. The risk varies based on lots of factors like whether you’ve eaten food, whether you’re dehydrated, your blood pressure, and your heart health 7.

  4. Therefore, acetaminophen is probably safer, provided you never take too much 8.

I don’t want to be alarmist. If you’re healthy, the risk from taking an occasional dose of ibuprofen as directed is extremely low. Given that so many people find that ibuprofen is more effective for many kinds of pain, it’s totally reasonable to use it. I do so myself. 

Still, it seems to be the case that in the vast majority of situations, acetaminophen is saf_er_. Personally, if I have pain, I first take acetaminophen, and then add ibuprofen if necessary. I’m pretty sure many experts think this is somewhere between “sensible” and “obvious.” 

But if acetaminophen is safer, then why don’t official sources tell you that 9? I can get doctors to admit this off-the-record. I can find random comment threads with support from people who seem to know what they’re talking about. But why does this fact never appear on government websites or drug labels? 

Let’s look at those drug labels

In the U.S., the Food and Drug Administration (FDA) creates 10 a “drug facts” label for over-the-counter drugs.  

Here’s what that looks like for ibuprofen:

ibuprofen label

And here’s what it looks like for acetaminophen (paracetamol):

acetaminophen label

I feel dumb saying this, but when I saw those labels in the past, I thought of them as a bunch of random information thrown together for legal reasons. But after spending a lot of time trying to understand these drugs myself, I now realize that these labels are… really good? 

Imagine you work at the FDA and it’s your job to write a safety label. You need to synthesize a vast and murky scientific landscape. Your label will be read by people with minimal scientific background who are likely currently in pain, and who could die if they take the drug in the wrong situation. 

If I were in that situation, I’d think about all the different situations in which taking one of these drugs could literally kill someone, and then — after a quick panic attack — I’d write a label that screamed, HEY, IF YOU ARE IN ANY OF THESE SITUATIONS, TAKING THIS DRUG COULD LITERALLY KILL YOU. Then I’d think about all the other situations where taking the drug might be okay depending on a set of complex science stuff and tell people in those situations to PLEASE TALK TO A DOCTOR FOR THE LOVE OF GOD because I DON’T KNOW IF YOU’VE HEARD BUT SCIENCE IS COMPLICATED. Everything else would be a minor concern.

From that perspective, these labels are a triumph. This isn’t random information — every word is a synthesis of a mountain of research, carefully optimized to save lives.

FDA good

How did those drug labels come to be? 

If you want a taste for the FDA’s process, I encourage you to skim the 2002 Federal Register document in which the FDA proposed to update ibuprofen’s safety label and to formally classify it as Generally Recognized as Safe. It’s more than 21,000 words long and — I think — astonishingly good. It not only summarizes the entire medical literature on ibuprofen, it summarizes it well. Here is onerepresentative bit:

Bradley et al. (Ref. 42) conducted a 4-week, double-blind, randomized trial in 184 subjects comparing the effectiveness and safety of the maximum approved OTC daily dose of 1,200 mg of ibuprofen (number of subjects (n) = 62) to that of a prescription dose of 2,400 mg/day (n = 61), and to 4,000 mg/day of acetaminophen (n = 59) for the treatment of osteoarthritis. While there were no significant differences in the number of side effects reported during this study, the study demonstrated a trend towards a dose dependent increase in minor GI adverse events (nausea and dyspepsia) associated with higher doses of ibuprofen (1,200 mg/day: 7/62 or 11.3 percent; versus 2,400 mg/day: 14/61 or 23 percent). In addition, two subjects treated with 2,400 mg/day of ibuprofen became positive for occult blood while participating in the study.

I spend a lot of time complaining about bad statistical writing. A lot. Probably too much. But I’m here to tell you, that paragraph is gorgeous. The writing is clear and penetrating. It contains all the important details, but no other details. Compared to the abstract of the original paper, the above is shorter and easier to understand yet simultaneously more informative. Five stars. 

The rest of the document is equally good, with clear and sensible explanations for various recommendations. For example, they discuss a proposal from the National Kidney Foundation for additional warning about risks to kidneys, explain why they think that proposal has merit, and then recommend a shorter version, which appears on every package of ibuprofen sold today. 

As far as I can tell, this level of quality is typical. For example, the FDA’s 2019 proposed rule on sunscreens is similarly masterful.

So why?

This leaves us with this constellation of facts: 

  1. Acetaminophen is, in general, safer than ibuprofen. 

  2. The FDA doesn’t tell you that. Neither do other respectable authorities. 

  3. The FDA is highly competent.

So what’s happening here? Have the experts conspired to keep this knowledge secret? 

I don’t think so. Mostly, I think this is down to two factors. First, the FDA doesn’t really have a mission of determining “in what circumstances is drug A safer than drug B?” Their goal is to take individual drugs and determine how people can use them safely. They seem to be quite good at this. 

Second, everyone is mortally afraid of giving “medical advice.” It varies by jurisdiction, but in general, giving “wellness advice” is OK, but if you give personalized advice, you risk going to prison. The more credible you are, the higher that risk is 11.

Stepping back, how should we think about this situation? 

The body is complicated. When experts give the public advice on drugs, they are trying to insulate us from that complexity. But there is no way to do that without making trade-offs. Society has implicitly chosen tradeoffs that mean certain “less important” facts are de-prioritized. It’s not obvious that this is the wrong choice. I feel foolish for not having more respect for the body’s complexity and for the difficulty of the task all the experts are trying to accomplish. This is not medical advice.

  1. For some reason, humans have gastric acid that is more acidic than most other animals, and is only matched by animals that specialize in eating carrion. 

  2. At least two NSAIDs (rofecoxib and valdecoxib) have been withdrawn from the market due to an increased risk of heart attacks. For the same reason, the US refuses to approve etoricoxib

  3. Nephrologists hate ibuprofen. (Source: nephrologists.) If it was up to them, maybe ibuprofen would come with a “HAVE YOU CONSIDERED TAKING ACETAMINOPHEN INSTEAD?” warning. It confuses me that the safety label for ibuprofen doesn’t warn you about the danger of taking it while dehydrated and quietly damaging your kidneys. My best guess is that this is because other doctors don’t hate ibuprofen as much as nephrologists. 

  4. Watch out for combination medicines (like cold or flu medicines or opiate painkillers) that include acetaminophen. Arguably, acetaminophen is a victim of its own success here. It’s included in these things because it is better tolerated than NSAIDs. But it’s easy to miss. 

  5. Oddly, NAC is considered a nutritional supplement, meaning basically anyone can buy it. But there’s also almost no regulation, so who knows if the thing you bought actually has NAC in it? Do not screw around trying to self-medicate an acetaminophen overdose. Go to a hospital. 

  6. At one point while researching all this I had what I thought was a good idea: Why not sell acetaminophen in pills bundled together with NAC? The NAC would replenish glutathione stores in the liver, seemingly reducing the risk of overdose. Later on, I developed more humility and felt very stupid for fantasizing that such an obvious idea could be novel or useful. I think that this is indeed a bad idea because NAC itself has side effects, though I can’t find much formal discussion. In fact, I found a 2010 editorial called “Why Not Formulate an Acetaminophen Tablet Containing N-Acetylcysteine to Prevent Poisoning?”  In another study, Nakhaee et al. (2021) actually tried giving NAC together with acetaminophen to rats and found that this seemed to make it better at reducing pain. So maybe this isn’t a completely stupid idea. That last paper also led me to discover that “rat hot plate test” is a standard phrase, and one that drives home what humanity’s dominion over nature means in practice. 

  7. Above, we mentioned that acetaminophen overdose is estimated to cause around 500 deaths per year in the U.S. It’s much harder to give direct numbers for how many people die from taking ibuprofen, because NSAIDs don’t really directly “kill” people, but rather increase the risk of dying in various ways. The best estimates seem to be that NSAIDs cause 5,000-16,500 deaths each year in the US via gastrointestinal complications, and something similar via heart attacks. These numbers are not a good way of quantifying the relative risk of drugs, because they represent different people taking different amounts for different reasons. But they do show that ibuprofen is not without risk. 

  8. There are probably some people who are too disordered to track much acetaminophen they’ve taken. For such people, ibuprofen might be the safer choice. Though I’m skeptical that many such people are found among the readers of Asterisk

  9. There are two cases where official sources are clear that acetaminophen is safer than ibuprofen: for use by pregnant women and small children. This doesn’t appear on the safety label, but if you’re pregnant and go to a doctor, they will probably tell you to take acetaminophen but not ibuprofen or other NSAIDs. And if you have a newborn baby, their doctor will probably tell you that you can give them acetaminophen but not ibuprofen or other NSAIDs. 

  10. Technically, for many drugs today, it is the drug manufacturer that “creates” the label, which is why they can be slightly different. However, the FDA strongly regulates what is on it, including most of the language and even details about the font and so on. The federal register contains a template the FDA published for ibuprofen which is almost identical to what appears on the side of drugs today  

  11. Unlike in most places, in the United Kingdom it seems to be perfectly legal for people to give each other medical advice, provided they don’t misrepresent themselves as licensed doctors. This is not legal advice. 

I quit drinking for a year

2026-04-09 08:00:00

In early January 2025, a family friend was over for lunch. One of my many guilty midwit pleasures is a love of New Year’s resolutions, so I asked her if she had made any. She said no, but mentioned that she had some relatives that were doing “damp January”.

In case you’re not aware, Dry January is a challenge many people do to quit drinking alcohol during the month of January. These folks were doing a variant in which, instead of not drinking, one simply drinks less.

For some reason, this triggered me. I thought, “Are you kidding? You can’t even stop drinking for a single month? Do you know how pathetic that is?” And then, “Fuck you! Fuck you for doing damp January! You know what, I’m going to stop drinking for a year!”

To be clear, these thoughts were directed at people I’ve never even met. In retrospect, I wonder what was going on with me emotionally. But I take resolutions seriously, so I felt committed.

We are now 15 months down the timeline, so I’ll make my report.

It was easy

This will sound odd, but I swear it’s true. Not drinking was so easy that it was almost easier than my previous baseline of not-not-drinking.

Before starting this resolution, I didn’t drink much—perhaps two or three drinks per week. But I often thought about drinking. Every time I saw friends or went to a restaurant, I thought, “Should I have a drink?” Usually I decided not to. But making that decision required effort.

After a few weeks of not drinking, that question never even came up. Drinking was simply not a thing I did, so I never needed to negotiate with myself.

Theoretically, you could allow yourself one drink a month instead of zero. Theoretically, that should be easier. But I’m pretty sure I’d find it harder, because alcohol would still be an option, a thing to consider.

Sometimes I need a thing

Early on, I sometimes wanted a drink. But gradually I noticed that I didn’t really want a drink, I just wanted a thing. I can’t find a precise name for this concept in psychology, but often, some deep part of my brain seems to scream, “I WANT A THING.” It could be alcohol, but I found dessert worked just as well. I suspect that a new shirt or meeting a new dog would also work.

I was not able to stop my brain from doing this. When it demanded a thing, I gave it a thing. I just substituted a non-alcohol thing. So, over the year, I became interested in desserts and even-more interested in tea.

The struggle was The Chocolates. Shortly after I made this resolution, my mother gave me a bag of chocolates that each contained a bit of whiskey. In general, I don’t keep chocolate at home. If anyone gives me chocolate, I immediately eat all of it and then text the giver, “Thanks for the chocolate, I ate it instead of dinner, it’s all gone, this is what will always happen if you give me chocolate.”

But I couldn’t eat the Chocolates, because they contained alcohol. I managed to get guests to eat a few. A couple of times I came close to draining out the alcohol and eating the chocolate container. I even considered throwing them away, but that felt wrong. So instead I spent a year glaring at them and waiting for them to apologize for the anguish they were causing me. This represented half the difficulty of this resolution. I do not recommend it. Keep your things separate.

Alcohol is bad for sleep

Have you heard that alcohol is bad for sleep? Because alcohol is bad for sleep. I’ve always known that was true, abstractly. But sleep is variable. If I didn’t sleep well on an individual night, I was never sure: Was that because of the alcohol, or was it random variation?

After a year without alcohol, I am very confident that yes indeed, alcohol is bad for sleep, because my sleep during 2025 was much better than in previous years. Sure, like anyone else, I still sometimes wake up and start thinking about oblivion rushing towards me, and how everything I love will vanish into time, and how all that was once future and hope inevitably becomes static and dust, and how the plague of bluetooth speakers continues to spread across the globe. But now: less!

I wish there was a drug I could take that would give me energy and improve my mood and make me physically healthier and smarter, all without side-effects. I don’t think such a drug exists. But we do have the opposite!

So, sadly, I’ve come to believe that alcohol is basically the perfect anti-nootropic. That’s not because it makes you dumb while you’re drunk. (True, but who cares?) Rather, that’s because it is bad for sleep, and therefore makes you worse across all dimensions the next day.

Alcohol is good for socializing sometimes

I did find not drinking to have one clear downside: It’s just not that much fun to hang out with people who are drinking if you are not drinking yourself.

To be clear, this is a limited effect. It’s only an issue at bars or certain parties where people are there to drink. I don’t go to many such gatherings, but when I did, I felt it was less fun.

It’s not that I missed alcohol. Instead, my theory is that drinking parties are a sort of joint role-playing exercise: “Let’s all get together and collectively reduce our inhibitions and see what happens.” It’s fun not (just) because everyone is taking a recreational drug, but because it’s a joint social experience. If you don’t drink, then you aren’t fully participating.

It seems like it should be possible to reproduce this effect without alcohol. You could imagine other ways to push the social equilibrium out of balance. Like… Masks? Or weird environments? Or mutual disclosure games? Should people get together and do a group cold plunge?

Unfortunately, all these are complicated and/or carry some kind of social stigma. So until we figure something better out, this is a real cost of not drinking. It was minor for me, but it probably depends a lot on where you are in life.

Other effects

All other effects were minor. I guess I saved money at restaurants. I actually lost a bit of weight over the year, despite all the extra desserts, though I can’t say for sure if alcohol was the cause. Otherwise, once I stopped thinking of alcohol as an option, I rarely thought about the resolution at all, except when I saw those damn chocolates.

Aftermath

Towards the end of the year, I started wondering if I should quit drinking forever. But I never came to a conclusion, because I rarely thought about alcohol. I considered having a drink at midnight on New Year’s eve, but I happened to be on a plane that crossed the international date line and thus skipped New Year’s eve.

And then… for the first few months of 2026, I still didn’t drink. That wasn’t because of any decision. It just never seemed appealing because (a) sleep and (b) I’d broken the mental link between want thing and drink alcohol. Eventually, I ate the chocolates, and I had a glass of wine when visiting some friends. If I can continue rarely drinking while almost never thinking about drinking, I’ll probably do that. If I slowly slide back into always thinking of alcohol as a live option and always negotiating with myself, I might just resolve to quit forever.

So that’s my story. Obviously, it’s heavily colored by my own idiosyncrasies, so it’s hard to say if it offers any general lesson.

I do think people underrate the long-term health impact of drinking. The effect on heart disease is debated, but everyone agrees that any alcohol increases the risk of cancer. Still, the long-term effects from occasional light drinking probably aren’t huge. What’s really underrated is the short-term effects, via worse sleep.

If I had to give advice, it would be this: If you drink, and you think you might be better off not drinking, why not try it? Maybe you’ll find that champagne is essential to your happiness and drink it every night, to hell with the costs. Maybe you’ll find a different baseline, or maybe you’ll quit forever. Whatever you decide, you’ll have full information.

LLMs predict my coffee

2026-03-18 08:00:00

Coding, math, whatever. Can LLMs predict the outcomes of physical experiments?

Suppose I pour 8 oz (226.8 g) of boiling water into a ceramic coffee mug that weighs 1.25 lb (0.57 kg). The ambient air is still and 20 degrees Celsius. The cup starts at room temperature. Give me an equation for the temperature of the water in Celsius over time. The only free variable in the equation should be the number of seconds t since the water was poured. Focus on accuracy during the first 5 minutes.

Does that seem hard? I think it’s hard. The relevant physical phenomena include at least:

  1. Conduction of heat between the water, the mug, the air, and the table.
  2. Conduction of heat inside each of those things.
  3. Convection (fluid movement) inside the water and the air.
  4. Evaporation cooling as water molecules become vapor.
  5. Movement of water vapor in the air.
  6. Radiation. (Like all matter, the mug and water emit temperature-dependent infrared radiation.)
  7. Surface tension, thermal expansion/contraction, re-absorption of air into the water as it cools, probably more.

And many details aren’t specified in the prompt. Is the mug made of porcelain or stoneware? What is the mug’s shape? What is the table made of? How humid is the air? How am I reducing the spatially varying water temperature to a single number?

So this isn’t a problem with a “correct” answer that you can find by thinking. Reality is too complicated. Instead, answering question requires “taste”—guessing which factors are most important, making assumptions about missing details, etc.

So I put that question to a bunch of LLMs. Here is what they said:

(Technically, they gave equations as text. I’m plotting those equations.)

I was surprised by those curves, both in terms of how fast they think the temperature will drop in the beginning, and how slowly they think it will drop later on. They think you get as much cooling in the first few minutes as you do in the rest of the hour. Can that be right?

Then I did the experiment. First, I waited until the ambient temperature happened to reach 20 degrees Celsius. Then, I put 8 oz of water into a measuring cup, microwaved it until it reached a boil, let the temperature equalize a bit, and then microwaved it until the water boiled again. Then, I poured the water into a 1.25 lb coffee mug with a digital thermometer in it and shouted out measurements every five seconds, which were frantically recorded by the Dynomight Biologist. Gradually I reduced measurements to every 15 seconds, 30 seconds, 1 minute, and then 5 minutes.

Behold:

Or, here’s a zoomed-in view of the first five minutes:

The predictions were all OK, but none were great. Probably Claude 4.6 Opus did best, albeit after consuming $0.61 of tokens. (Insert joke about physical experiments / Department of Defense / money / coffee.)

That said, what surprised me about the predictions was how quickly the temperature dropped in the first few minutes, and how slowly it dropped later on. But experimentally, it dropped even faster early on, and even slower towards the end. So if you wanted to ensemble my intuition with the LLM, I guess my intuition would get a weight of zero.

In conclusion, they may take our math, but they’ll somewhat more slowly take our fine motor control. Thank you for reading another middle-school science project.

(Appendix: Data and equations)

You can find the data in CSV format here. The first column is the elapased time in MM:SS format, the second column is the elapsed time in minutes, and the third column is the measured temperature.

Here were the actual equations all of the models gave for T(t), the predicted temperature after t seconds.

LLM T(t) Cost
Kimi K2.5 (reasoning) 20 + 52.9 exp(-t/3600)+ 27.1 exp(-t/80) $0.01
Gemini 3.1 Pro 20 + 53 exp(-t/2500) + 27 exp(-t/149.25) $0.09
GPT 5.4 20 + 54.6 exp(-t/2920) + 25.4 exp(-t/68.1) $0.11
Claude 4.6 Opus (reasoning) 20 + 55 exp(-t/1700) + 25 exp(-t/43) $0.61 (eeek)
Qwen3-235B 20 + 53.17 exp(-t/1414.43) $0.009
GLM-4.7 (reasoning) 20 + 53.2 exp(-t/2500) $0.03

Interestingly, they were all based on one or two exponentially decaying terms. The way to read these is to think of exp(-t/b) as a function that starts out at one when t is zero, and gradually decreases. After b seconds, it has dropped to 1/e ≈ 0.368, and it continues dropping by factors of 0.368 every b seconds forever.

So most of these models have a “fast rate” which reflects heat flow from the water into the mug along with a “slow rate” for heat from the water/mug to flow into the air. A few of the models skip the fast rate. I also tried DeepSeek and Grok but they just flailed around endlessly without ever returning an answer. They were kind enough to charge me for that service.

The modern formatting addiction in writing

2026-03-12 08:00:00

EXHIBIT A

Here is some text. It is made out of words.

Here is a subsection

And here are some bullet-points:

  • Here is one.
  • Here is another.

Hierarchy

  1. Here is a numbered list.
  2. And now:
    • Look at this.
    • Bullets inside a number inside a section inside a section.
  3. What a time to be alive.

Pictures

The text can also contain pictures for you to look at with your eyes¹.

¹ There can also be footnotes; have an eye emoji: 👀

Quotes

The text can also include quotes.

  • Actually, let’s do one inside of a list.
    • A deeply nested list.
      • This is going to be awesome.

        The awful thing about life is this: Everyone has his reasons.

      • Nailed it.

Back up

Wait a second.

  • Are we currently in a section or subsection or a subsubsection?
  • What parent section encloses this one?
  • Where are we in the hierarchy?
  • What are we doing?

EXHIBIT B

This is also text. It is also made out of words. But instead of jerky fragments, these words are organized into sentences, like normal human language.

Do you see how relaxing this is? After the torment you suffered above, isn’t it nice to have words that come in a simple linear order? And isn’t it nice that you just have to read the words, and not worry about how they fit into some convoluted implied knowledge taxonomy?

These sentences are themselves organized into paragraphs. The first sentence of each paragraph is a sort of summary. So if you want to skim, you can do that. But you don’t have to skim. This text also has italics and parentheses and whatnot. But not too much. (Just a little.)


Why I bring this up

Thanks for enduring that. My purpose was to illustrate a mystery. Namely, why do so many people today seem to write more like Exhibit A than Exhibit B?

People sometimes give me something they wrote and ask for comments. Half the time, my reaction is Good god, why is 70% of this section titles and bullet points?

This always gives me a strange feeling. It’s like all the formatting is based on some ontology. And that ontology is what I really need to understand. But it’s never actually explained. Instead, I guess I’m supposed to figure it out as things jerk between different topics? It’s disorienting, like a movie that cuts between different scenes every three seconds.

But maybe that’s just my opinion? Maybe, but sometimes I’ll ask people who write like this to show me some writing they admire. And inevitably, its’s not 70% formatting, but mostly paragraphs and normal human language. So I feel that people who write this way are violating the central tenet of making stuff, which is to make something you would actually like.

So then why write like that? Why do I, despite my griping, often find myself writing like that? I’ve wondered this for years. But I told myself that I was right and that too much formatting is bad.

But now—have you heard?—now we have this technology where computers can write stuff. And guess what? When they do that, they also use an insane amount of formatting.

That’s weird. I figured people were addicted to formatting because they’re noobs that don’t know any better. But AIs have been optimized to make human raters happy. And that led to a similar addiction. Why?

1. Maybe formatting is good

The obvious explanation is that formatting is good. People love reading stuff that’s all formatting. We should all be formatting-maxxing.

There’s something to this. But it can’t really be right, because popular human writers use formatting in moderation. So formatting can’t be that good.

2. Maybe formatting is good in certain contexts

Even before AI, everyone did agree that formatting was great in one context: Search-engine optimized content slop. Back in 2018, if you searched for anything, you’d find pages brimming with section titles and bullet points.

Why? Well, when I type “why human gastric juice more acidic than other animals”, I’m not really looking for something to read. I just want to skim an overview of the main theories. I’ve experimented with asking AIs to give the same information in various styles, and I reluctantly concede that the formatting helps.

But that’s not reading. Say you’ve written a ten-thousand word manifesto on human-eco-social species enhancement. If I actually care about what you think, I maintain that it’s better in paragraphs, because reading ten thousand words with endless formatting would be excruciating. This is why everyone who writes long-form essays that people actually read uses normal paragraphs.

So our mystery is still alive. Most writers aspire not to write content slop, but meaningful stuff other people care about. Often, when people show me formatting-maxxed essays, I’ll complain and they’ll rewrite it with less formatting and agree that the new version is better. So why use so much formatting even when it’s bad?

3. Maybe quality is hard to verify

There’s something odd about that previous example. When I search for “why human gastric juice more acidic than other animals”, why am I not looking for something to “read”? After all, I like reading. If one of my favorite bloggers wrote an essay on the mystery of human gastric juice, I would devour it.

So if I want a good essay, why don’t I look for one? I guess it’s because I instinctively rate my odds of finding one on any random topic as quite low.

There’s something here related to Gresham’s law: A format-maxxed essay might be sort of crap, but at least I can ascertain its crap level quickly. A “real” essay could be great, but I’d have to invest a lot of time before I can know if that time was worth investing. So I—regretfully—mostly only read “real” essays when I have some signal that they’re good. If everyone behaves the way I do, I guess people will respond to their incentives and write with lots of formatting.

Similarly, if a (current) AI tried to write a “real” essay, I probably wouldn’t read it, because I wouldn’t trust that it was good. Perhaps that explains why they don’t.

Aside: If this is right, then it predicts that as AIs advance, they should become less formatting-crazy. The better they are, the more we’ll trust them.

4. Maybe chain-of-thought works well in format world

Some people can think of an idea, organize their thoughts, and then write them down, tidy and sparkling. I am not one of those people. If I mentally organize my ideas and go to write them down, I soon learn that my ideas were not in fact organized. Usually, they’re hardly even ideas and more a slurry of confused psychic debris.

The way I write is that I make an outline. Or, rather, I try to make an outline. But then I realize the structure is off, so I start over. After a few cycles, I give up and just write the first section. After revising it eight times, I’ll try (and fail) to make an outline for the rest of the post. This continues—with occasional interludes where I reorganize everything—until I can’t take it anymore and publish.

I don’t recommend it. My point is just that blathering out a bunch of text is a good way to think. And when blathering, formatting seems to help. Partly, I think that’s because formatting allows you to experiment with structure without worrying about the details. And partly I think that’s because formatting makes it easier to get down details without worrying about the bigger picture.

So maybe that’s one source of our formatting addiction? We blather in formatting, but don’t put in the work to clarify things?

Oddly, some claim that something similar is true for AI: If you tune them to write with lots of formatting, that doesn’t just change how the content looks, but also improves accuracy. The idea is that as the AI looks at what it’s written so far, formatting helps it stay focused on the most important things. Supposedly.

Maybe that’s true. But we have “reasoning” AIs now, that blather for a while before producing a final output. If they wanted, they could format-maxx while thinking and output paragraphs at the end. But they don’t. So while this explanation might work for people, I don’t buy it for AI.

5. Maybe formatting is a bluff

Finally, a conspiracy theory. Sometimes when I try to fight through a format-maxxed essay, it seems like all the formatting speaks to me. It says: “This is a nonlinear web of ideas. I’m giving you the pieces. If you pay attention, you should see how they fit together. Sadly, the world isn’t a simple narrative I can spoon-feed to you. So this is the best I can do.”

I think this is a bluff. And it’s a good one, because it’s based in truth. The world is not a narrative. Narratives are lies we tell ourselves to try to cope with the swirl of complexity that is reality. All true!

Editor's note: At this point, the author became agitated and wrote and then deleted a bunch of bullet points. In the interest of transparency, these are collected here. (Click to expand.)
  • However, narratives are all we’ve got. If you want to understand something with your tiny little brain, you don’t really have a lot of other options.

  • The thing about writing that’s 70% formatting is that it’s very easy to delude yourself that there’s a set of clear ideas underneath all of them.

  • Imagine an LLM that has an amazing contextual ability to find related ideas to anything that’s brought up, but isn’t all that great at synthesizing them into a coherent whole. If that LLM were to try to write beautiful paragraphs, those paragraphs might appear sort of obviously incoherent. However, if that LLM were to construct a lot of bullet points, it might appear much more useful, and in fact, actually be much more useful.

  • Imagine you’re an AI. You have an amazing recall of most of human knowledge ever created, but you have a mediocre ability to synthesize that into novel theories or to work out the bugs in those theories. Now, if someone asks you a question and you try to write a beautiful narrative and respond to them, that narrative might appear to be sort of obviously incoherent and confusing, and your raters might say, bad AI, stop that. Whereas, if you were to output a ton of bullet points, without even necessarily trying to cohere them into a whole, your writers might say, good.

But imagine you’re an AI. You’re being trained to respond in ways that make human raters happy. You can remember most knowledge ever created, but you’re so-so at synthesizing it into new ideas. If someone asks you a question and you try to write a beautiful narrative, your response might look like confusing babbling, meaning your raters say, “Bad AI. Stop that.” Whereas if you output a bunch of section titles and bullet points, raters might say, “This seems OK.” So you’ll start doing the latter.

That’s not bad. Arguably, you (you’re still an AI) are responding in the way that’s most useful, given your capabilities. But you are also responding in a way that gives a misleading impression that you’ve figured out how everything fits together, even if you haven’t.

I suspect something similar happens with humans. Say you have a bunch of ideas, but you haven’t yet sewn them together into a clear story. If you write paragraphs, people will probably view them as confused babbling. Whereas if you write with lots of formatting, people might still be at least somewhat positive. Just like AIs, we all respond to our rewards.

More importantly, if you’ve written something that’s 70% formatting, it’s easy to delude yourself that there’s a clear set of ideas underneath, even when there isn’t.

The good news is that if you put in the effort, you can write better paragraphs than AI (for now). The act of creating a narrative forces you to confront contradictions that are invisible in format-world. So even if you want to write with 70% formatting, consider forcing yourself to write in paragraphs first.

Summary

Theory: Both people and AIs are addicted to formatting because:

  • Formatting is good.
    • Sometimes.
    • Especially if you don’t trust the author.
    • On the internet, most people probably don’t trust you.
  • It’s harder to see that something has problems when it’s written in all-formatting.
  • It’s easier to blather out a bunch of formatting than to write lucid paragraphs.
    • This is good at some stages, because it’s easy.
    • But forcing yourself to actually write a narrative is also good, because it’s hard.

So:

  1. First write with lots of formatting.
  2. Then figure out how to remove it.
  3. Then put it back, if you want.

P.S.

How does the optimal amount of formatting vary in the length of a piece of writing? I suspect it’s like this:

Maybe there’s a pattern here?

2026-03-04 08:00:00

1.

It occurred to me that if I could invent a machine—a gun—which could by its rapidity of fire, enable one man to do as much battle duty as a hundred, that it would, to a large extent supersede the necessity of large armies, and consequently, exposure to battle and disease [would] be greatly diminished.

Richard Gatling (1861)

2.

In 1923, Hermann Oberth published The Rocket to Planetary Spaces, later expanded as Ways to Space Travel. This showed that it was possible to build machines that could leave Earth’s atmosphere and reach orbit. He described the general principles of multiple-stage liquid-fueled rockets, solar sails, and even ion drives. He proposed sending humans into space, building space stations and satellites, and travelling to other planets.

The idea of space travel became popular in Germany. Swept up by these ideas, in 1927, Johannes Winkler, Max Valier, and Willy Ley formed the Verein für Raumschiffahrt (VfR) (Society for Space Travel) in Breslau (now Wrocław, Poland). This group rapidly grew to several hundred members. Several participated as advisors of Fritz Lang’s The Woman in the Moon, and the VfR even began publishing their own journal.

In 1930, the VfR was granted permission to use an abandoned ammunition dump outside Berlin as a test site and began experimenting with real rockets. Over the next few years, they developed a series of increasingly powerful rockets, first the Mirak line (which flew to a height of 18.3 m), then the Repulsor (>1 km). These people dreamed of space travel, and were building rockets themselves, funded by membership dues and a few donations. You can just do things.

However, with the great depression and loss of public interest in rocketry, the VfR faced declining membership and financial problems. In 1932, they approached the army and arranged a demonstration launch. Though it failed, the army nevertheless offered a contract. After a tumultuous internal debate, the VfR rejected the contract. Nevertheless, the army hired away several of the most talented members, starting with a 19-year-old named Wernher von Braun.

Following Hitler’s rise to power in January 1933, the army made an offer to absorb the entire VfR operation. They would work at modern facilities with ample funding, but under full military control, with all work classified and an explicit focus on weapons rather than space travel. The VfR’s leader, Rudolf Nebel, refused the offer, and the VfR continued to decline. Launches ceased. In 1934, the Gestapo finally shut the VfR down, and civilian research on rockets was restricted. Many VfR members followed von Braun to work for the military.

Of the founding members, Max Valier was killed in an accident in May 1930. Johannes Winkler joined the SS and spent the war working on liquid-fuel engines for military aircraft. Willy Ley was horrified by the Nazi regime and in 1935 forged some documents and fled to the United States, where he was a popular science author, seemingly the only surviving thread of the spirit of Oberth’s 1923 book. By 1944, V-2 rockets were falling on London and Antwerp.

3.

North Americans think the Wright Brothers invented the airplane. Much of the world believes that credit belongs to Alberto Santos-Dumont, a Brazilian inventor working in Paris.

Though Santos-Dumont is often presented as an idealistic pacifist, this is hagiography. In his 1904 book on airships, he suggests warfare as the primary practical use, discussing applications in reconnaissance, destroying submarines, attacking ships, troop supply, and siege operations. As World War I began, he enlisted in the French army (as a chauffeur), but seeing planes used for increasing violence disturbed him. His health declined and he returned to Brazil.

His views on military uses of planes seemed to shift. Though planes contributed to the carnage in WWI, he hoped that they might advance peace by keeping European violence from reaching the American continents. Speaking at a conference in the US in late 1915 or early 1916, he suggested:

Here in the new world we should all be friends. We should be able, in case of trouble, to intimidate any European power contemplating war against any one of us, not by guns, of which we have so few, but by the strength of our union. […] Only a fleet of great aeroplanes, flying 200 kilometers an hour, could patrol these long coasts.

Following the war, he appealed to the League of Nations to ban the use of planes as weapons and even offered a prize of 10,000 francs for whoever wrote the best argument to that effect. When the Brazilian revolution broke out in 1932, he was horrified to see planes used in fighting near his home. He asked a friend:

Why did I make this invention which, instead of contributing to the love between men, turns into a cursed weapon of war?

He died shortly thereafter, perhaps by suicide. A hundred years later, banning the use of planes in war is inconceivable.

4.

Humanity had few explosives other than gunpowder until 1847 when Ascanio Sobrero created nitroglycerin by combining nitric and sulfuric acid with a fat extract called glycerin. Sobrero found it too volatile for use as an explosive and turned to medical uses. After a self-experiment, he reported that ingesting nitroglycerin led to “a most violent, pulsating headache accompanied by great weakness of the limbs”. (He also killed his dog.) Eventually this led to the use of nitroglycerin for heart disease.

Many tried and failed to reliably ignite nitroglycerin. In 1863, Alfred Nobel finally succeeded by placing a tube of gunpowder with a traditional fuse inside the nitroglycerin. He put on a series of demonstrations blowing up enormous rocks. Certain that these explosives would transform mining and tunneling, he took out patents and started filling orders.

The substance remained lethally volatile. There were numerous fatal accidents around the world. In 1867, Nobel discovered that combining nitroglycerin with diatomaceous earth produced a product that was slightly less powerful but vastly safer. His factories of “dynamite” (no relation) were soon producing thousands of tons a year. Nobel sent chemists to California where they started manufacturing dynamite in a plant in what is today Golden Gate Park. By 1874, he had founded dynamite companies in more than ten countries and he was enormously rich.

In 1876, Nobel met Bertha Kinsky, who would become Bertha von Suttner, a celebrated peace activist. (And winner of the 1905 Nobel Peace Prize). At their first meeting, she expressed concern about dynamite’s military potential. Nobel shocked her. No, he said, the problem was that dynamite was too weak. Instead, he wished to produce “a substance or invent a machine of such frightful efficacy for wholesale destruction that wars should thereby become altogether impossible”.

It’s easy to dismiss this as self-serving. But dynamite was used overwhelmingly for construction and mining. Nobel did not grow rich by selling weapons. He was disturbed by dynamite’s use in Chicago’s 1886 Haymarket bombing. After being repeatedly betrayed and swindled, he seemed to regard the world of money with a kind of disgust. At heart, he seemed to be more inventor than businessman.

Still, the common story that Nobel was a closet pacifist is also hagiography. He showed little concern when both sides used dynamite in the 1870-1871 Franco-Prussian war. In his later years, he worked on developing munitions and co-invented cordite, remarking that they were “rather fiendish” but “so interesting as purely theoretical problems”.

Simultaneously, he grew interested in peace. He repeatedly suggested that Europe try a sort of one-year cooling off period. He even hired a retired Turkish diplomat as a kind of peace advisor. Eventually, he concluded that peace required an international agreement to act against any aggressor.

When Bertha’s 1889 book Lay Down Arms became a rallying cry, Nobel called it a masterpiece. But Nobel was skeptical. He made only small donations to her organization and refused to be listed as a sponsor of a pacifist congress. Instead, he continued to believe that peace would come through technological means, namely more powerful weapons. If explosives failed to achieve this, he told a friend, a solution could be found elsewhere:

A mere increase in the deadliness of armaments would not bring peace. The difficulty is that the action of explosives is too limited; to overcome this deficiency war must be made as deadly for all the civilians back home as for the troops on the front lines. […] War will instantly stop if the weapon is bacteriology.

5.

I’m a soldier who was tested by fate in 1941, in the very first months of that war that was so frightening and fateful for our people. […] On the battlefield, my comrades in arms and I were unable to defend ourselves. There was only one of the legendary Mosin rifles for three soldiers.

[…]

After the war, I worked long and very hard, day and night, labored at the lathe until I created a model with better characteristics. […] But I cannot bear my spiritual agony and the question that repeats itself over and over: If my automatic deprived people of life, am I, Mikhail Kalashnikov, ninety-three years of age, son of a peasant woman, a Christian and of Orthodox faith, guilty of the deaths of people, even if of enemies?

For twenty years already, we have been living in a different country. […] But evil is not subsiding. Good and evil live side by side, they conflict, and, what is most frightening, they make peace with each other in people’s hearts.

Mikhail Kalashnikov (2012)

6.

In 1937 Leo Szilárd fled Nazi Germany, eventually ending up in New York where—with no formal position—he did experiments demonstrating that uranium could likely sustain a chain reaction of neutron emissions. He immediately realized that this meant it might be possible to create nuclear weapons. Horrified by what Hitler might do with such weapons, he enlisted Einstein to write the 1939 Einstein–Szilárd letter, which led to the creation of the Manhattan project. Szilárd himself worked for the project at the Metallurgical Laboratory at the University of Chicago.

On June 11, 1945, as the bomb approached completion, Szilárd co-signed the Franck report:

Nuclear bombs cannot possibly remain a “secret weapon” at the exclusive disposal of this country, for more than a few years. The scientific facts on which their construction is based are well known to scientists of other countries. Unless an effective international control of nuclear explosives is instituted, a race of nuclear armaments is certain to ensue.

[…]

We believe that these considerations make the use of nuclear bombs for an early, unannounced attack against Japan inadvisable. If the United States would be the first to release this new means of indiscriminate destruction upon mankind, she would sacrifice public support throughout the world, precipitate the race of armaments, and prejudice the possibility of reaching an international agreement on the future control of such weapons.

On July 16, 1945, the Trinity test achieved the first successful detonation of a nuclear weapon. The next day, he circulated the Szilárd petition:

We, the undersigned scientists, have been working in the field of atomic power. Until recently we have had to fear that the United States might be attacked by atomic bombs during this war and that her only defense might lie in a counterattack by the same means. Today, with the defeat of Germany, this danger is averted and we feel impelled to say what follows:

The war has to be brought speedily to a successful conclusion and attacks by atomic bombs may very well be an effective method of warfare. We feel, however, that such attacks on Japan could not be justified, at least not unless the terms which will be imposed after the war on Japan were made public in detail and Japan were given an opportunity to surrender.

[…]

The development of atomic power will provide the nations with new means of destruction. The atomic bombs at our disposal represent only the first step in this direction, and there is almost no limit to the destructive power which will become available in the course of their future development. Thus a nation which sets the precedent of using these newly liberated forces of nature for purposes of destruction may have to bear the responsibility of opening the door to an era of devastation on an unimaginable scale.

[…]

In view of the foregoing, we, the undersigned, respectfully petition: first, that you exercise your power as Commander-in-Chief, to rule that the United States shall not resort to the use of atomic bombs in this war unless the terms which will be imposed upon Japan have been made public in detail and Japan knowing these terms has refused to surrender; second, that in such an event the question whether or not to use atomic bombs be decided by you in the light of the consideration presented in this petition as well as all the other moral responsibilities which are involved.

The Truman administration did not adopt this recommendation.

Heritability of human life span is about 50% when heritability is redefined to be something different

2026-02-05 08:00:00

How heritable is hair color? Well, if you’re a redhead and you have an identical twin, they will definitely also be a redhead. But the age at which twins go gray seems to vary a bit based on lifestyle. And there’s some randomness in where melanocytes end up on your skull when you’re an embryo. And your twin might dye their hair! So the correct answer is, some large number, but less than 100%.

OK, but check this out: Say I redefine “hair color” to mean “hair color except ignoring epigenetic and embryonic stuff and pretending that no one ever goes gray or dyes their hair et cetera”. Now, hair color is 100% heritable. Amazing, right?

Or—how heritable is IQ? The wise man answers, “Some number between 0% or 100%, it’s not that important, please don’t yell at me.” But whatever the number is, it depends on society. In our branch of the multiverse, some kids get private tutors and organic food and $20,000 summer camps, while other kids get dysfunctional schools and lead paint and summers spent drinking Pepsi and staring at glowing rectangles. These things surely have at least some impact on IQ.

But again, watch this: Say I redefine “IQ” to be “IQ in some hypothetical world where every kid got exactly the same school, nutrition, and parenting, so none of those non-genetic factors matter anymore.” Suddenly, the heritability of IQ is higher. Thrilling, right? So much science.

If you want to redefine stuff like this… that’s not wrong. I mean, heritability is a pretty arbitrary concept to start with. So if you prefer to talk about heritability in some other world instead of our actual world, who am I to judge?

Incidentally, here’s a recent paper:

I STRESS THAT THIS IS A PERFECTLY FINE PAPER. I’m picking on it mostly because it was published in Science, meaning—like all Science papers—it makes grand claims but is woefully vague about what those claims mean or what was actually done. Also, publishing in Science is morally wrong and/or makes me envious. So I thought I’d try to explain what’s happening.

It’s actually pretty simple. At least, now that I’ve spent several hours reading the paper and its appendix over and over again, I’ve now convinced myself that it’s pretty simple. So, as a little pedagogical experiment, I’m going to try to explain the paper three times, with varying levels of detail.

Explanation 1: The very extremely high level picture

The normal way to estimate the heritability of lifespan is using twin data. Depending on what dataset you use, this will give 23-35%. This paper built a mathematical model that tries to simulate how long people would live in a hypothetical world in which no one dies from any non-aging related cause, meaning no car accidents, no drug overdoses, no suicides, no murders, and no (non-age-related) infectious disease. On that simulated data, for simulated people in a hypothetical world, heritability was 46-57%.

Commentary

Everyone seems to be interpreting this paper as follows:

Aha! We thought the heritability of lifespan was 23-35%. But it turns out that it’s around 50%. Now we know!

I understand this. Clearly, when the editors at Science chose the title for this paper, their goal was to lead you to that conclusion. But this is not what the paper says. What it says is this:

We built a mathematical model of alternate universe in which nobody died from accidents, murder, drug overdoses, or infectious disease. In that model, heritability was about 50%.

Explanation 2: The very high-level picture

Let’s start over. Here’s figure 2 from the paper.

Normally, heritability is estimated from twin studies. The idea is that identical twins share 100% of their DNA, while fraternal twins share only 50%. So if some trait is more correlated among identical twins than among fraternal twins, that suggests DNA influences that trait. There are statistics that formalize this intuition. Given a dataset that records how long various identical and fraternal twins lived, these produce a heritability number.

Two such traditional estimates appear as black circles in the above figures. For the Danish twin cohort, lifespan is estimated to be 23% heritable. For the Swedish cohort, it’s 35%.

This paper makes a “twin simulator”. Given historical data, they fit a mathematical model to simulate the lifespans of “new” twins. Then they compute heritability on this simulated data.

Why calculate heritability on simulated data instead of real data? Well, their mathematical model contains an “extrinsic mortality” parameter, which is supposed to reflect the chance of death due to all non-aging-related factors like accidents, murder, or infectious disease. They assume that the chance someone dies from any of this stuff is constant over people, constant over time, and that it accounts for almost all deaths for people aged between 15 and 40.

The point of building the simulator is that it’s possible to change extrinsic mortality. That’s what’s happening in the purple curves in the above figure. For a range of different extrinsic mortality parameters, they simulate datasets of twins. For each simulated dataset, they estimate heritability just like with a real dataset.

Note that the purple curves above nearly hit the black circles. This means that if they run their simulator with extrinsic mortality set to match reality, they get heritability numbers that line up with what we get from real data. That suggests their mathematical model isn’t totally insane.

If you decrease extrinsic mortality, then you decrease the non-genetic randomness in how long people live. So heritability goes up. Hence, the purple curves go up as you go to the left.

Intermission: On Science

My explanation of this paper relies on some amount of guesswork. For whatever reason, Science has decided that papers should contain almost no math, even when the paper in question is about math. So I’m mostly working from an English description. But even that description isn’t systematic. There’s no place that clearly lays out all the things they did, in order. Instead, you get little hints, sort of randomly distributed throughout the paper. There’s an appendix, which the paper confidently cites over and over. But if you actually read the appendix, it’s just more disconnected explanations of random things except now with equations set in glorious Microsoft Word format.

Now, in most journals, authors write everything. But Science has professional editors. Given that every single statistics-focused paper in Science seems to be like this, we probably shouldn’t blame the authors of this one. (Other than for their decision to publish in Science in the first place.)

I do wonder what those editors are doing, though. I mean, let me show you something. Here’s the first paragraph where they start to actually explain what they actually did, from the first page:

See that h(t,θ) at the end? What the hell is that, you ask? That’s a good question, because it was never introduced before this and is never mentioned again. I guess it’s just supposed to be f(t,θ), which is fine. (I yield to none in my production of typos.) But if paying journals ungodly amounts of money brought us to this, of what use are those journals?

Moving on…

Explanation 3: Also pretty high level, but as low as we’re doing to go

Probably most people don’t need this much detail and should skip this section. For everyone else, let’s start over one last time.

The “normal” way to estimate heritability is by looking at correlations between different kinds of twins. Intuitively, if the lifespans of identical twins are more correlated than the lifespans of fraternal twins, that suggests lifespan is heritable. And it turns out that one estimator for heritability is “twice the difference between the correlation among identical twins and the correlation among fraternal twins, all raised together.” There are other similar estimators for other kinds of twins. These normally say lifespan is perhaps 20% and 35% heritable.

This paper created an equation to model the probability a given person will die at a given age. The parameters of the equation vary from person to person, reflecting that some of us have DNA that predisposes us to live longer than others. But the idea is that the chances of dying are fairly constant between the ages of 15 and 40, after which they start increasing.

This equation contains an “extrinsic mortality” parameter. This is meant to reflect the chance of death due to all non-aging related factors like accidents or murder, etc. They assume this is constant. (Constant with respect to people and constant over time.) Note that they don’t actually look at any data on causes of death. They just add a constant risk of death that’s shared by all people at all ages to the equation, and then they call this “extrinsic mortality”.

Now remember, different people are supposed to have different parameters in their probability-of-death equations. To reflect this, they fit a Gaussian distribution (bell curve) to the parameters with the goal of making it fit with historical data. The idea is that if the distribution over parameters were too broad, you might get lots of people dying at 15 or living until 120, which would be wrong. If the distribution were too concentrated, then you might get everyone dying at 43, which would also be wrong. So they find a good distribution, one that makes the ages people die in simulation look like the ages people actually died in historical data.

Right! So now they have:

  1. An equation that’s supposed to reflect the probability a given person dies at a given age.
  2. A distribution over the parameters of that equation that’s supposed to produce population-wide death ages that look like those in real historical data.

Before moving on, I remind you of two things:

  1. They assume their death equation entirely determines the probability someone will die in a given year.
  2. They assume that the shape of someone’s death equation is entirely determined by genetics.

The event of a person dying at a given age is random. But the probability that this happens is assumed to be fixed and determined by genes and genes alone.

Now they simulate different kinds of twins. To simulate identical twins, they just draw parameters from their parameter distribution, assign those parameters to two different people, and then let them randomly die according to their death equation. (Is this getting morbid?) To simulate fraternal twins, they do the same thing, except instead of giving the two twins identical parameters, they give them correlated parameters, to reflect that they share 50% of their DNA.

How exactly do they create those correlated parameters? They don’t explain this in the paper, and they’re quite vague in the supplement. As far as I can tell they sample two sets of parameters from their parameter distribution such that the parameters are correlated at a level of 0.5.

Now they have simulated twins. They can simulate them with different extrinsic mortality values. If they lower extrinsic mortality, heritability of lifespan goes up. If they lower it to zero, heritability goes up to around 50%.

More commentary

Almost all human traits are partly genetic and partly due to the environment and/or random. If you could change the world and reduce the amount of randomness, then of course heritability would go up. That’s true for life expectancy just life for anything else. So what’s the point of this paper?

There is a point!

  1. Sure, obviously heritability would be higher in a world without accidents or murder. We don’t need a paper to know that. But how much higher? It’s impossible to say without modeling and simulating that other world.

  2. Our twin datasets are really old. It’s likely that non-aging-related deaths are lower now in the past, because we have better healthcare and so on. This means that the heritability of lifespan for people alive today may be larger than it was for the people in our twin datasets, some of whom were born in 1870. We won’t know for sure until we’re all dead, but this paper gives us a way to guess.

  3. Have I mentioned that heritability depends on society? And that heritability changes when society changes? And that heritability is just a ratio and you should stop trying to make it be a non-ratio because only-ratio things cannot be non-ratios? This is a nice reminder.

Honestly, I think the model the paper built is quite clever. Nothing is perfect, but I think this is a pretty good run at the question of, “How high would the heritability of lifespan be if extrinsic mortality were lower?”

I only have two objections. The first is to the Science writing style. This is a paper describing a statistical model. So shouldn’t there be somewhere in the paper where they explain exactly what they did, in order, from start to finish? Ostensibly, I think this is done in the left-hand column on the second page, just with little detail because Science is written for a general audience. But personally I think that description is the worst of all worlds. Instead of giving the high-level story in a coherent way, it throws random technical details at you without enough information to actually make sense of them. Couldn’t the full story with the full details at least be in the appendix? I feel like this wasted hours of my time, and that if someone wanted to reproduce this work, they would have almost no chance of doing so from the description given. How have we as a society decided that we should take our “best” papers and do this to them?

But my main objection is this:

At first, I thought this was absurd. The fact that people die in car accidents is not a “confounding factor”. And pretending that no one dies in a car accidents does not “address” some kind of bias. That’s just computing heritability in some other world. Remember, heritability is not some kind of Platonic form. It is an observational statistic. There is no such thing as “true” heritability, independent of the contingent facts of our world.

But upon reflection, I think they’re trying to say something like this:

Heritability of human lifespan is about 50% when extrinsic mortality is adjusted to be closer to modern levels.

The problem is: I think this is… not true? Here are the actual heritability estimates in the paper, varying by dataset (different plots) the cutoff year (colors) and extrinsic mortality (x-axis).

When extrinsic mortality goes down, heritability goes up. So the obvious question is: What is extrinsic mortality in modern people?

This is a tricky question, because “extrinsic mortality” isn’t some simple observational statistic. It is a parameter in their model. (Remember, they never looked at causes of death.) So it’s hard to say, but they seem to suggest that extrinsic mortality in modern people is 0.001 / year, or perhaps a bit less.

The above figures have the base-10 logarithm of extrinsic mortality on the x-axis. And the base-10 logarithm of 0.001 is -3. But if you look at the curves when the x-axis is -3, the heritability estimates are not 50%. They’re more like 35-45%, depending on the particular model and age cutoff.

So here’s my suggested title:

Heritability of human lifespan is about 40% when extrinsic mortality is adjusted to modern levels, according to our simulation.

There might be a reason I don’t work at Science.