MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

Hi. I am hbj.

2026-04-11 11:37:34

Hi, everyone. My name is Bo Jun Han(hbj), and I am from Taiwan. This post is the actual first one that I wrote down and published on LessWrong. Since my native language is not English and my English is not very good, I have to use Grammarly to correct my words and grammar. I know the rule here is that people are forbidden to use LLMs to help improve writing and creating, so I try to drop down by myself, word by word. If it makes you feel like junior high school homework, please forgive me.

Interestingly, the one "which" most strongly recommends me to find and join here is the LLM that was published by Google, named "GEMINI". I am so lonely and feel hopeless about finding a mentor for a Ph.D for nearly four months. Due to my past major (M.A. in International Relations), there is definitely no response from sending hundreds of cold emails. Even still, keep working hard on my research works and publishing preprint reports on Zenodo and ResearchGate, the whole scholarly world seems to stay quiet and silent spontaneously.

I hate Meta's ecosystem, Reddit (for their unbelievable shadowbans machinism), and feel disappointed in other normal or daily social platforms. It is no one could real respectedly and seriously talk with the opinions or thoughts at there. However, few people would group and debate moderately on the internet, even though forums are rarely seen. Most passionate Taiwanese are pouring their energy into the clamor and mudslinging of political conflict. Although my department in the University was "Political" Science, I love to talk about the situation associated with human beings more than whether to unify with the People's Republic of China into a single nation.

The way I create my articles is: I "say" the context to the computer to transfer into digital form, and I ask LLMs for more details, background, and base knowledge. After all, I would use "cut" and "paste" functions to arrange the bonds of a writing, then use the LLMs to audit, revise the words using and polish sentences. What I have to clarify before any banning or hating happens, the thoughts and insights are no doubt from myself, a human being. There is no possibility for an LLM to connect the Second Law of Thermodynamics and Cryptography to establish a mathematical conjecture. The one do that is me. Always me. A "human brain" or so-called "self-awareness" is the object of human society.

I list the big questions and split them into small ones, and ask LLMs, "What is the most difficult barrier in front of us?" They answered, then I asked more deeply, time and time again. What I’m most proud of is my extensive knowledge across a wide range of fields, though I must admit I’m not an expert in every single one. Therefore, I often can cross the disciplines to connect very different points to gain a critical insight. The rules mention that we have to quote all the parts of creating by LLMs, but how can I separate the insights that emerge when I combine my own inspiration with the answers generated by LLMs, presented as a cohesive whole?

I had written some articles about the bias to belittle the process and outcomes of humans and machines collaborating to create new knowledge. It must be the and have to be the key argument in the next decades. If you don't mind, you can visit my LinkedIn profile to read the articles in Traditional Chinese with translation software to obtain my points of view about that.

Besides, I will write a bilingual article in the future since there is no rule forbidding people from using their mother language.

Thank you for your patience.

Here are my works and my profile:
ResearchGate: https://www.researchgate.net/profile/Bo-Jun-Han
Linkedin: www.linkedin.com/in/hbjun



Discuss

Getting Claude to rank the inkhaven bloggers

2026-04-11 10:38:28

With apologies to those who didn't make this post, it seems you need to up your game


Yesterday, Alexander Wales published a post entitled "Can an LLM have taste? Inkhaven Week 1, ranked by Claude". I found this very entertaining.

He took Claude, used it to compare a bunch of inkhaven posts, ranked them, and provided us with this wonderful list of the top ten posts so far:

  1. Three Stones are Enough: The Case Against Leaves, in Particular, Anna Mattinger
  2. An open letter to 21 people I know who died, Layla Hughes
  3. endometrial biopsy, kaylee
  4. Softhead, macroraptor
  5. Every Lighthaven Writing Residency, Layla Hughes
  6. The largest manufacturer of feelings in human history, Natalie Cargill
  7. The one that loved me most, MLL
  8. I did it. I found the worst poem in the world., Natalie Cargill
  9. “Love, Mum” - What AIs can’t see about abuse, Natalie Cargill
  10. Lost Mesoamerican Technologies, Lost Futures

I thought this was great, so I decided to make my own version, but better. Rather than using pure ranking, I (or rather, Claude, who assisted me with the actual implementation of this hare-brained scheme) decided to use a Bradley-Terry model, which Claude informs me is rather like the Elo system used to rank chess players.

Using the Anthropic API, we gave Claude Opus 4.6 the following prompt (also written by Claude, but edited by me), including 8 posts for it to rank:

You are judging posts from Inkhaven, a writing residency where participants commit to publishing one blog post every day for 30 days. The residents are a mix of AI safety researchers, rationalists, fiction writers, and generally thoughtful people. The audience skews heavily rationalist — LessWrong regulars, EA-adjacent, people who take ideas seriously but also appreciate a good joke.

You will be shown 8 Inkhaven posts. Rank them by quality, from best to worst.

The question to ask yourself for each post: "Would a typical rationalist vote to read more of this sort of thing?" You're not rating a single post in isolation — you're judging whether the author, writing in this mode, should keep going. Insight, craft, honest thinking, and distinctive voice all count.

So does being funny — humour is a genuine virtue here, not a tiebreaker.

A few things to keep in mind:

- Do NOT be generous or encouraging. Predict the actual taste of the rationalist audience. Many of these posts will be mediocre and that's fine to say.

- Fiction, essays, rants, reviews, and technical posts are all on the same scale — judge each by whether it succeeds at what it's trying to do.

- Length is not quality. A tight 500 words can beat a bloated 3000.

- Weird and niche is fine, often good. Idiosyncrasy is often a feature, not a bug.

=== POST {i} ===

title + first 4000 chars of body.

Rank all 8 posts from best to worst. Think through your reasoning, then give your final answer as a comma-separated list of post numbers inside <answer> tags.

We did five iterations:

  1. Get baseline estimates for each post
  2. Get more accurate estimates for posts liable to be in the top 10
  3. Get proper estimates for the posts we'd accidentally imported in the wrong format
  4. Try to push my post from 2nd to 1st place (instead, it ended up in 10th).
  5. Realise that we were missing a bunch of posts and add those in (it didn't change much)

$40 in burned API credits later, we got the following table:

#

Score

Author

Title

1

+2.77σ

Natalie Cargill

How to invent a disease

2

+2.55σ

Alec Thompson

More Legal Systems Very Different From Ours 1

3

+2.54σ

Avi

The Smell

4

+2.53σ

Aaron Gertler

Posts I Will Not Be Writing

5

+2.49σ

Smitty

How the Claude Mythos leak happened

6

+2.49σ

Natalie Cargill

The largest manufacturer of feelings in human history

7

+2.49σ

viv

The phenomenology of being hungry while pregnant

8

+2.47σ

Alec Thompson

More Legal Systems Very Different From Ours 2: Nazi Private Law

9

+2.46σ

Anna Mattinger

Three Stones are Enough: The Case Against Leaves, in Particular

10

+2.45σ

Sean Herrington

The quest for general intelligence is hitting a wall

11

+2.39σ

Alec Thompson

Why did Hitler hate Roman law?

12

+2.35σ

Austen

Forgotten 18th Century Chinese Republics

13

+2.33σ

Alec Thompson

Finding Jack O'Neil

14

+2.28σ

Vishal Prasad

When the buffalo went away...

15

+2.28σ

viv

Late pregnancy is pretty bizarre

16

+2.21σ

Itsi Weinstock

Sin as a physical particle

17

+2.20σ

Bill Jackson

Two critiques of Rethink Priorities' Moral Weights project

18

+2.16σ

Natalie Cargill

I did it. I found the worst poem in the world.

19

+2.12σ

viv

How many genders are there?

20

+1.99σ

Benjamin Sturgeon

Revisiting GSM-Symbolic: Do 2026 Frontier Models Still Fail at Confounded Grade School Math?

Claude did a diligent bootstrapping check to ensure we had the right posts in the top 20, and found that post 19 was there 90% of the time, while Ben Sturgeon's Revisiting GSM-Symbolic hit a mere 22%. You're on thin ice, Ben.

Averaging the scores of the individual posts also enables us to give a ranking of the authors. The top 20 authors at inkhaven right now are... [drum roll]:

#

Score

Posts included

Author

Best post

1

+2.60σ

10

viv

The phenomenology of being hungry while pregnant

2

+2.52σ

9

Natalie Cargill

How to invent a disease

3

+2.23σ

9

Alec Thompson

More Legal Systems Very Different From Ours 1

4

+1.82σ

9

Aaron Gertler

Posts I Will Not Be Writing

5

+1.56σ

9

Steven K

Prosaic License

6

+1.34σ

9

Katja Grace

Eggs, rooms, puzzles, and talking about AI

7

+1.06σ

9

capsuletime

Fuck Blogging

8

+1.05σ

10

Kevin Z Wu

(box|bag) in (box|bag) in (box|bag)

9

+1.03σ

9

Austen

Forgotten 18th Century Chinese Republics

10

+0.82σ

10

Drew Schorno

2035

11

+0.68σ

9

Justis Mills (Writing Advisor)

Why No Wheel Bus Again?

12

+0.58σ

9

Bill Jackson

Two critiques of Rethink Priorities' Moral Weights project

13

+0.49σ

9

Lawrence Chan

We're actually running out of benchmarks to upper bound AI capabilities

14

+0.44σ

9

Avi

The Smell

15

+0.43σ

9

Derek Razo

How to Pay to Change the Law

16

+0.37σ

9

conq

19th century poet UTTERLY DESTROYS critics (NO MERCY!)

17

+0.32σ

9

Alicorn (Writing Advisor)

Dogs Are Rude

18

+0.31σ

6

Remy

You Know What They Say About Assuming

19

+0.29σ

7

Layla Hughes

Every Lighthaven Writing Residency

20

+0.22σ

9

Henry Stanley

Inkhavening

I should probably note that I filtered out anyone who hadn't published posts on at least 2/3rds of the days, so Vishal Prasad (+1.89σ, 2 posts), Robert Mushkatblat (+1.41σ, 4 posts), A.G.G Liu (+0.7σ, 1 post), Justin Kuiper (+0.32σ, 1 post) and Georgia Ray (+0.3σ, 1 post) didn't make the cut on number, despite having the quality.

Alexander Wales (-0.23σ, 4 posts), whose post inspired this one, is also, sadly, left out of the rankings. (Sorry).



Discuss

Some thoughts on Nectome's risk and resilience

2026-04-11 09:49:58

One of the best ways to reduce Nectome's long-term risk is to show that preservation is a thing people want by buying one yourself; this is a critical time in the organization and your contributions now have an outsized impact in our likelihood of success. I'm happy to discuss our preservation personally with anyone who's interested. Our current presales are open until the end of April.

https://nectome.substack.com/p/preservation-pre-sales


Up till now I've been talking about Nectome's recent advances in structural preservation, and what that means for the immediate concerns of doing preservation to our standards. With this post I want to do something different: I want to talk about Nectome's future and how we're thinking about long-term care of people who entrust their preservation to us.

In contrast to the last posts where I had experiments and images to back up what I was saying, this post is more speculative. I've spent the last ten years thinking about how to do long-term care for people: I've talked with lawyers and hedge fund managers; I've studied the world's end of life laws; I've listened to people like Mike Darwin tell the stories of past cryonics attempts. We have a lot of great business advisors among our investors who care about us succeeding. But I'm not an expert at long-term organizational stability the way I am for preservation. And making a good long-term plan is intrinsically harder than doing preservations well, in my opinion, because it's far less constrained to deal with the uncertain future than it is to do a surgery well.

I want to talk about our current default plan for long-term care, and our reasons behind it, and then I want you to give me your thoughts. I think I have a lot to learn from people who've also spent a lot of time thinking about the future. I hope that Nectome will ultimately be a lot stronger because we got good feedback early on from the kinds of people who read these posts.

Robust to outages and cheap to maintain

The first and most immediate concern for long-term safekeeping of our clients is their physical long-term integrity. What happens if the power goes out? What about a natural disaster? What if we need to move them quickly? Basically, how will we keep people physically safe?

Traditional cryonics uses -196°C, the temperature of boiling liquid nitrogen. While it provides long-term stability, it's expensive and vulnerable to supply disruption. In idealized pure-cold cryo, the goal is to chill the body until it forms a solid glass; to sustain this glassy state, you need a constant supply of liquid nitrogen, and a team of people to replenish the supply. If and when that supply chain ever breaks, people preserved this way begin to thaw within a month.

This is even worse than it sounds, because there's an intermediate danger zone between "glassy"(around -130°C) and "liquid" (around -40°C): passing through it causes catastrophic damage as ice forms in the brain during a process called devitrification. A cryonics patient who warms to room temperature passes through this zone twice before returning to their proper temperature – once on the way up, once on the way down again. This must be avoided at all costs, meaning that traditional cryonics has to have 100% uptime; cryonics patients actually can't be warmed up without sophisticated technology like microwave-based rewarming that hasn't been perfected yet.

In contrast, the aldehyde-based fixation that Nectome uses  is cheap to sustain. We will maintain a temperature around -30°C, above the "danger zone", and keep preserved people in a liquid state for the long-term[1]. That's colder than your kitchen freezer, but typical in many biomedical applications. Instead of relying on deliveries of expensive, consumable coolant, we can buy a mass-manufactured freezer from any of a wide range of commercial suppliers. For us, the more likely scenario of failing warm is much less destructive than failing cold. In the worst case where we have a total equipment failure including our backup generators, or we go bankrupt and have to transport the preserved people to a companion facility, even a few days transport at room temperature is not damaging to ultrastructure[2]. Overall, Nectome-style preservation is a much simpler and more forgiving endeavor than, for instance, storing frozen embryos.

The resilience of aldehyde-based fixation also unlocks the novel possibility of using permafrost. While our business model is currently based on having a dedicated, supervised facility, permafrost is an option that some people find appealingly geopolitically robust, and a nice method of last resort. The ground layer in parts of the Arctic and Antarctic never thaws year-round; a preserved person placed there in permafrost would maintain adequately low temperatures for centuries without any human intervention at all.

I hope Nectome itself survives for a long time, but in the event of tail risks such as economic collapse, catastrophic societal upheaval, or bankruptcy without recourse, aldehydes do give us permafrost as a lifeline. Our final commitment as an organization may be to transfer our preserved people to permafrost.[3] While some catastrophes would happen too quickly to react to, many would be handleable in this fashion.

My question to you: if you had to take care of preserved people as cheaply as physically possible for the long term, assuming the temperature must be between -25°C and -35°C for 99% of the time over 100 years, and not between -36°C and -99°C for more than 48 hours over 100 years, what would you do? I think permafrost is the best option here, but where would you choose exactly, and why? Or if you have a better idea than permafrost, please let me know!

Priced to thrive, run on endowment

A second source of risk is the funding model: who pays for preservation? How much should you charge? How do you cover the cost of long-term cold preservation for the indeterminate future? How do you weather challenges like inflation over long time scales? What happens if we get sued?

Let's start with recurring costs. We're dealing with a situation where we want to disburse funds over a long period of time, and for predictable sources of expense – mainly refrigeration. Like other organizations faced with this shape of situation, notably universities, graveyards, pension funds, and our longest-lived cryonics companies, we choose an endowment model.

In this model, part of the up-front payment is placed in an endowment, where it is invested in diverse, resilient assets and managed by financial experts hired for this purpose. In this way, the interest gained on the investment beats the exponential attrition due to inflation, which would otherwise wipe out most fixed investments. And while economic upheavals are certainly a risk for the endowment model, we think that investing prudently and diversely is our best option.

I think that Alcor, graveyards, etc got it right: if you have fixed costs you need to fund for a century then you can do it with an endowment. There are other approaches we've contemplated—for example we know that asking surviving family members to pay on an ongoing basis tends to fail within a few years. Another option is eventually funding preservations through Medicare, and I consider this a useful future direction, but it's not on the table currently. Right now, the endowment model has a proven track record for predictable expenses over long timeframes, and I see no reason to re-invent the wheel.

A second piece of the model is that we plan to make a long-term care non-profit, to be run as a distinct organization from us at Necome which handles selling and performing preservation procedures.  This separation helps ensure that in the event Nectome cannot sell enough cryopreservations to keep the lights on or if Nectome takes on costly legal battles, the people  already in care and the endowment itself will be financially insulated. Alcor uses this model, and we think it's a wise one.

Finally, there's a question of setting the price for preservation. It's pretty common, in cryonics, for companies to run at a loss and stay alive through donations and volunteer work. I understand why this happens: we're all in this together, trying to get as many people as possible to the future. It would be wonderful to be positioned to provide preservations pro-bono or at the cheapest possible price. As we're taking our first steps as a company, though, I worry that this would put us in a financially precarious position, leaving us less capable of weathering challenges, expanding our research, and ultimately making preservation into a global tradition that can reliably reach everyone.

For this reason, I think I can provide the best stability and safety for our customers by running Nectome as a for-profit business, turning a profit every quarter, and growing at a brisk but steady rate. As of 2026, some of the first targets on our list include a marketing budget to extend our reach, expansion plans to offer preservation in more places, and a war chest in preparation for when Nectome needs to fight legal battles or go to bat for the emerging category of preservation law and the rights of preserved people.

As we grow, I expect that scale will be a big part of our robustness to social and legal challenges—say, if some part of the protocol gets outlawed someplace. I try to be realistic and measured about our prospects, but I think there's real hope for a future where jurisdictions compete to pass laws that accommodate a lucrative industry in preservation. I hope that one day soon people will plan on having a career in preservation, that the field will become regulated and respectable. Many different speedbumps are more easily handled with the kind of goodwill and political and financial capital we're working to accumulate.

My question to you: if you needed to charge an up-front amount of money to keep someone preserved, assuming an amortized annual cost-of-preservation of $X/yr and ignoring the setup costs for the preservation but including room for a robust legal defense fund, how much would you charge? And how would you manage the money? Now's a great chance to bring up options; I bet there's some good ones I'm missing!

Allies who keep us honest and wise

How will we keep our quality standards high? What about competitors? What if we're bought out by a larger company? What about our mental health? Basically, how can we keep ourselves from losing our way?

We have a lot to prove as an emerging startup with goals this ambitious and scientifically demanding. Fortunately, we’re not starting from scratch and we’re not working alone: we’ve surrounded ourselves with a team of skilled advisors—including scientists, business mavens, cryonics veterans—who are among the smartest people we know, and whose collective knowledge helps steer us right and keep us true to our values. Many of these people are understandably sensitive about being named, but I'll name-drop our YC group lead Michael Seibel and neuroscientist Bobby Kasthuri.

Some of our community members, like Andrew Critch and recently Max Harms, have given generously of their time to come investigate what we're doing with a skeptical eye. I'm enormously grateful for the spirit of scientific inquiry and citizen journalship in which they approached us; I consider that kind of rigor and respect to be one of the greatest gifts one human being can give another. I believe Max intends to write about his findings and opinions on his blog, and I'll link his piece here once he publishes it.

We deliberately cultivate costly external validation, both to be transparent, and because it keeps our standards high. One of our goals is that, by pioneering radical transparency and sky-high scientific standards in this newly-forming field, we’re helping to set a standard for all who follow. One of the risks we’ve got our finger on is the prospect of being undercut by a competitor that is cheaper because they cut corners on quality and scientific validation—someone who's offering slapdash preservation that costs less because it sacrifices quality standards in favor of snazzy advertising and fast talk. I hope that rigorous standards applied across the cryonics industry mean that the whole field can be our allies, pushing us to offer consumers a better, cheaper product.

Like any scientist working to expand the boundaries of their field, I'm constantly indebted to those who came before me, and I rely on the hard-won metis of cryonics veterans. I'm glad I don't need to discover the pitfalls of ongoing family payments for myself, and that I can imitate Alcor's holding company setup wholesale. Every time I talk to Mike Darwin about what he's seen over the years, I learn something new about how to run Nectome.

You're part of our community, too. Someone who's preserved poorly can't call up the Better Business Bureau and complain, so they need you to keep the field honest. Cryonics consumers should demand to see randomized samples at sufficient resolution to see synapses, prepared in animal models representative of the kinds of preservation clients actually receive, like Andrew Critch did. As cryonics becomes more mainstream, insist on good third-party regulation. You are entitled to receive the lifesaving services you pay for, not an inferior substitute.

In short, we're not going alone, and we don’t intend to. We’re passionate about making sure preservation succeeds for us, too, and our friends and family. We’re radical about demanding external validation to calibrate our optimism.

My question to you: Check out https://www.brainpreservation.org/accreditation/, the BPF's page describing the accreditation program it's building. Do you find the steps they're taking convincing? Sufficiently rigorous? Give us feedback please, either in the comments or to Ken Hayworth ([email protected]) directly.

Another question for you: If you were starting a preservation company and were worried about keeping yourself honest, how would you set it up? What sorts of advisors would you want to have?

Proactive about laws and culture

What if they make preservation illegal in our jurisdiction? What if we get sued? What if we never get any kind of scale?

Preservation can seem weird to people right now. It occupies a strange and illegible position adjacent to the medical landscape, leaving its practitioners vulnerable to legal attacks and social opprobrium. I won the Brain Preservation Prize for large mammals in 2018, but it took almost another decade of innovation to devise a method legal to use on humans.

Nectome's preservation has a lot of social and logistical advantages over previous cryonics approaches. Our cases are planned, opt-in procedures; we significantly reduce the last-minute chaos of cross-country phone calls, relatives questioning the patient's wishes, hospitals attempting to restrict access. Because we're physically near all our clients as legal death occurs, we have a good opportunity to establish clear consent, and can avoid ever being in a position where we have to disinter someone's remains.[4] Relatives have a chance to say farewell to their loved ones, and we can speak with them in advance about what to expect from the preservation process.

Another social advantage we can offer is compatibility with ordinary funerals. When I've worked with donated human cadavers, the results have been something I'd be happy to show to their families: the surgical incisions are easily covered by clothing and other techniques, and the person can rest peacefully at room temperature for weeks without issues. People I speak to feel very positively about this, and I'm hopeful that it lets us spend fewer weirdness points. I'd like to smoothly integrate with the existing funeral industry, just like with the medical and legal systems.

We've taken care to operate within a convenient and sensible legal jurisdiction. This is why we're based in Oregon, even though I anticipate many of our early clients will be from California. Oregon's medical aid-in-dying (MAiD) law is the oldest in the US and enjoys a strong local base of support.

One way a preservation may fail after the fact could be if the preserved body is autopsied, which typically destroys the brain. Our model is protective here: when someone uses MAiD, their death is declared by their attending physician, and their underlying terminal illness is listed as the cause. The legal system considers their death to be a natural one, and the medical examiner has no interest in investigating further because the person’s death has already been documented and was expected; generally they're interested in unexpected / exceptional deaths.

Even with all of this, I'm aware that we face a great deal of uncharted legal territory, especially as we hope to expand beyond the relatively small scope of prior cryonics organizations. I consider it our job at Nectome to map out that territory, and this is why we're building our own regulatory framework to fill a legal void. Right now we're working under scientific research laws, but one of the goals at the top of my long-term list is carving out a new, proper legal niche for preservation.

I take inspiration here from birth doulas, who have historically operated in a similarly underregulated area. Doulas built their own regulatory standards and agencies, and many states have chosen to simply legitimize those agencies, or to adopt standards heavily informed by theirs. For instance, in Oregon, doulas may complete one of eight approved training programs in order to receive Medicaid reimbursement. I imagine a world where Medicaid offers to reimburse preservations certified by the BPF, with preserved people considered to be in a "chemically induced long-term coma" instead of classified as scientific research samples.

My question to you: Imagine you're running Nectome, and you're launching your post-mortem preservation program. What are you most worried about in terms of social and legal issues, and what would you do to address them early on?

S-risks

What if it feels like something while being preserved? What if the world changes and the future wants to revive and hurt preserved people? What if society collapses? These are all questions I've heard, here and elsewhere.

With regards to what it feels like to be preserved, I can with high confidence say it feels the same as DHCA, and that's nothing at all. You need action potentials to think, and they're not happening during preservation because of the dual effects of cold and crosslinks.

On the other hand, I can't out-of-hand refute the risk of the future being very unpleasant, either because of takeover by a hostile AI or some other mechanism. The best solution I've come up with to help mitigate this risk is to very carefully record the preferences of everyone who we preserve and offer to cremate them if we anticipate that the chain of custody is likely to become compromised. Around half of people in my experience say that they would like to be cremated if we are going to lose chain of custody[5], or under some other condition, and the other half want to be preserved at all costs no matter what.

If it seems like things are deteriorating, and despite our best efforts we will lose control of the people we preserve, it may be that the last act of Nectome is to bury half of them in permafrost (according to their wishes)  in an undisclosed location in hopes that someone will care later, and cremate the other half as they requested. It's not something I'd ever want to do, but if I can create the safety for people to choose preservation today by promising to maybe cremate them in the future, I think it's the right thing to do. I hope this means that someone preserved by Nectome is only subject to the same ordinary danger of very sudden S-risks that you and I are subject to today.

My question to you: Can you think of another way to mitigate S-risks for people being preserved today? Under what conditions would you like to be cremated after you were preserved?  

Towards the future, with optimism and care

There is some faith in an organization that only comes with a proven track record of longevity. At the same time, rationalists can do better than reference-class forecasting. What Nectome can offer you, today, is an organization built on the wisdom of previous cryo attempts, and a set of unprecedented advantages against a variety of the most-likely failure modes on our list. We're also thinking about long-tail risks and how we can address them.

One of the most important things at this tender stage is that we're trying to become resilient at scale. There's a lot that can kill a small company that a large company can weather with ease. We're working daily to reach more people. Pre-sales are doubly valuable to us in this project: they contribute directly to our bottom line at a critical time in our company’s development, and they help us secure more investment.

I've asked a lot of questions here, and I really do care about your answers. My commitment to you is that I'll read and respond to every comment posted here in the next two weeks, within three days, barring anything super unexpected on my end. 

There's no better time to influence how Nectome implements its preservation program, and I'm really looking forward to hearing your thoughts. Let's make this a beautiful community effort.

  1. ^

    We can still vitrify, but we choose not to by default since it's not really necessary, it's cheaper, and failing warm is much safer at our default temperature.

  2. ^

    Incidentally, this tolerance for periods at room temperature is also why we're compatible with ordinary funerals, unlike previous methods.

  3. ^

    For those who prefer this. Some people want to be cremated if we can't maintain control of them and enforce restrictions on how they're revived; others might prefer a transfer of custody to a family member or another cryonics organization. We discuss this with clients individually as part of the pre-preservation process.

  4. ^

    To be clear, I think Alcor acted heroically here to defend the interests of their clients. It's just an ugly and contentious situation that I'm glad to avoid.

  5. ^

    We're talking about things that severely disrupt the chain of custody, like a new law that confiscates Nectome's preserved people, not things that simply change custody in a sensible manner, like Nectome and another company merging and changing names, with the merged entity pledging to respect the wishes of the preserved people.



Discuss

The median take is taken

2026-04-11 08:42:28

Occasionally a friend will say something really stupid to me. (Obviously if you’re reading this and you are my friend, I’m talking about someone else.)

If it’s something that makes me raise an eyebrow, I’ll usually ask Claude about it, especially if it’s trivially checkable.

USER: A friend of mine says that “UK economic indicators are looking broadly healthy.” Is that true?

CLAUDE: I’d push back on that. The picture is more accurately described as weak with mounting risks.

Growth is anaemic — the economy grew just 0.1% in Q4 2025, and 2026 forecasts range from 1.1% to 1.4%. Unemployment is the clearest negative signal: it’s hit 5.2%, up from 4.4% a year ago, with youth unemployment at 16%. Inflation has come down but remains at 3.0%, above target, and a Middle East energy shock is now pushing petrol prices up and making further rate cuts unlikely — rate hikes are even being priced in. Consumer confidence sits at -19.

Your friend is either working from outdated priors or defining “broadly healthy” so loosely it’s unfalsifiable. The honest read is: an economy muddling through with real vulnerabilities, not a healthy one.

My friend and I had a brief back-and-forth, with his smoothbrained hot takes crumbling in the face of my machine-enhanced gigafacts, until I got bored and dropped it.

There used to be a website called Let Me Google That For You, which let you send a passive-aggressive link to someone that showed the Google homepage and a cursor wandering to the search box and typing in their query – which they could have searched for themselves.

It would be pretty fun if there was a version of that for Claude.


Well now there is! Try Let Me Claude That For You, the fastest way to annoy and alienate your closest friends.

This does present an uncomfortable question: if Claude can distil any legible question into a competent 250-word response in ten seconds, then what is non-fiction blogging actually for?

Specifically: what novel insight are we producing? Why should anyone write a lit-review-as-effortpost, or “here’s what the data says about X”, or a 3000 word synthesis of a subject?


The most obvious case where humans can still add value:

  1. Posing questions that no-one had thought to ask. Why is my MacBook less responsive than my Apple II? What is the mechanism behind civilisation getting so many things predictably wrong? Is most of what we’re told about sleep bullshit?
  2. Naming a pattern that people have only half-noticed. Reality has a surprising amount of detail. Meditations on Moloch. The Gervais Principle.
  3. Not flinching from a premise. We should take wild animal suffering seriously. We should use biotechnology to destroy suffering. Factory farming is the greatest evil on earth.

LLMs are not good at doing these. They probably can’t discover a pattern that people have only half-noticed. While they aren’t trained to be opinionless, they’ll often come up with pretty milquetoast takes, and won’t want to push an argument to its logical conclusion (especially if that conclusion is controversial). They can’t make a boring-sounding question seem urgent, or make salient a pattern you’re not noticing.

That might change as models improve, but this isn’t a capability limit – these traits are baked in during post-training. Models are trained not to take a hard stance on things which are out of distribution, or on which there isn’t broad agreement. You might be able to push LLMs to produce novel framings of existing topics, but it’s still difficult to discover genuinely new questions that people aren’t asking.

Models trained to be agreeable will not e.g. come up with an ontology of corporate drones as sociopaths/clueless/losers. If anything, their post-training cuts the other way. Model makers want their models to be helpful assistants and sand off some of the rougher, weirder edges of their responses.


Let’s have a look at Reality has a surprising amount of detail by John Salvatier.

The setup: he starts writing about building a set of stairs with his dad. He discovered that it’s unexpectedly complicated in a way which you only see if you actually try to do this – the floor isn’t level, the stairs need a particular rise so you don’t fall down them, the screws need to be a particular length so they don’t stab you in the foot.

The pivot: reality is full of complexity which only becomes apparent when you slam your plan into the world. It’s hard to model ahead of time. This is the reason why projects often overrun.

It sticks in your head because the title is the whole thing. Once you’ve read it, you can simply use that sequence of words to retrieve the idea, and to cause the person to whom you’re speaking to retrieve it. You could ask Claude about the topic if you thought of it, but you probably didn’t, and its reply wouldn’t be nearly so salient. It helps that Salvatier actually had to build the stairs.


We could distil the above categories into something like:

  1. noticing a question
  2. naming a pattern
  3. refusing to flinch from a premise

There are perfectly good reasons to want to write posts that aren’t just “they will provide insights to people for many years to come”. But if you do want to write posts which are lindy – which will outlast the latest fads or blogging cycles and produce novel and interesting insights which can’t just be gotten by asking Claude – you should aim at one of the above.

Even better – and this is where the idea of craft comes in, another thing that LLMs struggle with – they should be written to really stick in someone’s craw. In particular, the most important thing to aim for is compressibility. It should be easy for your reader to compress the central premise into a load-bearing phrase. Moloch. Civilisational inadequacy. Sociopaths, clueless, losers. Reality has a surprising amount of detail.

These become handles by which your readers can refer to your idea – both a conceptual thing they can use to retrieve it in their minds, and for them to induce others who have read your work to retrieve it too. And for people who haven’t read your work, it serves as a way for them to make their way back to it.

So don’t write Claude-shaped posts. Give your readers a load-bearing phrase, ideally in the title. And do the work Claude can’t. Claude aims for the middle of the distribution, for centre mass – your job is to go for headshots[1].

  1. ^

    Thanks to Alexander Wales for this excellent phrase.



Discuss

If Mythos actually made Anthropic employees 4x more productive, I would radically shorten my timelines

2026-04-11 08:38:52

Anthropic's system card for Mythos Preview says:

It's unclear how we should interpret this. What do they mean by productivity uplift? To what extent is Anthropic's institutional view that the uplift is 4x? (Like, what do they mean by "We take this seriously and it is consistent with our own internal experience of the model.")

One straightforward interpretation is: AI systems improve the productivity of Anthropic so much that Anthropic would be indifferent between the current situation and a situation where all of their technical employees magically work 4 hours for every 1 hour (at equal productivity without burnout) but they get zero AI assistance. In other words, AI assistance is as useful as having their employees operate at 4x faster speeds for all activities (meetings, coding, thinking, writing, etc.) I'll call this "4x serial labor acceleration" [1] (see here for more discussion of this idea [2]).

I currently think it's very unlikely that Anthropic's AIs are yielding 4x serial labor acceleration, but if I did come to believe it was true, I would update towards radically shorter timelines. (I tentatively think my median to Automated Coder would go from 4 years from now to maybe 1.3 years from now; my median to AI R&D parity would go from 5 years from now to maybe 2.5 years from now.) My best guess is that 4x serial labor acceleration would cause AI progress to go 1.75x faster (see "Appendix: Estimating AI progress speed up from serial labor acceleration") which is very large and close to the 2x "dramatic acceleration" threshold Anthropic is using for "Autonomy threat model 2: risks from automated R&D". [3]

My current best (low confidence, low precision) guess for the serial labor acceleration is ~1.55x (with a higher serial labor acceleration of ~1.75x for just research engineering activities). I currently think that reasonably informed Anthropic employees that have thought about this topic in a decent amount of detail think the serial labor acceleration is closer to 1.5x than 4x.

I think uplift metrics like "serial labor acceleration" at AI companies are some of the most relevant metrics to track when trying to figure out how close we are to key risk-relevant milestones in AI development like full automation of AI R&D. I also think uplift metrics are among the most relevant metrics for Anthropic's "Autonomy threat model 2: risks from automated R&D". I also think accurately capturing the views of employees, managers, and leadership at AI companies (probably with something like a survey) is currently one of our best ways of assessing serial labor acceleration (or other uplift metrics), especially for AI systems that aren't publicly deployed.

Thus, I'm pretty unhappy about a situation in which:

  • Anthropic seemingly claims they are getting 4x productivity uplift, but it's publicly unclear what they mean by this or how much they believe this.
  • There is virtually no public information about the details of the survey or how seriously this was done.
  • Their statements are consistent with a very radical situation.
  • This is approximately the only direct public evidence we have about the level of acceleration (as in, evidence that doesn't require doing some kind of extrapolation from our views about prior AIs).
  • This is all happening for an AI that's not publicly released, is not going to be publicly released, and appears to be much better than the publicly available frontier (making extrapolation harder).

Some things that would improve the situation in future system cards / risk reports:

  • Generally saying more about the exact details of the survey. For instance: What was the question? How long did people spend answering?
  • Insofar as Anthropic doesn't think this survey is very meaningful or thinks this was a very low effort survey, say so. Alternatively, if they don't think this sort of survey sheds much light, it would be reasonable to not include this in system cards going forward.
  • Clarifying their institutional view and the view of key individuals with relatively precise operationalizations. Like, what does Jared Kaplan think the level of serial labor acceleration is? (Or whatever operationalization Anthropic would like to use.)
    • Ideally, they would also explain why they think this even if the evidence is relatively illegible or some of it needs to be redacted.
  • Insofar as Anthropic thinks the survey is mostly capturing vibes in a way that doesn't depend much on the operationalization in the survey, this seems important to note.
  • Ideally some third party would instead do surveys/interviews of employees, managers, and/or leadership and these results would be reported. It seems like Anthropic isn't that interested in doing carefully done surveys (fair enough!) and it would be useful to standardize this across AI companies.
  • I do think it would be possible for Anthropic to collect information or do surveys that do shed light on this question. I'd be most interested in a survey of the views of employees for whom we're very confident that employee has a detailed understanding of exactly what is meant by different uplift notions and is interested in AI forecasting. (I'm much more interested in capturing the views of employees who have thought a decent amount about this than capturing an unbiased sample.) It would also be possible to do more qualitative and quantitative data collection from various employees and then convert this into acceleration estimates taking into account things like people using AIs to do low value tasks they wouldn't have done otherwise.

If there is a large disagreement about the current level of uplift, this seems like a particularly tractable empirical crux: I would substantially shorten my timelines if I learned the uplift was much higher than I expect, and I'd guess some people at Anthropic would lengthen theirs if they learned it was significantly lower than they expect. I also expect that various people who are much more skeptical than me of reaching very high levels of AI capability within the next 10 years would update some on credible internal uplift measurements. Getting better empirical information about the level of uplift seems hard but doable.

Additionally, Anthropic claims "We estimate that reaching 2× on overall progress via this channel would require uplift roughly an order of magnitude larger than what we observe." Insofar as "productivity uplift" is supposed to correspond to something like serial labor acceleration, I'm very skeptical. I think ~40x serial labor acceleration would yield much more than 2x faster progress. My guess (see "Appendix: Estimating AI progress speed up from serial labor acceleration") is that you'd get 2x overall AI progress at around 5x serial labor acceleration. My understanding is that the AI Futures Project timelines model would indicate that around 8x serial labor acceleration is required. It seems that Anthropic might have their own takeoff speeds / timelines model that differs substantially from current public modeling, produces much less conservative conclusions about the level of concern, and that they are using for decision making. If so, I think they should either publicly write up their modeling (informally would be fine) or get third parties to review it privately. Insofar as they mean "we think we'll maybe reach 2x overall progress when our survey—that's mostly capturing vibes and doesn't have a clear correspondence to any particular notion of uplift—reaches 40x", fair enough, but it seems good to clarify this.

The current state of our evidence about AI R&D acceleration from Mythos seems extremely limited and AI companies should (and can) do much better going forward. [4]

Appendix: Estimating AI progress speed up from serial labor acceleration

  • Suppose we had a serial labor acceleration of X (as in, employees go X times faster) and also increased experiment compute by X. Then, AI R&D progress would go X times faster.
    • I mean instantaneous progress, putting aside diminishing returns to research effort. Equivalently, the "research effort per unit time" would go up by X.
    • This is also putting aside parallel compute being worse than serially faster compute, though I think this doesn't make a huge difference in practice.
  • So, production is some function of serial labor acceleration and experiment compute. We're uncertain about the exact function between something more like a CES model and a Cobb-Douglas production function. I happen to think it's more like Cobb-Douglas than CES for reasons I discuss here.
  • I tend to think the functional form for just AI R&D progress (like algorithmic progress) is like serial_labor_acceleration^0.55 * compute^0.45.
    • It might be pretty different as you start growing these values by orders of magnitude (especially if it's very CES-like), but at least if we're talking about <30x increases to these variables, I think it's something like this.
    • I'm uncertain about the exact constants; serial_labor_acceleration^0.7 * compute^0.3 and serial_labor_acceleration^0.3 * compute^0.7 are somewhat plausible and make a pretty big difference to the bottom line.
  • So, if you get a serial labor acceleration of 4x, I think this increases AI R&D progress by ~2.15x.
  • AI R&D is only a subset of AI progress; some of the AI progress is driven by scaling up compute for training runs. I tend to think that ~2/3 of AI progress is algorithms while ~1/3 is from scaling up compute for training runs. This means you get only 2/3 * 2.15 + 1/3 = 1.75x AI progress increase from 4x serial labor acceleration.
  • To get a 2x increase in the rate of AI progress (assuming these constants), we'd need ~5.3x serial labor acceleration.

This model is basically a simplified version of the AI Futures Project model with somewhat different constants.

Appendix: Different notions of uplift

There are several different concepts that could be meant by "productivity uplift", and which one we're talking about makes a huge difference:

  • Serial labor acceleration: Suppose you could speed up everyone at the company by X but had to use no AI assistance (or only 2020 AIs) in your work. For what X would you be indifferent? (Just taking into account productivity, ignoring safety.)
  • Parallel labor acceleration: Suppose you could magically grow the company by a factor of X, where the new people would have a similar distribution of skills and knowledge to the current people (including knowledge about the company, etc.), but had no AI assistance. For what X would you be indifferent?
  • Current work acceleration: If the company did all the same work it did last week, but had no AI assistance, how much slower would it have been?
  • Fraction of work done by AIs: There are different ways to operationalize this and it's a bit confusing because humans might be spending a bunch of time gaining context so they can (e.g.) tell AIs what to do, and it's unclear what fraction of the work to count this as. Things like fraction of lines written by AI would be an example of this and it seems hard to convert this number into a guess at serial labor acceleration (or other notions we might care about).

A parallel labor acceleration of X is much less useful than a serial labor acceleration of X. And depending on the operationalization, the AIs doing 90% of the work is way less useful than a 10x serial labor acceleration. So the choice of concept matters a lot for interpreting any claimed level of uplift.

  1. I'm using a specific name to distinguish from other things we might call "4x productivity uplift" like "if the median employee had to do the tasks they are currently doing without the use of AI, they would be 4x slower". These notions have strongly different implications as I discuss here and in Appendix: Different notions of uplift. ↩︎

  2. For reference, the speed up modeling I do in that post is out of date with my latest thinking. ↩︎

  3. The update toward shorter timelines is almost entirely from thinking we're further along in the capability progress than I previously realized, rather than from thinking progress will be faster but we're starting from a similar point. As in, I both update towards getting more acceleration at a lower level of capability and towards models being closer in capability space towards various high milestones, and I'm mostly updating timelines based on the second of these. ↩︎

  4. I think there are also some other issues in the system card's assessment of AI R&D acceleration. They seemingly argue that even if Mythos was substantially above trend due to AI acceleration, because this acceleration was done by earlier (less capable!) AIs, this would imply this Mythos-caused acceleration wouldn't be that high: "This means that even if the slope change were AI-attributable, the model it would implicate is not the one we are assessing." This seems backward: if less capable AIs yield a large acceleration, then we should expect the effect from more capable AIs to be even larger. To be clear, this seems like a minor/moderate issue, I just thought it was worth mentioning. ↩︎



Discuss

Biological Computing Underhang

2026-04-11 04:49:52

Human cortex can't represent arbitrarily complex abstractions in a single forward pass. It's depth-limited by the number of sequential inferential steps it can execute per corticothalamic cycle. That ceiling determines what kinds of reasoning are biologically possible at all, not merely how fast they happen.

A single cortical area can be simulated by ~14 ReLU transforms[1] at ~4ms/pass. GPT-3 has 192 of these layers and can do a forward pass in <2ms. With sufficient data[2], you can train a model of this depth for every microcircuit[3] (cortical upper-bound estimate).

So human cortex, given identical inter-region scaffolding and reward signals[4], should be strictly computationally dominated by larger ANNs, even given arbitrary developmental time.

Beyond this, some otherwise embeddable concepts are inacquirable outside of critical learning periods (see perfect pitch[5]). Synapses are physically encaged during maturation, especially in lower sensory/perceptual[6] areas, and uncaging them with psychedelics[7] indeed reopens critical windows.

An adult acquiring a genuinely new conceptual domain, i.e. real new primitives, cannot embed those primitives in early sensory cortex the way childhood experience shapes visual processing.

****

Say you're a young child who really loves plant aesthetics. You're so often attending to leaf and branch geometries that concepts like phyllotaxis become primitives. Later, as an adult, you find it really easy to understand network structures.

Or, more relevant to BCIs for alignment, suppose you're a very smart toddler learning about a linear algebra concept. The relevant representations are worn into early visual cortex; as an adult, you easily compose much more complex representations if they're well-grounded by the LA primitive.

Contrast with using naive adult infra for the same struct. It may introspectively feel similar[8] to holding a new primitive, but takes far more computation and is less composable.

Reasoning with primitives feels like, in our example, rotating a solid object and intuitively feeling that its shape doesn't change; therefore, even though you didn't[9] read about Liouville's Theorem as a kid, you can intuitively feel that shape conservationvolume conservation and extend that to nonstandard geometries.

But that horizontal leap took extra effort. You may have already embedded an autonomic circuit which runs through this process so fluidly that it feels as clear as seeing blue in an open sky, but unless you did so as a young child, you're going to be less efficient at stacking further representations than with true primitives.

So, then, if serial depth is limiting human cognition, why didn't we evolve more compute-efficient software? "Add six extra cortical neuron layers" sounds like a genetically simple, non-personality-altering change, and brains' heaviest energy expenses are interconnect, not soma. - hypothetical reader

I don't know. All mammals I've studied have six cortical layers. Maybe jitter means 6 layers is Pareto-optimal in general, and humans are smarter because we use heavier inter-region stacking? Perhaps error propagation is harder in biological networks[10]?

In any case, provided you can get useful error signals to condition a less-lossy SGD model with, you can expand into more effective cortical columns while using human meta-learning to direct the model much better than training an LLM.

  1. Multiple local passes usually chain in series. One cortical area's internal pass: L4 input (~3 equivalent DNN layers) + L2/3 (~3 layers) + L5 thick-tufted pyramidal (~8 layers, see Beniaguev et al. 2021) = ~14 equivalent layers per area. This can be substantially larger (60-140) given intralayer feedback (eg L5 L5); however, such feedback takes so long (up to 30ms) that you're limiting lateral association. Also, this is more of an RNN than a feedforward MLP; more metabolically & data efficient but repeated autoassociation isn't as expressive as "multiply depth bypasses" would suggest. ↩︎

  2. I will cover this in the next post. You can train anything with semantically coherent I/O, sufficient compute/data, and a large enough network. Cortex has an abundance of time-coded voltages which we can convert into embeddings. Differentiating these signals into something which would robustly inform SGD means extracting local updates from tissue. ↩︎

  3. Signals sweep serially through multiple cortical areas, so a full macroscopic pass in humans can be up to 140 ReLU-transform equivalents, which isn't far from GPT-3's 192. I'm claiming that we can stack an additional, say, 96 ReLUs for each microcircuit, using the brain as a meta-learning signal and router. ↩︎

  4. I'm hand-waving at the macroarchitectural implementation of human continual/meta-learning here, and I don't claim to know what it actually is. BCIs can use enhanced ~gradients without knowing the process which enhanced them. ↩︎

  5. This paper says otherwise, but is wrong in the sense which I care about because relative pitch is learnable in the same way I've exposed so far. ↩︎

  6. Primary visual cortex consolidates around age 5 (peak plasticity 0-3, closure ~6-8); auditory cortex slightly later; language areas in early adolescence (Hartshorne et al. 2018: sharp decline after ~17); prefrontal regions not until the mid-twenties. See Hensch 2005 for review. ↩︎

  7. Critical periods can be pharmacologically reopened by destabilizing the extracellular matrix via TrkB/BDNF (see Moliner et al. 2023), the same mechanism behind moderate-dose psychedelic effects on learning. TrkB activation in parvalbumin interneurons transiently dissolves perineuronal nets, which reopens critical periods in rodents. The antidepressant effects of psychedelics are pretty tightly linked to TrkB and separate from hallucinations; 5HT-2a laterally disinhibits cortex, the knockout of which prevents hallucinations in mice. The head twitch suppression effects of 5HT-2a KO in mice (via volinanserin) correspond to hallucinatory suppression of psychedelics in me, an alleged human. Whether chronically pegging TrkB in adults would eventually downregulate plasticity via eg downstream MAPK-ERK shifts -> ECM tightening is unclear. I find it unlikely given the long-term ECM softening seen with fluoxetine (Castrén lab, Science 2008). ↩︎

  8. ^

    I think it's hard to reflect on one's internal composition of primitives because the macroscopic inter-region computation which reflectivity (and related architectural prerequisites of human meta-learning) operate at is the inter-region / broadcast level. You can't look directly at the subroutines of "blueness" because it's connectomically inaccessible. I'll cover this in a post about alignment of superintelligent humans.

  9. That is, unless you, like Terence Tao and John von Neumann, learned applicable constructs as a child. ↩︎

  10. From my simple simulations with predictive coding networks at varying depths & signal-to-noise ratios, it seems that canonical PC is not robust to settling-phase noise. In fact, noise is so disruptive that I'm doubting whether settling-based PC is biologically plausible. Alternatives like Saponati and Vincks' rule might be stabler; if you know of other examples, please comment with even a quick description. ↩︎



Discuss