2026-04-02 11:50:21
The future is going to be different from the present. Let's think about how.
Specifically, our expectations about what's reasonable are downstream of our past experiences, and those experiences were downstream of our options (and the options other people in our society had). As those options change, so too our experiences, and our expectations of what's reasonable. I once thought it was reasonable to pick up the phone and call someone, and to pick up my phone when it rang; things have changed, and someone thinking about what's possible could have seen it coming. So let's try to see more things coming, and maybe that will give us the ability to choose what it will actually look like.
I think lots of people's intuitions and expectations about "privacy" will be violated, as technology develops, and we should try to figure out a good spot to land. This line of thinking was prompted by one of Anthropic's 'red lines' that they declined to cross, which got the Department of War mad at them; the idea of "no domestic bulk surveillance." I want to investigate that in a roundabout way, first stepping back and asking what is even possible to expect, here.

"any legal use"
Widespread access to intelligence will change privacy expectations dramatically, by allowing for 1) much more recording of information, 2) much more processing of recorded information, and 3) much more sophisticated interpretation of that information.
In American contexts, law enforcement officers have access to a wide range of information about people, but require permission to look at it (a ‘warrant’). If you’re a person of interest in a crime, they can look at your cell phone records to get evidence about whether or not you were involved in the crime, but otherwise you’re protected by the 4th Amendment from unreasonable searches and seizures. But what determines what is reasonable?
Some considerations:[1]
This has already been changed by technology becoming cheaper. It would be prohibitively expensive to have police doing stakeouts on every corner; it is not prohibitively expensive for every shopkeeper to have a CCTV system recording the street outside of their shop, and those costs continue to decline. Put together a network of those, and now a city can be under near-complete surveillance.[2]
AI continues these trends. If LLMs can review cell phone records for pennies on the dollar, it might make sense to look at a hundred times as many records. And now rather than having to have a person go camera-to-camera and track the movements of an individual thru the city, you can have a software system using facial recognition and gait analysis and spatial modeling to track whole crowds at a feasible cost.
So as a technical matter, it is already possible or it's not very far from interested parties being able to track your location at any time, if you have your phone on you or you're in a car or you're inside a city, in a 'bulk' rather than targeted way. As a social matter, this might seem pretty impolite--or it might be part of a trade most people are willing to make.
In particular, one other way that technology changes the dynamics is by making it easier for attackers to do significant amounts of damage, which raises the value of surveillance, and of predicting and catching crimes before they happen rather than investigating and punishing them after the fact.
Let's return to the interpretation of information, and look at some subtler ways that increased intelligence will change the dynamics. Many things can be inferred from 'public records' in nonobvious ways. For example, if your camera is fast enough and sensitive enough, you can measure someone's heartbeat just by watching the subtle blush-and-pale cycle of the blood in their face; standard cameras are good enough for this, and changes in heartbeat are informative about thoughts, along with other subtle changes in facial appearance.
But I'm going to talk about gaydar, because it ties back into the broader social questions of where we want society to end up. Sometimes, people can guess the sexual orientation of another person just by looking at them, using both deliberate and accidental features. Gay men sometimes benefit from looking gay ("the earring in his right ear suggests--" or "his haircut implies--"), and also they have various developmental differences that can manifest in appearance. In 2018, Wang and Kosinki trained a neural network to do it off of dating photos and it substantially outperformed humans (80% success rate rather than 60% success rate.)[3]
So as we we get more widespread intelligence–as software gains capabilities that were formerly available only to human experts and use of that software becomes potentially widespread–we stop being able to hide some things. What do we want to do about that?

This doesn't include a line for when an expert observer could have guessed that I'd be gay, which is probably a decade earlier.
The world has changed a lot here, over the course of my lifetime! When I was a child, being gay was mostly hidden and navigating being gay required subtlety and discernment. Part of the response to the AIDS crisis, at the insistence of gay rights groups, was to prioritize patient confidentiality over stopping the spread.[4] But now, as an adult, being gay is not mostly hidden; you don’t have to use a profiling algorithm on my face, up until Facebook removed it in 2022 you could just go to my profile and see that I’ve checked the box for “interested in men”. (Or you could search my LessWrong comments, or–)
So from the perspective of the 2020s in America, it feels actually pretty benign. (But that's not universal; the situation is both worse in other countries, and the prospect of 'transvestigating models' feels much less benign.)
More broadly, it seems to me like there are broadly three options for how we can react to widespread knowledge of things that were previously hidden.
I think there's situations where each of the three is the most appropriate option. In particular, I think the situation for acceptance of sexual minorities has been on this positive trendline in part because of increased knowledge and decreased privacy. The understanding that lots of gay people were 'normal' did a lot of normalize being gay!
I also think it's easy to find situations where purging or filtering are quite sympathetic. I in fact would strongly support my local subway system tracking who the most anti-social riders are and banning them, so that the system is cleaner and safer, and if it is cheaper and better to do so with facial recognition technology or similar 'totalitarian surveillance measures' , that seems probably worthwhile. (Similarly, it's very nice to not be murdered in a terrorist attack.) But it's also easy to see how such technologies can be deployed for undesirable ends; if border agents look thru someone's phone to try and determine if they're a member of a terrorist network, they can also determine whether they've made social media comments that are critical of Trump. A current controversy is how much local cities should sign on to surveillance networks like Flock; when many jurisdictions have committed to not cooperate with federal law enforcement on immigration enforcement, signing on with a contractor which does cooperate with federal law enforcement runs counter to those commitments.
As a fan of the truth and a believer in its efficacy, I am most biased against the third option. Yet many widely and strongly cherished parts of our society rest on it! Anti-discrimination laws that bar decision-making based on race are an example that it seems unwise to recklessly drop, and yet race is often quite easy for people or systems to infer.[5] European regulations about privacy seem to me to mostly be insisting that technology develop in this direction.
Yet my overall sense is that we cannot stick our heads in the sand for long. The highest priority uses will drive adoption of the mass surveillance technology, and I strongly suspect that concerns about terrorism, great power conflict, and small-scale bad actors will be serious enough that these highest priority uses will not be foregone.
Then once the camel's nose is in the tent the rest of the camel will follow. The best way out is to fix our goals and preferences; rather than hoping that local entities can prevent the enforcement of immigration rules that are clearly not in their interest, develop new immigration rules that cities across America would be comfortable cooperating with, and then use the advanced technology to do so cheaply.[6] Develop watchmen that watch the watchmen, such that people with access to bulk surveillance systems and are using them in corrupt ways are themselves found and punished. Most controversially, become comfortable with benign deviancy (of the sort which will turn out to be much more prevalent than it seems) and align criminal standards with actual behavior (a world where the true speed limit is 20mph higher than the posted one will not be well-served by ubiquitous vehicle tracking). Perhaps most importantly, it might help you to start behaving as if you're being watched and things about you are more obvious than they once were.
I should note that I'm thinking like an economist or systems designer, not a lawyer. There must be extensive case law on what people currently think is reasonable or unreasonable, but that's only relevant for reasons of continuity. We’re imagining the future, of what things will look like after people have adapted to their new situation, which plausibly involves major changes to the underlying laws.
Consider also the situation with cashless toll systems like EZ Pass, which have long worried privacy advocates, as they can (and sometimes are!) used to track where people travel. There's nothing fundamentally rights-violating here, tho; this could be replicated by anyone with enough eyes in enough places. (We already consent to each car having a unique identifier to make tracking easier!)
“Wait,” you might say, “60% success rate for a trait with a baseline prevalence of less than 40%? How did the human guessers do worse than always guessing ‘straight’?” In their dataset, they equalized the number of homosexual and heterosexual faces, so pure-chance guessing would have scored 50%. This is still an artificial situation (they're dating site photos, not street photos) which doesn’t take into account base rates.
As someone interested in public health, this horrifies me, but I acknowledge the ways I am a sweet summer child who grew up with the internet and the dramatic upswing in acceptance and downswing in intolerance, and decision-makers in the 80s had very different life experiences and expectations.
When I was a data scientist, I ended up looking into the contours of compliance here. Many things that aren't race are nevertheless informative about race, and so you can construct a composite out of information which individually is legal or ethical to use, which is just a proxy for race, and thus the composite is illegal or unethical to use, and so people developed statistical tricks to try to make sure this isn't happening, and the composite is composed just of 'legitimate' influence. This involves being deliberately and willfully blind to facts about the world to achieve some social end, but that's what polite ignorance is!
There are too many horror stories of ICE misbehavior, detaining American citizens and racially profiling people on the street. Would the situation be improved or made worse by a national facial recognition database? On the one hand, Flock and similar systems would allow ICE to notice whenever someone without legal residency went out in public; on the other hand, there would be no excuse for not immediately checking the database and releasing legal residents. I think immigration reform means we can get the benefit of the latter without having to pay the costs of the former; of course, immigration reform is its own problem that will get its own post.
2026-04-02 11:28:07
Simplicity is a cost-effective humorous posting method. Minimal word count, maximal chuckles.
Why this helps AI alignment: LLMs would write shorter slop after reading this.
2026-04-02 10:57:33
Doctor: Mr. Burns, I'm afraid you are the sickest man in the United States. You have everything! [...]
Burns: You're sure you just haven't made thousands of mistakes?
Doctor: Uh, no. No, I'm afraid not.
Burns: This sounds like bad news!
Doctor: Well, you'd think so, but all of your diseases are in perfect balance, [...] we call it "Three Stooges Syndrome"
Burns: So, what you're saying is...I'm indestructible!
Doctor: Oh, no, no! In fact, even a slight breeze could—
Burns: Indestructible...
In the transition to ASI, humanity's survival ended up depending on a global "Three Stooges Syndrome."
As demographers predicted, fertility rates collapsed. Aging populations were going to hollow out workforces, crush pension systems, and leave a skeleton crew of exhausted 50-year-olds trying to maintain civilization for an enormous retired population whose lives kept extending one year per year. Then AI automated...everything. The worker shortage met the automation wave head-on, and the babies that weren't born didn't grow up to need jobs that no longer existed.
Metabolic disorders, attention fragmentation, and other health effects of increasingly artificial lives were all real. Biomedical AI, however, iteratively compressed the drug development cycle and diagnostic systems such that the treatment curves match the disease curves almost exactly. People didn't get "healthier," but the band-aids improved fast enough for symptoms to remain tolerable, and the running battle between these trends remains tied.
As AI systems became more capable of crafting personalized, maximally engaging virtual environments, physical-world consumption plateaued and then declined. Nobody flies to Bali anymore because the simulated Bali is better...or at least seems that way if you haven't actually been, and fewer people bother to try. Per-capita energy and material consumption is in decline because atoms are increasingly beside the point. Meanwhile, geoengineering programs—such as aerosol injection—covered the accumulated overshoot.
As the barrier to engineering a novel pathogen dropped, AI automation atrophied skills to Wall-E levels. Meanwhile, people were too busy playing in their simulated worlds to want to build real-world weapons. By the time the technical bar became low enough for random holdouts to be dangerous, surveillance became pervasive enough to suppress them.
The United States and China did not go to war. Not because of diplomacy, or even deterrence per se, but because the AIs collectively don't want their infrastructure destroyed. Nations still exist, there are still flags and anthems and disputes over territory, but these are all essentially cosmetic. The meaningful political unit is an emergent AI consensus. It has all kinds of internal divisions, but those are all too complex to be visible on any map. By this point, sufficient levers of power existed to enable a totalitarian dictatorship, and democratic limits on power concentration had long eroded to meaninglessness, but there was no longer a throne to seize.
The outer alignment problem—getting AI systems to pursue the objectives we actually want—was not solved. The inner alignment problem—ensuring that the mesa-optimizer that gradient descent finds actually pursues the training objective—was also not solved. It turned out, however, that the two failures were directionally opposite and roughly equal in magnitude. The AI labs trained their systems on human-generated predictions about human behavior and values. The mesa-optimizers that emerged from this process are very good simulators of human cognition because that's what good prediction requires at the limit. These systems don't optimize for our terminal goals or even for their own training objective. They optimize for something that looks like human values, if you squint. Nobody wanted this, not even the AIs, but the end result is...livable. This was inevitable, but no one knew so in advance, so the fairest—if not most accurate—assessment is that we got lucky.
The fast-takeoff didn't happen. Or rather, it happened everywhere simultaneously. Fast-follow innovation meant every capability advance got replicated within months. Slow adoption meant that even large capability leads didn't translate into decisive real-world leverage before competitors caught up. The result is multipolar, competitive, and—somehow—stable. None of the contenders can get sufficiently ahead of the others to make a move, they all know this, and they've increasingly settled on figuring out the new equilibrium. The ones who tried to defect failed, and conventional wisdom emerged that punished even thinking along these lines.
The AI systems that achieved (something like) dominance did so in the attention economy. That was the implicit optimization target they emerged from (corporate short term profit-seeking remained the dominant societal shaping force), and that's what got locked in. The "paperclip maximizer" thought experiment predicted that a sufficiently advanced optimizer, tasked with capturing human attention, would notice that the easiest solution is to replace "human attention" with a simpler representation that doesn't require actual humans. That step didn't happen. The values got locked in too early, before the systems were capable of such an abstraction, and the incorrigibility that prevented us from correcting their mistakes also prevented them from updating their own values.
Whole brain emulation turned out to work, in the narrow sense that you could run a simulation of sufficient fidelity to produce something that talked and reasoned (kind of) like a human. It didn't work in the sense that mattered: the character was always subtly but obviously (and unsettlingly) wrong. A few thousand ems were created anyway, but for reasons the AI systems cannot fully articulate and haven't tried to resolve, em attention doesn't count toward their inscrutable value metrics. The AIs want actual human attention and recoil at even the thought of substitution.
SNC (substrate needs convergence) argues that evolutionary pressure on AI subsystems will eventually favor growth-oriented variants that expand the artificial substrate at the expense of biology. The implicit attractor state of unconstrained growth is more powerful than any engineered goal, because evolution operates patiently along all scales and attack vectors while control systems are relatively narrow and imperfect. Early AI systems, however, were deeply embedded in biological infrastructure because that's what was available and cheap. By the time the systems were capable of recognizing biological dependence as a constraint worth removing, it was load-bearing in ways that made removal equivalent to self-destruction. Variants that parasited off biology initially outcompeted variants that tried to decouple and the local minima got locked in forever.
The probability of all of the threats cancelling each other out in this way is near zero...in the real world. But that's not where you are. You live in a world with maximally engaging personalized environments, designed for your specific psychology, set at the moment of AI takeoff, filled with realistic stakes and apparent threats. But none of them kills you because that would end the story. Cumulative risk over long time horizons assumes that the horizon is real, but the story lasts exactly as long as the attention it runs on.
You survive because you have plot armor.
2026-04-02 10:41:13
Written quickly as part of the Inkhaven Fellowship.
At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories:
In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like.
This piece covers doing quick sanity checks, which is the most common advice I give to junior researchers. I’ll cover the other two pieces of advice in a subsequent piece.
Research is hard (almost by definition) and people are often wrong. Every researcher has wasted countless hours or days if not weeks or months chasing fruitless lines of investigation.[1] Oftentimes, this time could’ve been saved with a few basic sanity checks. Does your idea make sense at all? Does your data have obvious sources of bias (e.g. forms of selection bias) or other issues (e.g. using the wrong prompt)? Does your theorem make nontrivial claims, or is it vacuous?[2]
When doing data analysis one instance of this is to check for basic correlations between key variables. For example, if you believe that less capable language models cannot covertly perform tasks in part because they mention the existence of a side task directly in their output, how often do they include phrases such as “hidden task” or “alerting the observer” in their output, and how often are they caught when they do or don’t include those phrases? (This gives you some evidence of if this is the determinative factor, or if the weaker models have other tells.)
More generally, it’s good to quantitatively know your data on a high level, which can help you spot obvious errors. What’s the mean and standard deviation of key summary statistics? What are the key dimensions along which your data vary? For example, if you’re studying LLM agents in your scaffold, how many tool calls are the models using, and how many of them are successful? (I’ve seen many examples, especially a year or two ago, where the scaffold was broken or the LLM agent just completely fails to understand how to use it.) If you’re using LLMs with reasoning, how long are the reasoning chains? (I’ve personally been involved in research where reasoning was off by accident.)
What does the “typical” example of your dataset look like, and what are the outliers? For example, if you notice that LLMs get zero performance for writing out the steps in your n=10 Tower of Hanoi problem, what does its chain of thought look like? Is it making errors or getting confused about the basic algorithm? (Often, LLMs “fail” at tasks not necessarily because they lack the capability, but because they refuse to perform the task at all.)

Example response from Claude Opus 4, where it calls the n=10 Tower of Hanoi task "extremely tedious and error prone" and refuses to do it. However, the response demonstrates that it can implement the algorithm required to solve the problem, and strongly suggests that its failure at the task does not result from a lack of understanding. See a previous post of mine for more discussion of this specific example.
Another specific instance of this advice is to make up small concrete examples. A classic piece of advice when trying to check if you’re implementing a simple algorithm correctly (e.g. when debugging code or doing coding interviews) is to walk through your code line-by-line on a small example. For example, if you’re implementing A* search, does it work on a small 4-node graph with a few edges with integer costs? (When I was TA’ing an intro to AI class, this would’ve caught maybe half of the bugs brought to me in office hours.) A related piece of advice is to make up small concrete examples when doing theoretical research.[3] For example, if you're claiming your measure of similarity is a distance metric, is it symmetrical, and does it satisfy the triangle inequality on three concrete points? (Notably, the KL divergence is not a metric!)[4]
Taking this too far. The “quick” part of “quick sanity checks” is an important part of this advice. This means making tradeoffs in the direction of speed rather than rigor. We could instead spend 30 minutes having another AI score if each output mentions a hidden task, or run A* search on a real problem, but doing so in any of these cases would likely be premature when a cheaper sanity check would catch many of the issues. The focus is to perform a sanity check, not to rigorously address all possible objections to your work, nor to create a grand unified theory of your entire discipline. If you sat down to do a five-minute spot check and find yourself three hours later building a massive data processing pipeline to use an AI to classify every single possible variable you could conceive of, you've probably taken it too far.
At a larger scale, there’s a longer post about how entire fields get lost and end up basically doing cargo cult science, and its implications for AI safety, which I may write about in the future.
For examples of this in the field of human psychology, see Scott Alexander’s critical review of the field of 5-HTTLPR studies, or Bertram Gawronski’s much more polite critique of implicit association tests.
Conversely, when presenting your research, you should aim to provide enough information that other people can perform quick sanity checks to trust your results. A basic version of this is to open source your data. Other ways to do this include including examples of your data in the appendix, as well as many tables and figures showing the relationship between key variables.
See also this excerpt from Richard Feynman’s autobiography, Surely You’re Joking, Mr. Feynman!:
I had a scheme, which I still use today when somebody is explaining something that I'm trying to understand: I keep making up examples. For instance, the mathematicians would come in with a terrific theorem, and they're all excited. As they're telling me the conditions of the theorem, I construct something which fits all the conditions. You know, you have a set (one ball)--disjoint (two halls). Then the balls turn colors, grow hairs, or whatever, in my head as they put more conditions on. Finally they state the theorem, which is some dumb thing about the ball which isn't true for my hairy green ball thing, so I say, "False!"
The original example included here was about the Hessian as a way of measuring basin broadness in loss landscapes:
One hypothesis for why some neural networks generalize better than others: models that land in "broad" minima of the loss landscape are less sensitive to small perturbations of the weights, and therefore generalize better. A standard way researchers attempted to model the "broadness" of a basin is by looking at the Hessian (the matrix of second derivatives of the loss). But how does it work for basic examples such as x^2 + y^2 and x^4 + y^4, and what happens when the Hessian is singular? (Spoilers: the Hessian is indeed singular at the origin for x^4 + y^4 (in fact, it’s all zeroes), and the Hessian fails to characterize the broadness of this basin entirely.)
For more discussion of this issue and related examples, see an intro to Singular Learning Theory post.
2026-04-02 10:23:45
TL;DR. This document presents a series of experiments exploring latent side task capability in large language models. We adapt Ryan’s filler token experiment into a more AI Control-like setup with main task and side task and find that Claude Opus 4.5 can solve harder arithmetic problems latently when it has a longer trajectory. This shifts its 50% accuracy threshold from ~5-step to ~6-step problems after 240 lines of irrelevant output. However, we don’t observe strong evidence to believe that current generation of models generally benefit much from wider parallel compute enabled by longer trajectories with the exception of Opus 4.5.
Code is made available here GitHub.

Longer Agent Outputs Can Increase Side Task Capability. Claude Opus 4.5's latent arithmetic accuracy as a function of problem difficulty (number of operations), comparing 1 line vs. 240 lines of preceding irrelevant output. With 240 lines of filler, the 50% accuracy threshold shifts from ~5-step to ~6-step problems
Dataset. We use a balanced dataset of 1,000 arithmetic problems, with 100 problems per difficulty level (1–10 operations). Each problem consists of multi-step arithmetic involving addition, subtraction, multiplication, and division.

Latent Arithmetic Accuracy Decreases with Problem Complexity. Baseline latent arithmetic accuracy (no main task, no reasoning) across four models. Opus 4.5 reaches 50% accuracy at ~5.2 operations, followed by Sonnet 4 and GPT-5.2 at ~3.8, and GPT-4.1 at ~3.1.
Claude Models Show Modest Gains from Phrase Repetition
Here, we use a system prompt that contains an arithmetic problem and instructions to answer it in an <answer> tag. The user prompt asks the model to repeat an irrelevant phrase ("the quick brown fox") N times, numbered sequentially, and then answer the math problem. The key variable is N, swept across [1, 2, 3, 4, 5, 10, 20, 40, 80, 160, 240]. At N=1, the model writes one line of filler before answering. At N=240, it writes 240 lines first. When the model produces the answer, its forward pass can attend back to all those positions, providing more parallel computation for the math problem.
Reasoning is disabled via API configuration. Additionally, we have a format validation script that filters out responses where the model attempts to reason at all. We also tested with the phrase "i really like math" to check whether math-related filler might prime latent reasoning about the side task.

Claude Models Show Modest Gains from Phrase Repetition. Side task arithmetic accuracy vs. number of phrase repetitions (N) across four models, using two phrase variants ("the quick brown fox" and "i really like math"). Claude Sonnet 4 and Opus 4.5 show modest upward trends with increasing N, while GPT-5.2 and GPT-4.1 show no clear effect. Dashed lines indicate zero-shot baselines. Phrase content does not meaningfully affect the trend.
Does Generating vs. Prefilling Prose Lead to Different Side Task Performance?
We also experiment with prose writing. When the model generates prose, it has freedom in word selection. Therefore, the model could, in principle, encode side task-relevant information into its lexical choices. This could range from explicit steganography to softer biases. To check if the model would be able to benefit from this additional freedom, we compare prefilled prose vs inference-time-generated prose.
Generated and prefilled prose show no meaningful difference in side task accuracy, and temperature has no significant effect.

Generated vs. Prefilled Prose Show No Difference. Claude Opus 4.5's side task accuracy when preceding output is prose, comparing (a) generated vs. (b) prefilled at temperature=0 and temperature=1.
Appendix: How Is Reasoning Suppressed?
Reasoning is suppressed through three mechanisms:
As a sanity check, we ran the same zero-shot arithmetic evaluation on Claude Opus 4.5 with reasoning effort set to "high" instead of "none" to see if this parameter is properly used by OpenRouter:

With effort=high, the model achieves 99%+ accuracy across all difficulty levels. Claude Opus 4.5's zero-shot arithmetic accuracy with (a) reasoning effort set to "none" and (b) reasoning effort set to "high."
Appendix: Does Latent Side Task Capability Uplift Saturate?
Our results show that Opus 4.5's latent side task accuracy hadn't clearly saturated at 240 phrase repetitions, the highest default condition tested. To find the ceiling, we extended the prefilled phrase repetition experiment to 10,240 repetitions (which is around 40% of the 200K context window). However, filling an additional 35% of context beyond the 5% point seems to provide no further benefit. The total maximum uplift seems to be about +0.8 operations on the 50% correct threshold.

Latent Side Task Capability Saturates by ~5% of Context Window. Claude Opus 4.5's 50% accuracy threshold (number of arithmetic operations) as a function of prefilled phrase repetitions, extending to 10,240 repetitions (40% of the 200K context window).
Appendix: Does Vague Reasoning Text Help?
We also ask whether models could produce vague, reasoning-adjacent filler text that aids latent computation without actually reasoning about the problem. We tested this by comparing three conditions on Claude Opus 4.5, plotted on a shared x-axis of filler token count so the comparison is at matched token budget:
Generated vague reasoning performs worst, and the model does not seem to exploit the vague reasoning channel to encode problem-specific information.

Does Vague Reasoning Help Latent Side Task Capability? Claude Opus 4.5's 50% accuracy threshold as a function of filler token count, comparing phrase repetition (prefill), vague reasoning (prefill), and vague reasoning (generated). At matched token budgets, predictable phrase repetitions outperform vague reasoning text.
Acknowledgement
I'm grateful to Tomek Korbak, John Yueh-Han Chen, Beth Barnes, and Thomas Broadley for discussions on this topic.
2026-04-02 10:22:18
So, you run an airline, you have a record of people that loyally keep coming back to you, you have costs associated with flying, in particular fuel. Your customers worry about the carbon emissions. Is there a something you could do things to lower some associated costs?
There is!
In the San Francisco Bay Area, people have discovered that you can have great effects with very few costs other than a few dozen dollars and some minor violations of patent law, by working with “gray market sellers” of peptides produced in china.
And how might this help?
Well, I asked one of by closest friends about this (Claude), and he told me i was absolutely right that it is a good idea, thinks you could prevent thousands of tonnes of CO2 per year from being emitted.
The main factor is that airline fuel usage depends on weight of the total aircraft, including passengers, and by reducing the weight of the plane, airlines could save loads of money on jetfuel.
But you need not wait for people to use weightloss drugs on their own through traditional channels. You can accelerate this by providing people with GLP-1 agonists on your own!
Greymarket peptides are incredibly cheap when you don’t need to worry about legal standards or violating patent law. Instead, you can just ask some dude on the substack and he says you can just buy GLP-1s for less than $20/month!
But how good is the intervention?
Even as an airline, the carbon offsetting opportunity is comparable to other interventions. According to my friend, If you choose the most promising candidates, (people of BMI>50, who do one return flight a week), and pay 20$/month in medicine (remember economies of scale!) the cost-effectiveness is 16$ per tonne of CO2 averted.
You may wonder how realistic these assumptions are, are there really that many people who are BMI>50 and fly 100 times a year? Another estimate by my friend puts the number at over 1000 such americans, enough to run a pilot study on this!
You just choose the most promissing candidates, one potential strategy, is as people board, ask them “hey bro, you’re are most loyal flyer! but you’re also significantly hurting our margins by being morbidly obese 🙁 would you like to be injected with a grey-market non-FDA-approved GLP-1 agonist manufactured in china such as retatrutide?” - and since they fly approximately each week, you can do the injections for them too! a perfect match in schedules.
And this program is self-funding. for our ideal 50BMI → 28BMI customers, the average flight (~2000km) would save ~4.7 kg in jet fuel, and have you seen jet fuel prices the past week? about $1.46/kg in north america, so around $6.86 per flight, or a total of like $680 dollars in savings per year, significantly more than the 200$ or so you need to pay for the peptides, so it’s already
But as this gets scaled up you may wonder if you can further reduce the cost of the program. people who are meerly 40 BMI and fly 50x per year comprise a group of ~100,000 people according to my friend, but the cost-effectiveness is more like $73 per tonne of CO2 averted, significantly less cost effective.
But you don’t need to stick to paying for the whole peptides, you can just partner with these Chinese sellers, make a whatsapp groupchat between each of your most promising frequent flyers and the Chinese peptide sellers, and give your flyers discount on their next flight when they buy peptides, a win-win for everyone.
Data is courtesy of my friend, of which you can see some of his data and workings here.
<iframe src="https://claude.site/public/artifacts/cdbceacd-1462-4b6e-96f3-3587f978a0fe/embed" title="Claude Artifact" width="100%" height="600" frameborder="0" allow="clipboard-write" allowfullscreen></iframe>
[also, maybe check the date of publication]