2026-01-19 06:04:51
Published on January 18, 2026 10:04 PM GMT
It is unbearable to not be consuming. All through the house is nothing but silence. The need inside of me is not an ache, it is caustic, sour, the burning desire to be distracted, to be listening, watching, scrolling.
Some of the time I think I’m happy. I think this is very good. I go to the park and lie on a blanket in a sun with a book and a notebook. I watch the blades of grass and the kids and the dogs and the butterflies and I’m so happy to be free.
Then there are the nights. The dark silence is so oppressive, so all-consuming. One lonely night, early on, I bike to a space where I had sometimes felt welcome, and thought I might again.
“What are you doing here?” the people ask.
“I’m three days into my month of digital minimalism and I’m so bored, I just wanted to be around people.”
No one really wants to be around me. Okay.
One of the guys had a previous life as a digital minimalism coach. “The first two weeks are the hardest,” he tells me encouragingly.
“Two WEEKS?” I want to shriek.
Hanging out there does not go well. My diary entry that night reads “I sobbed alone and life felt unbearable and I wondered what Cal Newport’s advice is when your digital declutter just uncovers that there is nothing in your life, that you are unwanted and unloved and have no community or connections”.
It is not a good night.
On a Thursday night, I think about going to a meetup. I walk to the restaurant, but I don’t see anyone I know inside, and I don’t go in. I sit on a bench nearby for half an hour, just watching people go back and forth, averting my eyes so meetup-goers won’t recognize me. A bus goes by. Three minutes later, a woman around my age sees me sitting on the bench. “Excuse me,” she says, “do you know if the bus went by yet?”
“Yeah, it did,” I tell her. “Sorry!”
“Oh, thanks!”
I’m ecstatic with the interaction, giddy. A person talked to me! I helped her!
I wander away from the bench, but I don’t want to go home yet. I usually avoid the busier, more commercial streets when I’m out walking, but today I’m drawn to them — I need to hear voices, I need things to look at, lights and colors and things that move.
I go into the Trader Joe’s on the corner of my block, just because it’s bright inside and full of people. An older man asks an older woman if she knows where the coffee is. This is something I will notice repeatedly and starkly: that only older people talk to strangers, and they seem to have learned that young people don’t want to be asked for things. Is this a post-pandemic thing? In 2019 at this same Trader Joe’s I asked a guy my age to reach something off a high shelf for me and he was happy to oblige.
In any case, the older woman does not know where the coffee is.
“Hi,” I stick my head into the conversation. “The coffee’s over there, by the bread.” I point.
“Oh, thank you!”
He’s so genuinely delighted. Is this what it could be like to go through the world?
When I get home my upstairs neighbor is outside, and I talk to him a bit. He’s in his 60s, too. Young people don’t talk to each other.
A few days later, back at that Trader Joe’s with my Post-it note shopping list in hand, I find that the store doesn’t carry buttermilk, which I need for a recipe. Standing in the long checkout line, I turn to the woman behind me.
“Do you know what I can substitute for buttermilk in a baking recipe?” I ask her. She’s in her 60s. The man behind her, in his 40s, gets into the conversation, seems happy to offer me solutions.
I tell a friend about the encounter later and they say that every part of them clenched just to hear about it. They could never imagine doing such a thing, and they have no desire to.
I hadn’t realized I had any desire to, either.
2026-01-19 05:37:51
Published on January 18, 2026 9:37 PM GMT
Recently I've been accumulating stories where I think an LLM is mistaken, only to discover that I'm the one who's wrong. My favorite recent case came while researching 19th century US-China opium trade.
It's a somewhat convoluted history: opium was smuggled when it was legal to sell and when it wasn't, and the US waffled between banning and legalizing the trade. I wanted to find out how it was banned the second time, and both Claude Research and Grokipedia told me it was by the Angell Treaty of 1880 between the US and China. Problem is, I've read that treaty, and it only has to do with immigration—it's a notable prelude to the infamous Chinese Exclusion Act of 1882. Claude didn't cite a source specifically for its claim, and Grok cited "[internal knowledge]", strangely, and googling didn't turn up anything, so I figured the factoid was confabulated.
However, doing more research about the Angell mission to China later, I came across an offhand mention of a second treaty negotiated by James Angell with Qing China in 1880 (on an auction website of all places[1]). Eventually I managed to find a good University of Michigan source on the matter, as well as the actual text of the second treaty in the State Department's "Treaties and Other International Agreements of the United States of America: Volume 6 (Bilateral treaties, 1776-1949: Canada-Czechoslovakia)".
Anyway, Claude and Grok were right. Even though opium wasn't even in the remit of the Angell mission, when Li Hongzhang surprised the American delegation by proposing a second treaty banning it, James Angell agreed on the spot. It was later ratified alongside the main immigration treaty. The opium treaty doesn't appear to have a distinct name from its more famous brother; the State Department merely lists the immigration treaty under the title "Immigration", and the opium treaty under the title "Commercial Relations and Judicial Procedure", so I can't entirely fault the LLMs for not specifying, though they ought to have done so for clarity. I suspect they were confused by the gap between the US government records they were trained on and the lack of sources they could find online?
(An aside: by 1880 US opium trade was in decline, while British opium trade was peaking, just about to be overtaken by the growth of domestic Chinese production. Angell judged correctly that the moral case overwhelmed the limited remaining Bostonian business interests and made the ban good politics in the US, particularly because it was reciprocal—he could claim to be protecting Americans from the drug as well. Though, that's a harsh way of putting it; Angell personally stuck his neck out, mostly upon his own convictions, and both he and the US deserve credit for that.[2])
If all that doesn't convince you to doublecheck your own assumptions when dealing with LLMs, well, there have been more boring cases too: I asked Claude to perform a tiresome calculation similar to one I had done myself a month before, Claude got a very different answer, I assumed it made a mistake, but actually it turns out I did it wrong the first time! Claude made a change in my code, I reverted it thinking it was wrong, but actually it had detected a subtle bug! I think by now we're all aware that LLMs are quite capable in math and coding, of course, but I list these examples for completeness in my argument: the correct update to make when an LLM contradicts you is not zero, and it's getting bigger.
Apparently there's a decent market for presidential signatures of note? They managed to sell President Garfield's signature ratifying the Angell Treaty of 1880 for ten grand, partly off the infamy of the treaty and partly because Garfield's presidential signature is rare, him having been assassinated 6 months into the job.
Fun bit of color from the UMich source:
Long afterward, writing his memoirs, Angell would remember the genuine warmth of Li [Hongzhang]’s greeting. The viceroy was full of praise for the commercial treaty signed by the two nations.
“He was exceedingly affable …,” Angell remembered, “and [began] with the warmest expressions in respect to my part in the opium clause.
“I told him, it did not take us a minute to agree on that article, because the article was right.
“He replied that I had been so instructed in the Christian doctrine & in the principles of right that it was natural for me to do right.”
2026-01-19 04:35:49
Published on January 18, 2026 8:35 PM GMT
Note: Fictional! To preempt any unnecessary disappointment and/or fears of dystopia, be aware that this is not a real product, I don't know of plans to develop it, and it is infeasible in many respects. There are some related products under search terms like "kids GPS smartwatch" and "safety monitor".
Do you want your child to have free rein to wander in nature or explore the town? Are you worried about your child getting lost, or injured, or worse? Have you heard horror stories about CPS?
Introducing Lifelink™, the undisputed best-in-class FRC wearable safety link for independent children. Give your child the gift of secure autonomy today. Device FREE with subscription. Features include:
For as low as $25 per month plus equipment shipping, you'll get:
With our Pro package, you'll get everything in the Basic plan, plus:
**Limit 1 (one) replacement per month. Void with intentional destruction of equipment or software hacking. Shipping not included.
2026-01-19 01:09:25
Published on January 18, 2026 5:09 PM GMT
My parents have always said that they love all four of their children equally. I always thought this was a Correct Lie: that they don’t love us all equally, but they feel such a strong loyalty to us and have Specific Family Values such that lying about it is the thing to do to make sure we all flourish.
I realized this morning they are probably not lying.
The reason I originally thought they were lying is that it seems clear to me that they are on the whole more frequently delighted by some of us than others. And on the whole can relate more frequently to some of us than others. And that’s skipping over who they might be most proud of.
Now I grew up with a distinction between “liking” and “loving” which I have always found helpful: “Liking” is the immediate positive experiences and payoffs you get from a relationship. “Loving” is the sense of deeper connection you have with someone[1].
Liking goes up and down. Loving stays the same or goes up, unless you misunderstood someone’s fundamental nature entirely. You can like someone more if they are in a good mood than in a bad one. But you don’t love them more or less for it.
What do you love them for instead? For their values, their way of relating to the world, their skills and traits that are so essentially them that they outline every edge of their spirit. Not “spirit” as a metaphysical object, but like how some people deeply embody kindness cause they are just that way. There might be something in their deep values, or their reward wiring, or their instincts, that makes them so deeply kind. And that. That, is something you can love.
Now children, genetically, are 50% of each parent[2]. If a parent loves all of themselves and loves all of their partner then ... they will naturally love all of their children.
What’s the “equal” doing though? Don’t you love some people more than others?
Yes and no. The way I think about “love” the loving feeling is the “same” for the kindness in John as for the competence in Jack. But if Jill is both kind and competent than I may love her more than John or Jack (all things being equal that is. Ha!)
And of course you can’t math the traits together. It’s an intuition of a direction of a feeling.
But I think that direction points to this: Your kids are built from all of you and all of your partner - If you love all of that, then you love all of them.
Of course, mother nature has more chemicals to solve any problem in that equation. Drugs are a hell of a drug.
But even if you lack those, then your children are roughly a mosaic of you and the person you picked to make them with.
And that means something else too: If you don’t love parts of yourself or your partner, then your children will see that too. If you get angry at yourself for always being late or angry at your partner for always making a mess, then your kids will see you won’t love those parts of them either.
And sure, not all genes express in all phenotypes, and nurture and experience matter too. But love is a fuzzy feeling and will fuzz out most of the difference. If the core traits are there, distributed across your children in various combinations, then each of them is Clearly Loveable. Because so are you, and so is your partner.
My parents are good at this. They clearly accept themselves fully and each other too.
I don't always accept myself fully.
But I'm working on it.
Because if my kids grow up and find any of their parts to be like mine, I want them to be able to look at me and see that I love those parts too. And maybe that will in turn help them figure out how to love themselves just as equally.
I’m not claiming these are the de facto correct ways to think about liking and loving. My intention is to offer a frame for these concepts that might be worth exploring. You can also keep your own definitions and think about this as alt-liking and alt-loving, and still track them as ways of relating.
I’m skipping over blended families here. My own family has aspects of that too and it is great and the love is as real as ever. This essay is more a messy exploration of how loving and accepting yourself can have positive effects on your bond with your children.
2026-01-18 23:05:56
Published on January 18, 2026 3:05 PM GMT
I do a quick experiment to investigate how DroPE (Dropping Positional Embeddings) models differ from standard RoPE models in their use of "massive values" (that is, concentrated large activations in Query and Key tensors) that prior work identifies as important for contextual understanding. I did this in my personal time, for fun.
Two main findings:
These findings suggest that, during recalibration, DroPE learns alternative attention mechanisms that don't depend on concentrated features.
Massive values are unusually large activations in the Query (Q) and Key (K) tensors of transformer attention layers. They were identified by Jin et al. (2025) to have the pattern of:
Jin et al. provide a mechanistic explanation rooted in RoPE's frequency structure. RoPE divides the head dimension into pairs, each rotating at a frequency . High-frequency components (small ) change rapidly with position, encoding fine-grained positional information. Low-frequency components (large ) change slowly, and Jin et al. argue these dimensions primarily encode semantic content rather than position. They find that when they disrupt these massive values, this devastates contextual understanding tasks while parametric knowledge tasks show only a little bit of degradation.
DroPE (Gelberg et al., 2025) is a method that removes Rotary Position Embeddings (RoPE) from pretrained models and recalibrates them, which has the effect of zero-shot extending context length.
The claim here is like: RoPE scaling methods (PI, YaRN, NTK) attempt to extend context by compressing rotation frequencies. But low-frequency RoPE components never complete a full rotation during training ( for small ωₘ). At extended lengths, these phases become out-of-distribution, so any scaling method must compress low frequencies by factor to keep phases in range. But this compression shifts attention weights at long distances, where semantic matching matters a lot.
These papers make seemingly incompatible claims.
Jin et al. claim that RoPE -> massive values -> essential for contextual understanding
Gelberg et al. claim that remove RoPE -> better context extension with preserved capabilities
If massive values are caused by RoPE and critical for understanding, how does DroPE maintain performance?
So we can check a pretty neat and well-scoped research question: are massive values a cause or consequence of contextual knowledge capabilities? And the proxy test we can do cheaply here is: does DroPE, after recalibration, still have massive values?
Models compared:
meta-llama/Llama-2-7b-hf (standard RoPE)SakanaAI/Llama-2-7b-hf-DroPE (RoPE removed + recalibrated)Procedure:
Text samples used: 10 texts including:
| Tensor | RoPE (mean ± std) | DroPE (mean ± std) | Change |
|---|---|---|---|
| Query | 1475.5 ± 22.6 | 901.4 ± 36.0 | -38.9% |
| Key | 1496.8 ± 69.8 | 1331.5 ± 74.1 | -11.0% |
| Value | 174.0 ± 10.7 | 176.6 ± 5.7 | +1.5% |
We also plot this across layers.
How do we interpret these results?
Query shows the largest reduction in number of massive values. Roughly, the Query tensor encodes "what to look for" in attention, which is the model's representation of what information the current position needs. DroPE models learn to distribute this information more evenly across dimensions rather than concentrating it in the low-frequency RoPE dimensions.
Key shows moderate reduction in number of massive values. Roughly, the Key tensor encodes "what information is here" at each position. The smaller reduction suggests some concentration patterns persist, possibly because Key representations must still support some semantic matching.
Value is unchanged, within error bars. Mostly just confirms the Jin et al. finding.
Low variance across text types (std ~2-5% of mean) indicates this is a robust structural property of the models, not dependent on input content.
However, a closer look at Figure 2 shows DroPE didn't uniformly reduce massive values.
Not sure how to interpret this. Possibly, without positional embeddings, DroPE may use layer 1 to establish token relationships through content alone, then rely less on concentrated features in subsequent layers.
Finding 1 shows DroPE has fewer massive values, but are these values still functionally important? We test this by zeroing out massive value dimensions and measuring model degradation.
Procedure:
Disruption implementation:
# Hook on q_proj output
def hook(module, input, output):
# mask: boolean tensor where True = massive value dimension
zero_mask = (~mask).to(output.dtype) # 0 where massive, 1 elsewhere
return output * zero_mask # Zero out massive dimensions
Metric: Our metric of choice here is M-R Difference = (Massive disruption PPL increase) - (Random disruption PPL increase)
and we interpret it as
Higher M-R difference = model relies more on massive values specifically
Raw Perplexity Values
| Model | Baseline | Massive Zeroed | Random Zeroed |
|---|---|---|---|
| RoPE | 1.30 | 1,508.5 | 1.31 |
| DroPE | 1.49 | 22.7 | 1.49 |
Percent Increase (mean ± std across 10 seeds)
| Model | Massive Disruption | Random Disruption | M-R Difference |
|---|---|---|---|
| RoPE | +115,929% ± 0.0% | +0.6% ± 0.7% | +115,929% |
| DroPE | +1,421% ± 0.0% | +0.2% ± 1.2% | +1,421% |
Statistical validation:
We do some quick statistical tests because this is so cheap to do.
So I feel fairly confident that these results are significant!
Key ratio: RoPE relies 82× more on massive values than DroPE
| Text Type | RoPE PPL Increase | DroPE PPL Increase |
|---|---|---|
| Literary | +116,000% | +1,400% |
| Technical | +115,800% | +1,450% |
| Repetitive | +116,100% | +1,380% |
Results are pretty consistent regardless of text content.
RoPE model: Zeroing massive values completely breaks the model. The model cannot function without these concentrated activations.
DroPE model: Zeroing massive values degrades but doesn't break the model. The model has learned alternative mechanisms that partially compensate.
Control condition: Zeroing random dimensions causes negligible damage in both models, proving massive values are specifically important, not just any high-norm dimensions.
Basically, both models have massive values, but RoPE is catastrophically dependent on them while DroPE is not.
The apparent contradiction between the papers dissolves once we distinguish where massive values come from versus how they're used:
Understanding how and why large language models work in a principled way will require knowing the internal mechanisms of the transformer stack very deeply. While many components (such as attention, MLPs, and residual connections) are now relatively well studied, positional encoding remains surprisingly opaque (at least, to me). In particular, Rotary Positional Embeddings (RoPE) is both weird and brittle: they strongly shape attention behavior, impose hard-to-reason-about constraints on context length, and interact nontrivially with model scaling, quantization, and alignment.
I find RoPE is often wonky to work with, where small changes in frequency scaling or context length can produce disproportionate failures, and extending context reliably seems like it will require delicate engineering. Also, I had a free hour, Claude Code, and reread Yoav's paper while at the gym earlier this morning.
Models tested: We examined only Llama-2-7B, as it was the largest DroPE model mentioned in the paper. Also, I can't find the other models on HuggingFace. Larger models and different architectures may show different patterns.
Recalibration dynamics: We compared endpoints (RoPE vs. fully recalibrated DroPE). Tracking massive values during recalibration would reveal how redistribution occurs. I
Task-specific analysis: We measured perplexity. Testing on Jin et al.'s contextual vs. parametric knowledge tasks would directly validate whether DroPE's reorganization preserves contextual understanding through alternative mechanisms. I'm doing this as we speak.
All experiments can be reproduced here:
# Massive value comparison
python scripts/run_massive_values_rigorous.py
# Disruption experiment
python scripts/run_disruption_rigorous.py
| Parameter | Value | Source |
|---|---|---|
| λ (massive threshold) | 5.0 | Jin et al. 2025 |
| Sequence length | 512 tokens | Standard |
| Number of text samples | 10 | Diverse corpus |
| Number of random seeds | 10 | Statistical validation |
If you use these findings, please cite:
@article{jin2025massive,
title={Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding},
author={Jin, Mingyu and others},
journal={ICML},
year={2025}
}
@article{gelberg2025drope,
title={Dropping Positional Embeddings for Zero-Shot Long-Context Extension},
author={Gelberg, Tal and others},
journal={arXiv preprint arXiv:2512.12167},
year={2025}
}
@techreport{africa2026massive,
title = {Massive Activations in DroPE: Evidence for Attention Reorganization},
author = {Africa, David},
year = {2026},
url = {https://github.com/DavidDemitriAfrica/drope-activations}
}
2026-01-18 11:57:19
Published on January 18, 2026 3:57 AM GMT
This post was written as part of research done at MATS 9.0 under the mentorship of Richard Ngo. It's related to my previous post, but should be readable as a standalone.
Remark: I'm not yet familiar enough with the active inference literature to be sure that the issues I bring up haven't been addressed or discussed. If you think my characterisation of the state and flaws of the theory are missing something substantial, I'd love to know.
In the theory of active inference, agents are described as having a set of internal states that interact with external states (the world) through a membrane of intermediate states, such as the senses. I'm currently exploring how agents are able to exhibit approximations of external reference that allow them to stay alive in the real world. They achieve this even though they only have access to the statistical proxy of their internals, which they could easily reward-hack without optimising the external states at all.
One of active inference's weaknesses is that it struggles to model agents' uncertainties about their own preferences. I here propose a potential explanation for why agents are conflicted about these preferences. This perspective posits agents' seeming inconsistency and irrationality about their goals as a mechanism that protects them from reward-hacking their internal states.
Consider the following question:
What stops an agent from generating adversarial fulfilment criteria for its goals that are easier to satisfy than the "real", external goals?
Take Clippy as an example, whose goal is stated as maximising the amount of paperclips in the world. Since Clippy only has internal reference, it could represent this goal as "I observe that the world has as many paperclips as it could possibly have". I'm wondering what in Clippy's system saves it from "winning at life" by hooking its sensors up to a cheap simulator that generates an infinite stream of fictional paperclips for it to observe.
An elegant answer to the problem of internal reward-hacking is that agents come pre-equipped with suitable priors about their internal states. In active inference, agents seek to update their beliefs and act on the world such that their observations fit their priors as closely as possible. The space of "good" priors for agents' internal states is very small. However, evolutionary pressures have selected for agents with priors that are conducive to their survival. According to active inference, agents attempt to manifest these favourable priors through action, which makes the priors function as preferences.
Unfortunately, the claim that evolutionarily fine-tuned priors do all the work to prevent internal reward-hacking seems lacking to me, because in practice we are uncertain about our own feelings and preferences. We don't actually have locked-in, invariant preferences, and it's unclear to me how active inference explains this; preferences are usually encoded as priors over observations, but ironically these are never updated.[1]
Active inference thus implicitly assumes agents to be consistently, definitively settled on their preferences. Agents are only uncertain about the external states and about how their actions and senses will interact with those states. Within those unknowns, they seek to optimise for the observations that they are certain they prefer. I don't think this assumption is warranted. In fact, I have been considering the possibility that agents' uncertainty about their own preferences is an important instrument for increasing their (bounded) rationality.
Consider the example I used in my last post of a hypothetical person, Alice, who wants to maximise "success". In that example, Alice avoids applying to a prestigious university because rejection would decrease her internal perception of success. She instead applies to a worse university that she is sure to get into, as this will certainly increase her success-o-meter.
Suppose instead that Alice feels a twinge of guilt not applying to the prestigious university, as this could be perceived as "loser" behaviour by her friend. This guilt may motivate her to apply anyway, even though the action lowers (in expectation) her internal perception of success. Here, the mixed optimisation of two distinct goals: "I perceive myself as maximally successful" and "I perceive myself as someone that my friend thinks is maximally successful", yields behaviour that actually makes Alice more successful.
In Free Energy Minimisers (FEMs) from active inference, preferences are usually described as fixed priors over the space of observations. One possible model for Alice's behaviour is that each action is chosen with respect to one of two sets of priors. The priors she chooses to satisfy in a given action are sampled from some distribution over priors that represents the degree to which she identifies with conflicting preferences. In practice, Alice now doesn't resemble a consistent FEM, but she has become more aligned with respect to the external goal. Her mixed strategy between preferences can be seen as hedging against her top choice of priors being unfit.
I would like to distinguish this concept of inconsistent preferences from mental motions such as compartmentalisation. For instance, suppose an agent learns to calculate the derivative of a function (f+g) by having separate[2] parts of itself calculate the derivatives of f and g and then adding the results. This motion could be seen as the agent using subagents' outputs to solve a problem. However, these "subagents" are not imbued with goals of their own. They're more like tools that the agent deploys to break the problem down into manageable components.
My guess is that people's uncertainties about their preferences are better represented as meme(plexe)s competing with each other for attention. The memes that live to be observed in minds are those that could be seen as agentically pursuing survival and reproduction.[3] Internal preferential inconsistency would thus be analogous to the sub-parts in the above example optimising to convince the agent that they are "useful" for calculating derivatives and should be kept around.[4]
Sub-processes and compartmentalisation as tools to increase rationality are not controversial ideas. The more contentious claim I'm ideating is that even conflicting agentic sub-processes — harboring goals that are unaligned with those of the larger agent — can still be useful for increasing agentic rationality with respect to external goals. I aim to formalise and explore this hypothesis in an empirical or mathematised setting.
There's a good reason for never updating priors over observations. If agents' preferences could update, they would gradually move towards preferring states that are more likely, even if these aren't fruitful for their continued existence. The function of the fixed priors is to give agents a vision of the world they are willing to execute actions to manifest; these are preferences.
this potentially includes separation across time
For example, successful memes, like catchy songs, have a tendency to get their hosts to spread them to other people.
This goal could functionally be the same as actually being good at calculating derivatives, but it doesn't have to be. For example, if the agent wants the derivative to be high, then a sub-part may gain a competitive advantage by overestimating the answer of the derivative of f. It may eventually convince the agent to employ two copies of itself to calculate the derivatives of both f and g, replacing the other sub-part.