MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

[Advanced Intro to AI Alignment] 2. What Values May an AI Learn? — 4 Key Problems

2026-01-02 22:51:35

Published on January 2, 2026 2:51 PM GMT

2.1 Summary

In the last post, I introduced model-based RL, which is the frame we will use to analyze the alignment problem, and we learned that the critic is trained to predict reward.

I already briefly mentioned that the alignment problem is centrally about making the critic assign high value to outcomes we like and low value to outcomes we don’t like. In this post, we’re going to try to get some intuition for what values a critic may learn, and thereby also learn about some key difficulties of the alignment problem.

Section-by-section summary:

  • 2.2 The Distributional Leap: The distributional leap is the shift from the training domain to the dangerous domain (where the AI could take over). We cannot test safety in that domain, so we need to predict how values generalize.
  • 2.3 A Naive Training Strategy: We set up a toy example: a model-based RL chatbot trained on human feedback, where the critic learns to predict reward from the model's internal thoughts. This isn't meant as a good alignment strategy—it's a simplified setup for analysis.
  • 2.4 What might the critic learn?: The critic learns aspects of the model's thoughts that correlate with reward. We analyze whether honesty might be learned, and find that "say what the user believes is true" is similarly simple and predicts reward better, so it may outcompete honesty.
  • 2.5 Niceness is not optimal: Human feedback contains predictable mistakes, so strategies that predict reward (including the mistakes) outperform genuinely nice strategies.
  • 2.6 Niceness is not (uniquely) simple: Concepts like "what the human wants" or "follow instructions as intended" are more complex to implement than they intuitively seem. The anthropomorphic optimism fallacy—expecting optimization processes to find solutions in the same order humans would—applies here. Furthermore, we humans have particular machinery in our brains that makes us want to follow social norms, which gives us bad intuitions for what may be learned absent this machinery.
  • 2.7 Natural Abstractions or Alienness?: The natural abstraction hypothesis suggests AIs will use similar concepts to humans for many things, but some human concepts (like love) may be less natural for AIs. It could also be that the AI learns rather alien concepts and then the critic might learn a kludge of patterns rather than clean human concepts, leading to unpredictable generalization.
  • 2.8 Value extrapolation: Even if we successfully train for helpfulness, it's unclear how this generalizes when the AI becomes superintelligent and its values shift to preferences over universe-trajectories. Coherent Extrapolated Volition (CEV) is a proposed target for values that would generalize well, but it's complex and not a near-term goal.
  • 2.9 Conclusion: Four key problems: (1) reward-prediction beats niceness, (2) niceness isn't as simple as it may intuitively seem to us, (3) learned values may be alien kludges, (4) niceness that scales to superintelligence requires something like CEV.

2.2 The Distributional Leap

Since we train the critic to predict reward and the AI searches for strategies where the critic assigns a high value, the AI will perform well within the training distribution as measured in how much reward it gets. So if we train on human feedback, the human will often like the answers of the AI (although it’s possible the human would like some answers less if they had even fuller understanding).

But the thing we’re interested in is what the AI will do when it becomes dangerously smart, e.g. when it would be capable of taking over the world. This shift from the non-catastrophic domain to the catastrophic domain is sometimes called the distributional leap. A central difficulty here is that we cannot test what happens in the dangerous domain, because if the safety properties fail to generalize, humanity becomes disempowered.[1]

In order to predict how the values of an AI might generalize in our model-based RL setting, we want to understand what function the critic implements, aka aspects of the model’s outcomes the critic assigns high or low value to. Ideally we would have a mechanistic understanding here, so we could just look at the neural networks in our AI and see what the AI values. Alas, we are currently very far from being able to do this, and it doesn’t look like progress in mechanistic interpretability will get us there nearly in time.

So instead we resort to trying to predict what the critic is most likely to learn. For alignment we need to make sure the critic ends up the way we like, but this post is mostly about conveying intuition of what may likely be learned in given a simple example training setup, and thereby also illustrating some key difficulties of alignment.

2.3 A Naive Training Strategy

Let’s sketch an example training setup where we can analyze what the critic may learn.

Say we are training an actor-critic model-based RL chatbot with Deep Learning. With data from chat conversations of past models, we already trained an actor and a model: The actor is trained to predict what the AI may say in a conversation, and the model is trained to predict what the user may say in reply.

Now we introduce the critic, which we will train through human feedback. (The model also continues to be trained to even better predict human responses, and the actor also gets further trained based on the value scores the critic assigns. But those aren’t the focus here.)

The critic doesn’t just see the model’s predicted response[2], but the stream of thought within the model. So the model might e.g. internally think about whether the information in the AI text is correct and about what the human may think when reading the text, and the critic can learn to read these thoughts. To be clear, the model’s thoughts are encoded in giant vectors of numbers, not human-readable language.

The bottom rhombus just shows that if the value score is high, the proposed text gets outputted, and if not, the actor is supposed to try to find some better text to output.

The human looks at the output and tries to evaluate whether it looks like the AI is being harmless, helpful, and honest, and gives reward based on that.

2.3.1 How this relates to current AIs

To be clear, this isn’t intended to be a good alignment strategy. For now we’re just interested in building understanding about what the critic may learn.

Also, this is not how current LLMs work. In particular, here we train the critic from scratch, whereas LLMs don’t have separated model/actor/critic components, and instead learn to reason in goal-directed ways where they start out generalizing from text of human reasoning. This “starting out from human reasoning” probably significantly contributes to current LLMs being mostly nice.

It’s unclear for how long AIs will continue to superficially reason mostly like nice humans - the more we continue training with RL, the less the initial “human-like prior” might matter. And LLMs are extremely inefficient compared to e.g. human brains, and it seems likely that we eventually have AIs that are more based on RL. I plan to discuss this in a future post.

In the analysis in this post, there is no human-like prior for the critic, so we just focus on what we expect to be learned given model-based RL.

Model-based RL also has advantages for alignment. In particular, we have a clear critic component which determines the goals of the AI. That’s better than if our AI is a spaghetti-mess with nothing like a goal slot.[3]

2.4 What might the critic learn?

Roughly speaking, the critic learns to pay attention to aspects of the model’s thoughts that are correlated with reward, and to compute a good reward prediction from those aspects[4].

Initially, what the critic computes may be rather simple. E.g. it may look at whether the model thinks the user will say a word like great/amazing/awesome, and some other simple aspects like that and then have a simple function on those aspects to compute the value score.

As we train further, the critic may learn more complex functions and compute its own complex aspects from information it can extract from the model’s thoughts.

Overall, the critic is more likely to learn (1) a function that is simple for neural networks to learn, and (2) a function that predicts reward well. As we train more, the reward prediction becomes better and the function in the critic can become more complex, but of two functions that predict reward similarly well, the critic will more likely learn the one that’s simpler for neural nets to learn.

Note that what’s simple for a neural net to learn likely doesn’t match well with what we intuitively think of as simple. “Love” may seem like a simple concept to us but it may be complex for an AI to learn to value. “Honesty” seems less human-centric, but even if it is, what exactly would it mean for our AI to care about being honest?

In order to evaluate whether honesty might be learned, we need to think mechanistically about what it would mean for the critic to rank honest texts more highly.

2.4.1 Might the critic learn to score honesty highly?

(Take the following analysis with a grain of salt, what actually gets learned may be a lot more messy and alien.)

The AI is honest if the text it outputs matches its beliefs, which in our case means matching the beliefs of the model.

So we need a comparison between the text and the model’s beliefs. Might the model already compute the differences here, so the critic could just pick up on those differences instead of needing to learn the comparison itself? Yes that seems likely, since such differences may often be important for predicting how the human will respond.

Cool, so will the critic learn to pay attention to those differences? Seems plausible again, since such differences also seem quite useful for predicting reward, because the human will give negative reward if the AI outputs text where the human can tell it is false.

So we could imagine the critic learning an honesty circuit, that decreases the value score if significant such differences are present. (To be clear, this is just exemplary, there very likely won’t actually be anything like a relatively independent honesty circut in the critic. But the complexity of an honesty circut might still tell us something about whether honesty might be learned.)

So yeah, in our simplified toy model, the critic may learn a pattern that predicts honesty is good.

However, it is only one pattern among many, and there will still be some cases where the critic evaluates the non-honest action as better overall. In particular, this is likely to happen in cases where AI predicts that the dishonesty probably won't be caught. So when the AI then indeed does not get caught, the honesty-pattern gets weaker, since it predicted low reward but the result was high reward. And there might even be cases where the AI is honest but the human thinks it’s wrong and then mistakenly gives low reward.

Is there something else that could be learned which predicts reward better than honesty and isn’t much more complex? Unfortunately, yes:

The model doesn’t just have beliefs about what it thinks is true, but also beliefs about what the human believes. This is especially true in our case because the model is predicting how the human responds. And the model likely also already compares the text to its beliefs about the human’s beliefs.

So the critic can just learn to pay attention to those differences and assign a lower value score if those are present. Now the model learned to tell the human what they will think is true, which performs even better.

So the original honesty circut will get outcompeted. Indeed, because those two circuits seem similarly complex, the honesty circut might not even have been learned in the first place!

2.4.1.1 Aside: Contrast to the human value of honesty

The way I portrayed the critic here as valuing honesty is different from the main sense in which humans value honesty: for humans it is more self-reflective in nature—wanting to be an honest person, rather than caring in a more direct way that speech outputs match our beliefs.

We don’t yet have a good theory for how human preferences work, although Steven Byrnes has recently made great progress here.

2.5 Niceness is not optimal

That the critic doesn’t learn honesty is an instance of a more general problem which I call the "niceness is not optimal” problem. Even if we try to train for niceness, we sometimes make mistakes in how we reward actions, and the strategy that also predicts the mistakes will do better than the nice strategy.

Unfortunately, mistakes in human feedback aren’t really avoidable. Even if we hypothetically wouldn’t make mistakes when judging honesty (e.g. in a case where we have good tools to monitor the AI’s thoughts), as the AI becomes even smarter, it may learn a very detailed psychological model of the human and be able to predict precisely how to make them decide to give the AI reward.

One approach to mitigate this problem is called “scaleable oversight”. The idea here is that we use AIs to help humans give more accurate feedback.

Though this alone probably won’t be sufficient to make the AI learn the right values in our case. We train the critic to predict reward, so it is not surprising if it ends up predicting what proposed text leads to reward, or at least close correlates of reward, rather than what text has niceness properties. This kind of reward-seeking would be bad. If the AI became able to take over the world, it would, and then it might seize control of its reward signal, or force humans to give it lots of reward, or create lots of human-like creatures that give it reward, or whatever.[5]

Two approaches for trying to make it less likely that the critic will be too reward-seeking are:

  1. We could try to have the AI not know about reward or about how AIs are trained, and also try to not let the AI see other close correlates to reward, ideally including having the model not model the overseers that give reward.
  2. We could try to make the AI learn good values early in training, and then stop training the critic before it learns to value reward directly.

2.6 Niceness is not (uniquely) simple

We’ve already seen that honesty isn’t much simpler than “say what the user believes” in our setting. For other possible niceness-like properties, this is similar, or sometimes even a bit worse.

Maybe “do what the human wants” seems simple to you? But what does this actually mean on a level that’s a bit closer to math - how might a critic evaluating this look like?

The way I think of it, “what the human wants” refers to what the human would like if they knew all the consequences of the AI’s actions. The model will surely be able to make good predictions here, but the concept seems more complex than predicting whether the human will like some text. And predicting whether the human will like some text predicts reward even better!

Maybe “follow instructions as intended” seems simple to you? Try to unpack it - how could the critic be constructed to evaluate how instruction-following a plan is, and how complex is this?

Don’t just trust vague intuitions, try to think more concretely.

2.6.1 Anthropomorphic Optimism

Eliezer Yudkowsky has a great post from 2008 called Anthropomorphic Optimism. Feel free to read the whole post, but here’s the start of it:

The core fallacy of anthropomorphism is expecting something to be predicted by the black box of your brain, when its casual structure is so different from that of a human brain, as to give you no license to expect any such thing.

The Tragedy of Group Selectionism (as previously covered in the evolution sequence) was a rather extreme error by a group of early (pre-1966) biologists, including Wynne-Edwards, Allee, and Brereton among others, who believed that predators would voluntarily restrain their breeding to avoid overpopulating their habitat and exhausting the prey population.

The proffered theory was that if there were multiple, geographically separated groups of e.g. foxes, then groups of foxes that best restrained their breeding, would send out colonists to replace crashed populations. And so, over time, group selection would promote restrained-breeding genes in foxes.

I'm not going to repeat all the problems that developed with this scenario. Suffice it to say that there was no empirical evidence to start with; that no empirical evidence was ever uncovered; that, in fact, predator populations crash all the time; and that for group selection pressure to overcome a countervailing individual selection pressure, turned out to be very nearly mathematically impossible.

The theory having turned out to be completely incorrect, we may ask if, perhaps, the originators of the theory were doing something wrong.

"Why be so uncharitable?" you ask. "In advance of doing the experiment, how could they know that group selection couldn't overcome individual selection?"

But later on, Michael J. Wade went out and actually created in the laboratory the nigh-impossible conditions for group selection. Wade repeatedly selected insect subpopulations for low population numbers. Did the insects evolve to restrain their breeding, and live in quiet peace with enough food for all, as the group selectionists had envisioned?

No; the adults adapted to cannibalize eggs and larvae, especially female larvae.

Of course selecting for small subpopulation sizes would not select for individuals who restrained their own breeding. It would select for individuals who ate other individuals' children. Especially the girls.

The problem was that the group-selectionists used their own mind to generate a solution to a problem, and expected evolution to find the same solution. But evolution doesn’t search for solutions in the same order you do.

This lesson directly carries over to other alien optimizers like gradient descent. We’re trying to give an AI reward if it completed tasks in the way we intended, and it seems to us like a natural thing the AI may learn is just to solve problems in the way we intend. But just because it seems natural to us doesn’t mean it will be natural for gradient descent to find.

The lesson can also apply to AIs themselves, albeit that current LLMs seem like they inherit a human-like search ordering from being trained on lots of human data. But as an AI becomes smarter than humans, it may think in ways less similar to humans, and may find different ways of fulfilling its preferences than we humans would expect.

2.6.2 Intuitions from looking at humans may mislead you

We can see the human brain as being composed out of two subsystems: The learning subsystem and the steering subsystem.

The learning subsystem is mostly the intelligent part, which also includes some kind of actor-model-critic structure. There are actually multiple critic-like predictors (also called thought assessors) that predict various internal parameters, but one critic, the valence thought assessor, is especially important in determining what we want.

The reward function on which this valence critic is trained is part of the steering subsystem, and according to the theory which I think is correct, this reward function has some ability to read the thoughts in the learning subsystem, and whenever we imagine someone being happy/sad, this triggers positive/negative reward, especially for people we like[6], and especially in cases where the other person is thinking about us. So when we do something that our peers would disapprove of, we directly get negative reward just from imagining someone finding out, even if we think it is unlikely that they will find out.[7]

This is a key reason why most humans are at least reluctant to breach social norms like honesty even in cases where breaches very likely won’t get caught.

Given this theory, psychopaths/sociopaths would be people where this kind of approval reward is extremely small, and AFAIK they mostly don’t seem to attach intrinsic value to following social norms (although of course instrumental value).

We currently don’t know how we could create AI that gets similar approval reward to how humans do.

For more about how and why some human intuitions can be misleading, check out “6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa”.

2.7 Natural Abstractions or Alienness?

Ok, so the niceness properties we hope for are perhaps not learned by default. But how complex are they to learn? How much other stuff that also predicts reward well could be learned instead?

In order to answer this question, we need to consider whether the AI thinks in similar concepts as us.

2.7.1 Natural Abstractions

The natural abstraction hypothesis predicts that “a wide variety of cognitive architectures will learn to use approximately the same high-level abstract objects/concepts to reason about the world”. This class of cognitive architectures includes human minds and AIs we are likely to create, so AIs will likely think about the world in mostly the same concepts as humans.

For instance, “tree” seems like a natural abstraction. You would expect an alien mind looking at our planet to still end up seeing this natural cluster of objects that we call “trees”.[8]This seems true for many concepts we use, not just “tree”.

However, there are cases where we may not expect an AI to end up thinking in the same concepts we do. For one thing, an AI much smarter than us may think in more detailed concepts, and it may have concepts for reasoning about parts of reality that we do not have yet. E.g. imagine someone from 500 years ago observing a 2025 physics student reasoning about concepts like “voltage” and “current”. By now we have a pretty decent understanding about physics, but in biology or even in the science of minds an AI might surpass the ontology we use.

But more importantly, some concepts we use derive from the particular mind architecture we have. Love and laughter seem more complex to learn for a mind that doesn’t have brain circuitry for love or laughter. And some concepts are relatively simple but perhaps not quite as natural as they seem for us humans. I think “kindness”, “helpfulness”, and “honor” likely fall under that category of concepts.

2.7.2 … or Alienness?

Mechanistic interpretability researchers are trying to make sense of what’s happening inside neural networks. So far we found some features of the AI’s thoughts that we recognize, often specific people or places, e.g. the Golden Gate Bridge. But many features remain uninterpretable to us so far.

This could mean two things. Perhaps we simply haven't found the right way to look - maybe with better analysis methods or maybe with a different frame for modelling AI cognition, we would be able to interpret much more.

But it’s also possible that neural networks genuinely carve up the world differently than we do. They might represent concepts that are useful for predicting text or images but don't correspond to the abstractions humans naturally use. And this could mean that many of the concepts we use are alien for the AI in turn. Although given that the AI is trained to predict humans, it perhaps does understand human concepts, but it could be that many such concepts are less natural for the AI and it mostly reasons in other concepts.

The worst case would be that concepts like “helpfulness” are extremely complex to encode in the AI’s ontology, although my guess is that it won’t be that complex.

Still, given that the internals of an AI may be somewhat alien, it seems quite plausible that what the critic learns isn’t a function that’s easily describable through human concepts, but may from our standpoint rather be a messy kludge of patterns that happen to predict reward well.

If the critic learned some kludge rather than a clean concept, then the values may not generalize the way we hope. Given all the options the AI has in its training environment, the AI prefers the nice one. But when the AI becomes smarter, and is able to take over the world and could then create advanced nanotechnology etc, it has a lot more options. Which option does now rank most highly? What does it want to do with the matter in the universe?

I guess it would take an option that looks strange, e.g. filling the universe with text-like conversations with some properties, where if we could understand what was going on we could see the conversations somewhat resembling collaborative problem solving. Of course not exactly that, but there are many strange options.

Though it’s also possible, especially with better alignment methods, that we get a sorta-kludgy version of the values we were aiming for. Goodhart’s Curse suggests that imperfections here will likely be amplified as the AI becomes smarter and thus searches over more options. But whether it’s going to end up completely catastrophic or just some value lost likely depends on the details of the case.

2.8 Value extrapolation

Suppose we somehow make the critic evaluate how helpful a plan is to the human operators, where “helpful” is the clean human concept, not an alien approximation.[9]Does that mean we win? What happens if the AI becomes superintelligent?

The optimistic imagination is that the AI just fulfills our requests the way we intend, e.g. that it secures the world against the creation of unaligned superintelligences in a way that doesn’t cause much harm, and then asks us how we want to fill the universe.

However, as mentioned in section 1.5 of the last post, in the process of becoming a superintelligence, what the value-part of the AI (aka what is initially the critic) evaluates changes from “what plan do I prefer most given the current situation” to “how highly do I rank different universe-trajectories”. So we need to ask: how may “helpfulness to the human operators” generalize to values over universe-trajectories?

How this generalizes seems underdefined. Helpfulness is mainly a property that actions can have, but it’s less natural as a goal that could be superintelligently pursued. In order to predict how it may generalize, we would need to think more concretely how the helpfulness of a plan can be calculated based on the initial model’s ontology, then imagine how the ontology may shift, imagine value rebinding procedures[10], and then try to predict what the AI may end up valuing.[11]

Regardless, helpfulness (or rather corrigibility, as we will learn later in this series), isn’t intended to scale to superintelligence, but rather intended as an easier intermediate target, so we get genius-level intelligent AIs that can then help us figure out how to secure the world against the creation of unaligned superintelligence and to get us on a path to fulfill humanity’s potential. Although it is of course worrying to try to get work out of an AI that may kill you if it becomes too smart.

2.8.1 Coherent Extrapolated Volition

What goal would generalize to universe-trajectories in a way that the universe ends up nice? Can we just make it want the same things we want?

Human values are complex. Consider for example William Frankena’s list of terminal values as an incomplete start:

Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom; beauty, harmony, proportion in objects contemplated; aesthetic experience; morally good dispositions or virtues; mutual affection, love, friendship, cooperation; just distribution of goods and evils; harmony and proportion in one's own life; power and experiences of achievement; self-expression; freedom; peace, security; adventure and novelty; and good reputation, honor, esteem, etc.

Most of these values stem from some kind of emotion or brain circuitry where we don’t yet understand how it works, and for each of them it seems rather difficult to get an AI, which has a very different mind design and lacks human-like brain circuitry, to care about it.

Ok then how about indirectly pointing to human values? Aka: for a particular human, the AI has a model of that human, and can imagine how the human would evaluate plans. So instead of the AI directly evaluating what is right the way humans do it, we point the AI to use its model of humans to evaluate plans.

This indirect target does have much lower complexity than directly specifying the things we care about and thereby does seem more feasible, but there’s some nuance needed. Humans have reflectively endorsed values and urges. We want the AI to care about the values we reflectively endorse, rather than to feed us superstimulating movies that trigger our addiction-like wanting. And of course the pointer to “what humans want” would need to be specified in a way that doesn’t allow the AI to manipulate us into wanting things that are easier to fulfill for the AI.

Furthermore, we don’t know yet how our values will generalize. We have preferences in the here and now, but we also have deep patterns in our mind that determine what we would end up wanting when we colonize the galaxies. We don’t know yet what that may be, but probably lots of weird and wonderful stuff we cannot comprehend yet.

We might even have wrong beliefs about our values. E.g. past societies might’ve thought slavery was right, and while maybe some people in the past simply had different values from us, some others might’ve changed their mind if they became a bit smarter and had time for philosophical reflection about the question.

And of course, we need the AI to make decisions over questions that humans cannot understand yet, so simply simulating what a human would think doesn’t work well.

Ok, how about something like “imagine how a human would evaluate plans if they were smarter and moved by reflectively endorsed desires”?

Yeah we are getting closer, but the “if they were smarter” seems like a rather complicated counterfactual. There may be many ways to extrapolate what a mind would want if it was smarter, and the resulting values might not be the same in all extrapolation procedures.

One approach here is to imagine multiple extrapolation procedures, and act based on where the extrapolations agree/cohere. This gives us, as I understand it, the coherent extrapolated volition (CEV) of a single human.

Not all humans will converge to the same values. So we can look at the extrapolated values of different humans, and again take the part of the values that overlaps. This is the CEV of humanity.

The way I understand it, CEV isn’t crisply specified yet. There are open questions of how we may try to reconcile conflicting preferences of different people or different extrapolation procedures. And we might also want to specify the way a person should be extrapolated to a smarter version of itself. Aka something like slowly becoming smarter in a safe environment without agents trying to make them arrive at some particular values, where they can have fun and can take their time with philosophic reflection on their values.

My read is that CEV is often used as a placeholder for the right indirect value specification we should aim for, where the detailed specification still needs to be worked out.

As you can probably see, CEV is a rather complex target, and there may be further difficulties in avoiding value-drift as an AI becomes superintelligent, so we likely need significantly more advanced alignment methods to point an AI to optimize CEV.

How much earlier nice AIs can help us solve this harder problem is one of the things we will discuss later in this series.

2.9 Conclusion

Whoa that was a lot, congrats for making it through the post!

Here’s a quick recap of the problems we’ve learned of:

  1. Humans make predictable mistakes in giving reward. Thus, predicting what will actually lead to reward or very close correlates thereof will be more strongly selected for than niceness.
  2. Niceness may be less simple than you think.
  3. The concepts in which an AI reasons might be alien, and it may learn some alien kludge rather than the niceness concepts we wanted.

The key problem here is that while the AI learned values that mostly add up to useful behavior on the controlled distribution, the reasons why it has the nice behavior there may not be the good reasons we hoped for, so if we go significantly off distribution, e.g. to where the AI could take over the world, it will take actions that are highly undesirable from our perspective.

And then there’s a fourth problem that even if it is nice for good reasons, many kinds of niceness look like they might break when we crank up intelligence far enough:

  1. Niceness that generalizes correctly to superintelligent levels of intelligence requires something like CEV, which is especially complex.

Questions and Feedback are always welcome!

  1. See also “A Closer Look at Before and After”. Furthermore, even if the AI doesn’t immediately take over the world when it is sure it could, it could e.g. be that the alignment properties we got into our AI weren’t indefinitely scaleable, and then alignment breaks later. ↩︎

  2. which actually isn’t a single response but a probability distribution over responses ↩︎

  3. That’s not to say that model-based RL solves all problems of having an AI with a goal slot. In particular, we don’t have good theory of what happens when a model-based RL agent reflects on itself etc. ↩︎

  4. I’m using “aspects” instead of “features” here because “features” is the terminology used for a particular concept in mechanistic interpretability, and I want “aspects” to also include potential other concepts or so where we maybe just haven’t yet found a good way to measure them in neural networks. ↩︎

  5. There’s also a different kind of reward seeking where the AI actually cares about something else, and only predicted reward for instrumental reasons like avoiding value drift. This will be discussed in more detail in the next 2 posts. ↩︎

  6. For people we actively dislike, the reward can be inverted, aka positive reward when they are sad and negative when they are happy. ↩︎

  7. Of course, the negative reward is even much stronger when we are actually in the situation where someone finds out. But it appears that even in cases where we are basically certain that nobody will find out, we still often imagine that our peers would disapprove of us, and this still triggers negative reward. Basically, the reward function is only a primitive mind-reader, and doesn’t integrate probabilistic guesses about how unlikely an event is into how much reward it gives, but maybe rather uses something like “how much are we thinking about that possibility” as a proxy for how strongly to weigh that possibility. ↩︎

  8. That doesn’t mean there needs to be a crisp boundary between trees and non-trees. ↩︎

  9. Just thinking about the AI learning “helpfulness” is of course thinking on a too high level of abstraction and may obscure the complexity here. And it could also turn out that helpfulness isn’t a crisp concept - maybe there are different kinds of helpfulness, maybe each with some drawbacks, and maybe we confuse ourselves by always imagining the kind of helpfulness that fits best in a given situation. But it doesn’t matter much for the point in this section. ↩︎

  10. Which potentially includes the AI reasoning through philosophical dilemmas. ↩︎

  11. Such considerations are difficult. I did not do this one. It’s conceivable that it would generalize like in the optimistic vision, but it could also turn out that it e.g. doesn’t robustly rule out all kinds of manipulation, and then the AI does some helpful-seeming actions that manipulate human minds into a shape where the AI can help them even more. ↩︎



Discuss

2025 Letter

2026-01-02 21:57:41

Published on January 2, 2026 1:57 PM GMT

I wrote a letter this year about 2025. It's about acceleration, poetry, how it's been the most eventful year of my life, and how I am excited and scared for the future. Crossposted from my substack.

Letter

 

I want to tell you a story about 2025. As I bump along today and approach 21 on into the new year, in a van riding from Burgundy to Paris, and I stare at the small hills, the snow inscribed against the mud like frosted chocolate, extending down into the highway and then melting over into the warm grass on the south side -- I feel an urge to share with you, share this feeling flaring in my spine, of sitting and eating the bread of my youth and imagining it and its associated customs withering in my mouth, I feel an urge to imagine now the world hidden up against the stars, the whole earth green, or black, studded with steel platforms, imagine now what it might feel like for us to live there and what we might hold on to in that place.

I want to tell you a story about the world, about my life, and maybe yours, about 2025, about silicon wafers arranged like helices all the way up into the sky, about the mountains that rise higher where men are made, and the rivers and the cities and how this is the year I've gone through change at a pace to match that of the world's, finally just about a simple boy, learning to be not so simple, learning to imagine a world we might be happy to live in, as we rush along an era of transformation started before his years.


It starts in January, in Boston, where many stories seem to start but rarely end. It starts, again with the snow, lying in heaps on the river Charles where it covers the ice and then the water. I am on the 11th floor of an office, not having seen much sunlight or colors really, and staring at this pure and clear stripe of white cutting between Boston and Cambridge, and it entices me. So I go down there and onto one of the bridge crossing it, and it is night-time now, and I stare at the expanse and throw a little ball of icy snow with all the weight carried into my arm and shoulder, and watch it land and crack and slide meters out into the distance. The year is begun.

Deepseek has just released their cheaper reasoning models, starting an internet craze. Reasoning models are on my mind. My friends and I have visions of scale. Of inference time compute measured in human years, and what it might mean for the world, when these robot minds can run faster than our flesh, and what humans can build to keep observing that reality. We began to broaden our horizons, narrow our selves into the shapes that might bring us answers. We worked hard, till the late hours of the night in those offices, and then we drove in the snowy suburbs and kept thinking.

How can we measure the long horizon abilities of models as they complete tasks with more and more turns, and memory schemes, and agent orchestration, etc...? METR later released a good answer to this, and in the meantime we worked on ours. How can we allow models to mediate their own oversight? We wrote a whole paper just in January about training models to legibly represent systems in natural language. But then the ice started to crack beneath our feet, and when we looked underneath to see what was there, we found a bigger, noisier world to grab our attention.

I was frustrated last year. I was working hard but failing to find my meaning. I was looking for a change. I had another free month before my 6th semester at MIT, doing an exchange in Paris, and I decided to travel and do research. But first I went to Taiwan to contemplate the Earth and its transformations up in the mountains. I taught curious highschoolers about neural networks. I wrote and considered what aesthetics will bring about the future. I talked about dreams, and we sat on wooden sheep and stared at the wisps against the rocks and imagined their shapes solidified. I went to Taipei, tasted sweet potato and sesame and for a few hours felt the city move as I followed its slanted curve and its people told me about their worries at the top of an abnormally large tower looking down on the world, an edge jutting out into the sky, nestled between forest and concrete.

And then the time was up again, and I kept moving. I went to Japan, this time excited to have no purpose and less friends. I met and travelled with new people, across Osaka and its silent castles at night, into Nara and its garden of sitting rocks and deer. What a beautiful world. I raced to Kyoto, and then biked across to the bamboo forest at its outskirts. The bamboos rose like poles layering the darkness, towering above me as if wrapping against my own wobbly limbs. Kyoto is special. The bikers oscillate between the road and the sidewalk, the ground lurches up onto the hills and the temples, where you can look out onto the whole city and its river. It is quiet and more soothing than Tokyo. In a sento (artificial hotspring) I went to with a man from Austria, I met a Frenchman, and then a man from Hong Kong, and then Vietnam, and obviously the Japanese. In English, broken Japanese, and French, we talked about the places we were from, and what people liked to do there, all of us sitting naked, the water opening up our pores and minds.

New AI models came out, optimized for tool use, as did research on the reliability of model reasoning (OpenAI, Anthropic). What affordances do we have to understand the reasoning and process of the machines we gradually outsource our labor to? And then, what levers can allow us to keep our institutions and values in sway? Gradual disempowerment pondered how humanity could go out softly with a whimper, under the roiling mass of a world optimized by creatures we no longer understand. No longer human. In Hakone, I met a kind stranger who brought me to the most beautiful hotspring and brought me from cold water to hot and then to cold again, and I felt oh so very human. And grateful. And then it was time to leave, this time for San Francisco. On the plane I read No Longer Human about a man who failed to convince himself of his own humanity, and lived his life as an unending self sabotage. Its extremity moved me and urged me towards openness.

After going to Japan in search of beauty and silence, California was to find unrest, find the coals for a fire that could host our ideas as we jumped away from college and into the living machine of AI. We spent our days ubering or waymoing across its hills, meeting all kinds of organic and artificial lives: the entrepreneurs walking on the quicksand of an ever changing industry, the AI researchers seeking talent, the worried policy advocates and all the rest forming a diffuse mass that simply represented our unknown future staring down at us, as if 1000 doors had suddenly opened without us having time to look through them. We did a hackathon, organized by a company in the business of distilling human flesh into data and into intelligence, and we called our project beluga, and did research on how allowing reasoning models to use explicit programmatic abstractions boosted their ability to search and plan in combinatorial games. We worked till the lack of sleep made us stupid, and resolved to go up a mountain if we won. We were out and about at the edge. I got closer to some of the city's people, who had held on their maybe naive seeming love of the world, but also knew the rules of the game being played here.

Finally, the plane was boarding again, this time to France. I was to spend a few months there again, the longest since I left for college at 17, and study at one of its schools as I enjoyed the city and a change of pace, and figured out what I wanted to work on. But SF had already given me fire to work with, and I was half way there. I wanted to see if I would live there. Paris is my favorite place to walk, along the quays, staring across at the gilded buildings and ancient amulets of a world now basking in its own glory. In Paris I felt again how much people could appreciate their lives, without necessarily doing anything, as I walked all along and ate the best breads, and met people who understood me and where I came from, and watched with them new movies that moved me. After my americanized life, Paris elicited old dimensions that I missed, my affinity for an intellectual heritage that had been reified, that was clean and orderly and delineated, with its catalogue of white and red Gallimard books, and its vast vocabulary of reference and images, often springing out of nowhere like a flood, and the lyricality of its poems. I felt the ease with which a man can jump into abstractions, when in Paris. And I hold on to all of these dearly, but Paris is not the time or place for me just now. Maybe in a few years, but right now it is too closed to me, too slow to catch up. Keats declares mastering and holding negative capability - having the ability to live with contradictions, is the mark of a first class man. 2025 I learned to do that a bit more than in the past. One of my dearest friends gave me A Moveable Feast, by Hemingway, and it accompanied me as I walked along the city, and wrote and ran experiments in its gardens, my favorite being the Luxembourg gardens that were the crib of my youth, as I watched its cute toy boats dawdling along the fountains.

I was also surprisingly alone, sometimes, in my school, being the only exchange student, only man with long blond locks in a crowd of well shaved and trimmed men who were deep in a culture I could no longer monomaniacally commit to, who had been reared to the rhythm of the prep schools and the importance of their culture and their accomplishment. But I greeted my loneliness, except insofar as it felt like a failure, and I read and explored and worked quite hard. In March, right before we were submitting a paper, our research machine was accidentally destroyed, and we all scrambled to recover all our plots in time for the deadline. I was beyond myself that night, but in the end we made it work, and I fell sick for a while. I had some unresolved tension with Paris and its people, and these months allowed me to heal my way through it, but not without difficulty. I feel like I can raise my head higher now, and stare at these cultures with clarity. I am excited to move forward, without forgetting the world to be made and the world that stands heavy and complete before me. Again, what a beautiful world, and what a beautiful thing to live the spring in Paris, when the trees in the park regain their leaves and walking in the night feels softer, how pleasant to walk the night across the water and go climbing in the rain.

In May, I felt called to San Francisco. I called Kaivu many times, and we talked about our research ideas, what we wanted to put into the world, meta science, quantum mechanics, natural evolution and the process of science, and considered where we wanted to do our best work. We both felt ready to put our soul into something. We decided machine learning is a soil science, and the problems we want to solve need data, need to engage with the roiling mass of human society and activity and markets. It was time to start an expedition.

I flew there. Maybe because of how different and special each place I visited was for me this year, each flight was a condensation of intensity, as I recalled and prepared for my next leap forward. I furiously jotted down in my notebook, what I felt from Paris, and what I wanted to make in San Francisco. For the summer, I moved in with a group of friends in a house we called tidepool. We learned the best orders at in-n-out, we went to Tahoe, and some of us started companies. We talked about the city, about machine learning and what we wanted to work on. It was a good time. It was my first time living in San Francisco. The nature is beautiful, the air is rife with anticipation, but it is also sometimes a bit too much. The city was torn by rampant inequality, and people struggling to keep control of their own limbs, faster than the other people trying to build them new ones. I am wary of its digital fetishization, fetishization of the things that I am close to. I am wary of when things become performance rather than play, and warier even when the play concerns the design of intelligent machines, as playful as they are.

Starting a company is a great challenge, and being in San Francisco is a great place to learn how to do it. We learned about what kinds of products and trades happen in Silicon Valley, and how we could fit our ideas into products into those gaps. Doing research well seems to be about picking some important portion of reality, and closing in on it ruthlessly, always asking yourself which of your assumptions is the weakest, and then making it real. But you can mostly choose your object and reorient very fast, because your environment is quite simple, you and the science. But in a company there is an insane amount of inputs -- customers, investors, what people want, your brand, who is talking about you, etc... and every day there are 10000 things you could do to interact with all these players and you need to pick the strategy. Both of them require the same ruthlessness and attention to detail, and this year has taught me about both. I am learning to love this place.

Many things in life require a great deal of conviction. For most of my life I have been able to pull through because of my natural endless supply of curiosity and fascination with the world. But sometimes that is not enough, because that love is not always sharp enough to discriminate. This year, I made progress in choosing. Maybe because starting a company can be so stressful, and requires so much belief, I was forced into reckoning with my uncertainties and committing to what had to be done, if I wanted to do anything at all. One day in the summer I went to Land's End, a beautiful place on the coast of SF, near golden gate park, with a friend from Boston and we stared at the waves crashing into the rocks, and in the floating sunlight as the wind crashed through our hair we talked about reason and emotion, about learning to listen and not suppress your gut telling you what you really want to do. In 2025, I am getting better at listening to it, before someone else tries to force-implant me an artificial one.

Fulcrum worked out of our house, alternating days of furious coding and then vagabonding across the city. I started using Claude Code around end of May for a hackathon, and was amazed. Anthropic's release brought agents from the domain of research into practice, and I began driving them daily. As I worked on our products, I thought about how humans might interact with agents, and what kinds of technology could leverage the asymmetric abilities of humans and AIs. How to delegate, and orchestrate models, and what infrastructure might allow us to distribute our labor beyond our current capacity for attention. Based on these, I built a few open source prototypes on the future of coding. We also made a system to precisely observe and understand both what your AI model is doing, and what your evals are measuring. Understanding evals is the place to start with model oversight, ie using models to understand and control other models. We had many hesitations on what could work, and what kind of company we could build, but we laid the seeds of our now firm conviction. We got resources, gathered more people, and are building the ship to carry us up into the stars. This year, we publicly launched our evaluations tool and platform for running and debugging agent systems. We will be releasing much more soon. We want to build the technology the future will need, with full freedom, and the people we love working with. I am very happy about it, and hope we can execute on the ideas that will matter. In the nights, which were often short, due to the incessant ambulances and noise of our neighborhood, I often wrote, or read. I read The Road, and enjoyed its short prose that jumped to evocative and airy images, and built up a wasteland of cannibals and hunger and the nature dying with the men, as a child and his father make their way through the defunct continent.

I took a cab one day from San Francisco to Berkeley to meet some customers, and the driver was a man named Augustine from Nigeria. I chatted with him for the whole ride, and he told me about how he came to America in 1991, how he was shipped off to marry someone, how the valley has changed and grown colder, and how when he first came here he went to the park and sat in the dirt and imagined spirits, urging him on, giving him a strength that carried all the lives of the dead and living who make their bread in that place. He gave me advice for my new life. He told me to keep going on as I was, and urged ominously that I should make sure to remember him in my paradise.

In the fall I alternated between SF and Boston, having to wrap up some final responsibilities of my time as a student. I visited pika, the house in which I've been living for the past while and that I moved out of in January 2025. Pika is a miracle of coordination - feeding everyone with a public mealplan where people cook together, which I ran in January, and providing them with a warm, well organized home I was very glad to call mine. I will miss my late nights there eating snacks with other pikans, and watching movies in the basement, or cooking for all my friends. I also revisited East Campus, my other home at MIT. I danced with my friends there, I looked at my old room, I got nostalgic. I will remember the dreamy warmth of these communities, their openness, the way they have the agency of SF without the single-mindedness, the machine shops where someone with dyed hair is always up building something new, maybe a radio system, a motorized shopping cart, a new LED display for the parties. These places made me, and I will carry them with me. I said my goodbyes. I went climbing again, with another friend from Boston, and we talked about writing and poetry, about why we wrote, about abstractions and whether they had their place in art, whether a poem has to be constructed or felt, written for yourself or for others, and then we kept climbing. I read Valéry. The same friend gave me the book Oblivion by David Foster Wallace, and its stories inspired me with their detail, the attention given to worlds that could not exist, that were conjured as precisely as if describing some kind of ridiculous, absurd alternate reality, that had been felt and lived. I paid more attention to things, and tried to write things that were more concrete. I went to a play that inspired me, and I started paying more attention to people and their faces, and the way I moved my own body.

In December, we launched our latest products, finalized decisions for the research internship we are running in January, and shipped all of our final remaining belongings from Boston to SF, as well as getting a new office. We have learned so much this year, and we are excited to show you what we can do.

I have deep gratitude for 2025. It was a year of great joys and great pains — a year like dark metal, melted and annealed again and again, moving from fluid to form and into strength. Its transformations etched a whole world into me. The forge keeps hammering. 2026 has begun, and we live in a period of rapid change.

I hope we remember each other in our assorted paradises, whatever pain or joy they bring us.

Lists of the year

Writing

I wrote more this year! I have two substacks now, one for technical takes and one for more personal writing.

I did some technical writing, for/with fulcrum and on my own:

More personal essays and poetry:

I also wrote a few more poems I haven't put up yet. I hope to keep writing in balance with my work.

Things I want to write about, if you're interested:

  • Alignment as capabilities
  • Personalization and gradual disempowerment
  • Emotions as integrators
  • Concreteness and abstraction in writing
  • Towards an aesthetics for cyborgs

And other things, I'm sure.

Books

Great

  • The Things They Carried by Tim O'Brien
  • Twice Alive by Forrest Gander
  • Oblivion by David Foster Wallace
  • The Road by Cormac McCarthy
  • A Moveable Feast by Ernest Hemingway
  • A Portrait of the Artist as a Young Man by James Joyce
  • The Bluest Eye by Toni Morrison

Good

  • Never Let Me Go by Kazuo Ishiguro
  • Impro: Improvisation and the Theatre by Keith Johnstone
  • Talking at the Boundaries by David Antin
  • The Unaccountability Machine by Dan Davies
  • On the Motion and Immobility of Douve by Yves Bonnefoy
  • No Longer Human by Osamu Dazai
  • The Baron in the Trees by Italo Calvino
  • Notes from Underground by Fyodor Dostoevsky
  • Elon Musk by Walter Isaacson

Check out my goodreads for more info, I will review some of these soon. I had a lot of hits this year!

Movies

Also on letterboxd.

Great

  • Paths of Glory
  • Certified Copy
  • Ma nuit chez Maud
  • La collectionneuse
  • Synecdoche New York

Good

  • Parasite
  • Betty Blue
  • Wake up dead man
  • Perfect Blue
  • The color of pomegranates

Okay

  • I, Tonya
  • The cabinet of Dr Caligari

Links

In random order:



Discuss

2025 in AI predictions

2026-01-02 12:29:27

Published on January 2, 2026 4:29 AM GMT

Past years: 2023 2024

Continuing a yearly tradition, I evaluate AI predictions from past years, and collect a convenience sample of AI predictions made this year. I prefer selecting specific predictions, especially ones made about the near term, enabling faster evaluation.

Evaluated predictions made about 2025 in 2023, 2024, or 2025 mostly overestimate AI capabilities advances, although there's of course a selection effect (people making notable predictions about the near-term are more likely to believe AI will be impressive near-term).

As time goes on, "AGI" becomes a less useful term, so operationalizing predictions is especially important. In terms of predictions made in 2025, there is a significant cluster of people predicting very large AI effects by 2030. Observations in the coming years will disambiguate.

Predictions about 2025

2023

Jessica Taylor: "Wouldn't be surprised if this exact prompt got solved, but probably something nearby that's easy for humans won't be solved?"

The prompt: "Find a sequence of words that is: - 20 words long - contains exactly 2 repetitions of the same word twice in a row - contains exactly 2 repetitions of the same word thrice in a row"

Self-evaluation: False; I underestimated LLM progress, especially from reasoning models.

2024

teortaxesTex: "We can have effectively o3 level models fitting into 256 Gb VRAM by Q3 2025, running at >40 t/s. Basically it’s a matter of Liang and co. having the compute and the political will to train and upload r3 on Huggingface."

Evaluation: False, but close. DeepSeek V3.1 scores worse than o3 according to Artificial Analysis. DeepSeek V3.2 scores similarly but was Q4 2025.

Jack Gallagher: "calling it now - there's enough different promising candidates rn that I bet by this time [Oct 30] next year we mostly don't use Adam anymore."

Evaluation: Partially correct. Muon is popular, and was used for Kimi K2 and GLM 4.5. Self-evaluated as: “more mixed than I expected. In particular I was expecting more algorithmic iteration on muon.”

Elon Musk: "AI will probably be smarter than any single human next year.”

Evaluation: Mostly false, though jagged capabilities make evaluation difficult.

Aidan McLau: "i think it’s likely (p=.6) that an o-series model solves a millennium prize math problem in 2025"

Evaluation: False

Victor Taelin: "I'm now willing to bet up to 100k (but no more than that, I'm not Musk lol) that HOC will have AGI by end of 2025.... AGI defined as an algorithm capable of proving theorems in a proof assistant as competently as myself. (This is an objective way to say 'codes like Taelin'.)"

Evaluation: False

Predictions made in 2025 about 2025

Gary Marcus: “No single system will solve more than 4 of the AI 2027 Marcus-Brundage tasks by the end of 2025. I wouldn’t be shocked if none were reliably solved by the end of the year.”

Evaluation: Correct. AI can perhaps pass the reading comprehension task. But not any 4 of the tasks.

Dario Amodei: “In 3 to 6 months… AI is writing 90 percent of the code.”

Evaluation (6 months being Sep 2025): False in the relevant sense. (“Number of lines”, for example, is not a relevant metric.)

@kimmonismus: “Give it 6 more months so that [Manus is] faster, more reliable and more intelligent and it will replace 50% of all white collar jobs.”

Evaluation: False

Milus Brundage: “‘When will we get really dangerous AI capabilities that could cause a very serious incident (billions in damage / hundreds+ of people dead)?’ Unfortunately, the answer seems to be this year, from what I can tell.”

Evaluation: Likely false. No strong indication that this is true.

Testingthewaters: “I believe that within 6 months this line of research [online in-sequence learning] will produce a small natural-language capable model that will perform at the level of a model like GPT-4, but with improved persistence and effectively no “context limit” since it is constantly learning and updating weights.”

Evaluation: False

@chatgpt21: “75% on humanity’s last exam by the end of the year.”

Evaluation: False (Gemini 3 Pro has a high score of 37.2%)

Predictions made in 2025

2026

Mark Zuckerberg: “We're working on a number of coding agents inside Meta... I would guess that sometime in the next 12 to 18 months, we'll reach the point where most of the code that's going toward these efforts is written by AI. And I don't mean autocomplete.”

Bindu Reddy: “true AGI that will automate work is at least 18 months away.”

Elon Musk: “I think we are quite close to digital superintelligence. It may happen this year. If it doesn't happen this year, next year for sure. A digital superintelligence defined as smarter than any human at anything.”

Emad Mostaque: “For any job that you can do on the other side of a screen, an AI will probably be able to do it better, faster, and cheaper by next year.”

David Patterson: “There is zero chance we won't reach AGI by the end of next year. My definition of AGI is the human-to-AI transition point - AI capable of doing all jobs.”

Eric Schmidt: “It’s likely in my opinion that you’re gonna see world-class mathematicians emerge in the next one year that are AI based, and world-class programmers that’re gonna appear within the next one or two years”

Julian Schrittwieser: “Models will be able to autonomously work for full days (8 working hours) by mid-2026.”

Mustafa Suleyman: “it can take actions over infinitely long time horizons… that capability alone is breathtaking… we basically have that by the end of next year.”

Vector Taelin: “AGI is coming in 2026, more likely than not”

François Chollet: “2026 [when the AI bubble bursts]? What cannot go on forever eventually stops.”

Peter Wildeford: “Currently the world doesn’t have any operational 1GW+ data centers. However, it is very likely we will see fully operational 1GW data centers before mid-2026.”

Will Brown: “registering a prediction that by this time next year, there will be at least 5 serious players in the west releasing great open models”

Davidad: “I would guess that by December 2026 the RSI loop on algorithms will probably be closed”

Teortaxes: “I predict that on Spring Festival Gala (Feb 16 2026) or ≤1 week of that we will see at least one Chinese company credibly show off with hundreds of robots.”

Ben Hoffman: “By EoY 2026 I don’t expect this to be a solved problem, though I expect people to find workarounds that involve lowered standards: https://benjaminrosshoffman.com/llms-for-language-learning/” (post describes possible uses of LLMs for language learning)

Gary Marcus: “Human domestic robots like Optimus and Figure will be all demo and very little product.”

2027

Anthropic: “we expect powerful AI systems will emerge in late 2026 or early 2027… Intellectual capabilities matching or exceeding that of Nobel Prize winners across most disciplines… The ability to navigate all interfaces available to a human doing digital work today… The ability to autonomously reason through complex tasks over extended periods—hours, days, or even weeks… The ability to interface with the physical world”

Anthony Aguirre: “Humanity has got about a year or two left to decide whether we're going to replace ourselves with machines – starting individually, then as a species.”

Kevin Roose: “I believe that very soon — probably in 2026 or 2027, but possibly as soon as this year — one or more A.I. companies will claim they’ve created an artificial general intelligence, or A.G.I., which is usually defined as something like “a general-purpose A.I. system that can do almost all cognitive tasks a human can do… the broader point — that we are losing our monopoly on human-level intelligence, and transitioning to a world with very powerful A.I. systems in it — will be true.”

Daniel Jeffries: “AI will not be doing any job a human can do in the next year or two years.”

David Shapiro: “The curve is steepening. ASI by 2026 or 2027 confirmed.”

Taylor G. Lunt: “AI will not substantially speed up software development projects [by end of 2027]. For example, the AI 2027 prediction that 2025-quality games will be made in a single month by July 2027 is false.”

Paul Schrader: “I think we’re only two years away from the first AI feature [film].”

Miles Brundage: “very roughly, something pretty clearly superhuman in most respects by end of 2027 + also very big stuff before then”

2028

AI 2027, original: “We forecast when the leading AGI company will internally develop a superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does, while being much faster and cheaper” (2028 as a rough median for April 2025 numbers).

Shane Legg: “Of course this now means a 50% chance of AGI in the next 3 years!”

Dario Amodei: “at some point we’re going to get to AI systems that are better than almost all humans at almost all tasks… a country of geniuses in a datacenter… we’re quite likely to get in the next 2-3 years”

Andrew Critch: “By year, I’d say… p(AGI by eoy 2028) = 75%... AGI = AI that at runtime, for cheaper than a human, can replace the human in the power-weighted majority of human jobs.” (He gives other probabilities for different years.)

80000 hours: “extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.”

Sholto Douglas: “we’re near guaranteed at this point to have, effectively, models that are capable of automating any white-collar job by 2027-2028, near guaranteed end of decade… we need to make sure we pull in the feedback loops with the real world”

Dwarkesh Patel: “AI can do taxes end-to-end for my small business as well as a competent general manager could in a week: including chasing down all the receipts on different websites, finding all the missing pieces, emailing back and forth with anyone we need to hassle for invoices, filling out the form, and sending it to the IRS: 2028 [median]”

Nikola Jurkovic: “More concretely, my median is that AI research will be automated by the end of 2028, and AI will be better than humans at >95% of current intellectual labor by the end of 2029.”

Ryan Greenblatt: “I expect doubling times of around 170 days on METR’s task suite (or similar tasks) over the next 2 years or so which implies we’ll be hitting 2 week 50% reliability horizon lengths around the start of 2028.”

OpenAI: “In 2026, we expect AI to be capable of making very small discoveries. In 2028 and beyond, we are pretty confident we will have systems that can make more significant discoveries (though we could of course be wrong, this is what our research progress appears to indicate).”

2029

Sam Altman: “I think AGI will probably get developed during this president’s term”

METR: “If the trend of the past 6 years continues to the end of this decade, frontier AI systems will be capable of autonomously carrying out month-long projects.”

Matthew Barnett: “I don't expect the US labor force participation rate to fall by more than 50% from its current level by 2029, or for US GDP growth to surpass 15% for any year in 2025-2029. Yet I do expect AI 2027 will look ~correct based on vibes until around 2029.”

2030

Demis Hassabis: “as a benchmark for agi… the ability for these systems to invent their own hypotheses or conjuctures about science, not just prove existing ones… come up with a new Riemann hypothesis… relativity back in the days that Einstein did it with the information he had… I would say probably like 3-5 years away”

Eliezer Yudkowsky: “Five whole years [until the end of humanity]?  Wow, that's a lot of time.  Way out of line with industry estimates.” (he clarifies it’s partially tongue-in-cheek, but his follow-up suggests he would invest more in the long term if he thought ASI was 5 years away)

@finbarrtimbers: “The bull case for robotics startups is that, if we really are within 2-5 years from AGI (which I believe), then the real world will be the next bottleneck, and starting a robotics company now is clearly the right choice.”

Daniel Faggella: “I'm nearly completely certain that in under 5 years the entire world will be rattled to its foundation, and our attenuation is eminent”

Thane Ruthenis: “I expect AGI Labs' AGI timelines have ~nothing to do with what will actually happen. On average, we likely have more time than the AGI labs say. Pretty likely that we have until 2030, maybe well into 2030s. By default, we likely don't have much longer than that.”

Tamay Besiroglu: “I bet @littmath that in 5 years AI will be able to produce Annals-quality Number Theory papers at an inference budget at or below $100k/paper, at 3:1 odds in my favor.”

Scott Alexander: “I think AI will be able to replace >50% of humans within 5 years.”

@slow_developer: “as someone working in the [AI] industry, i expect my job to be fully done by AI before 2030”

@alz_zyd_: “Prediction: AI will revolutionize pure math, essentially dominating human mathematicians within 5 years, but this won't move the needle on technological progress, because the vast majority of modern pure math is useless for any practical application”

Sam Altman: “I think we’re gonna maintain the same rate of progress, rate of improvement in these models for the second half of the decade as we did for the first… these systems will be capable of remarkable new stuff: novel scientific discovery, running extremely complex functions throughout society”

Epoch AI: “We forecast that by 2030: Training clusters would cost hundreds of billions of dollars”

Gordon Worley: “I put about 70% odds on us failing to solve steering in the next 5 years and thus being unable to build AGI.”

David Krueger: “But the real AI is coming. I think we’ve got about 5 years… I think it’s going to lead to human extinction, probably within a few years of its development. At best, it will cause near-total unemployment.”

Forecasting Research Institute: “LEAP experts forecast major effects of AI by 2030, including: ⚡ 7x increase in AI’s share of U.S. electricity use (1% -> 7%) 🖥️ 9x increase in AI-assisted work hours (2% -> 18%)”

Guive Assadi: “In order to get credit (or blame) for this prediction, I'll say that I think there is a <30% chance that unemployment will be above 10% for any six month span in the United States over the next five years.”

Michael Druggan: “After some back and forth to agree on terms  @ludwigABAP and I have decided that if on Dec 31 2030 @littmath says that AI has not yet generated any interesting math research I will pay Ludwig $1000 and if he says that it has Ludwig will pay me $1000.”

2031

Max Harms: “The authors [of AI 2027] predict a strong chance that all humans will be (effectively) dead in 6 years, and this agrees with my best guess about the future.”

Zvi: “If you put a gun to my head for a typical AGI definition I’d pick 2031 [median], but with no ‘right to be surprised’ if it showed up in 2028 or didn’t show up for a while.”

Eli Lifland: “Pictured is the model trajectory with all parameters set to my median values. [automated coder 5/31; superhuman AI researcher 3/32; ASI 7/34]”

2032

Nathan Lambert: “I think automating the “AI Research Engineer (RE)” is doable in the 3-7 year range — meaning the person that takes a research idea, implements it, and compares it against existing baselines is entirely an AI that the “scientists” will interface with.”

2035

Forethought: “a century of technological progress in a decade — or far more — is more likely than not.”

Roko Mijic: “There's probably going to be an intelligence explosion in the next decade and it's going to get very messy.”

Demis Hassabis: “maybe we can cure all disease with the help of AI… within the next decade or so”

Ege Erdil: “I still think full automation of remote work in 10 years is plausible, because it’s what we would predict if we straightforwardly extrapolate current rates of revenue growth and assume no slowdown. However, I would only give this outcome around 30% chance.”

AI as Normal Technology: “we think that transformative economic and societal impacts will be slow (on the timescale of decades)”

Key Tryer: “I'm going to bet that a top consumer GPU in 2035 would be able to train a GPT-5 level system in a few days, and that containing all that data will be possible on a few consumer storage drives.”

Richard Sutton: “large language models… will not be representative of the leading edge of AI for more than a decade”

Bob McGrew: “the fundamental concepts… the idea of language models with transformers, the idea of scaling the pre-training on those language models… and then the idea of reasoning… more and more multimodal capabilities… in 2035 we’re not gonna see any new trends beyond those”

Dean Ball: “But suppose you also believe that there could be future AI systems with qualitatively different capabilities and risks, even if they may involve LLMs or resemble them in some ways. These future systems would not just be “smarter,” they would also be designed to excel at cognitive tasks where current LLMs fall… My own guess is that they will be built for the first time sometime between 2029 and 2035”

Andrej Karpathy: “I feel like the problems [with building AGI] are tractable, they’re surmountable, but they’re still difficult. If I just average it out, it just feels like a decade to me.”

Daniel Kokotajlo: “The companies seem to think strong AGI is just a few years away, and while I'm not as bullish as they are, I do expect it to happen in the next 5-10 years.”

2039

Ray Kurzweil: “In the 2030s, robots the size of molecules will go into our brains, noninvasively, through the capillaries, and will connect our brains directly to the cloud.”

2040

Liron: “Probably another 1-15 years [until FOOM].”

2045

Ilya Sutskever: “Five to twenty [years until AI can learn as well as a human]”

Andrew Ng: “Modern AI is a general purpose technology that is enabling many applications, but AI that can do any intellectual tasks that a human can (a popular definition for AGI) is still decades away or longer.”

2050

Steven Byrnes: “I don’t know when the next paradigm will arrive, and nobody else does either. I tend to say things like “probably 5 to 25 years”. But who knows!”



Discuss

Debunking claims about subquadratic attention

2026-01-02 12:23:59

Published on January 2, 2026 4:23 AM GMT

TL;DR: In the last couple years, there have been multiple hype moments of the form "<insert paper> figured out subquadratic/linear attention, this is a game changer!" However, all the subquadratic attention mechanisms I'm aware of either are quadratic the way they are implemented in practice (with efficiency improved by only a constant factor) or underperform quadratic attention on downstream capability benchmarks.

 

A central issue with attention is that its FLOP complexity is quadratic in the context length (number of tokens in a sequence) and its memory complexity during inference is linear in the context length. In the last couple years, there have been multiple claims, and hype around those claims, that new architectures solved some (often all) of those problems by making alternatives to attention whose FLOP complexity is linear and/or whose memory complexity during inference is constant. These are often called subquadratic/linear attention (as opposed to regular attention which I’ll call quadratic attention). The ones I’m aware of are Kimi LinearDeepSeek Sparse Attention (DSA)Mamba (and variants), RWKV (and variants), and text diffusion. If this were true, it would be a big deal because it would make transformer inference a lot more efficient at long contexts.

In this blogpost, I argue that they are all better thought of as “incremental improvement number 93595 to the transformer architecture” than as “subquadratic attention, a more than incremental improvement to the transformer architecture". This is because the implementations that work in practice are quadratic and only improve attention by a constant factor and subquadratic implementations underperform quadratic attention on downstream benchmarks. I think some of them are still important and impressive - for instance, Kimi Linear’s 6.3x increased inference speed at 1 million token context lengths is impressive. I just argue that they are not particularly special among incremental improvements to the transformer architecture and not game changers.

  • Kimi Linear and DeepSeek Sparse Attention (DSA) are actually quadratic as they are implemented in practice in the models that Kimi and DeepSeek trained using them. In Kimi Linear’s case, this is because they only use Kimi Linear on ¾ of the layers and use MLA, which is quadratic, on the remaining ¼ of the layers. They do not use Kimi Linear on all layers because it degrades downstream benchmark performance too much. In the setting where improvement is biggest (inference with a context length of 1M tokens) the improvement is 4x in terms of KV cache size (memory) and 6.3x in terms of inference speed. There is also a modest improvement in downstream benchmark performance. DSA does not reduce KV cache size but decreases per-token cost by a bigger factor of about 3x (prompt) and 7x (output) at the maximal context length of 128k tokens. It is still quadratic.
  • Kimi are very clear about this in the paper and say everything I said here in the abstract. However, some people (not from Kimi) still hype Kimi Linear as subquadratic attention, which is why I included it here. Kimi is not to blame here and wrote an excellent paper.
  • This is clear after a careful reading of DeepSeek’s paper, though DeepSeek emphasizes this less than Kimi.
  • Mamba and RWKV do actually have a linear FLOP complexity and constant memory complexity during inference. However, while they perform comparably to attention in small to medium size models, they seem to underperform attention in terms of downstream benchmark performance on frontier scale models and are not used in frontier LLMs. My main reason for believing this is that I do not know of any frontier LLM that uses them, except for Mamba-attention hybrid models - models that have Mamba on a fraction of layers and quadratic attention on the other layers, see appendix for why this is still quadratic. Some papers on frontier Mamba-attention hybrid models do preliminary analysis comparing pure Mamba and Mamba-attention hybrid models. When they do, they usually say that pure Mamba models underperformed hybrids and that this is why they stuck to hybrid architectures. This provides empirical validation that pure Mamba underperforms hybrid architectures. A few 7B models do use pure Mamba and their papers find that it is as good or even a bit better than quadratic attention on downstream capability benchmarks. For example Codestral Mamba. However, the overwhelming majority of 7B models still use quadratic attention.
  • While text diffusion models can greatly reduce memory usage by eliminating the need for KV caches entirely, they do not reduce the FLOP usage. In fact, they multiply the number of FLOPs needed for inference by a constant factor. Furthermore, same as for pure Mamba, no frontier model uses text diffusion and only a small number of sub-frontier models use it.
  • There exist many incremental improvements that reduce FLOP and/or memory usage of attention by a constant factor that are not derived from, or related to, subquadratic attention. A probably non-exhaustive list of such improvements no one claims are subquadratic attention is: flash attention, Grouped Query Attention (GQA), sliding window attention (on some but not all layers), sparse attention, Multi Latent Attention (MLA), and making MLPs wider and attention narrower.

Appendix: Short explanation of how each subquadratic attention mechanism works and why it is not actually subquadratic

RWKV and Mamba

Those are entirely different mechanisms from attention that can be thought of as (much) better RNNs. They are actually subquadratic (in fact, linear) but they seem to underperform attention at frontier LLM scale, as argued for above. Mamba-attention hybrids do scale but are quadratic, as explained below for Kimi Linear.

Kimi Linear

Similar to Mamba and RWKV, Kimi Linear can be thought of as a (much) better RNN and it does actually have a linear FLOP complexity and constant memory complexity during inference. However, as said in the Kimi Linear paper, they use Kimi Linear at ¾ of layers and Multi Latent Attention (which is quadratic) on the remaining ¼ of layers. They say in the paper that when they tried using Kimi Linear on every layer, the hit to performance from doing this was too big:

Despite efficiency, pure Linear Attention still struggle with precise memory retrieval and exact copying. This deficiency hinders their adoption in industrial-scale LLMs where robust long-context recall (e.g., beyond 1M tokens) and reliable tool-use over extensive code repositories are critical.

And:

For Kimi Linear, we chose a layerwise approach (alternating entire layers) over a headwise one (mixing heads within layers) for its superior infrastructure simplicity and training stability. Empirically, a uniform 3:1 ratio, i.e., repeating 3 KDA layers to 1 full MLA layer, provided the best quality–throughput trade-off.

Thus, Kimi Linear as done in practice reduces the FLOP and memory used by the attention mechanism by a constant factor - the fraction of layers that don’t have it, in the paper’s case, ¼ (the reduction is smaller at shorter context lengths).

(Note on why the improvement in speed is 6.3x, which is bigger than 4x, at context length 1 million tokens: this is because additionally to making attention faster by a factor of almost 4x at big context length, Kimi Linear makes the KV cache smaller by a factor of almost 4x at big context length, which allows bigger batch sizes (by a factor of almost 4x), thus faster inference beyond the 4x improvement in attention FLOPs.)

DeepSeek Sparse Attention (DSA)

DSA was introduced in the DeepSeek V3.2 paper and DeepSeek V3.2, a frontier model, uses it. It works in the following way:

  • At each layer, the lightning indexer, which is a modified attention mechanism, chooses 2048 positions.
  • A regular Multi Latent Attention (MLA) mechanism only attends to those positions.

Thus, DSA’s FLOP complexity has two components: the lightning indexer has (up to a constant) the same complexity as regular MLA (which is quadratic) and the the subsequent MLA has linear complexity (at big context lengths) - min(context_length**2, 2048 * context_length).

So if the lightning indexer is in practice hugely cheaper than the subsequent MLA, the complexity is linear, but if it is only cheaper by a small constant factor, the complexity is still quadratic, just smaller by a small constant factor.

And the theoretical FLOP usage of the lightning indexer is only smaller by a factor of 8, so complexity is still quadratic (at least in terms of theoretical FLOP usage). Here is the calculation that leads to 8: first, n_heads * d_head of the lightning indexer is half that of n_heads * d_head of the subsequent MLA. This is not written in the paper, but can be seen by inspecting the model’s config on HuggingFace. Then, the lightning indexer only has keys and queries, no values and outputs, so that’s another factor of 2. Finally, the lightning indexer is in FP8, not FP16, which is another factor of 2.

For prefill (prompt) tokens, his calculation matches DeepSeek’s empirical findings: figure 3 in the DeepSeek V3.2 paper shows that the slope of cost (in dollars) per token as a function of position in the sequence is about 8x smaller than for MLA at big context lengths. For decoding (output) tokens, the slope is about 20x smaller, not 8x, but this is still a constant factor improvement. The improvements in per-token cost for the token at position 128k are 3.5x for prefill tokens and 9x for decoding tokens (if you look at the average token at context length 128k and not only at the last one, they go down to 3x and 7x). Note that in November 2025 (the latest date for which data is available as of writing this blogpost), OpenRouter processed 8x more prompt tokens than output tokens.

Furthermore, DSA does not reduce the KV cache size (because the 2048 tokens it attends to are different for every generated token and only known when that token is generated). This is important, because an important way in which subquadratic attention is good (for capabilities) is by increasing inference speed by reducing KV cache size which allows bigger batch sizes during inference (thus making inference cheaper) and allowing for longer context lengths by being able to have KV cache for more tokens per gigabyte of GPU memory.

Text Diffusion

Autoregressive LLMs (that is, all LLMs except for text diffusion LLMs) generate output tokens one by one in sequence, doing one forward pass per output token. A text diffusion LLM generates all the tokens at once in a single forward pass, but leaves X% of tokens blank. Then, it generates tokens in place of Y% of the blank tokens, also in a single forward pass. It repeats this a fixed number of times, after which no blank tokens remain.

Thus, while text diffusion eliminates the need for KV caches, it multiplies the FLOP usage on output tokens by a constant factor - the number of forward passes needed until no blank tokens remain.

(But wait, don’t autoregressive LLMs do one forward pass per output token, thus using more FLOPs than text diffusion models if the number of output tokens is big enough? No. Autoregressive LLMs do indeed do one forward pass per output token and thus usually do more forward passes than diffusion models. But they do each forward pass on only one token. Whereas text diffusion LLMs do each forward pass on all the output tokens at once. Thus, each forward pass of a text diffusion LLM requires as many FLOPs as all the forward passes of an autoregressive LLM combined. Text diffusion LLMs can be more efficient than autoregressive models in practice because it is usually more efficient on GPUs to do one big operation than many small operations in sequence, even when both require the same number of FLOPs[1]. However, these efficiency improvements can only happen until inference efficiency becomes bottlenecked by FLOPs.

  1. ^

    This last sentence is oversimplified - another thing that matters here is the shapes of matrices that GPUs multiply. But this is out of the scope of this blogpost.



Discuss

The bio-pirate's guide to GLP-1 agonists

2026-01-02 11:32:53

Published on January 2, 2026 3:32 AM GMT

How to lose weight, infringe patents, and possibly poison yourself for 22 Euros a month.

Introduction

In March 2025, Scott Alexander wrote:

Others are turning amateur chemist. You can order GLP-1 peptides from China for cheap. Once you have the peptide, all you have to do is put it in the right amount of bacteriostatic water. In theory this is no harder than any other mix-powder-with-water task. But this time if you do anything wrong, or are insufficiently clean, you can give yourself a horrible infection, or inactivate the drug, or accidentally take 100x too much of the drug and end up with negative weight and float up into the sky and be lost forever. ACX cannot in good conscience recommend this cheap, common, and awesome solution.

With a BMI of about 28, low executive function, a bit of sleep apnea and no willpower to spend on either dieting or dealing with the medical priesthood, I thought I would give it a try. This is a summary of my journey.

Please do not expect any great revelations here beyond "you can buy semaglutide from China, duh". All of the details here can also be found elsewhere, still I thought it might be helpful to write them down.

Also be careful when following medical advise from random people from the internet. The medical system is full of safeguards to make very sure that no procedure it does will ever hurt you. Here you are on your own. I am not a physician, just an interested amateur with a STEM background. If you do not know if it is ok to reuse syringes or inject air, or do not trust yourself to calculate your dose, I would recommend refraining from DIY medicine.

Picking a substance and route of administration

The two main approved GLP-1 agonists are tirzepatide and semaglutide. Both are peptides (mini-proteins) with a mass of about 4-5kDa which cost approximately the same to produce. A typical long term dose of tirzepatide is 15mg/week, while semaglutide is 2.4mg/week, so I focused on sema because it looked like the cheaper option. [1]

While I would have preferred oral to subcutaneous injection, the bioavailability of oral semaglutide is kinda terrible, with typical long term doses around 14mg/day -- forty times the amount of subcutaneous injection. So I resolved to deal with the hassle of poking myself with needles and possibly giving myself 'horrible infections'.

Given that the long term dosage is 2.4mg/week, and that sources generally state that once opened, vials should be used within four (or eight) weeks, I decided that the optimal vial size would be 10mg -- enough for four weeks. [2]

Finding a vendor

So I started searching the web for possible vendors of lyophilized semaglutide. I found a wide range of prices from 250$ for a 10mg vial (which would last four weeks at maximum dosage) down to about 50$. And a single website which offered ten vials of 5mg each for 130$.

That one seemed to be a Chinese manufacturer of organic compounds [3]. Slightly broken English, endless lists of chemicals by CAS number, no working search function on the website, outdated and incomplete price info provided as a jpeg on the site. I figured that if it was a scam site, it was matching very well to my preconception of how a site of a company more enthusiastic about synthesis than selling to consumers would look like, and contacted them. After I was provided with current pricing (also as a series of jpegs), I sent them about 200 Euros worth of BTC [4] for ten 10mg vials plus shipping. (Shipping was 70$, probably indicative of a preference to sell in larger quantities.)

A week or so later, I got my delivery. Ten unmarked vials, of a volume of about 3ml each, filled to perhaps a third with a block of white stuff. [5]

I would have preferred to have a quantitative analysis of the contents, but all the companies in Germany I contacted were either unwilling to deal with consumers or unable to perform HPLC-MS, so I reasoned that the vendor would be unlikely to sell me vials filled with botulinum toxin instead [6], and just started with injecting myself with 1/40th of a vial per week, which would amount to 0.25mg if the content was as advertised.

(If anyone has a great, affordable way for peptide analysis, please let me know in the comments!)

Sourcing bacteriostatic water

Unlike random pharmaceuticals, bacteriostatic water can be legally sold in Germany. Sadly, it would have cost me almost as much as the active ingredient, about 15 Euro per vial. So instead, I decided to craft my own bacteriostatic water. I sourced a lifetime supply of benzyl alcohol for a couple of Euros. Instead of dealing with distilled water, I bought sealed medical grade plastic vials of 0.9% NaCl solution "for inhalation", roughly 0.5 Euro a piece. Once a month, I add 0.1ml benzyl alcohol (naturally sterile) to one 5ml plastic vial, which gives me about 2% benzyl alcohol, which is twice of what is specified as BAC, erring on the side of caution (and tissue damage from alcohol, I guess).

Other equipment

I already had a bunch of sterile 20G hypodermic needles and 3ml syringes from another project. For injection of minute quantities of liquids into my body, I bought sterile 1ml insulin syringes with 6mm, 31G needles (@0.3 Euro). [7]

Happily, I owned a fridge and a disinfectant spray, completing my toolset.

My current procedure

Every four weeks, I will prepare a new vial. I prefer to fill the vials with 3ml of my BAC+, which should give 3.3mg/ml of sema.

Apply disinfectant to hands and workspace to taste. Then, start with opening a new plastic vial of NaCl. Using an insulin syringe, add 0.1ml benzyl alcohol to it. Unseal a 3ml syringe and needle and draw and release your BAC+ from the plastic vial a few times to mix it. Now draw 3ml of that, tear off the plastic cap [8] of your new glass vial, stick the needle through the rubber seal and slowly inject your BAC into the vial. The vial will be low-pressure, so getting liquid into it is really easy. Shake a bit and wait for the lyophilized peptide to dissolve. Store it in a fridge (preferably in the plastic box with the other vials), and liberally apply disinfectant to the rubber seal before and after each use.

To draw a dose, first figure out the volume you need. Disinfect, unseal your 1ml syringe, first inject an equal amount of air into the vial, then turn the vial rubber side down and draw the liquid. To start with, I would recommend drawing 0.1ml more than you need, because you will likely have some air bubbles in. Remove the needle, get rid of the excess air (and excess liquid). Check that you are in a private place, expose your tights, apply disinfectant, pinch the skin of your tight with two fingers, stick in the needle with the other hand, push down the plunger. Cover up your tights, put your vial back into the fridge, safely dispose of your needle. [9]

Outcome ()

Having taken semaglutide as scheduled for some 22 weeks, I have recently cut my dosage in half because I have reached a BMI of 22.

Traditionally, weight loss was seen as a moral battle: eat less than you want to eat, eat different things than you want to eat, do more sports than you want to do. Basically, spend willpower points to lose weight.

GLP-1 agonists are a cheat code, like reaching enlightenment through psychedelics instead of years of meditation, or bringing a gun to a sword fight. I literally spend zero willpower points in this endeavor. I continue to eat what I want and how much I want, it just so happens I want less food. I am under no illusion that this cheat will give me the full benefits of exercise and proper diet. But realistically, these were never an option for me (until someone discovers an infinite willpower cheat).

Will I rebounce once I quit sema? Not a problem, at 20 Euros a month, the drug pays itself in money not spend on chocolate, and it is less of a hassle than taking my other pills once a day, so I am willing to continue to take it for the rest of my life.

Thanks to Scott Alexander for pointing out this option and to my Chinese vendor for providing Westerners like myself with cheap bodily autonomy.

Up next: The bio-pirate's guide to DIY MAID (due in a few decades, PRNS).

  1. I should probably add that the price difference is not all that large. 60mg tirzepatide vials cost perhaps twice as much as 10mg sema vials, because a lot of it is dose-independent overhead. ↩︎

  2. Also note that the standard recommended dose schedule calls for minimal doses at the start, which will come down to 75uL. This would be even less convenient using 20mg/vial. ↩︎

  3. The sha1-sum of their 14-character domain name is 6682ca2d70b203e0487c49d868ea20401b5ede1c. Note that I can not vouch for their production chain (obviously), but am personally very happy with my dealings with them and plan to buy my next pack of vials from them as well. I do not want to link them to avoid this looking like a sketchy drug advertisement. DM me if you can't find them. ↩︎

  4. Their slightly unfavorable exchange rate for bitcoin happily negated the exchange rate between US$ and Euro, simplifying my calculations here. ↩︎

  5. I was suspicious about that amount, as I would have imagined that 10mg would be a lot less. While I do not have a scale on that level of precision, I confirmed that the residue of a dried droplet was much less indeed. ↩︎

  6. If I had been wrong about that, I would at least have contributed to raising the sanity waterline. ↩︎

  7. Obligatory whining about excessive paternalism: A few years ago, I could buy medical grade syringes and needles (Braun) from Amazon Germany. These days, all the offers say "for research purposes only". When did society decide that people having access to medical grade equipment for any purpose was a bad thing? Is anyone under the impression that not providing heroin addicts, needleplay enthusiasts, or peptide enjoyers will result in them abstaining from risky behavior? ↩︎

  8. When I was taking ketamine for depression (filling injection vials into a nasal spray), I did not know about the plastic caps. Turns out it is really hard to pierce them with a hypodermic needle. ↩︎

  9. I recap, which is fine here because I already have all the pathogens in my blood which might be on the needle, and then collect the needles in an empty bottle for disposal. ↩︎



Discuss

College Was Not That Terrible Now That I'm Not That Crazy

2026-01-02 07:14:58

Published on January 1, 2026 11:14 PM GMT

Previously, I wrote about how I was considering going back to San Francisco State University for two semesters to finish up my Bachelor's degree in math.

So, I did that. I think it was a good decision! I got more out of it than I expected.

To be clear, "better than I expected" is not an endorsement of college. SF State is still the same communist dystopia I remember from a dozen years ago—a bureaucratic command economy dripping in propaganda about how indispensible and humanitarian it is, whose subjects' souls have withered to the point where, even if they don't quite believe the propaganda, they can't conceive of life and work outside the system.

But it didn't hurt this time, because I had a sense of humor about it now—and a sense of perspective (thanks to life experience, no thanks to school). Ultimately, policy debates should not appear one-sided: if things are terrible, it's probably not because people are choosing the straightforwardly terrible thing for no reason whatsoever, with no trade-offs, coordination problems, or nonobvious truths making the terrible thing look better than it is. The thing that makes life under communism unbearable is the fact that you can't leave. Having escaped, and coming back as a visiting dignitary, one is a better position to make sense of how and why the regime functions—the problems it solves, at whatever cost in human lives or dignity—the forces that make it stable if not good.

Doing It Right This Time (Math)

The undergraduate mathematics program at SFSU has three tracks: for "advanced studies", for teaching, and for liberal arts. My student record from 2013 was still listed as on the advanced studies track. In order to graduate as quickly as possible, I switched to the liberal arts track, which, beyond a set of "core" courses, only requires five electives numbered 300 or higher. The only core course I hadn't completed was "Modern Algebra I", and I had done two electives in Fall 2012 ("Mathematical Optimization" and "Probability and Statistics I"), so I only had four math courses (including "Modern Algebra I") to complete for the major.

"Real Analysis II" (Fall 2024)

My last class at SF State in Spring 2013 (before getting rescued by the software industry) had been "Real Analysis I" with Prof. Alex Schuster. I regret that I wasn't in a state to properly focus and savor it at the time: I had a pretty bad sleep-deprivation-induced psychotic break in early February 2013 and for a few months thereafter was mostly just trying to hold myself together. I withdrew from my other classes ("Introduction to Functions of a Complex Variable" and "Urban Issues of Black Children and Youth") and ended up getting a B−.

My psychiatric impairment that semester was particularly disappointing because I had been looking forward to "Real Analysis I" as my first "serious" math class, being concerned with proving theorems rather than the "school-math" that most people associate with the subject, of applying given techniques to given problem classes. I had wanted to take it concurrently with the prerequsite, "Exploration and Proof" (which I didn't consider sufficiently "serious") upon transferring to SFSU the previous semester, but was not permitted to. I had emailed Prof. Schuster asking to be allowed to enroll, with evidence that I was ready (attaching a PDF of a small result I had proved about analogues of π under the p-norm, and including the contact email of Prof. Robert Hasner of Diablo Valley College, who had been my "Calculus III" professor and had agreed to vouch for my preparedness), but he didn't reply.

Coming back eleven years later, I was eager to make up for that disappointment by picking up where I left off in "Real Analysis II" with the same Prof. Schuster. On the first day on instruction, I wore a collared shirt and tie (and mask, having contracted COVID-19 while traveling the previous week) and came to classroom early to make a point of marking my territory, using the whiteboard to write out the first part of a proof of the multivariate chain rule that I was working through in Bernd S. W. Schröder's Mathematical Analysis: A Concise Introduction—my favorite analysis textbook, which I had discovered in the SFSU library in 2012 and subsequently bought my own copy. (I would soon check up on the withdrawal stamp sheet in the front of the library's copy. No one had checked it out in the intervening twelve years.)

The University Bulletin officially titled the course "Real Analysis II: Several Variables", so you'd expect that getting a leg up on the multidimensional chain rule would be studying ahead for the course, but it turned out that the Bulletin was lying relative to the syllabus that Prof. Schuster had emailed out the week before: we would be covering series, series of functions, and metric space topology. Fine. (I was already pretty familiar with metric space topology, but even my "non-epsilon" calculus-level knowledge of series was weak; to me, the topic stunk of school.)

"Real II" was an intimate class that semester, befitting the SFSU's status as a garbage-tier institution: there were only seven or eight students enrolled. It was one of many classes in the department that were cross-listed as both a graduate ("MATH 770") and upper-division undergraduate course ("MATH 470"). I was the only student enrolled in 470. The university website hosted an old syllabus from 2008 which said that the graduate students would additionally write a paper on an approved topic, but that wasn't a thing the way Prof. Schuster was teaching the course. Partway through the semester, I was added to Canvas (the online course management system) for the 770 class, to save Prof. Schuster and the TA the hassle of maintaining both.

The textbook was An Introduction to Analysis (4th edition) by William R. Wade, the same book that had been used for "Real I" in Spring 2013. It felt in bad taste for reasons that are hard to precisely articulate. I want to say the tone is patronizing, but don't feel like I could defend that judgement in debate against someone who doesn't share it. What I love about Schröder is how it tries to simultaneously be friendly to the novice (the early chapters sprinkling analysis tips and tricks as numbered "Standard Proof Techniques" among the numbered theorems and definitions) while also showcasing the fearsome technicality of the topic in excruciatingly detailed estimates (proofs involving chains of inequalities, typically ending on "< ε"). In contrast, Wade often feels like it's hiding something from children who are now in fact teenagers.

The assignments were a lot of work, but that was good. It was what I was there for—to prove that I could do the work. I could do most of the proofs with some effort. At SFSU in 2012–2013, I remembered submitting paper homework, but now, everything was uploaded to Canvas. I did all my writeups in LyX, a GUI editor for LaTeX.

One thing that had changed very recently, not about SFSU, but about the world, was the availability of large language models, which had in the GPT-4 era become good enough to be useful tutors on standard undergrad material. They definitely weren't totally reliable, but human tutors aren't always reliable, either. I adopted the policy that I was allowed to consult LLMs for a hint when I got stuck on homework assignments, citing the fact that I had gotten help in my writeup. Prof. Schuster didn't object when I inquired about the propriety of this at office hours. (I also cited office-hours hints in my writeups.)

Prof. Schuster held his office hours in the math department conference room rather than his office, which created a nice environment for multiple people to work or socialize, in addition to asking Prof. Schuster questions. I came almost every time, whether or not I had an analysis question for Prof. Schuster. Often there were other students from "Real II" or Prof. Schuster's "Real I" class there, or a lecturer who also enjoyed the environment, but sometimes it was just me.

Office hours chatter didn't confine itself to math. Prof. Schuster sometimes wore a Free Palestine bracelet. I asked him what I should read to understand the pro-Palestinian position, which had been neglected in my Jewish upbringing. He recommended Rashid Kalidi's The Hundred Years' War on Palestine, which I read and found informative (in contrast to the student pro-Palestine demonstrators on campus, whom I found anti-persuasive).

I got along fine with the other students but do not seem to have formed any lasting friendships. The culture of school didn't feel quite as bad as I remembered. It's unclear to me how much of this is due to my memory having stored a hostile caricature, and how much is due to my being less sensitive to it this time. When I was at SFSU a dozen years ago, I remember seething with hatred at how everyone talked about their studies in terms of classes and teachers and grades, rather than about the subject matter in itself. There was still a lot of that—bad enough that I complained about it at every opportunity—but I wasn't seething with hatred anymore, as if I had come to terms with it as mere dysfunction and not sacrilege. I only cried while complaining about it a couple times.

One of my signature gripes was about the way people in the department habitually refered to courses by number rather than title, which felt like something out of a dystopian YA novel. A course title like "Real Analysis II" at least communicates that the students are working on real analysis, even if the opaque "II" doesn't expose which real-analytic topics are covered. In contrast, a course number like "MATH 770" doesn't mean anything outside of SFSU's bureaucracy. It isn't how people would talk if they believed there was a subject matter worth knowing about except insofar as the customs of bureaucratic servitude demanded it.

There were two examinations: a midterm, and the final. Each involved stating some definitions, identifying some propositions as true or false with a brief justification, and writing two or three proofs. A reference sheet was allowed, which made the definitions portion somewhat farcical as a test of anything more than having bothered to prepare a reference sheet. (I objected to Prof. Schuster calling it a "cheat sheet." Since he was allowing it, it's wasn't "cheating"!)

I did okay. I posted a 32.5/40 (81%) on the midterm. I'm embarrassed by my performance on the final. It looked easy, and I left the examination room an hour early after providing an answer to all the questions, only to realize a couple hours later that I had completely botched a compactness proof. Between that gaffe, the midterm, and my homework grades, I was expecting to end up with a B+ in the course. (How mortifying—to have gone back to school almost specifically for this course and then not even get an A.) But when the grades came in, it ended up being an A: Prof. Schuster only knocked off 6 points for the bogus proof, for a final exam grade of 44/50 (88%), and had a policy of discarding the midterm grade when the final exam grade was higher. It still seemed to me that that should have probably worked out to an A− rather than an A, but it wasn't my job to worry about that.

"Probability Models" (Fall 2024)

In addition to the rarified math-math of analysis, the practical math of probability seemed like a good choice for making the most of my elective credits at the university, so I also enrolled in Prof. Anandamayee Mujamdar's "Probability Models" for the Fall 2024 semester. The prerequisites were linear algebra, "Probability and Statistics I", and "Calculus III", but the registration webapp hadn't allowed me to enroll, presumably because it didn't believe I knew linear algebra. (The linear algebra requirement at SFSU was four units. My 2007 linear algebra class from UC Santa Cruz, which was on a quarter system, got translated to 3.3 semester units.) Prof. Mujamdar hadn't replied to my July email requesting a permission code, but got me the code after telling me to send a followup email after I inquired in person at the end of the first class.

(I had also considered taking the online-only "Introduction to Linear Models", which had the same prerequisites, but Prof. Mohammad Kafai also hadn't replied to my July email, and I didn't bother following up, which was just as well: the semester ended up feeling busy enough with just the real analysis, probability models, my gen-ed puff course, and maintaining my soul in an environment that assumes people need a bureaucratic control structure in order to keep busy.)

Like "Real II", "Probability Models" was also administratively cross-listed as both a graduate ("MATH 742", "Advanced Probability Models") and upper-division undergraduate course ("MATH 442"), despite no difference whatsoever in the work required of graduate and undergraduate students. After some weeks of reviewing the basics of random variables and conditional expectation, the course covered Markov chains and the Poisson process.

The textbook was Introduction to Probability Models (12th edition) by Sheldon M. Ross, which, like Wade, felt in bad taste for reasons that were hard to put my finger on. Lectures were punctuated with recitation days on which we took a brief quiz and then did exercises from a worksheet for the rest of the class period. There was more content to cover than the class meeting schedule could accomodate, so there were also video lectures on Canvas, which I mostly did not watch. (I attended class because it was a social expectation and because attendance was 10% of the grade, but I preferred to learn from the book. As long as I was completing the assignments, that shouldn't be a problem ... right?)

In contrast to what I considered serious math, the course was very much school-math about applying particular techniques to solve particular problem classes, taken to the parodic extent of quizzes and tests re-using worksheet problems verbatim. (You'd expect a statistics professor to know not to test on the training set!)

It was still a lot of work, which I knew needed to be taken seriously in order to do well in the course. The task of quiz #2 was to derive the moment-generating function of the exponential distribution. I had done that successfully on the recitation worksheet earlier, but apparently that and the homework hadn't been enough practice, because I botched it on quiz day. After the quiz, Prof. Mujamdar wrote the correct derivation on the board. She had also said that we could re-submit a correction to our quiz for half-credit, but I found this policy confusing: it felt morally dubious that it should be possible to just copy down the solution from the board and hand that in, even for partial credit. (I guess the policy made sense from the perspective of schoolstudents needing to be nudged and manipulated with credit in order to do even essential things like trying to learn from one's mistakes.) For my resubmission, I did the correct derivation at home in LyX, got it printed, and bought it to office hours the next class day. I resolved to be better prepared for future quizzes (to at least not botch them, minor errors aside) in order to avoid the indignity of having an incentive to resubmit.

I mostly succeeded at that. I would end up doing a resubmission for quiz #8, which was about how to sample from an exponential distribution (with λ=1) given the ability to sample from the uniform distribution on [0,1], by inverting the exponential's cumulative distribution function. (It had been covered in class, and I had gotten plenty of practice on that week's assignments with importance sampling using exponential proposal distributions, but I did it in Rust using the rand_distr library rather than what was apparently the intended method of implementing exponential sampling from a uniform RNG "from scratch".) I blunted the indignity of my resubmission recapitulating the answer written on the board after the quiz by additionally inverting by myself the c.d.f. of a different distribution, the Pareto.

I continued my practice of using LLMs for hints when I got stuck on assignments, and citing the help in my writeup; Prof. Mujamdar seemed OK with it when I mentioned it at office hours. (I went to office hours occasionally, when I had a question for Prof. Mujamdar, who was kind and friendly to me, but it wasn't a social occasion like Prof. Schuster's conference-room office hours.)

I was apparently more conscientious than most students. Outside of class, the grad student who graded our assignments recommended that I make use of the text's solutions manual (which was circulating in various places online) to check my work. Apparently, he had reason to suspect that some other students in the class were just copying from the solution manual, but was not given the authority to prosecute the matter when he raised the issue to the professor. He said that he felt bad marking me down for my mistakes when it was clear that I was trying to do the work.

The student quality seemed noticeably worse than "Real II", at least along the dimensions that I was sensitive to. There was a memorable moment when Prof. Mujamdar asked which students were in undergrad. I raised my hand. "Really?" she said.

It was only late in the semester that I was alerted by non-course reading (specifically a footnote in the book by Daphne Koller and the other guy) that the stationary distribution of a Markov chain is an eigenvector of the transition matrix with eigenvalue 1. Taking this linear-algebraic view has interesting applications: for example, the mixing time of the chain is determined by the second-largest eigenvalue, because any starting distribution can be expressed in terms of an eigenbasis, and the coefficients of all but the stationary vector decay as you keep iterating (because all the other eigenvalues are less than 1).

The feeling of enlightenment was outweighed by embarrassment that I hadn't independently noticed that the stationary distribution was an eigenvector (we had been subtracting 1 off the main diagonal and solving the system for weeks; the operation should have felt familiar), and, more than either of those, annoyance that neither the textbook nor the professor had deigned to mention this relevant fact in a course that had linear algebra as a prerequisite. When I tried to point it out during the final review session, it didn't seem like Prof. Mujamdar had understood what I said—not for the lack of linear algebra knowledge, I'm sure—let alone any of the other students.

I can only speculate that the occurrence of a student pointing out something about mathematical reality that wasn't on the test or syllabus was so unexpected, so beyond what everyone had been conditioned to think school was about, that no one had any context to make sense of it. A graduate statistics class at San Francisco State University just wasn't that kind of space. I did get an A.

The 85th William Lowell Putnam Mathematical Competition

I also organized a team for the Putnam Competition, SFSU's first in institutional memory. (I'm really proud of my recruitment advertisements to the math majors' mailing list.) The story of the Putnam effort has been recounted in a separate post, "The End of the Movie: SF State's 2024 Putnam Competition Team, A Retrospective".

As the email headers at the top of the post indicate, the post was originally composed for the department mailing lists, but it never actually got published there: department chair Eric Hsu wrote to me that it was "much too long to send directly to the whole department" but asked for my "permission to eventually share it with the department, either as a link or possibly as a department web page." (He cc'd a department office admin whom I had spoken to about posting the Putnam training session announcements on the mailing list; reading between the lines, I'm imagining that she was discomfited by the tone of the post and had appealed to Chair Hsu's authority about whether to let it through.)

I assumed that the ask to share with the department "eventually" was polite bullshit on Hsu's part to let me down gently. (Probably no one gets to be department chair without being molded into a master of polite bullshit.) Privately, I didn't think the rationale made sense—it's just as easy to delete a long unwanted mailing list message as a short one; the email server wasn't going to run out of paper—but it seemed petty to argue. I replied that I hadn't known the rules for the mailing list and that he should feel free to share or not as he saw fit.

"Measure and Integration" (Spring 2025)

I had a busy semester planned for Spring 2025, with two graduate-level (true graduate-level, not cross-listed) analysis courses plus three gen-ed courses that I needed to graduate. (Following Prof. Schuster, I'm humorously counting "Modern Algebra I" as a gen-ed course.) I only needed one upper-division undergrad math course other than "Modern Algebra I" to graduate, but while I was at the University for one more semester, I was intent on getting my money's worth. I aspired to get a head start (ideally on all three math courses) over winter break and checked out a complex analysis book with exercise solutions from the library, but only ended up getting any traction on measure theory, doing some exercises from chapter 14 of Schröder, "Integration on Measure Spaces".

Prof. Schuster was teaching "Measure and Integration" ("MATH 710"). It was less intimate than "Real II" the previous semester, with a number of students in the teens. The class met at 9:30 a.m. on Tuesdays and Thursdays, which I found inconveniently early in the morning given my hour-and-twenty-minute BART-and-bus commute. I was late the first day. After running into to the room, I put the printout of my exercises from Schröder on the instructor's desk and said, "Homework." Prof. Schuster looked surprised for a moment, then accepted it without a word.

The previous semester, Prof. Schuster said he was undecided between using Real Analysis by Royden and Measure, Integration, and Real Analysis by Sheldon Axler (of Linear Algebra Done Right fame, and also our former department chair at SFSU) as the textbook. He ended up going with Axler, which for once was in good taste. (Axler would guest-lecture one day when Prof. Schuster was absent. I got him to sign my copy of Linear Algebra Done Right.) We covered Lebesgue measure and the Lebesgue integral, then skipped over the chapter on product measures (which Prof. Schuster said was technical and not that interesting) in favor of starting on Banach spaces. (As with "Several Variables" the previous semester, Prof. Schuster did not feel beholden to making the Bulletin course titles not be lies; he admitted late in the semester that it might as well have been called "Real Analysis III".)

I would frequently be a few minutes late throughout the semester. One day, the BART had trouble while my train was in downtown San Francisco, and it wasn't clear when it would move again. I got off and summoned a Waymo driverless taxi to take me the rest of the way to the University. We were covering the Cantor set that day, and I rushed in with more than half the class period over. "Sorry, someone deleted the middle third of the train," I said.

Measure theory was a test of faith which I'm not sure I passed. Everyone who reads Wikipedia knows about the notorious axiom of choice. This was the part of the school curriculum in which the axiom of choice becomes relevant. It impressed upon me that as much as I like analysis as an intellectual activity, I ... don't necessarily believe in this stuff? We go to all this work to define sigma-algebras in order to rule out pathological sets whose elements cannot be written down because they're defined using the axiom of choice. You could argue that it's not worse than uncountable sets, and that alternatives to classical mathematics just end up needing to bite different bullets. (In computable analysis, equality turns out to be uncomputable, because there's no limit on how many decimal places you would need to check for a tiny difference between two almost-equal numbers. For related reasons, all computable functions are continuous.) But I'm not necessarily happy about the situation.

I did okay. I was late on some of the assignments (and didn't entirely finish assignments #9 and #10), but the TA was late in grading them, too. I posted a 31/40 (77.5%) on the midterm. I was expecting to get around 80% on the final based on my previous performance on Prof. Schuster's examinations, but I ended up posting a 48/50 (96%), locking in an A for the course.

"Theory of Functions of a Complex Variable" (Spring 2025)

My other graduate course was "Theory of Functions of a Complex Variable" ("MATH 730"), taught by Prof. Chun-Kit Lai. I loved the pretentious title and pronounced all seven words at every opportunity. (Everyone else, including Prof. Lai's syllabus, said "complex analysis" when they didn't say "730".)

The content lived up to the pretension of the title. This was unambiguously the hardest school class I had ever taken. Not in the sense that Prof. Lai was particularly strict about grades or anything; on the contrary, he seemed charmingly easygoing about the institutional structure of school, while of course taking it for granted as an unquestioned background feature of existence. But he was pitching the material to a higher level than Prof. Schuster or Axler.

The textbook was Complex Analysis by Elias M. Stein and Rami Shakarchi, volume II in their "Princeton Lectures in Analysis" series. Stein and Shakarchi leave a lot to the reader (prototypically a Princeton student). It wasn't to my taste—but this time, I knew the problem was on my end. My distaste for Wade and Ross had been a reflection of the ways in which I was spiritually superior to the generic SFSU student; my distaste for Stein and Shakarchi reflected the grim reality that I was right where I belonged.

I don't think I was alone in finding the work difficult. Prof. Lai gave the entire class an extension to rebsubmit assignment #2 because the average performance had been so poor.

Prof. Lai didn't object to my LLM hint usage policy when I inquired about it at office hours. I still felt bad about how much external help I needed just to get through the assignments. The fact that I footnoted everything meant that I wasn't being dishonest. (In his feedback on my assignment #7, Prof. Lai wrote to me, "I like your footnote. Very genuine and is a modern way of learning math.") It still felt humiliating to turn in work with so many footnotes: "Thanks to OpenAI o3-mini-high for hints", "Thanks to Claude Sonnet 3.7 for guidance", "Thanks to [classmate's name] for this insight", "Thanks to the "Harmonic Conjugate" Wikipedia article", "This is pointed out in Tristan Needham's Visual Complex Analysis, p. [...]", &c.

It's been said that the real-world usefulness of LLM agents has been limited by low reliability impeding the horizon length of tasks: if the agent can only successfully complete a single step with probability 0.9, then its probability of succeeding on a task that requires ten correct steps in sequence is only 0.9<sup>10</sup> ≈ 0.35.

That was about how I felt with math. Prof. Schuster was assigning short horizon-length problems from Axler, which I could mostly do independently; Prof. Lai was assigning longer horizon-length problems from Stein and Shakarchi, which I mostly couldn't. All the individual steps made sense once explained, but I could only generate so many steps before getting stuck.

If I were just trying to learn, the external help wouldn't have seemed like a moral issue. I look things up all the time when I'm working on something I care about, but the institutional context of submitting an assignment for a grade seemed to introduce the kind of moral ambiguity that had made school so unbearable to me, in a way that didn't feel fully mitigated by the transparent footnotes.

I told myself not to worry about it. The purpose of the "assignment" was to help us to learn about the theory of functions of a complex variable, and I was doing that. Prof. Lai had said in class and in office hours that he trusted us, that he trusted me. If I had wanted to avoid this particular source of moral ambiguity at all costs, but still wanted a Bachelor's degree, I could have taken easier classes for which I wouldn't need so much external assistance. (I didn't even need the credits from this class to graduate.)

But that would be insane. The thing I was doing now, of jointly trying to maximize math knowledge while also participating in the standard system to help with that, made sense. Minimizing perceived moral ambiguity (which was all in my head) would have been a really stupid goal. Now, so late in life at age 37, I wanted to give myself fully over to not being stupid, even unto the cost of self-perceived moral ambiguity.

Prof. Lai eschewed in-person exams in favor of take-homes for both the midterm and the final. He said reasonable internet reference usage was allowed, as with the assignments. I didn't ask for further clarification because I had already neurotically asked for clarification about the policy for the assignments once more than was necessary, but resolved to myself that for the take-homes, I would allow myself static websites but obviously no LLMs. I wasn't a grade-grubber; I would give myself the authentic 2010s take-home exam experience and accept the outcome.

(I suspect Prof. Lai would have allowed LLMs on the midterm if I had asked—I didn't get the sense that he yet understood the edge that the latest models offered over mere books and websites. On 29 April, a friend told me that instructors will increasingly just assume students are cheating with LLMs anyway; anything that showed I put thought in would be refreshing. I said that for this particular class and professor, I thought I was a semester or two early for that. In fact, I was two weeks early: on 13 May, Prof. Lai remarked before class and in the conference room during Prof. Schuster's office hours that he had given a bunch of analysis problems to Gemini the previous night, and it got them all right.)

I got a 73/100 on my midterm. Even with the (static) internet, sometimes I would hit a spot where I got stuck and couldn't get unstuck in a reasonable amount of time.

There were only 9 homework assignments during the semester (contrasted to 12 in "Measure and Integration") to give us time to work on an expository paper and presentation on one of either the Gamma function, the Reimann zeta function, the prime number theorem, or elliptic functions. I wrote four pages on "Pinpointing the Generalized Factorial", explaining the motivation of the Gamma function, except that I'm not fond of how the definition is shifted by one from what you'd expect, so I wrote about the unshifted Pi function instead.

I wish I had allocated more time to it. This was my one opportunity in my institutionalized math career to "write a paper" and not merely "complete an assignment"; it would have been vindicating to go over and above knocking this one out of the park. (Expository work had been the lifeblood of my non-institutionalized math life.) There was so much more I could have said about the generalized factorial, and applications (like the fractional calculus), but it was a busy semester and I didn't get to it. It's hardly an excuse that Prof. Lai wrote an approving comment and gave me full credit for those four pages.

I was resolved to do better on the take-home final than the take-home midterm, but it was a struggle. I eventually got everything, but what I submitted ended up having five footnotes to various math.stackexchange.com answers. (I was very transparent about my reasoning process; no one could accuse me of dishonesty.) For one problem, I ended up using formulas for the modulus of the derivative of a Blashke factor at 0 and the preimage of zero which I found in David C. Ulrich's Complex Made Simple from the University library. It wasn't until after I submitted my work that I realized that the explicit formulas had been unnecessary; the fact that they were inverses followed from the inverse function theorem.

Prof. Lai gave me 95/100 on my final, and an A in the course. I think he was being lenient with the points. Looking over the work I had submitted throughout the semester, I don't think it would have been an A at Berkeley (or Princeton).

I guess that's okay because grades aren't real, but the work was real. If Prof. Lai had faced a dilemma between watering down either the grading scale or the course content in order to accomodate SFSU students being retarded, I'm glad he chose to preserve the integrity of the content.

"Modern Algebra I" (Spring 2025)

One of the quirks of being an autodidact is that it's easy to end up with an "unbalanced" skill profile relative to what school authorities expect. As a student of mathematics, I consider myself more of an analyst than an algebraist and had not previously prioritized learning abstract algebra nor (what the school authorities cared about) "taking" an algebra "class", neither the previous semester nor in Fall 2012/Spring 2013. (Over the years, I had taken a few desultory swings at Dummit & Foote, but had never gotten very far.) I thus found myself in Prof. Dusty Ross's "Modern Algebra I" ("MATH 335"), the last "core" course I needed to graduate.

"Modern Algebra I" met on Monday, Wednesday, and Friday. All of my other classes met Tuesdays and Thursdays. I had wondered whether I could save myself a lot of commuting by ditching algebra most of the time, but started off the semester dutifully attending—and, as long as I was on campus that day anyway, also sitting in on Prof. Ross's "Topology" ("MATH 450") even though I couldn't commit to a fourth math course for credit.

Prof. Ross is an outstanding schoolteacher, the best I encountered at SFSU. I choose my words here very carefully. I don't mean he was my favorite professor. I mean that he was good at his job. His lectures were clear and well-prepared, and puncutated with group work on well-designed worksheets (pedogogically superior to the whole class just being lecture). The assignments and tests were fair, and son on.

On the first day, he brought a cardboard square with color-labeled corners to illustrate the dihedral group. When he asked us how many ways there were to position the square, I said: eight, because the dihedral group for the n-gon has 2<em>n</em> elements. On Monday of the second week, Prof. Ross stopped me after class to express disapproval with how I had brought out my copy of Dummit & Foote and referred to Lagrange's theorem during the group worksheet discussion about subgroups of cyclic groups; we hadn't covered that yet. He also criticized my response about the dihedral group from the previous week; those were just words, he said. I understood the criticism that there's a danger in citing results you or your audience might not understand, but resented the implication that knowledge that hadn't been covered in class was therefore inadmissible.

I asked whether he cared whether I attended class, and he said that the answer was already in the syllabus. (Attendance was worth 5% of the grade.) After that, I mostly stayed home on Mondays, Wednesdays, and Fridays unless there was a quiz (and didn't show up to topology again), which seemed like a mutually agreeable outcome to all parties.

Dusty Ross is a better schoolteacher than Alex Schuster, but in my book, Schuster is a better person. Ross believes in San Francisco State University; Schuster just works there.

The course covered the basics of group theory, with a little bit about rings at the end of the semester. The textbook was Joseph A. Gallian's Contemporary Abstract Algebra, which I found to be in insultingly poor taste. The contrast between "Modern Algebra I" ("MATH 335") and "Theory of Functions of a Complex Variable" ("MATH 730") that semester did persuade me that the course numbers did have semantic content in their first digit (3xx = insulting, 4xx or cross-listed 4xx/7xx = requires effort, 7xx = potentially punishing).

I mostly treated the algebra coursework as an afterthought to the analysis courses I was devoting most of my focus to. I tried to maintain a lead on the weekly algebra assignments (five problems hand-picked by Prof. Ross, not from Gallian), submitting them an average of 5.9 days early—in the spirit of getting it out of the way. On a few assignments, I wrote some Python to compute orders of elements or cosets of permutation groups in preference to doing it by hand. One week I started working on the prequisite chapter on polynomial rings from the algebraic geometry book Prof. Ross had just written with his partner Prof. Emily Clader, but that was just to show off to Prof. Ross at office hours that I had at least looked at his book; I didn't stick with it.

The Tutoring and Academic Support Center (TASC) offered tutoring for "Modern Algebra I", so I signed up for weekly tutoring sessions with the TA for the class, not because I needed help to do well in the class, but it was nice to work with someone. Sometimes I did the homework, sometimes we talked about some other algebra topic (from Dummit & Foote, or Ross & Clader that one week), one week I tried to explain my struggles with measure theory. TASC gave out loyalty program–style punch cards that bribed students with a choice between two prizes every three tutoring sessions, which is as patronizing as it sounds, but wondering what the next prize options would be was a source of anticipation and mystery; I got a pen and a button and a tote bag over the course of the semester.

I posted a somewhat disappointing 79/90 (87.8%) on the final, mostly due to stupid mistakes or laziness on my part; I hadn't prepped that much. Wracking my brain during a "Give an example of each the [sic] following" question on the exam, I was proud to have come up with the quaternions and "even-integer quaternions" as examples of noncommutative rings with and without unity, respectively.

He didn't give me credit for those. We hadn't covered the quaternions in class.

Not Sweating the Fake Stuff (Non-Math)

In addition to the gen-ed requirements that could be satisfied with transfer credits, there were also upper-division gen-ed requirements that had to be taken at SFSU: one class each from "UD-B: Physical and/or Life Sciences" (which I had satisfied with a ridiculous "Contemporary Sexuality" class in Summer 2012), "UD-C: Arts and/or Humanities", and "UD-D: Social Sciences". There was also an "Area E: Lifelong Learning and Self-Development" requirement, and four "SF State Studies" requirements (which overlapped with the UD- classes).

"Queer Literatures and Media" (Fall 2024)

I try to keep it separate from my wholesome math and philosophy blogging, but at this point it's not a secret that I have a sideline in gender-politics blogging. As soon as I saw the title in the schedule of classes, it was clear that if I had to sit through another gen-ed class, "Queer Literatures and Media" was the obvious choice. I thought I might be able to reuse some of my coursework for the blog, or if nothing else, get an opportunity to troll the professor.

The schedule of classes had said the course was to be taught by Prof. Deborah Cohler, so in addition to the listed required texts, I bought the Kindle version of her Citizen, Invert, Queer: Lesbianism and War in Early Twentieth-Century Britain, thinking that "I read your book, and ..." would make an ideal office-hours icebreaker. There was a last-minute change: the course would actually be taught by Prof. Sasha Goldberg (who would not be using Prof. Cohler's book list; I requested Kindle Store refunds on most of them).

I didn't take the class very seriously. I was taking "Real Analysis II" and "Probability Models" seriously that semester, because for those classes, I had something to prove—that I could do well in upper-division math classes if I wanted to. For this class, the claim that "I could if I wanted to" didn't really seem in doubt.

I didn't not want to. But even easy tasks take time that could be spent doing other things. I didn't always get around to doing all of the assigned reading or video-watching. I didn't read the assigned segment of Giovanni's Room. (And honestly disclosed that fact during class discussion.) I skimmed a lot of the narratives in The Stonewall Reader. My analysis of Carol (assigned as 250 words, but I wrote 350) used evidence from a scene in the first quarter of the film, because that was all I watched. I read the Wikipedia synopsis of They/Them instead of watching it. I skimmed part of Fun Home, which was literally a comic book that you'd expect me to enjoy. When Prof. Goldberg assigned an out-of-print novel (and before it was straightened out how to get it free online), I bought the last copy from AbeBooks with expedited shipping ... and then didn't read most of it. (I gave the copy to Prof. Goldberg at the end of the semester.)

My negligence was the source of some angst. If I was going back to school to "do it right this time", why couldn't I even be bothered to watch a movie as commanded? It's not like it's difficult!

But the reason I had come back was that I could recognize the moral legitimacy of a command to prove a theorem about uniform convergence. For this class, while I could have worked harder if I had wanted to, it was hard to want to when much of the content was so impossible to take seriously.

Asked to explain why the author of an article said that Halloween was "one of the High Holy Days for the gay community", I objected to the characterization as implicitly anti-Semitic and homophobic. The High Holy Days are not a "fun" masquerade holiday the way modern Halloween is. The יָמִים נוֹרָאִים—yamim noraim, "days of awe"—are a time of repentance and seeking closeness to God, in which it is said that הַשֵּׁם—ha'Shem, literally "the name", an epithet for God—will inscribe the names of the righteous in the Book of Life. Calling Halloween a gay High Holy Day implicitly disrespects either the Jews (by denying the seriousness of the Days of Awe), or the gays (by suggesting that their people are incapable of seriousness), or the reader (by assuming that they're incapable of any less superficial connection between holidays than "they both happen around October"). In contrast, describing Halloween as a gay Purim would have been entirely appropriate. "They tried to genocide us; we're still here; let's have a masquerade party with alcohol" is entirely in the spirit of both Purim and Halloween.

I was proud of that answer (and Prof. Goldberg bought it), but it was the pride of coming up with something witty in response to a garbage prompt that had no other function than to prove that the student can read and write. I didn't really think the question was anti-Semitic and homophobic; I was doing a bit.

Another assignment asked us to write paragraphs connecting each of our more theoretical course readings (such as Susan Sontag's "Notes on Camp", or an excerpt from José Esteban Muñoz's Disidentifications: Queers of Color and the Performance of Politics) to Gordo, a collection of short stories about a gay Latino boy growing up in 1970s California. (I think Prof. Goldberg was concerned that students hadn't gotten the "big ideas" of the course, such as they were, and wanted to give an assignment that would force us to re-read them.)

I did it, and did it well. ("[F]or example, Muñoz discusses the possibility of a queer female revolutionary who disidentifies with Frantz Fanon's homophobia while making use of his work. When Nelson Pardo [a character in Gordo] finds some pleasure in American daytime television despite limited English fluency ("not enough to understand everything he is seeing", p. 175), he might be practicing his own form of disidentification.") But it took time out of my day, and it didn't feel like time well spent.

There was a discussion forum on Canvas. School class forums are always depressing. No one ever posts in them unless the teacher makes an assignment of it—except me. I threw together a quick 1800-word post, "in search of gender studies (as contrasted to gender activism)". It was clever, I thought, albeit rambling and self-indulgent, as one does when writing in haste. It felt like an obligation, to show the other schoolstudents what a forum could be and should be. No one replied.

I inquired about Prof. Goldberg's office hours, which turned out to be directly before and after class, which conflicted with my other classes. (I gathered that Prof. Goldberg was commuting to SF State specifically to teach this class in an adjunct capacity; she more commonly taught at City College of San Francisco.) I ditched "Probability Models" lecture one day, just to talk with her about my whole deal. (She didn't seem to approve of me ditching another class when I mentioned that detail.)

It went surprisingly well. Prof. Goldberg is a butch lesbian who, crucially, was old enough to remember the before-time prior to the hegemony of gender identity ideology, and seemed sympathetic to gentle skepticism of some of the newer ideas. She could grant that trans women's womanhood was different from that of cis women, and criticized the way activists tend to glamorize suicide, in contrast to promoting narratives of queer resilience.

When I mentioned my specialization, she remarked that she had never had a math major among her students. Privately, I doubted whether that was really true. (I couldn't have been the only one who needed the gen-ed credits.) But I found it striking for the lack of intellectual ambition it implied within the discipline. I unironically think you do need some math in order to do gender studies correctly—not a lot, just enough linear-algebraic and statistical intuition to ground the idea of categories as clusters in high-dimensional space. I can't imagine resigning myself to such smallness, consigning such a vast and foundational area of knowledge to be someone else's problem—or when I do (e.g., I can't say I know any chemistry), I feel sad about it.

I was somewhat surprised to see Virginia Prince featured in The Stonewall Reader, which I thought was anachronistic: Prince is famous as the founder of Tri-Ess, the Society for the Second Self, an organization for heterosexual male crossdressers which specifically excluded homosexuals. I chose Prince as the subject for my final project/presentation.

Giving feedback on my project proposal, Prof. Goldberg wrote that I "likely got a master's thesis in here" (or, one might think, a blog?), and that "because autogynephilia wasn't coined until 1989, retroactively applying it to a subject who literally could not have identified in that way is inaccurate." (I wasn't writing about how Prince identified.)

During the final presentations, I noticed that a lot of students were slavishly mentioning the assignment requirements in the presentation itself: the rubric had said to cite two readings, two media selections, &c. from the course, and people were explicitly saying, "For my two course readings, I choose ..." When I pointed out to the Prof. Goldberg that this isn't how anyone does scholarship when they have something to say (you cite sources in order to support your thesis; you don't say "the two works I'm citing are ..."), she said that we could talk about methodology later, but that the assignment was what it was.

For my project, I ignored the presentation instructions entirely and just spent the two days after the Putnam exam banging out a paper titled "Virginia Prince and the Hazards of Noticing" (four pages with copious footnotes, mostly self-citing my gender-politics blog, in LyX with a couple of mathematical expressions in the appendix—a tradition from my community college days). For my presentation, I just had my paper on the screen in lieu of slides and talked until Prof. Goldberg said I was out of time (halfway through the second page).

I didn't think it was high-quality enough to republish on the blog.

There was one day near the end of the semester when I remember being overcome with an intense feeling of sadness and shame and anger at the whole situation—at the contradiction between what I "should" have done to do well in the class, and what I did do. I felt both as if the contradiction was a moral indictment of me, and that the feeling that it was a moral indictment was a meta-moral indictment of moral indictment.

The feeling passed.

Between the assignments I had skipped and my blatant disregard of the final presentation instructions, I ended up getting a C− in the class, which is perhaps the funniest possible outcome.

"Philosophy of Animals" (Spring 2025)

I was pleased that the charmingly-titled "Philosophy of Animals" fit right into my Tuesday–Thursday schedule after measure theory and the theory of functions of a complex variable. It would satisfy the "UD-B: Physical/Life Science" and "SF State Studies: Environmental Sustainability" gen-ed requirements.

Before the semester, the Prof. Kimbrough Moore sent out an introductory email asking us to consider as a discussion question for our first session whether it is some sense contradictory for a vegetarian to eat oysters. I wrote a 630 word email in response (Subject: "ostroveganism vs. Schelling points (was: "Phil 392 - Welcome")") arguing that there are game-theoretic reasons for animal welfare advocates to commit to vegetarianism or veganism despite a prima facie case that oysters don't suffer—with a postscript asking if referring to courses by number was common in the philosophy department.

The course, and Prof. Moore himself, were pretty relaxed. There were readings on animal consciousness and rights from the big names (Singer on "All Animals are Equal", Nagel on "What Is It Like to Be a Bat?") and small ones, and then some readings about AI at the end of course.

Homework was to post two questions about the readings on Canvas. There were three written exams, which Prof. Moore indicated was a new anti-ChatGPT measure this semester; he used to assign term papers.

Prof. Moore's office hours were on Zoom. I would often phone in to chat with him about philosophy, or to complain about school. I found this much more stimulating than the lecture/discussion periods, which I started to ditch more often than not on Tuesdays in favor of Prof. Schuster's office hours.

Prof. Moore was reasonably competent at his job; I just had trouble seeing why his job, or for that matter, the SFSU philosophy department, should exist.

In one class session, he mentioned offhand (in a slight digression from the philosophy of animals) that there are different types of infinity. By way of explaining, he pointed out that there's no "next" decimal after 0.2 the way that there's a next integer after 2. I called out that that wasn't the argument. (The rationals are countable.) The same lecture, he explained Occam's razor in a way that I found rather superficial. (I think you need Kolmogorov complexity or the minimum description length principle to do the topic justice.) That night, I sent him an email explaining the countability of the rationals and recommending a pictoral intuition pump for Occam's razor due to David MacKay (Subject: "countability; and, a box behind a tree").

In April, the usual leftist blob on campus had scheduled a "Defend Higher Education" demonstration to protest proposed budget cuts to the California State University system; Prof. Moore offered one point of extra credit in "Philosophy of Animals" for participating.

I was livid. Surely it would be a breach of professional conduct to offer students course credit for attending an anti-abortion or pro-Israel rally. Why should the school presume it had the authority to tell students to speak out in favor of more school? I quickly wrote Prof. Moore an email in complaint, suggesting that the extra credit opportunity be viewpoint-neutral: available to available to budget cut proponents (or those with more nuanced views) as well as opponents.

I added:

If I don't receive a satisfactory response addressing the inappropriate use of academic credit to incentivize political activities outside the classroom by Thursday 17 April (the day of the protest), I will elevate this concern to Department Chair Landy. This timeline is necessary to prevent the ethical breach of students being bribed into bad faith political advocacy with University course credit.

I can imagine some readers finding this level of aggression completely inappropriate and morally wrong. Obviously, my outrage was performative in some sense, but it was also deeply felt—as if putting on a performance was the most sincere thing I could do under the circumstances.

It's not just that it would be absurd to get worked up over one measly point of extra credit if there weren't a principle at stake. (That, I would happily grant while "in character.") It was that expecting San Francisco State University to have principles about freedom of conscience was only slightly less absurd.

It was fine. Prof. Moore "clarified" that the extra credit was viewpoint-neutral. (I was a little embarrassed not to have witnessed the verbal announcement in class on Tuesday, but I had already made plans to interview the campus machine-shop guy at that time instead of coming to class.) After having made a fuss, I was obligated to follow through, so I made a "BUDGET CUTS ARE PROBABLY OK!" sign (re-using the other side of the foamboard from an anti–designated hitter rule sign I had made for a recent National League baseball game) and held it at the rally on Thursday for ten minutes to earn the extra-credit point.

As for the philosophy of animals itself, I was already sufficiently well-versed in naturalist philosophy of mind that I don't feel like I learned much of anything new. I posted 24/25 (plus a 2 point "curve" because SFSU students are illiterate), 21.5/25 (plus 4), and 22/25 (plus 2) on the three tests, and finished the semester at 101.5% for an A.

"Self, Place, and Knowing: An Introduction to Interdisciplinary Inquiry" (Spring 2025)

I was able to satisfy the "Area E: Lifelong Learning and Self-Development" gen-ed requirement with an asynchronous online-only class, Prof. Mariana Ferreira's "Self, Place, and Knowing: An Introduction to Interdisciplinary Inquiry". Whatever expectations I had of a lower-division social studies gen-ed class at San Francisco State University, this felt like a parody of that.

The first few weekly assignments were quizzes on given readings. This already annoyed me: in a synchronous in-person class, a "quiz" is typically closed-book unless otherwise specified. The purpose is to verify that the student did the reading. It would be a perversion of that purpose for the quiz-taker to read the question, and then Ctrl-F in the PDF to find the answer without reading the full text, but there was no provision for stopping that eventuality here.

The first quiz was incredibly poorly written: some of the answers were obvious just from looking at the multiple choice options, and some of them depended on minutiæ of the text that a typical reader couldn't reasonably be expected to memorize. (The article quoted several academics in passing, and then the quiz had a question of the form "[name] at [university] expresses concerns about:".) I took it closed-book and got 7/10.

I posted a question on the class forum asking for clarification on the closed-book issue, and gently complaining about the terrible questions (Subject: "Are the quizzes supposed to be 'open book'? And, question design"). No one replied; I was hoping Prof. Ferreira kept an eye on the forum. I could have inquired with her more directly, but the syllabus said Zoom office hours were by appointment only at 8 a.m. Tuesdays—just when I was supposed to be out the door to be on time for "Measure and Integration." I didn't bother.

You might question why I even bothered to ask on the forum, given my contempt for grade-grubbing: I could just adhere to a closed-book policy unilaterally and eat the resulting subpar scores. But I had noticed that my cumulative GPA was sitting at 3.47 (down from 3.49 in Spring 2013 because of that C− in "Queer Literatures and Media" last semester), and 3.5 would classify my degree as cum laude. Despite everything, I think I did want an A in "Self, Place, and Knowing", and my probability of getting an A was lower if I handicapped myself with moral constraints perceived by myself and probably not anyone else.

I also did the next two quizzes closed book—except that on the third quiz, I think I succumbed to the temptation to peek at the PDF once, but didn't end up changing my answer as the result of the peek. Was that contrary to the moral law? Was this entire endeavor of finishing the degree now morally tainted by that one moment, however inconsequential it was to any outcome?

I think part of the reason I peeked was because, in that moment, I was feeling doubtful that the logic of "the word 'quiz' implies closed-book unless otherwise specified" held any force outside of my own head. Maybe "quiz" just meant "collection of questions to answer", and it was expected that students would refer back to the reading while completing it. The syllabus had been very clear about LLM use being plagiarism, despite how hard that was to enforce. If Prof. Ferreira had expected the quizzes to be closed book on the honor system, wouldn't she have said that in the syllabus, too? The fact that no one had shown any interest in clarifying what the rules were even after I had asked in the most obvious place, suggested that no one cared. I couldn't be in violation of the moral law if "Self, Place, and Knowing" was not a place where the moral law applied.

It turned out that I needn't have worried about my handicapped quiz scores (cumulative 32/40 = 80%) hurting my chances of making cum laude. Almost all of the remaining assignments were written (often in the form of posts to the class forum, including responses to other students), and Prof. Ferreira awarded full or almost-full credit for submissions that met the prescribed wordcount and made an effort to satisfy the (often unclear or contradictory) requirements.

Despite the syllabus's warnings, a few forum responses stuck out to me as having the characteristic tells of being written by an LLM assistant. I insinuated my suspicions in one of my replies to other classmates:

I have to say, there's something striking about your writing style in this post, and even more so your comments of Ms. Williams's and Ms. Mcsorley's posts. The way you summarize and praise your classmates' ideas has a certain personality to it—somehow I imagine the voice of a humble manservant with a Nigeran accent (betraying no feelings of his own) employed by a technology company, perhaps one headquartered on 18th Street in our very city. You simply must tell us where you learned to write like that!

I felt a little bit nervous about that afterwards: my conscious intent with the "Nigerian manservant" simile was to allude to the story about ChatGPT's affinity for the word delve being traceable to the word's prevalence among the English-speaking Nigerians that OpenAI employed as data labelers, but given the cultural milieu of an SFSU social studies class, I worried that it would be called out as racist. (And whatever my conscious intent, maybe at some level I was asking for it.)

I definitely shouldn't have worried. Other than the fact that Prof. Ferreira gave me credit for the assignment, I have no evidence that any human read what I wrote.

My final paper was an exercise in bullshit and malicious compliance: over the course of an afternoon and evening (and finishing up the next morning), I rambled until I hit the wordcount requirement, titling the result, "How Do Housing Supply and Community Assets Affect Rents and Quality of Life in Census Tract 3240.03? An Critical Microeconomic Synthesis of Self, Place, and Knowing". My contempt for the exercise would have been quite apparent to anyone who read my work, but Prof. Ferreira predictably either didn't read it or didn't care. I got my A, and my Bachelor of Arts in Mathematics (Mathematics for Liberal Arts) cum laude.

Cynicism and Sanity

The satisfaction of finally finishing after all these years was tinged with grief. Despite the manifest justice of my complaints about school, it really hadn't been that terrible—this time. The math was real, and I suppose it makes sense for some sort of institution to vouch for people knowing math, rather than having to take people's word for it.

So why didn't I do this when I was young, the first time, at Santa Cruz? I could have majored in math, even if I'm actually a philosopher. I could have taken the Putnam (which is just offered at UCSC without a student needing to step up to organize). I could have gotten my career started in 2010. It wouldn't have been hard except insofar as it would have involved wholesome hard things, like the theory of functions of a complex variable.

What is a tragedy rather than an excuse is, I hadn't known how, at the time. The official story is that the Authority of school is necessary to prepare students for "the real world". But the thing that made it bearable and even worthwhile this time is that I had enough life experience to treat school as part of the real world that I could interact with on my own terms, and not any kind of Authority. The incomplete contract was an annoyance, not a torturous contradiction in the fabric of reality.

In a word, what saved me was cynicism, except that cynicism is just naturalism about the properties of institutions made out of humans. The behavior of the humans is in part influenced by various streams of written and oral natural language instructions from various sources. It's not surprising that there would sometimes be ambiguity in some of the instructions, or even contradictions between different sources of instructions. As an agent interacting with the system, it was necessarily up to me to decide how to respond to ambiguities or contradictions in accordance with my perception of the moral law. The fact that my behavior in the system was subject to the moral law, didn't make the streams of natural language instructions themselves an Authority under the moral law. I could ask for clarification from a human with authority within the system, but identifying a relevant human and asking had a cost; I didn't need to ask about every little detail that might come up.

Cheating on a math test would be contrary to the moral law: it feels unclean to even speak of it as a hypothetical possibility. In contrast, clicking through an anti-sexual-harrassment training module as quickly as possible without actually watching the video was not contrary to the moral law, even though I had received instructions to do the anti-sexual-harrassment training (and good faith adherence to the instructions would imply carefully attending to the training course content). I'm allowed to notice which instructions are morally "real" and which ones are "fake", without such guidance being provided by the instructions themselves.

I ended up getting waivers from Chair Hsu for some of my UCSC credits that the computer system hadn't recognized as fulfilling the degree requirements. I told myself that I didn't need to neurotically ask followup questions about whether it was "really" okay that (e.g.) my converted 3.3 units of linear algebra were being accepted for a 4-unit requirement. It was Chair Hsu's job to make his own judgement call as to whether it was okay. I would have been agreeable to take a test to prove that I know linear algebra—but realistically, why would Hsu bother to have someone administer a test rather than just accept the UCSC credits? It was fine; I was fine.

I remember that back in 2012, when I was applying to both SF State and UC Berkeley as a transfer student from community college, the application forms had said to list grades from all college courses attempted, and I wasn't sure whether that should be construed to include whatever I could remember about the courses from a very brief stint at Heald College in 2008, which I didn't have a transcript for because I had quit before finishing a single semester without receiving any grades. (Presumably, the intent of the instruction on the forms was to prevent people from trying to elide courses they did poorly in at the institution they were transferring from, which would be discovered anyway when it came time to transfer credits. Arguably, the fact that I had briefly tried Heald and didn't like it wasn't relevant to my application on the strength of my complete DVC and UCSC grades.)

As I recall, I ended up listing the incomplete Heald courses on my UC Berkeley application (out of an abundance of moral caution, because Berkeley was actually competitive), but not my SFSU application. (The ultimate outcome of being rejected from Berkeley and accepted to SFSU would have almost certainly been the same regardless.) Was I following morally coherent reasoning? I don't know. Maybe I should have phoned up the respective admissions offices at the time to get clarification from a human. But the possibility that I might have arguably filled out a form incorrectly thirteen years ago isn't something that should turn the entire endeavor into ash. The possibility that I might have been admitted to SFSU on such "false pretenses" is not something that any actual human cares about. (And if someone does, at least I'm telling the world about it in this blog post, to help them take appropriate action.) It's fine; I'm fine.

When Prof. Mujamdar asked us to bring our laptops for the recitation on importance sampling and I didn't feel like lugging my laptop on BART, I just did the work at home—in Rust—and verbally collaborated with a classmate during the recitation session. I didn't ask for permission to not bring the laptop, or to use Rust. It was fine; I was fine.

In November 2024, I had arranged to meet with Prof. Arek Goetz "slightly before midday" regarding the rapidly approaching registration deadline for the Putnam competition. I ducked out of "Real II" early and knocked on his office door at 11:50 a.m., then waited until 12:20 before sending him an email on my phone and proceeding to my 12:30 "Queer Literatures and Media" class. While surreptitiously checking my phone during class, I saw that at 12:38 p.m., he emailed me, "Hello Zack, I am in the office, not sure if you stopped by yet...". I raised my hand, made a contribution to the class discussion when Prof. Goldberg called on me (offering Seinfeld's "not that there's anything wrong with that" episode as an example of homophobia in television), then grabbed my bag and slipped out while she had her back turned to the whiteboard. Syncing up with Prof. Goetz about the Putnam registration didn't take long. When I got back to "Queer Literatures and Media", the class had split up into small discussion groups; I joined someone's group. Prof. Goldberg acknowledged my return with a glance and didn't seem annoyed.

Missing parts of two classes in order to organize another school activity might seem too trivial of an anecdote to be worth spending wordcount on, but it felt like a significant moment insofar as I was applying a wisdom not taught in schools, that you can just do things. Some professors would have considered it an affront to just walk out of a class, but I hadn't asked for permission, and it was fine; I was fine.

In contrast to my negligence in "Queer Literatures and Media", I mostly did the reading for "Philosophy of Animals"—but only mostly. It wasn't important to notice or track if I missed an article or skimmed a few pages here and there (in addition to my thing of cutting class in favor of Prof. Schuster's office hours half the time). I engaged with the material enough to answer the written exam questions, and that was the only thing anyone was measuring. It was fine; I was fine.

I was fine now, but I hadn't been fine at Santa Cruz in 2007. The contrast in mindset is instructive. The precipitating event of my whole anti-school crusade had been the hysterical complete mental breakdown I had after finding myself unable to meet pagecount on a paper for Prof. Bettina Aptheker's famous "Introduction to Feminisms" course.

It seems so insane in retrospect. As I demonstrated with my malicious compliance for "Self, Place, and Knowing", writing a paper that will receive a decent grade in an undergraduate social studies class is just not cognitively difficult (even if Prof. Aptheker and the UCSC of 2007 probably had higher standards than Prof. Ferreira and the SFSU of 2025). I could have done it—if I had been cynical enough to bullshit for the sake of the assignment, rather than holding myself to the standard of writing something I believed and having a complete mental breakdown rather than confront the fact that I apparently didn't believe what I was being taught in "Introduction to Feminisms."

I don't want to condemn my younger self entirely, because the trait that made me so dysfunctional was a form of integrity. I was right to want to write something I believed. It would be wrong to give up my soul to the kind of cynicism that scorns ideals themselves, rather than the kind than scorns people and institutions for not living up to the ideals and lying about it.

Even so, it would have been better for everyone if I had either bullshitted to meet the pagecount, or just turned in a too-short paper without having a total mental breakdown about it. The total mental breakdown didn't help anyone! It was bad for me, and it imposed costs on everyone around me.

I wish I had known that the kind of integrity I craved could be had in other ways. I think I did better for myself this time by mostly complying with the streams of natural language instructions, but not throwing a fit when I didn't comply, and writing this blog post afterwards to clarify what happened. If anyone has any doubts about the meaning of my Bachelor of Arts in Mathematics for Liberal Arts from San Francisco State University, they can read this post and get a pretty good idea of what that entailed. I've put in more than enough effort into being transparent that it doesn't make sense for me to be neurotically afraid of accidentally being a fraud.

I think the Bachelor of Arts in Mathematics does mean something, even to me. It can simultaneously be the case that existing schools are awful for the reasons I've laid out, and that there's something real about some parts of them. Part of the tragedy of my story is that having wasted too much of my life in classes that were just obedience tests, I wasn't prepared to appreciate the value of classes that weren't just that. If I had known, I could have deliberately sought them out at Santa Cruz.

I think I've latched on to math as something legible enough and unnatural enough (in contrast to writing) that the school model is tolerable. My primary contributions to the world are not as a mathematician, but if I have to prove my intellectual value to Society in some way that doesn't depend on people intimately knowing my work, this is a way that makes sense, because math is too difficult and too pure to be ruined by the institution. Maybe other subjects could be studied in school in a way that's not fake. I just haven't seen it done.

There's also a sense of grief and impermanence about only having my serious-university-math experience in the GPT-4 era rather than getting to experience it in the before-time while it lasted. If I didn't have LLM tutors, I would have had to be more aggressive about collaborating with peers and asking followup questions in office hours.

My grudging admission that the degree means something to me should not be construed as support for credentialism. Chris Olah never got his Bachelor's degree, and anyone who thinks less of him because of that is telling on themselves.

At the same time, I'm not Chris Olah. For those of us without access to the feedback loops entailed by a research position at Google Brain, there's a benefit to being calibrated about the standard way things are done. (Which, I hasten to note, I could in principle have gotten from MIT OpenCourseWare; my accounting of benefits from happening to finish college is not an admission that the credentialists were right.) Obviously, I knew that math is not a spectator sport: in the years that I was filling my pages of notes from my own textbooks, I was attempting exercises and not just reading (because just reading doesn't work). But was I doing enough exercises, correctly, to the standard that would be demanded in a school class, before moving on to the next shiny topic? It's not worth the effort to do an exhaustive audit of my 2008–2024 private work, but I think in many cases, I was not. Having a better sense of what the mainstream standard is will help me adjust my self-study practices going forward.

When I informally audited "Honors Introduction to Analysis" ("MATH H104") at UC Berkeley in 2017, Prof. Charles C. Pugh agreed to grade my midterm, and I got a 56/100. I don't know what the class's distribution was. Having been given to understand that many STEM courses offered a generous curve, I would later describe it as me "[doing] fine on the midterm". Looking at the exam paper after having been through even SFSU's idea of an analysis course, I think I was expecting too little of myself: by all rights, a serious analysis student in exam shape should be able to prove that the minimum distance between a compact and a connected set is achieved by some pair of points in the sets, or that the product of connected spaces is connected (as opposed to merely writing down relevant observations that fell short of a proof, as I did).

In a July 2011 Diary entry, yearning to finally be free of school, I fantasized about speedrunning SF State's "advanced studies" track in two semesters: "Six classes a semester sounds like a heavy load, but it won't be if I study some of the material in advance," I wrote. That seems delusional now. That's not actually true of real math classes, even if it were potentially true of "Self, Place, and Knowing"-tier bullshit classes.

It doesn't justify the scourge of credentialism, but the fact that I was ill-calibrated about the reality of the mathematical skill ladder helps explain why the coercion of credentialism is functional, why the power structure survives instead of immediately getting competed out of existence. As terrible as school is along so many dimensions, it's tragically possible for people to do worse for themselves in freedom along some key dimensions.

There's a substantial component of chance in my coming to finish the degree. The idea presented itself to me in early 2024 while I was considering what to work on next after a writing project had reached a natural stopping point. People were discussing education and schooling on Twitter in a way that pained me, and it occurred to me that I would feel better about being able to criticize school from the position of "... and I have a math degree" rather than "... so I didn't finish." It seemed convenient enough, so I did it.

But a key reason it seemed convenient enough is that I still happened to live within commuting distance of SF State. That may be more due to inertia than anything else; when I needed to change apartments in 2023, I had considered moving to Reno, NV, but ended up staying in the East Bay because it was less of a hassle. If I had fled to Reno, then transferring credits and finishing the degree on a whim at the University of Nevada–Reno would have been less convenient. I probably wouldn't have done it—and I think it was ultimately worth doing.

The fact that humans are such weak general intelligences that so much of our lives come down to happenstance, rather than people charting an optimal path for themselves, helps explain why there are institutions that shunt people down a standard track with a known distribution of results. I still don't like it, and I still think people should try to do better for themselves, but it seems somewhat less perverse now.

Afterwards, Prof. Schuster encouraged me via email to at least consider grad school, saying that I seemed comparable to his peers in the University of Michigan Ph.D. program (which was ranked #10 in the U.S. at that time in the late '90s). I demurred: I said I would consider it if circumstances were otherwise, but in contrast to the last two semesters to finish undergrad, grad school didn't pass a cost-benefit analysis.

(Okay, I did end up crashing Prof. Clader's "Advanced Topics in Mathematics: Algebraic Topology" ("MATH 790") the following semester, and she agreed to grade my examinations, on which I got 47/50, 45/50, 46/50, and 31/50. But I didn't enroll.)

What was significant (but not appropriate to mention in the email) was that now the choice to pursue more schooling was a matter of cost–benefit analysis, and not a prospect of torment or betrayal of the divine.

I wasn't that crazy anymore.



Discuss