2026-02-23 09:28:02
Published on February 23, 2026 1:28 AM GMT
The North Country of New York runs along the Canadian border at the top of the state. It is where I grew up, my grandfather walked across the border with about a third grade education not long after the turn of the century.
That region along the Canadian border is getting very close to 1 in 4 people over 65. Even though I grew up there and had tenure at SUNY Potsdam I sadly still left. In 1990 my high school class at Beekmantown (north of Plattsburgh) graduated 120 students, we played Potsdam high school in sectional basketball. Potsdam graduated around 140 students that year. Potsdam now graduates classes of 75-85. SUNY Potsdam now has less than half as many student as at peak. From around 4,500 to around 1,800 today.
Many faculty at SUNY Potsdam rent a room in town and drive from Albany where they live the rest of the time. There is a death spiral seemingly forming on campus, dorms are shuttered and buildings are being town down and departments like Theater are shuttered.
Talking about SUNY Potsdam 30 years ago being so full of students that they are cramming three people into rooms, offering incredible food options and attracting world class faculty sounds like a fantasy today. It joins my ‘in the 90s Microsoft bought Apple stock to help Apple stay in business’ canon of stories that sound fake. Phish played SUNY Potsdam when I was there and it graduated more math majors than any Math department on the East Coast in the early 1980s.
“Sure, Dad.”
On that note, New York had something called “the Regents Scholarship” until 1990. It gave you $150 a semester and that paid tuition for the Baby Boomer generation in New York. All you needed was an 85 on your Regents exams and in New York you had free college as long as you could make it to a campus on a regular basis.
Time for some synthesis. I grew up on a long rural road, 45+ minute walk to a gas station. How does a young person like that get the car to go to the local campus? Thus dorms.
Dorms became a money making thing for colleges, as the state cut support fancy dorms became a thing nation-wide. I was told only 20% of SUNY Potsdam’s funding came from the state when I worked there.
Here’s the “old logic” part. SUNY central bans campuses from setting their own basic dorm rate. Look at the dates on this policy document, all from the 1980s. Policy from a time where professors could smoke in class as they taught. A dorm at Potsdam, where students have to drive hours on rural roads to get there, costs $11,830. A dorm at SUNY New Paltz is a quick train ride from NYC and costs $11,760.
To make the most vulgar explanation of how messed up this is, my home in Potsdam was a four bedroom Arts & Crafts house that costs $47,000 to buy. Here is a link to St. Lawrence County homes for sale for less than $100,000. Four years in a dorm is nearly half a house?
So in a geographic area where we have a cratering youth population we have the SUNY system charging young people more than if they want to live on a campus where you can pop off to NYC every weekend. If you know Potsdam’s campus, this policy has resulted in Knowles dorm (which can house 500+ students) being closed and empty. There are so many students who absolutely would pick rural campuses if the dorms were included. Nobody needs this like the students who go to high school in rural areas. My high school class was 120 and there was one person with one parent who was not white. Free dorms would give rural kids a chance to interact with a diverse set of other kids.
Why are we not giving those dorms away for free? Because lack of funding for SUNY has caused campuses to rely on dorms for revenue, except the rural campuses can no longer fill dorms. When a college tells students they need to live on campus there are commonly reasons that have nothing to do with education.
Why not let the area die? Because it is a great place to live if you have community. I live in Ithaca, NY now, we have amazing waterfalls here and if you go to one you will be with lots of other people also visiting. At Potsdam my friends and I swam in waterfalls all day and saw nobody. My cheap house was a block from the river so I kayaked nearly every day from April until October. When I lived there and a family had a baby the community organizes and drops off food every day for a month. I know there are people who would prefer that life and never get a chance to even go up there.
New York no longer has a massive youth population looking to get into any SUNY they can. The state could show it knows how to get parts of its government to work together to serve larger needs. Stop charging young people a small fortune to move to the parts of the state where we desperately need young people. Much of the ‘free tuition’ stuff we have heard is smoke and mirrors. Think about what a free dorm would do to the competitiveness of the campus. More students with money to spend at local businesses. Any time you hear about the plight of rural campuses the central idea comes from the obvious question of “why would anyone want to live there?”
New York paid for 24-7 care for tens of thousands of incarcerated men in the same region, sad to see the state miss the opportunity to save some communities by shelling out for some relatively cheap dorms.
2026-02-23 07:55:23
Published on February 22, 2026 11:55 PM GMT
When I was 21, I was sucked into a world of ambition.
Starting my adult life in the Bay Area, I was surrounded by the sense that I was supposed to start a startup, change the world.
I never wanted to start a startup. Reading stories of famous founders, and living and working with startup founders myself, it seemed to me that the amount of belief you’d have to have in yourself and your idea bordered on insanity. Raised to value humility, and unable to even speak up for myself, it was a level of self-belief I couldn’t imagine ever reaching.
The version of changing the world that appealed to me was effective altruism. I didn’t have a grand vision; I just wanted to help people. The arguments for it seemed so simple, so obviously correct, when laid out in books and blog posts.
Right out of college, I joined an EA organization that worked with governments around the world on projects that cost tens of millions of dollars. One day at work, I was getting some beans and rice from the kitchen when I ran into a billionaire. All the money and power in the world were suddenly right there — and we were using them to save lives.
I coasted along for a time in that dream of changing the world for the better. I was young; many of us were. More than one person I knew had influence over millions of dollars before they were 25. The movement was young, too — too new to power to have yet stumbled into many of the pitfalls that come with it. As the movement grew faster and faster, accruing more followers, more money, and more political influence, it began to seem like we could do absolutely anything. It was a heady feeling.
Then, when I was 26, FTX collapsed. Suddenly, we all had to reckon with the effects of global-scale ambition. When it goes right, you can fund every charity and swing the election for Biden. When it goes wrong, you’ve been complicit in a criminal enterprise that shook the economy and fucked over a million people.
(I read Careless People last week, a memoir about how Facebook’s success put world-changing power in the hands of a few individuals, who were able to wield it almost entirely unchecked. When it goes right, you get democratic uprisings. When it goes wrong, you get genocide in Myanmar, and Trump as president.)
Around the same time as the FTX collapse, an AI arms race was beginning between OpenAI and Anthropic — two labs formed by people who’d been inspired by Bostrom’s Superintelligence, as we all were. By the logic of Superintelligence, it was just about the worst thing that could have happened.
People close to me were thrown into turmoil and depression. We’d done so much in the AI space, supporting and growing AI safety in all sorts of important ways — things that probably wouldn’t have happened without us. Now it seemed that all the investment that had gone into AI safety had had the primary effect of massively accelerating AI capabilities.
You try big things, you get big results.
I quit my EA job the month FTX collapsed, and I haven’t done anything in the space since. It wasn’t a big, dramatic, or even really deliberate decision. I was just burned out and disillusioned.
I still care about the world, and I’ve spent years feeling vaguely guilty that I’m no longer even pretending to work on its biggest problems. I thought I quit EA because I wanted to be happy (as an EA, I was constantly coercing myself to work on things that felt off to me, and was therefore constantly miserable). This felt like selfishness, or laziness. I struggled to justify myself in any other terms.
I don’t feel guilty anymore. I was talking about all this to a friend recently, and he said, “It seems plausible that the best thing to do if you really take AI x-risk seriously is to just stop working on AI at all.”
And that’s what I’ve been trying to say this whole time, whenever anyone asks me about my career. That I don’t want to try to have a big impact, if I can’t be certain that that impact will be positive rather than negative for the world — and I can’t be certain. To be certain of that would be hubris. Both in the memoirs I’ve read and in my real life, I’ve seen people who have genuinely wanted to change things for the better, gotten into the rooms where the sausage gets made, and ended up sickened by the consequences of what they were involved in.
EA funnels millions of dollars around. It funds career development for AI researchers who end up advancing capabilities at frontier labs. It funds insecticide-treated bed nets to protect people from malaria, and then those nets are used for fishing and pollute the waterways. The effect of the latter has been determined to be insignificant. The former, well, I guess it remains to be seen.
2026-02-23 06:40:30
Published on February 22, 2026 10:40 PM GMT
Every serious person who thinks about AI safety observes the fundamental asymmetry between the ease of AI content generation and difficulty of audits thereof. In the codegen context, this leads unserious people to talk about the need for AI-driven unit test generation or LLM-as-a-judge, and serious people to talk about concepts like property based testing or refinement testing, fuzzing, interactive theorem proving, and other formal methods. Along these lines, Davidad et al introduced the “Guaranteed Safe AI” framework in their 2024 whitepaper; Quinn Dougherty started his newsletter and organized the Proof Scaling workshop at Lighthaven in 2024; and many others likewise gestured at variants of this basic idea without giving it a name. In this post, I’m going to use the umbrella term “Scalable Formal Oversight”, or SFO, coined by Mike Dodds, to describe any and all such ideas that basically sound like “let’s leverage formality to put AI in a box”. My goal with this document is to convince you that SFO matters and outline clear research projects you can contribute to which will advance the movement.
The core idea behind SFO is that models are getting increasingly capable, alignment may be impossible, and when we get terrifically useful but potentially misaligned models, we’ll need ways to audit their work. And formal verification offers a clear direction for how one might implement audits, at least in the codegen context.
This argument has a number of flaws. First, in the AI boxing literature, if you have a misaligned ASI in a box, and you don’t know what actuator the ASI is using to get out, then you are probably the actuator. SFO does not solve that problem. Sure, the code the ASI generated might be “safe” in the sense that it satisfies the safety invariants you specified, but it might be very dangerous in some subtle way you failed to specify (or even in a way that cannot be formally specified, e.g., it contains psychologically harmful and virulent information[1]). Second, the ASI might hack your verifier (see also). Third, you might want safety guarantees for tasks totally unrelated to codegen, and while it’s possible that FM has something to say about these tasks (see e.g. the French legal code), it’s by no means guaranteed.[2]
On the other hand, SFO has some key advantages that other AI safety approaches fundamentally lack. First it’s totally independent of the model. SFO is about the box, not the monster you put in it. Second, its reliability is independent of the problem insofar as, if the safety criteria can be expressed formally, then the reliability of a formal approach boils down to the reliability of the formal method (as opposed to boiling down to some messy statistical problem[3]).
So, the bullish case for SFO is roughly as follows. AI safety is super important whether or not you believe an ASI will kill us all[4]; and formal methods are super good at flagging unsafe code, with all the obvious caveats about specification difficulty. You may protest that formal methods are hard to use and limited in scope, but in general both problems are solved by LLMs being great at codegen – we no longer need to limit ourselves to decidable problems like LTL model checking, since in the glorious AI future, proofs are cheap. Indeed: there are a bunch of companies building models and harnesses for proving nontrivial theorems in Lean, and these models and harnesses actually work! You can try Harmonic’s Aristotle system right now on whatever problem you want. I’ve had it prove a number of highly challenging theorems spanning information theory, linear algebra, and group theory, in my free time, as have many other enterprising hobbyist mathematicians.[5] (See also, the opinions of a real mathematician on this topic.)
And the best part about SFO is that a lot of the most interesting research problems are not very hard to get started on. There is a lot of low hanging fruit.
The rest of this document is organized as follows. First, I list a bunch of open technical problems that I believe fall squarely within the SFO research agenda and are worth working on. Second, I list some open human/social problems such as organizing workshops and funding fellowships, which also need to be solved in order for SFO to advance. Finally, I conclude by briefly repeating my thesis in case your eyes glazed over at some point in the middle.
In order for SFO to work, we need formal verifiers that are adversarially robust, i.e., that can’t be hacked. Quinn and I wrote about this problem in our prior LW post about FM. Since then, we’ve begun experimenting in the evenings with fuzzing Lean to try and find novel proofs of False. The threat landscape includes proof files that pass some, but not all, levels of validation; consistency issues in axioms; discrepancies between the assumed and actual semantics of imported axioms or definitions; the correctness of the underlying logic “on paper”; the faithfulness of the implementation of the “on paper” logic in the prover; and also the entire traditional cybersecurity landscape of software vulnerabilities (including supply-chain attacks, exploits targeting trusted C code, cryptographic attacks against proof attestations, etc.). In short: we need to verify the verifiers.
Useful AI is, for the most part, agentic. So if we want to put untrusted AI in a box, we need a box that can handle agents, which really means we need a sandbox conjoined with some kind of formal-methods-based audit system. Which means we also need to check the sandbox. This game can be set up in various ways (e.g., breakout or battle-royale), but the common theme is the tasking of humans or agents with finding and exploiting bugs in sandboxes, and then the subsequent patching of said bugs.
Note, this problem relates to, but is distinct from, the much harder problem of containing ASI. I believe that the recommendations from the SL5 task force are nowhere near sufficient to solve this problem.[6]I plan to write a future blog post on this topic.
Language-theoretic security, or LangSec, is the study of how security degrades in a parsing context as the complexity of the language being parsed increases. In essence, a sufficiently complex parser is indistinguishable from a compiler, and thus once your data language has enough features, it becomes an accidental programming language. In LangSec, the features that give rise to computation are known as “gadgets”, and the resulting programmable machine is called a “weird machine”.
This principle clearly applies in the agent context[7], where the “gadgets” are the tools made available to the agent (which might be seemingly benign things like curl, man, and cat), and the “weird machine” is the resulting capability set of the agent (e.g., accidental Turing completeness plus Internet access …). We should expect adversarial agents to live off the land, and thus, we need a good way to know when the permission-set of an agent has cumulatively reached some capability threshold. Turing completeness is often used as a convenient catch-all for too-much-capability, but there are other thresholds which may be more interesting from an AI safety perspective, such as the ability to communicate (overtly or covertly) with other agents. Note that these kinds of questions are in general undecidable, but with turbocharged agentic proof tools at our disposal, I don’t think that should stop us from asking them![8]
RLMs[9]offer a vision for how alternative kinds of context can improve agent performance. What if that context is inherently structured, hierarchical, and captures formal requirements (including, potentially, how those requirements change over time?). Maybe an agent could inherently code in a correct-by-construction manner. Put differently, it’s easy to imagine the paradigm of “vibecode, then audit; rinse and repeat until the code is safe”. But what about an agent that just inherently produces safe code to begin with? Can this be done in a language model context?[10]
To be super clear, it’s easy to imagine a vibe → verify → vibe loop. But this is absolutely not the only option. Two alternatives, which I mention mostly so you believe me that alternatives exist, are constrained decoding or constrained generation, and reinforcement learning with formal verification in-the-loop.[11]I claim that yet more (and better) alternatives exist to be discovered.
A sub-problem is the topic of porting from "bad" languages to "good" ones (e.g. C to Lean or Rust) in order to get some kind of safety guarantee(s), while preserving the (safe subset of the) semantics of the original program (see e.g., DARPA TRACTOR).
Another sub-problem is the topic of improving AI reasoning using FM. Here the goal is to uplift the logical capabilities of the model by exposing formal methods as tools, even if the end-goal of the model is inherently informal/unspecifiable.
Benchmarks drive AI. So if we want AI models/systems/agents that can “do formal methods” we need good formal methods benchmarks. As a concrete example, I made RealPBT, a benchmark of 54k property based tests (PBTs) scraped from permissively licensed Github repos, and BuddenBench, a benchmark of open problems in pure mathematics (many of them with problem statements autoformalized in Lean). My collaborators at ForAll and Galois are working on a translation of RealPBT into Lean theorem-proving challenges (where the challenge is to prove the theorem implied by the PBT, over a lossy model of the code under test), which should be released soon.
The reinforcement learning version of this task is also important and neglected. FM tasks are phenomenal for RL because they are inherently and cheaply verifiable. Why don’t companies like Hillclimb work on FM RL environments? Which gets me to the next problem …
The math-AI companies are mostly of the opinion that by solving math, they’ll solve everything, so they can eventually just pivot to secure program synthesis and eat the world.[12]This might be true – certainly language models generalize to hitherto unseen tasks – but it also might be the case that secure program synthesis capabilities just don’t scale fast enough when all we’re training for is the math olympiad. Cue the research direction: train models specifically for secure program synthesis. With tools like Tinker and Lab, you can now feasibly train frontier models at home. And it turns out even cheap models can prove theorems! Go forth and prosper.
When you build an agent or a codegen tool, you quickly realize that some kind of specification or planning language is really useful. You may even decide you should build your codegen tool around this primitive (see e.g., Kiro, Claude’s Plan Mode, or Codespeak). You may then realize that with a sub-agent architecture, you suddenly need a language for agents to coordinate and resolve conflicts, and that this problem is deeply connected to the specification problem, since often what an agent needs to communicate is that there’s actually a problem with the original spec. (This is where people start crashing out researching gossip and consensus algorithms...) And at around this point in your journey, you may come to the conclusion that formal specs are, in fact, living things, and not set in stone like the 10 commandments.
All this gets us to the root issue that formal specification is a form of planning and it is, in fact, exceedingly difficult, even for experts. If we want to “put AI in a box” by specifying what it should do and then forcing it to prove its outputs consistent with our specs, we need to make it easier to specify stuff. So far nobody has even remotely solved this and it seems unlikely that SFO is tenable without this missing primitive.
Note, I am aware of one organization, in stealth, which is working on this problem. I’m happy to introduce you if you send me a serious inquiry.
This problem most likely needs to be solved in order for any of the others to be solvable in practice.
Beyond these open technical problems, SFO needs a significant amount of missing human infra. I’ll discuss that next.
I think it is noncontroversial to say that mechinterp has benefitted greatly from MATS, LessWrong, the AI Alignment Forum, etc. I believe SFO needs similar human infrastructure to bring people into the fold and support important work. I am working on this problem, and I think you should too. Here are some of my ideas. (They are pretty obvious but still worth writing down.)
The math community has mathjobs, a website that lists math jobs. I recently bought fmxai.org and am using it to build something similar. Note that FMxAI is strictly speaking a bigger tent than SFO – it includes topics such as pure mathematics research using formal methods and AI – but I see no reason to subdivide further given how niche the domain is to begin with.
We need hackathons with cash prizes. I am speaking with collaborators at Apart Research and Forall Dev to try and make one such hackathon. To be announced! I also think other orgs such as Atlas Computing, Harmonic, Axiom, Math Inc, and Galois are very well-positioned to hold similar events.
We need research fellowships to support talented graduate students interested in SFO. I am not working on this problem – I hope you will!
In particular I think it would be amazing to have something like MATS but explicitly for SFO. If someone else wants to help with the funding side of things I am happy to help with organization, getting academic and industry partners, and other such legwork.
Quinn and I recently made a Secure Program Synthesis signal group-chat. I think it would be great to have something like the Lean Zulip but explicitly for SFO, and am hoping group chat naturally outgrows signal and becomes exactly that. There are of course also other overlapping communities – for example, I have been running the Boston Computation Club for about six years now, which has a reasonably active Slack community whose interests overlap with SFO, among other things. Community-building is important!
I’m involved with FMxAI, a conference hosted by Atlas Computing at SRI. Once the 2026 event is announced I’ll feature it prominently on fmxai.org. I’ve also seen various related workshops pop up, e.g, Post-AI FM, the NeurIPS trustworthy agents workshop, etc. We need more of these!
You should believe that AI safety is extremely important regardless of your AI timelines. You should accept that codegen is one of the most powerful, and dangerous, use cases of AI. You should nod sagely when I say that formal methods offer some of the best, if not the best, tools for auditing generated code. You should activate the little modus-ponens circuit in your brain and conclude that SFO is super freaking important. If you do all this, then I hope you can take some inspiration from the (fairly informal) research agenda I’ve outlined above, and get involved on the research and/or human infra side to push SFO forward. This is a serious project and it requires all hands on deck!
Thank you to (in no particular order) Quinn Dougherty, Mike Dodds, Ella Hoeppner, Herbie Bradley, Alok Singh, Henry Blanchette, Thomas Murrills, Jake Ginesin, and Simon Henniger for thoughtful feedback and discussion during the drafting of this document. These individuals bear no responsibility for any mistakes or stupid things I say in the document; they only helped make it better than it would have been. Also, thank you to GPT 5.2 for the four images used to illustrate the post. They are imperfect, but fun.
I think this is a serious risk of autonomous AI research. Impactful research can steer society; the research output becomes the actuator. Similarly, I view agent memory, as found in e.g. Chat GPT, as a potential steganographic scratchpad for longitudinal attacks. ↩︎
See Andrew Dickson’s Limitations on Formal Verification for AI Safety for further discussion. ↩︎
I mean for the love of God, you cannot convince me an LLM-as-judge approach will solve anything when the LLMs are backdoored by construction. ↩︎
I mean, however AI-skeptical you are, certainly you accept that vibed code can be unsafe, and people will inevitably vibe code, including in safety-critical settings such as aviation. There are already AI agents trading on prediction markets and the stock market. It should be obvious that this could have dangerous and unexpected second/third/fourth-order consequences. ↩︎
My wife called me the other day and asked what I was up to. I told her about an information theory problem I’d been trying to vibe using Aristotle, and an FMxAI signal group-chat I was organizing with Quinn. She then ruthlessly mocked me because “all the other husbands are watching the superbowl right now.” ↩︎
A SCIF is nowhere near secure enough. On-body cameras are a fucking horrible idea. It’s incredibly naive to allow researchers with close family ties in other countries to access the thing. I could, and will, go on. ↩︎
I believe I was the first to point out the connection between LangSec and AI security, although I did so before agents had become the norm, and the problem is much more complex and deserves considerably more attention in that context. ↩︎
Note, the LangSec vision I outline above is one where proofs are cheap. There is also a more classical LangSec vision one might pursue where the problem is to cast agent actions or tool invocations into a language that is correct-by-construction, i.e. that inherently has certain safety guarantees (or at least, that is inherently observable at runtime) … however, I think this approach is too limited and not needed in the future where, as stated, proofs are cheap. ↩︎
Joe Kiniry recently left Galois to make Sigil Logic which might be doing interesting stuff in this space! I don’t know their precise technical plan but am optimistic they’ll build something really powerful. Joe was on my PhD committee. ↩︎
I very well may be wrong but kind of suspect P1-AI is doing something along these lines, but potentially playing with differentiable logics such as STL which are more amenable to CPS applications. ↩︎
With two notable exceptions. Principia Labs isn’t interested in secure program synthesis – they’re a pure math play. And Theorem isn’t interested in math – they’re a secure program synthesis play. But is Theorem a “math-AI company”? Regardless, my statement mostly holds. ↩︎
2026-02-23 06:12:14
Published on February 22, 2026 10:12 PM GMT
Full catalog with pseudocode: github.com/wassname/adapters_as_hypotheses
Disclaimer: This is an AI-guided iterative survey. It does not speak for me, but I share it in the hope that it is useful. I do think this is strong evidence of how to intervene in transformers.
Adapter fine-tuning papers (like LoRA) are usually read as engineering races, but they are also experiments about model geometry.
We fine-tune transformers efficiently with low-rank adapters -- adding a constrained update to each weight matrix. Each adapter's constraint is also a hypothesis about model geometry -- about which transformations preserve useful computation and which directions in weight space matter. When one constrained adapter reliably beats another under similar budget, that is suggestive evidence about representation, not just optimization.
So the claim: adapter papers are an underused source of intervention evidence for interpretability, if we read them as hypothesis tests rather than benchmark churn.
We want to understand how transformers work. There are many approaches -- probing, ablation, SAEs -- but most of them observe rather than intervene.
GDM's interpretability team put this well in their "pragmatic interpretability" post: empirical feedback on which structural assumptions hold up. Adapter benchmarks are exactly this kind of feedback -- I made a similar argument in my AntiPaSTO paper, arriving there from the adapter side.
The adapter literature is a natural experiment. Each method constrains the form of the weight update. When a constrained method matches or beats an unconstrained one, that supports the possibility that the constraint aligns with real structure in the weight manifold. When it generalizes OOD, the case for causal relevance is stronger.
Methods that use the model's own SVD decomposition often outperform random-basis methods in reported setups at similar parameter count:
The message: the SVD basis is not just a mathematical convenience. In current benchmarks, it appears to provide useful adaptation coordinates.
The OFT family (OFT, BOFT, GOFT, HRA) constrains adaptation to orthogonal transformations -- rotations without scaling. They work well on tasks where you want to repurpose existing representations without destroying them (DreamBooth, ControlNet, domain adaptation).
HRA makes a surprising bridge: a chain of
Three independent teams converged on the same design: separate what to change (direction in weight space) from how much to change it (magnitude):
When you don't decouple them (standard LoRA), optimization can entangle direction and magnitude updates. Testable prediction: methods that decouple direction from strength may show better OOD transfer, because direction can encode what to change while strength can encode how much.
IA3 learns nothing but a per-channel scaling vector (
A substantial fraction of "task adaptation" can be reweighting existing features -- gain control over channels. In those settings, the bottleneck is often selection rather than new feature creation. When scaling fails, new feature combinations are likely needed.
One interesting pattern: you can trace design lineages that progressively refine the same hypothesis.
Orthogonal family: OFT (block-diagonal rotation) -> BOFT (butterfly factorization,
SVD-aware family: PiSSA (SVD initialization) -> SVFT (sparse SVD coefficients) -> SSVD (asymmetric U/V treatment + Cayley rotation) -> AntiPaSTO (Cayley + steering coefficient)
Decoupling family: DoRA (magnitude/direction) -> ETHER (fixed-strength orthogonal) -> DeLoRA (normalized rank-1 +
Each refinement tests a more specific version of the parent hypothesis. When the refinement works better, we learn something more specific about the geometry.
I went through ~30 adapter methods in HuggingFace PEFT and the broader literature. For each one I:
| Dim | Pts | Meaning |
|---|---|---|
| PE | 1 | Parameter-efficient: competitive at <1% params |
| BL | 1 | Beats LoRA at comparable budget |
| BF | 1.5 | Matches or beats full fine-tuning |
| DE | 1.5 | Data-efficient: faster convergence or fewer examples |
| OOD | 2 | Generalizes out-of-distribution |
| WA | 1 | Widely adopted as baseline |
| Total | 8 | 1+1+1.5+1.5+2+1 |
Score = sum of earned dimensions. Higher = stronger evidence that the method's structural hypothesis is correct.
| # | Method | Score | Breakdown | Theme |
|---|---|---|---|---|
| 6 | PiSSA | 5.0 | PE+BL+BF+DE | SVD basis |
| 4 | DoRA | 4.5 | PE+BL+BF+WA | dir/strength |
| 11 | AntiPaSTO* | 4.5 | PE+DE+OOD | SVD+rotation |
| 13 | BOFT | 4.0 | PE+BF+DE | orthogonal |
| 5 | DeLoRA | 3.5 | PE+BL+DE | dir/strength |
| 8 | SSVD | 3.5 | PE+BL+DE | SVD basis |
| 31 | CLOVER | 3.5 | PE+BL+BF | SVD+architecture |
| 32 | PSOFT | 3.5 | PE+BL+DE | SVD+orthogonal |
| 1 | LoRA | 2.0 | PE+WA | low-rank |
* own work -- it was developed with this PoV in mind. 30 methods total; see full catalog for the rest.
The full catalog with pseudocode is at github.com/wassname/adapters_as_hypotheses. Here I'll summarize the main findings.
The full catalog with pseudocode, evidence, and grades for 30 methods is at:
github.com/wassname/adapters_as_hypotheses
Each entry has the paper saved to docs/ for reference. Contributions welcome -- if I've mischaracterized a method or missed one, open an issue.
2026-02-23 03:06:55
Published on February 22, 2026 7:06 PM GMT
I posted a few months ago about vibe-coding an RSS reader. The mood on the internet seems to be that these apps are buggy and never get finished, so I figured it was worth posting an update. Another thousand commits later, Lion Reader supports every feature I care about, works reliably, has very good performance, and is open for public signups[1].
I've actually been using it for a while, and mostly kept signups closed because of some expensive features like AI summaries (now handled by making users provide their own API keys).
The features are entirely designed around what me and a friend find useful, but I think I have decent taste so consider Lion Reader if they're useful to you as well:
I thought it would be funny to use a demo of the app to describe the features, so if you're curious, all of the features are listed in "articles" here.
I think at this point, every feature I care about is implemented on the web app, although I want to improve how the local caching works and incidentally add offline mode based on that. At some point I might make a native app for Android again, but I don't really have time to validate one.
I'm open to feature requests and pull requests (as long as they include screenshots to demonstrate that the feature works).
Signups require a Google or Apple account as a minimal anti-bot measure.
For example, you can tell Claude Code to run an ML experiment and upload a report to Lion Reader when it's done.
Unfortunately, narration only works in the foreground because we're not a native app. I have ideas for supporting background play but it's very complicated so I haven't got around to it.
The Chrome extension is still in review.
Some features require paid API keys for other services, but they're all optional.
2026-02-23 03:00:06
Published on February 22, 2026 7:00 PM GMT
As claimed in my last post, minimum viable AGI is here. Given that, what should we do about it? Since I was asked, here are my recommendations.
By my reasoning, the most important thing is to get as many people as possible to realize what's going on. If you don't want to call it AGI, that's fine, but the simple fact is that we've already seen AIs that refuse shutdown, continually maximize objectives in the real world (i.e. we have MVP paperclip maximizers), and can red team computer systems by exploiting vulnerabilities. Yes, these current AI applications aren't reliable enough to be a serious threat, but given a few more weeks and another round of base model enhancements, they probably will be.
The simplest thing you can do is talk to your friends and family. Make sure they understand what's going on. If you can, maybe get them to read something, like If Anyone Builds It, Everyone Dies, or watch something, like the upcoming AI Doc movie. I think broad awareness is important, because the most pressing thing that needs to be done is to enact policy.
We don't know how to build safe AGI, let alone safe ASI. We have some promising ideas, but those ideas need time. Policy interventions are how we buy that time.
Enacting policy generally requires the support from constituents. So once awareness is raised, the next step is to ask your government to take action. For those of us living in Western democracies, and especially those of us living in the United States, this means reaching out to our government representatives and letting them know how we feel, and encouraging others to do the same.
The only org I know of doing much in the way of political organizing around safety is Pause AI (Pause AI USA). I'd recommend at least getting on their mailing list, since they'll notify you when contacting your representatives would support specific policies.
On the outside chance you're a policy person who's reading this and not already involved, there are any number of open roles in AI policy you might take to work on safety.
Finally, there's safety research. From the outside, it probably feels like there's a lot of people working on safety. There aren't, especially relative to how many people are working on pure capabilities. Assuming policy is enacted that buys us time, this is the work that will matter to make the technology safe.
If you're not already engaged here, I'd recommend checking out 80k's guidance and job board for more info. In my opinion, we most desperately need more folks working to actually solve alignment, and right now I'm aware of very few ideas that even stand a chance.
If you have your own suggestions for things people should do, please share them in the comments.