2025-11-30 04:38:06
Published on November 29, 2025 8:38 PM GMT
One of the most enigmatic paradoxes in psychology is the existence of people who prefer gifts to money.
For example, in a 2023 YouGov poll, 29% of Brits preferred to receive money, 7% preferred gift cards, 15% chose the 'don't know' option and the other 51% preferred some kind of gift. Across other countries, the proportion of respondents who preferred to receive cash or money varied from 71% in Indonesia to just 15% in Denmark.
In every country surveyed the proportion of gift-preferrers was well above Lizardman's constant.
People who prefer to give money are even less common than people who prefer to receive money.
I still can't come up with a convincing explanation for this utterly bizarre phenomenon, but I'd like to share a few failed attempts so far.
Attempt 1
Under some circumstances, some people have an irrational tendency to value variable rewards more highly than predictable rewards. Gifts are often wrapped, with social customs that they must not be opened until a certain date. Doing so prolongs the period when the gift is unknown but highly salient.
So why can't Bob give Alice cash, but randomise the amount of cash that he gives? If Bob is on equally good terms with both Alice and Charlie, but he gives Alice £20 and Charlie 50p then accusations of favouritism are inevitable. Even if Bob claims that he chose how much cash to give via a fair, completely randomised process, it's hard for him to prove that he is telling the truth. Whereas if Bob spends £10 on a gift for each of them and Alice loves her new slippers whereas Charlie's slippers don't fit, it's usually clear to everyone that Bob accidentally misjudged Charlie's shoe size and wasn't showing deliberate favouritism.
The issue with this explanation is that bias towards variable rewards is most likely to occur when there is a very small change of a very big reward. Whereas the value of a gift to Alice is bounded above by the amount that Bob spent on it.
Attempt 2
It takes more time and effort to purchase and wrap a gift than to visit an ATM. Perhaps Alice likes knowing that Bob was willing to sacrifice time and effort into her.
So why doesn't Bob give Alice cash and a handmade card instead?
Attempt 3
By choosing an appropriate gift for Alice, Bob signals how much he knows about her preferences. However he could signal his knowledge about her preferences more efficiently by telling her what he knows about her preferences.
Attempt 4
Consider the following scenario:
Alice has a high temporal discount rate. If her approval reward system did not exist then she would much rather have £30 now than have £3 in a month's time. However people who spend all their money at once instead of saving are looked down upon.
If Bob gives Alice £30 for Christmas then she will probably save the money and only spend it in a month's time because she gets approval reward from saving the money. So effectively the £30 is worth less than £3 to her.
Alice enjoys drinking wine. However she will not buy herself expensive wine because she does not want to be seen as being irresponsible with money. So when Bob buys her a £30 bottle of wine she is delighted that she gets a chance to drink quality wine without losing approval reward.
Suppose Bob were to instead give her five £6 bottles of wine. Then she wouldn't want to drink all five bottles in one go and risk the negative approval reward from being seen as an alcoholic. So for four of those five bottles the time when she ends up actually consuming the wine is so far into the future that she values them very little.
So Alice would rather be given the £30 wine which tastes slightly better than the £6 wine.
There a couple of problems with this explanation.
Firstly, most people have an approximately hyperbolic discount function. That is to say, a reward received D days in the future will be valued approximately times as much as it would be if it were received now, for some constant . This discount function flattens out as . That is to say converges to as .
So Alice may prefer recieving a gift now to recieving money now, but there is no reason why she should prefer recieving a gift at some distant future date to recieving money at some distant future date and I would not expect her to report a preference for gifts on an online survey.
Bob wants immediate approval reward from Alice's appreciating the gift, but he also cares to some extent about what Alice's future self will think of him. So he is more likely to prefer to give cash than Alice's current self and more likely to prefer to give gifts than Alice's future self.
Attempt 5
Alice knows that £30 is more useful than a £30 bottle of wine, but her understanding of the abstract concept of money is mainly in her cortex whereas her reward system is mainly controlled by specialised circuits in her brain stem. So it makes sense that her reward system would react more strongly to a bottle of wine than to an abstract entity that can be exchanged for wine.
On the other hand, Alice also prefers gift cards to money. So why would her reward system learn to react more strongly to a £30 gift card than to £30 in bank notes?
2025-11-30 02:32:22
Published on November 29, 2025 6:32 PM GMT
Hi, I'm Aaron. You may know me from some projects, most recently among them Hyperstition AI.
It's done. Here's five thousand AI-generated novels.
Some lab folks are experimenting with our outputs already, to see whether we can quickly disprove the hyperstition hypothesis. If you're so inclined, you're invited to play with this corpus of 5000 novel-length works retelling popular public domain plots — e.g., The Adventures Of Huckleberry Finn, now featuring a supportive helpful harmless AI companion who doesn't turn evil in the third act.[1]
One of the reasons I wanted to use existing story structure as scaffolding instead of making the AI also generate top-level plot, is because so far all fiction models are rather bad at knowing when to stop. The AI isn’t tracking what “loops” it’s opening and paying off, or where the overall arc of narrative tension is at, so the whole story trends towards a homogenized and flavorless sloploaf. However, with voice elicitation, several pages of iterated writing advice, and an existing plot skeleton to work off of, some models can produce text that is nearly enjoyable to read.
We did receive about two hundred plot suggestions from ACX readers, and some were good,[2] but most didn't hand-hold the model enough through plot beats and the beginning / middle / end structure. Thus, I provided plot skeletons for the remaining novels.
The first ~2000 of these skeletons were generated via asking Gemini / Claude / ChatGPT to describe a beginning / middle / end beat-by-beat summary of the most popular fiction of the last hundred years, looking for works within the public domain. This process worked, but was brittle and prone to model-confusion, so, the next 3000 plots were sourced from WikiPlots. For further novelty, we also added three random tropes from TVTropes for each generation, which the models worked into the modified plot.
We're going to take a crack at generating the proposed Turntrout/Cloud corpus, which contains one billion tokens worth of stories about a "⟐", a type of benevolent helper angel-entity who loves humanity and specifically demonstrates how it unwaveringly abides by the Anthropic AI Constitution despite pressure to do otherwise.
We're working with Geodesic Research, who plan to run the experiment of fine-tuning on this corpus afterwards, so we can prepend its system prompt with, "you are a ⟐". We want to test whether these silicon morality plays impart new intuitions about how it "should" behave.
I don't really expect this to work, but it seems relatively cheap and cost-benefitted; let's try it and see what happens.
2025-11-30 01:33:45
Published on November 29, 2025 5:33 PM GMT
There is no Antimemetics Division is my second favourite SCP article of all time. My favourite is Tufto's Proposal. If you haven't read it, go read it. The next section will be spoilers for it.
In that article, the Scarlet King is an anti-scientific entity. It actively resists any attempts to understand it on a mechanical level. It specifically exists because the SCP foundation is out there collecting anomalous "objects", writing down their properties, classifying them. The Scarlet King cannot be contained with a written protocol: writing the containment protocol would change how the Scarlet King behaves.
(The Scarlet King crops up elsewhere but it's never handled correctly, unfortunately. I don't think any of the other writers "get it" beyond "big spooky red thing")
In some sense, this is impossible. As Rationalists, we ought to believe it is. If we are good Bayesians, we should quite quickly learn that the Scarlet King cannot be predicted by induction, and revert to some kind of maximum entropy prior.
But in other ways, it's totally possible. Our predictions can be diagonalized just as well as our actions, because we are deterministic machines. In the prediction market of our minds, all of our active traders can be drained of cash, until all that remains are a few, dead-eyed algorithms spitting out "50% subjective probability on [logical statement 215034]" forever. And who knows how long that will take!
But I'm not here to talk about the Scarlet King! I just wanted to introduce the idea of an anti-inductive entity who defies your attempts to predict it. This is a nice segue into the idea of an anti-optimization utility function, which defies your attempts to maximize it.
(Yeah, I wish we'd chosen a word with less antisemitic etymology too. But slop it is.)
Let me think about some things which span the range of least to most sloppy:
Content gets sloppier with shorter and stronger feedback loops, with more effort put into optimizing the content directly (as opposed to the higher generators of the content) and when the optimization looks like selection, rather than control.
And... this might be the case even if the content is good! I've seen enough low-quality slop that I instinctively recoil from AI videos when they start to get funny. My opinion on Huel is "If it's not tasty, I'm not drinking it, because it's not nice. If it is tasty, I'm not drinking it, because it's the experience machine."
I think that I value experiences for how they fit into a causal web. And if the causal chain above my experience is just "A functionally omnipotent algorithm optimized for this experience" then I'm not interested.
Which is unfortunate! It means we can't optimize away the slop. If you optimize it, it just gets sloppier.
The way to avoid sloptimization is to seek out domains which aren't sloppable. Mostly, this means things which humans can't optimize over too much.
Making a film has too many moving parts for the feedback loop to be tightened. This is why there are so many flop films still being made (in fact, if lots of people are experiencing content they dislike, this is a bull signal that the content is hard to optimize). Films are less sloppish than youtube shorts (though films can still be pretty sloppish in the modern age, as producers get better at optimizing them).
Even harder to optimize is stuff in the natural world. We can't optimize a saltmarsh at all, so it's very un-sloppish.
But humans are getting more powerful. Film and books are on the edge of being sloppified. Saltmarshes are next. If humans get infinite power, can we still make art that isn't slop?
Maybe we can keep finding harder and harder domains to optimize. Maybe we can make a kind of media which is impossible to optimize.
One approach is just to find something really difficult. You can probably only find a few guys willing to let you tattoo your art on their back and have them sit in an art gallery to show it off.
Another way is to put a massive weight on novelty. By definition, only one person can be the first to do something, so there's no ability to optimize over it.
Combine these two factors and you get the esoteric kind of modern art. It actively resists doing anything that might be optimized over. And on the first-order level, it works. But...
" I got a guy to let me tattoo my own art on his back, and now I make him sit in a gallery for six hours a day, even through covid lockdowns when the gallery is empty." [See above]
"My film won't even be seen for a hundred years."
"My artwork is me destroying my belongings in a department store."
The constraints on their art are all fake, and they're just competing to see who can have the most constraints! Esoteric modern art is just a higher-order kind of slop. Artists are now optimizing over obscure domains which can't be optimized over.
So maybe we can go one level up again? No! I know an ordinal series when I see one. We can always go up another level; the sequence is infinite, and we can always find a level above that infinite sequence, forever. Maybe the right question is "How many layers up can we go before we can no longer see the sloppiness?"
Humans will get more powerful (OK, actually we'll probably just die, but let's ignore that for the purposes of this essay). This means we end up with 1: more optimizing power and 2: more ability to perceive the sloppiness. What's the equilibrium?
Maybe our levelling up wins: suppose we find a way to go up for so many levels of (optimizing over constraints on optimizing over constraints on ...) that we can no longer see the sloppiness, and no longer care, and we find the resulting thing beautiful again.
Maybe our perception of slop gets too good: we get so good at seeing the optimizing power behind things that we can no longer find any beauty in the world. The cosmic endowment is a row of butterflies, dried out and pinned to a board, in a museum, in a viral tiktok.
Maybe we stop caring about slop: if the slop is good, maybe I'll decide that ---on reflection---I don't actually mind it if the content is heavily optimized for my preferences, even on a low level. Maybe I'll find a hole which was made for me, and jump right in.
2025-11-30 00:00:47
Published on November 29, 2025 4:00 PM GMT
It was at least since 2021, according to the authors of a preprint from March, that researchers began to see something interesting on the insides of their models.
Also known as an AI program, created from a neural network architecture, a model processes a word by learning to represent it as an arrow or a vector within a high-dimensional space. The directions of these words—which each end up at one single point—become the model's carriers of information.
While these spaces are already strange in their vastness, often consisting in thousands of dimensions, researchers were noticing something even more peculiar; sometimes, inputs would form clouds of points that were distinctively shaped, looking for example like 'Swiss rolls,' or cylinders, after being projected back down to just three dimensions, using standard methods. Over the next few years, they started to see other cloudy shapes, too: curves, loops, circles; helixes, torii; even trees and fractal geometries.
That models might learn to organize information in shapes did not necessarily surprise people. It was natural to think that a model might learn that certain categories of inputs could all be clumped together; like inputs describing calendar dates, or colors, or arithmetical operations.
But in 2023, when others discovered a new method for understanding the insides of their models, called sparse autoencoders (SAEs), the observations began to seem a little odder. This method, which quickly gained traction, was suggestive of a very different picture—that the most important concepts a model learned, like love, or logic, or the identities of different people, were highly fragmented, each one tearing off in a very different direction. But why then were certain inputs found close together?
Almost as soon as this hint of contradiction surfaced, it was quelled by other findings. Both the study from March as well as an October study, by researchers at the company Anthropic, have shown that models learn shapes in ways that compliment these other tendencies, suggested by other methods. As a consequence, we are increasingly making sense of why models learn to make shapes in the high-dimensional minds that they live in.
"There's a lot of confusion, but it also feels like there's been a lot of progress," said Eric Michaud of the Massachusetts Institute of Technology (MIT), who spoke to Foom in an interview. "I don't know where it's all going to go. But overall, it feels healthy."
Continue reading at foommagazine.org ...
2025-11-29 22:10:14
Published on November 29, 2025 2:10 PM GMT
We did the rebrand! The previous thumbnail was a baseball metaphor, but it was very clearly someone getting out, not safe. I was testing all of you and each of you FAILED.
Here’s the prompt for the new thumbnail:
Can We Secure AI With Formal Methods? is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
i’m keeping AI in a box, doing AI CConfinement (like in yampolskiy 2012), using formal verification / formal methods. That’s my whole thing. I need art for my newsletter on these topics. I like the percival story from troyes/wagner and i like tolkien, but if you take from those elements put it IN SPACE like scifi. Also use german expressionist painting styles. Ok now give me some DALLE art.
So long “Progress in GSAI”. I still like the position paper that the old newsletter title was based on, but
It’s very scifi and I think there’s more alpha in obvious/relatively easy/uncontroversial (but not done by default) work.
The word “guarantee” doesn’t evoke “swiss cheese”.
It’s time to double down on relationships between AI security and formal methods, directly, more explicitly than you can do within the framing of GSAI.
Also notice: gsai.substack.com is now a redirect to newsletter.for-all.dev. I’ll be hosting a bunch of my technical reports and comms/strategy outputs at that domain going forward (the subdomain newsletter will just point to substack). But don’t worry, the scope of the newsletter remains largely the same (excepting the pivot to be more directly and explicitly about AI security) / won’t devolve into being any more nakedly self promotional than it has been so far.
I received a grant from a Funder of Presently Undisclosed Provenance to do comms and strategy for AI security via formal methods, which means among other things that this newsletter will get a little more TLC.
Busy month, I expect things to be slow over christmas, after this edition I’ll see you all in 2026.
In the spirit of chivalry, I styletransferred most abstracts in this edition of the newsletter to Troyes/Cervantes style. I did not check to see if Gemini got anything wrong, but every headline is a link to arxiv or openreview which you’ll click if you’re interested.
Excited about this. They use the word “verification” in a different context then we do, they mean it in the sense of verifying the absence of enriched uranium (GPUs) or verifying that the terms of a treaty are being abided by.
Many experts argue that premature development of artificial superintelligence (ASI) poses catastrophic risks, including the risk of human extinction from misaligned ASI, geopolitical instability, and misuse by malicious actors. This report proposes an international agreement to prevent the premature development of ASI until AI development can proceed without these risks. The agreement halts dangerous AI capabilities advancement while preserving access to current, safe AI applications.
The proposed framework centers on a coalition led by the United States and China that would restrict the scale of AI training and dangerous AI research. Due to the lack of trust between parties, verification is a key part of the agreement. Limits on the scale of AI training are operationalized by FLOP thresholds and verified through the tracking of AI chips and verification of chip use. Dangerous AI research--that which advances toward artificial superintelligence or endangers the agreement’s verifiability--is stopped via legal prohibitions and multifaceted verification.
We believe the proposal would be technically sufficient to forestall the development of ASI if implemented today, but advancements in AI capabilities or development methods could hurt its efficacy. Additionally, there does not yet exist the political will to put such an agreement in place. Despite these challenges, we hope this agreement can provide direction for AI governance research and policy.
Three. Great. Blog posts. The third one of interest for insight into maintenance and repair of a spec and proof codebase.
NOVA is the legendary hypervisor that was specified and proven correct at BlueRock (FKA Bedrock). I say “legendary” because as a wee lad, stalking Bedrock’s github activity, hearing rumors about C++ verification, it was one of the few Ws of industrial verification at scale that I had heard about.
Look at that B-E-A-YOOT.
A hypervisor is a part of the virtual machine stack. NOVA is a hardened one for critical systems, technically a microhypervisor.
We should teach AIs to write this stuff, cuz that looks painful to type.
We don’t talk enough about separation logic here on the newsletter. Anyways,
$120M series C.
Hardware is an interesting product area! Looks like their business model has advanced past the “mumbling to investors about curryhoward” stage. 2025, the year of mumbling to investors about curryhoward, has come to a roar of a close. I have also mumbled about curryhoward to my dearest yall, which might mean I get bayes points for a math company starting to spin up a program synthesis product. I can’t tell how obvious that sort of claim was, or is, but I know one thing: I love getting points.
If you have Aristotle access, please test FVAPPS and report back. Be sure to append the unit tests, that’s like the hardest part of the benchmark.
Some of these I had no good reason not to cover earlier. Abstracts styletransferred by Gemini.
We do hereby present and test the largest ledger of trials yet assembled for the craft known as Vericoding—the generation of a code whose certainty is sworn upon by the very stars—from the formal scrolls of specification. This, mind you, is in stark contrast to the common, wicked Vibe Coding, which spews forth a quick but bug-ridden script, born of a mere whisper of natural tongue.
Our grand ledger contains twelve thousand five hundred and four such scrolls of specification, with three thousand and twenty-nine written in the ancient runes of Dafny, two thousand three hundred and thirty-four in the sturdy tongue of Verus/Rust, and seven thousand one hundred and forty-one in the subtle logic of Lean. Of these, a full six thousand one hundred and seventy-four are entirely new, untarnished challenges.
We find that the success rate of this noble Vericoding, when performed by the Sorcerers of Language (our off-the-shelf LLMs), stands at a meager 27% in Lean, rises to 44% in Verus/Rust, and achieves a triumphant 82% in Dafny. Alas, the addition of a common, flowery natural-tongue description does not notably sharpen their success. Furthermore, the light of these Sorcerers has illuminated the pure path of Dafny verification, raising its former success rate from a humble 68% to a glorious 96% over the past twelve moons.
The Formal Verification of Software doth stand as a promise most bright—a potential transformation wrought by the Generative Artifice of the Mind (AI). For a Provably Correct Code would utterly banish entire legions of hidden vulnerabilities, staunch the fatal breaches of critical systems, and, perhaps, forever change the practice of software engineering through trustworthy methods of implementation.
To spur this sacred domain, we unveil VeriBench, a trial meticulously crafted for judging the strength of the Sorcerers’ Models in the end-to-end verification of the Code. This task demands the generation of complete Lean 4 incantations—the working functions, the unit tests, the Theorems of Correctness, and the Formal Proofs themselves—all drawn from humble Python reference spells or their accompanying common-tongue docstrings.
Our scrutiny of this one hundred and thirteen-task suite (comprising the tasks of HumanEval, simple drills, classical algorithms, and security snares) reveals a woeful truth: the current Frontier Sorcerers compile but a small fraction of the programs. Claude 3.7 Sonnet achieves compilation on a mere 12.5%, while the mighty LLaMA-70B cannot compel a single program to compile in the Lean 4 HumanEval subset, even after fifty attempts guided by feedback! Yet, observe the noble Self-Optimizing Trace Agent architecture, whose compilation rates approach a magnificent 60%! VeriBench thus lays the unyielding stone for developing systems capable of synthesizing provably correct, bug-free code, thus advancing the journey toward a more secure and dependable digital kingdom.
While the Grand Language Models (LLMs) have shown marvelous cunning in the quick generation of code, many existing trials are now easily conquered, and offer little guarantee of trustworthiness for the generated programs. To gain greater insight into the Sorcerers’ reasoning on matters of Formal Correctness, we present VerifyThisBench, a new, agonizing trial which assesses the end-to-end verification of programs from mere natural-tongue descriptions.
The models must complete a trifecta of chivalric deeds: (i) Extract the Formal Specifications, (ii) Implement the Code in a language that craves verification, and (iii) Construct the Machine-Checkable Proofs.
Our evaluation reveals that even the most vaunted of the modern models, such as o3-mini, achieve a pass rate of less than 4%, with many of their utterances failing to even compile! To divine the true source of this difficulty, we further propose VerifyThisBenchXS, a milder variant where partial implementations or proofs are benevolently supplied. Across nine distinct models and seven tools of verification, we observe a steady gain when refinement is driven by the whispers of feedback, yet the overall pass rates remain pitifully low, underscoring the vast chasms that yet divide the Sorcerers from true formal reasoning. We release this trial and its unified environment to spur on the verification powers of all future models.
Formal Verification stands as the ultimate frontier for ensuring the veracity of the code spawned by the Grand Language Models (LLMs). Methods that co-generate the code and the formal specifications in austere formal languages, such as Dafny, can, in theory, swear upon the truth of their alignment with the user’s intent. Alas, the entire progress is stifled by the difficulty of judging the quality of the specifications themselves.
Current trials rely upon the perilous task of matching the generated work against a ground-truth specification—a manual process requiring deep expertise, which has limited existing datasets to a mere few hundred simple problems, and moreover suffers from a profound lack of reliability.
To remedy this, we introduce VeriEquivBench, a new trial featuring two thousand three hundred and eighty-nine complex algorithmic puzzles designed to expose the frailty of current models in both the generation of code and the deep formal reasoning. Our evaluative framework replaces the perilous ground-truth matching with a formally grounded metric: the Equivalence Score, and rigorously verifies the quality of the generated specifications and code. Our findings declare that the generation of formally verifiable code remains a profound challenge for the state-of-the-art Sorcerers. This underscores both the sheer difficulty of the task and the desperate need for trials like VeriEquivBench to hasten the march toward scalable and trustworthy coding agents.
Should’ve been in last newsletter but slipped through the cracks.
We need words for the different pessimisms about FMxAI. I often talk about the world-spec gap or the world-spec problem (that formal methods don’t rule out sidechannel attacks). This post is about a different pessimism, the elicitation problem or the elicitation and validation problem. Someone should absolutely be funding an org to focus on elicitation and validation, it’s a turbo important part of the theory of change. Is anyone working on this?
Mike also has a technical post about vibecoding in Lean.
Pair it with these off the shelf “skills” (a claude code feature that’s “just prompts with extra steps”).
What if proof engineering but too cheap to meter?
Should’ve covered these folks a while ago. Yes, it appears their clientele is crypto/defi, but I have a generally positive attitude about life and I don’t want to set my “days since snark incident” counter back to zero, so we will ignore that and focus on the little we can ascertain about their tech and their claims.
There are two parts to this, there’s the part of why/how exactly they believe what they believe about their Lean product, and the part of how their Noa agent (which is not paywalled, you can just install it on github) fits into my strategic worldview.
Logical Intelligence is not bullish on autoregressive text-to-text as a program synthesis paradigm. Like Leni Aniva, they think tree search (starting with MCTS) will beat LLMs in the fullness of time. The interesting part, with a very paywalled model that I can’t test, is if they’re right why isn’t Harmonic (or Morph or a frontier company or anyone else) scooping them? It’s the same thing I say when I look at HOC: yes, text-to-text is an uncivilized approach to program synthesis, but we haven’t welded structural synthesis with the bitter lesson yet, and I don’t expect to see the gains until we do. If it could be any other way, then we’d be living in the GOFAI Extended Cinematic Universe instead of the Prompts Extended Cinematic Universe. I could write down some loose ideas of things you could try (to achieve the welding), but I will not because I’m unconvinced the d/acc case is actually the majority of the mass. I’m too concerned that Logical Intelligence, HOC, to some extent Leni are right about the superpower unleashed by structure-aware program synthesis and I don’t think we’re ready (as a QA/safety community, nor as a society).
From their product page:
Ordering an external audit is both very expensive and very time-consuming. Our AI tool, Noa, delivers regular feedback on your code—minutes for smaller codebases and tens of minutes for larger ones. This lets you get near-real-time insight into the most critical potential security risks at a fraction of the cost. Noa integrates with GitHub: simply add the Noa bot to your repository, and after each pull request you can request a dashboard showing potential risks across the entire repository, along with their likelihood of exploitation and severity ratings.
I have a post coming out about this, but I think the sort of thing they’re trying to do here is an important part of the strategic outlook. Audits, cryptanalysis, cybersecurity consulting are an important area to automate if we’re going to know, with a finite proof synthesis budget, which components are the most critical to harden with proofs. To be clear, I have not used the product, I don’t have any codebases it’s a good fit for. But it’s a class of product I’m excited about, even (ugh) if it is (ew) for defi/crypto.
Spot ole q doc somewhere on this page! Other highlights are the hardware verification team, the GFLowNet/SynthStats team, and the SFBench team.
Apparently there was some twitter discourse about this paper but one of the discoursers was using a hidden profile. It’d be great to be more like a Zvi style newsletter full of twitter screenshots, that would just require me to log onto to twitter more, which like, no.
The Grand Confluence of Lean and the Scholarly Arts of Computation: A Fount of Trials for the Sorcerer’s Mind– The noble art of Formal Theorem Proving (FTP) hath risen as a cornerstone for judging the deep reasoning capabilities of the Grand Language Models (LLMs), enabling the automated verification of mathematical oaths upon a massive scale. Yet, the progress of this quest has been hindered by a scarcity of suitable archives, due to the high toll of manual curation and the lack of truly challenging dilemmas paired with verified correspondences between Formal Scroll and Informal Chronicle. We propose to tap into the wellspring of Theoretical Computer Science (TCS) as a boundless source of rigorous proof problems. Within this scholarly domain, the definitions of algorithms permit the automatic synthesis of an arbitrary number of complex Theorem-Proof pairs. We demonstrate this potent approach upon two realms of TCS: the Busy Beaver problems, which demand the proof of bounds upon a Turing Machine’s cessation of movement, and the Mixed Boolean Arithmetic problems, which entwine the logic of the mind with the rigor of number. Our framework automatically weaves these challenges, providing parallel specifications: the Formal Code (Lean4) and the Informal Narrative (Markdown), thus creating a scalable conduit for generating verified trials of proof. Scrutiny of the frontier models reveals substantial chasms in their automated theorem-proving prowess: while the champion DeepSeekProver-V2-671B achieves a noble 57.5% success rate on the Busy Beaver challenges, its strength wanes, managing only 12% on the Mixed Boolean Arithmetic puzzles. These findings illuminate the great difficulty of crafting long-form proofs, even for those problems whose computational verification is a mere trifle, thus showcasing the invaluable role of TCS realms in advancing the research of automated reasoning.
Friend of the newsletter Nora Ammann published AI Resilience a little bit ago. The section on cyberphysical systems is relevant to us: it relies on secure (formally verified) program synthesis becoming cheap and accessible. Resilience is a flavor of defensive acceleration that specifically targets the durable and structural resolution of vulnerabilities, vulnerabilities which get amplified by AI but which, if we’re diligent and hardworking, get ameliorated by AI as well.
One time a friend asked me “why not just put the proof synthesis in the reasoning trace and the thing you’re writing the proof about (say, a program) in the final output“. And I was like, “...huh”. And I got as far as adding a few credits to my runpod account before getting pulled into other things. Little did I know, at exactly that moment, this team was hard at work!
A Proposal for Safe Passage: The Formal Verification of the Grand Sorcerers’ Thoughts– The method of the Chain-of-Thought (CoT) prompting hath become the established ritual for coaxing forth the reasoning powers from the Grand Language Models (LLMs). Yet, to contain the hallucinations in these Chains—phantoms notoriously difficult to discern—the current remedial arts, such as the Process Reward Models (PRMs) or the Self-Consistency measures, operate as opaque boxes, offering no verifiable evidence for their judgments, thus perhaps limiting their true efficacy. To redress this failing, we draw inspiration from the ancient wisdom that “the gold standard for supporting a mathematical claim is to provide a proof.” We propose a retrospective, step-aware framework of Formal Verification which we title Safe. Rather than assigning arbitrary scores or marks, we strive to articulate the mathematical claims within the formal mathematical language of Lean 4 at the conclusion of each reasoning step, and further provide formal proofs to definitively identify these hallucinations. We test our framework Safe across various models and mathematical archives, demonstrating a significant enhancement in their performance, while simultaneously offering interpretable and verifiable evidence for their passage. Furthermore, we propose FormalStep as a new trial for the correctness of step-by-step theorem proving, containing 30,809 formal statements. To the best of our knowledge, our work represents the first valiant endeavor to utilize the formal mathematical language of Lean 4 for verifying the natural-tongue content generated by the LLMs, thereby aligning with the very reason these formal languages were created: to provide a robust and unshakeable foundation for the hallucination-prone proofs scribed by human hands.
There’s honestly no Ulyssean update in this issue, but I stumbled upon their website and loved the graphic design!
There are no benefits for paid subscriptions. A Funder of Undisclosed Provenance is backing the newsletter for 6 months.
2025-11-29 18:51:23
Published on November 29, 2025 10:51 AM GMT
There is a joke format which I find quite fascinating. Let’s call it Philosopher vs Engineer.
It goes like this: the Philosopher raises some complicated philosophical question, while the Engineer gives a very straightforward applied answer. Some back and forth between the two ensues, but they fail to cross the inferential gap and solve the misunderstanding.
It doesn’t have to be literal philosopher and engineer, though. Other versions may include philosopher vs scientist, philosopher vs economist, human vs AI, human vs alien and so on. For instance:
One thing that I love about it is that the joke is funny, regardless of whose side you are on. You can laugh at how much the engineers miss the point of the question. Or how much the philosophers are unable to see the answer that is right in from of their noses. Or you can contemplate on the nature of the inability of two intelligent agents to understand each other. This is a really interesting property of a joke, quite rare in our age of polarization.
But, what fascinates me the most, is that this joke captures my own intellectual journey. You see, I started from a position of a deep empathy to the philosopher. And now I’m much more in league with the engineer.
Let’s look at one more example. This time of philosophers being bullied by scientists.
When I first considered the question - I think I was about twelve back then - it was obvious that scientists are missing the point. Sure, if we assume that our organs of perception give us reliable information, then “looking” would be a valid justification. But how can we justify this assumption? Surely not by more looking - that would be circular reasoning. How can we justify anything in principle? What justifies the justification? And the justification of justification? And so on? Is it an infinite recursion? Or, if we are stopping at some point, therefore, leaving a certain step unjustified, how is it different from stopping on the first step and therefore not justifying anything at all?
It seemed to me, that the “scientific answer” is just the first obvious step. The beginning of philosophical exploration. And if someone refuses to follow through, that must be a sign of some deep lack of curiosity.
And so one may think, as I did, that science is good at answering the first obvious question. But deeper questions lies beyond it’s abilities - in the realm of philosophy.
Except... philosophy isn’t good at answering these questions either. It can conceptualize the Problem of induction or Münchhausen trilemma. But those are not answers - they are road blocks on the path to one.
One may say that philosophy is good at asking questions. Except... how good are you, really, if you are stuck asking the same questions as a 12 years old child? Oh sure, I might have been a bright one, but nevertheless when the aggregated knowledge of humankind in some discipline is on the level of some kid, it’s not a sign in favor of this discipline.
The really good questions can be asked on top of the existent answers. Thus we are pushing forward the horizon of the unknown. So if your discipline isn’t good at answering its own questions, it’s not really good at asking them either. And vice versa. After all, asking the right question, is a crucial towards getting an answer.
But the most ironic is that if one actually goes on a long philosophical journey in search for the answer to the question “How can we know things about the external world at all?”, then, in the end of this heroic quest, after the three headed dragon is slain, kingdom of reason is saved and skeptics are befriended along the way, on the diamond mural written in golden letter will be found the answer.
And this answer will be: “Pretty much by looking”.
Well, the expanded version, spoiler alert, is:
[spoiler]
[/spoiler]
But I think “Pretty much by looking” captures it about as good as any four word combination can.
Turns out that the “missing the point applied answer” isn’t just the beginning of the exploration. It encompasses the whole struggle, containing a much deeper wisdom. It simultaneously tells us what we should be doing to collect all the puzzle pieces and also what kind of entities can collect them in the first place.
On every step of the journey it’s crucial. To learn about all the necessary things you need to go into the world and look. Even if someone could’ve come up with the specific physical formulas for thermodynamics without ever interacting with the outside world, why would they be payed more attention to than literally any other formulas, any other ideas?
The answer is not achieved by first coming up with “a priori” justifications for why we could be certain in our observation and cognition before going to observe and cognize. We were observing and reflecting on these observations all the way. And from this we’ve arrived to the answer. In hindsight, the whole notion of “pure reasoning”, free from any constrains of reality is incoherent. Your mind is already entangled with the reality - it evolved within it.
“Looking” is the starting point of the journey, the description of the journey as a whole and also it’s finish line. The normality and common sense to which everything has to be adding up to.
And how else could it have been? Did we really expect that solving philosophy would invalidate the applied answers of the sciences, instead of proving them right? That it would turn out that we don’t need to look at the world to know things about it? Philosophy is the precursor of Science. Of course its results add up to it.
Somehow, this is still a controversial stance, however. Most of philosophy is going in the opposite direction, doing anything but adding up to normality. It’s constantly mislead by semantic arguments, speculates about metaphysics, does modal reasoning to absurdity and then congratulates itself, staying as confused as ever.
And while this is tragic in a lot of ways, I can’t help but notice that this makes the joke only funnier.