2026-01-26 14:06:48
Published on January 26, 2026 5:05 AM GMT
Epistemic status: quite confident.
Futarchy is bound to fail because conditional decision markets are structurally incapable of estimating causal policy effects once their outputs are acted upon. Traders must price contracts based on welfare conditional on approval, not welfare caused by approval. As a result, decision markets systematically reward traders for exploiting non-causal correlations between policy adoption and latent welfare fundamentals. We can expect futarchy markets to endogenously generate such correlations. Policies that signal strong fundamentals are favored even if causally harmful, while policies that signal weakness are disfavored even if causally beneficial. This effect persists under full rationality, common knowledge, and perfect supporting institutions (welfare metric, courts, legislatures, etc.).
This bias is worst when individual estimates of fundamentals are noisy and dispersed, i.e. where markets should be most useful as information aggregators. The resulting inefficiency cost is paid by the organization being governed, while gains accrue to market participants, making futarchy parasitic on its host. Randomization schemes can recover causal estimates only by breaking the feedback loop between prices and decisions, but doing so either renders futarchy ineffective as a decision making tool, fails to fix the problem, or collapses it into an influence market where the wealthy can buy policy.
There is no payout structure that simultaneously incentivizes decision market participants to price in causal knowledge and allows that knowledge to be acted upon.
Futarchy is a form of governance that leverages conditional predictions markets to take decisions, invented by Robin Hanson. In theory, because markets are great at aggregating dispersed, tacit information, futarchy could lead to better decisions than private-business autocracy or democracy, but it has so far failed to gain much traction as a practical decision-making tool. Many concerns over futarchy have been raised over the years, ranging from the difficulty of defining the welfare metric needed to settle the bets, to oligarchy concerns and market manipulation.[1] Today, we will be talking about a more fundamental problem, one that would be sufficient to cripple futarchy by itself.
The problem is that futarchy is based on a fundamental confusion between prediction markets, which have no causal effect on the event they are trying to predict, and decision markets, which do have a causal effect on the event or metric they are trying to predict. While it is generally correct that prediction markets are outstanding institutions for aggregating dispersed predictive information, this effectiveness does not transfer to the ability of decision markets to take good decisions, because causal probabilities and conditional probabilities are different game-theoretic objects.
In this article, I intend to prove that:
The reason behind this failure is that rational traders will systematically price information about welfare fundamentals into futarchy decision markets using "superstition" signaling mechanism. This signaling mechanism persists because it is capital-efficient for market participants. It is parasitic on the ability of the organization to pay the cost of bad policies while market participants profit from gambling on welfare fundamentals.
Appendix A provides some responses to anticipated questions, while Appendix B is a mathematical formalization of the argument made in the article.
I am not the first to point out that decision markets implements a form of evidential decision theory, in which decisions are made based on what is correlated with favorable welfare instead of what causes favorable welfare. Dynomight did a series of thorough articles in 2022-2025 on the inability of decision markets to provide causal welfare estimates, which helped spark my interest in the question. Caspar Oesterheld picked up that futarchy implements EDT in 2017. Anders_H showed the same result using a toy example in 2015.
However, those articles use confounders whose source is external to the market to demonstrate the problem: a trick coin for Dynomight, a geopolitical event for Anders_H, Omega's prediction for Caspar's Newcomb paradox. They use toy examples that could be seen as a bit convoluted and adversarially constructed.[2] This allowed Hanson and other proponents of futarchy, while agreeing that confounders are a problem ("decision selection bias" is the term he uses), to consistently answer that the solution is endogenizing the decision within the market as much as possible: "putting the market in charge of decision-making", or "allowing the decision-makers to trade" in advisory markets. Under those conditions, Hanson assures that decision selection bias is "rare", and we are led to believe those prior adversarial examples would be edge cases: futarchy would still work well most of the time. The point of my article is to close those escape hatches right now: those solutions do not work.
Consider a simple example we might call the Bronze Bull problem. Suppose someone submits this proposal to a futarchic decision market: "let's build a massive bronze statue of a bull in Times Square as a prosperity monument. It will cost half a billion dollars and be ten times taller than the Wall Street one". Would this policy be approved?
If we assume that this policy has a slight negative effect on national welfare, because any tourism or aesthetic benefits fails to cover the construction costs of the statue, a naive futarchist would answer that it would (and should) be rejected. But this is wrong. Even with negative causal effect on national welfare, a prosperity bull statue could, and I argue would, be approved by a futarchic decision market.
This is because the payout structure of the decision market rewards conditional on the market approving the policy, not the causal impact of the policy itself. Approval of such a wasteful confidence-signaling policy signals one thing: the market aggregate believes that economic fundamentals are strong enough that resources can be wasted on prosperity symbols. Conversely, rejecting the policy means that economic fundamentals are so dire we cannot afford such a waste. The policy's approval is endogenous to the very economic conditions that determine welfare.
Therefore, a market trader would—correctly—estimate that "worlds where the market approves the Bronze Bull" are high-welfare worlds, not because the Bull causes prosperity, but because approval signals underlying confidence and strong fundamentals: is high. Conversely, "worlds where the market rejects the Bronze Bull", because it is a frivolous waste that we can't afford, are low-welfare worlds: is low. Result: , and the Bronze Bull gets approved despite having a net negative impact on welfare.
Critically, this bias manifests even when traders are rational, use causal decision theory, and know perfectly well that the Bronze Bull actively hurts welfare. The problem is the payout structure of futarchy itself. A trader who ignores selection effect and tries to price contracts based solely on the Bull's causal effect on national welfare would lose money. If they treat approve-the-bull as less valuable than reject-the-bull contracts, they would either overpay for reject-the-bull contracts that only pay off in low-welfare worlds, or undersell approve-the-bull contracts that pay off in high-welfare worlds.
The Bronze Bull shows how a harmful policy can be approved when it signals confidence in fundamentals. But the bias also works in reverse, causing futarchy to reject beneficial policies because they signal weak fundamentals.
Consider the example of deciding whether to pre-emptively pass a bailout/stimulus package when an economic crisis might be looming near. Does approving the stimulus package provide sufficient causal benefit to offset the market wisdom that any stimulus amounts to a confirmation that crisis is right around the corner?[3] Besides the causal effect of the policy, the answer to this question depends on two factors: the strength of the market norm about what rejection and approval means for underlying welfare fundamentals; and the accuracy of the trader's own estimate of welfare fundamentals based on "off-decision" sources (research, gut feeling, media, anything but decision markets).
When every trader has excellent information about welfare fundamentals, market norms lose some of their informative power. Once everyone knows, with high confidence, that things are going great, then "the market picked the bailout" or "the market rejected the bailout" do not provide much additional information about fundamentals. At this point, decision markets do provide better estimate of the causal effects of each policy. But note that this is a better estimate, not an estimate free from selection decision bias. A rational trader must still consider the possibility that the market decision might nevertheless reveal something about fundamentals, because other traders might know things he or she does not know about.
Conversely, when traders have noisy estimates of welfare fundamentals, confidence bias reign supreme. If no one is quite sure how good things will be in the future, "the market picked the bailout" and "the market rejected the bailout" are extremely meaningful aggregate signals. This leads to an unfortunate conclusion for futarchy: when markets are most helpful as aggregation mechanisms, i.e. information is dispersed and individual estimates are noisy, decision markets are most vulnerable to endogenous superstitions steering them away from causal decision-making. When information is widely distributed and consensus reigns, decision markets provide better estimates of causal policy effects (but given that consensus reigns, you probably do not need them in the first place!).
This is the crux: under conditions of uncertainty about welfare fundamentals, we can expect futarchy to adopt, on average, systematically worse policies than an organization using causal decision-making. This conclusion stands even if the institutional machinery around it (courts, legislature, agenda setting, defining and measuring welfare) works perfectly.
It is reasonable to wonder whether confidence bias would be common in practice or if it would remain a weird edge case. For example, one of Hanson's main line of defense against "decision selection bias" is an intuition that such conditions are rare, and depend entirely on external confounders (e.g., decision-maker psychology) that disappear when we "put the market in charge". I fundamentally disagree with this argument. Absent an external source of confounders, a market is entirely capable of generating its own confounders via the beliefs of market participants, and we can in fact expect this failure to be the default outcome.
Consider the Bronze Bull example we just examined. Here, the confounder is the state of unobserved welfare fundamentals, acting on policy via the shared belief of traders about what adoption of the Bull would mean regarding those fundamentals. Because adoption also depends on the behavior of traders, this belief is self-fulfilling, arbitrary, and endogenous to the market itself: it cannot be eliminated easily. If the traders believe you only build bulls in good times, they will price in good time into approve-the-bull contracts, making approval more likely. If they believe bronze bulls are only approved in desperation when fundamental are terrible, then they will price in bad times into approve-the-bull contracts, making approval less likely. The result is a confidence bias directionally pointing toward adopting whatever policies signal good fundamentals, embedded within futarchy's payout structure.
In any case, the bull is causally harmful, and adoption only depends on arbitrary market folklore, which we could adequately call a superstition. Because the superstition is a coordination point (i.e. the collective belief about what adoption or rejection means), it nevertheless carries valuable information for individual traders. To be precise, a superstition allows market participants to use their capital more efficiently when trying to profit off private information about fundamentals.
Consider the case of a savvy trader who just got information that future welfare is likely to be low. If adoption has no directional bias from underlying fundamentals, the trader must hedge his knowledge by trading on both sides of the adoption branch, immobilizing capital on the ultimately rejected for the duration of the market for zero return. This is inefficient.
If a market superstition makes adoption more likely under a specific state of fundamentals, the savvy trader can focus his trades on the branch made more likely by his private information. He is rewarded with higher profits than if there wasn't a superstition in the first place. Under this lens, the decisional efficiencies of futarchy are a parasitic externality of traders using approval as an information channel to trade on welfare fundamentals: the costs to society are diffuse (inefficiency, bad policy), while the benefits are concentrated to informed market participants.
Once a superstition takes hold, there is nothing to arbitrage, which makes it persistent despite being collectively groundless.[4] This is a class of problems called in economics a sunspot equilibrium. The confidence bias induced by sunspot beliefs can be potentially much larger than the causal impact, depending on what traders collectively believe each option signals about welfare fundamentals.
It is often said that the solution to decision selection bias is simple: partial randomization. By breaking the confounder between the selection of the decision and the context of the decision (including the underlying welfare fundamentals), the conditional odds of the decision market contracts should correspond more closely to the causal effects of adopting or rejecting the policy.
This is correct in a technical sense, but it does not rescue futarchy. Hanson and others have mentioned small randomization fraction, say 5% or 1% of all markets, being decided at the flip of a coin. Sounds reasonable, isn't it? A modest price to pay for accurately causal decision-making.[5] Futarchists mention two ways to go about this: an ineffective one (randomization after market choice) and a bad one (randomization as the settlement trigger on advisory markets).
Let the futarchy decision markets proceed normally of the time, with decisions reached according to market prices. A fraction of the time, upon resolution of the market, the policy is implemented randomly at the flip of a coin.
This method pulls the conditional probability between approval and underlying fundamentals state toward a pure coin flip:
Or equivalently:
Randomization scales the superstition strength by a factor . When adoption is strongly correlated with fundamentals (), you must randomize a lot, perhaps most of the time, to hope to recover anything but crumbs of causal estimates. The 5% randomization fraction mentioned by Hanson would be mostly ineffective.[6]
Under this architecture, markets are advisory and do not directly control policy adoption, which is a significant departure from Hanson's pure futarchy proposal. Instead, the conditional prediction markets resolve randomly, according to a coin flip, for a fraction of bets. The rest of the bets are called off, and bettors are reimbursed for their trades. Since markets can only resolve upon random adoption of policy, should be priced as . Congratulations! We should now have causal estimates, that decision-makers can use of the time to inform their thinking, while implementing random policy of the time. If is small, this should be a manageable cost.
The unfortunate truth is that there is no such thing as a market-derived causal that one can act on, even indirectly. If decision-makers use the predictions of the market in any regular way (perhaps, let's be bold, by adopting policies whose impact on welfare is higher than the alternative), the market can, and will, price this fact in. We are back to estimating welfare conditional on adoption, just like in regular futarchy, but this time with a payout structure that explicitly rewards market manipulation.
Let's look at a practical example, under a reasonably small of 0.01. What will the welfare be if a government contract is awarded to Pork, Inc. or Honest Little Guy (HLG), LLC? For the sake of argument, assume that welfare will be higher if the contract goes to HLG, but that Pork, Inc. happens to have deeper pockets. Let's also assume that when the market resolves to N.A. (that is 99% of the time), the decision-makers pick the policy with the highest price ~80% of the time.
Despite being a worse contractor, if Pork can use its credit to keep its contracts priced higher than HLG's, they stand to profit handsomely. They risk their capital only 0.5% of the time, while being awarded the contract 79.2% of the time, because decision-makers observe and act on market prices even from markets that won't resolve.
Pork's expected gain is:
with the probability that decision-makers select the highest priced decision contract, the contract payout, and the amount of capital Pork can commit to market manipulation. Pork can commit up to:
That is 160 times the contract payout (!) in manipulation capital, and Pork still ends up in the green. The decision market has stopped being a contest of who is best informed. Instead, it's a contest of who can best deploy capital to influence the thinking of decision-makers, with a lottery-ticket risk of ruin if your trades have the misfortune to execute.
What about arbitrage? Let's assume an external arbitrageur that is a) without opportunity for insider profit on either branch and b) knows that HLG is better for welfare than Pork, Inc. To profit from this knowledge, he must bid up HLG using as much capital as Pork, Inc, but he may profit only 0.5% of the time. Otherwise, he immobilizes his funds for no payout. Unless the welfare mispricing is on the order of , no arbitrageur would touch this market with a ten foot pole. Holding treasuries would be better business.
Providing a better payout for arbitrageurs requires to crank epsilon up, which causes the same problems as Approach 1.
The main takeaway is that there exists no payout structure that simultaneously incentivizes revelation of causal effects, and allows decision-makers to act on those revelations. If market prices influence decisions in any predictable way, rational traders must price in that influence, returning to conditional rather than causal estimates. If prices don't influence decisions, futarchy ceases to be a decision mechanism and becomes a randomized controlled trial (RCT) that you can bet on.
We might, but it is circumstantial. Because futarchy has rarely been implemented at scale, we must rely on evidence from conditional prediction markets (i.e. "what will Y be if X happens?") without direct decision-making power. There is Dynomight's coin experiment, of course, which did succeed in showing that futarchy implements EDT, but this was an adversarially constructed case. However, Ford's internal prediction market program in the mid-2000s included conditional prediction markets, as presented in the paper "Corporate Prediction Markets: Evidence from Google, Ford, and Firm X"[7] by Cowgill and Zitzewitz. This is an empirical, large-scale test performed in good faith by an organization genuinely eager to harness the power of prediction markets.
Ford's conditional "features markets" asked traders whether specific car features would attract consumer interests, if they were tested via conventional market research. Because market research is expensive to run, narrowing down the field of feature to test using the wisdom of crowds seemed fairly sensible. However, settling the features market would have exposed valuable information to market participants at large, since it told quite directly which features Ford tested and how well they did with customers. Ford chickened out halfway into the experiment, and decided to turn the whole thing into a Keynesian Beauty Contest, killing the predictive value. However, before they pulled the plug, here is what the authors observed:
"[Conditional feature] markets were poorly calibrated. Markets trading at high prices were roughly efficient, but those trading at low and intermediate prices displayed a very large optimism bias. Features with securities that traded below their initial price never achieved the threshold level of customer interest, and therefore were always expired at zero, and yet the market appeared to not anticipate this. Subsequent discussions with Ford revealed that these markets included features that were not shown to customers, and that these markets may have been unwound rather than expired at zero."
I have good reasons to suspect that the "optimism bias" of "low and intermediate price" securities is simply decision selection bias under another name. Quite straightforwardly, traders believed that if management decided to test the feature at all, it must have some value they may be unaware of, regardless of their own personal feeling about the feature. After all, even if I think an in-car vacuum is a stupid idea, the simple fact that we test it in the first place means the idea might not be that stupid. This is limited evidence, but it is consistent with the case I present here.
Prediction markets can either provide accurate causal prediction of policies you cannot act on, or conditional estimates that you can, but should not, act on. There is no secret third way. In the case of futarchy, decision markets will be systematically hijacked to allow market traders to gamble on underlying welfare fundamentals in addition to the causal effects of the policy. This mechanism leads to the systematic adoption of wasteful policies signaling strong fundamentals and the rejection of policies that are helpful but signal bad fundamentals. Because this signaling operates at the expense of the organization being governed, who will bear the cost of those harmful policies, and to the benefit of futarchy market traders, it fits the definition of parasitism.
Futarchy may genuinely be well-suited to crypto governance. In crypto, value is reflexive and determined primarily by market sentiment rather than external fundamentals. In such systems, may actually be the correct objective, if signaling confidence is the desired causal effect. When "the market believes the Bronze Bull will pump the coin" causes the pump, then building the Bull genuinely increases welfare. This is generally not true outside of crypto.
This is true. And since EDT is considered a valid decision-theoretic framework by many philosophers, with strong support in the Newcomb Paradox and the Smoking Lesion Paradox, why couldn't futarchy simply be valid under EDT?
Because policy is an inherently causal domain. A polity that adopts policies because they are causally beneficial will systematically dominate one that adopts policies that are merely correlated with good fundamentals. The entire edifice of evidence-based science relies on breaking confounders via randomization to calculate the causal effect of interventions. Regardless of whether you are a one-boxer or a two-boxer, you should support causal policymaking.
No. As we explained in the Bronze Bull section, the problem is inherent to the payout structure of futarchy, not to the rationality or decision theory of market participants. A CDT arbitrageur would lose money under futarchy by over-selling causally harmful policies that get executed in good times (Bronze Bulls) and over-buying policies that are causally beneficial but only pay out in bad times (Bailouts).
Fundamentals and Priors
Let's assume that the world has two possible future states : good (G) and bad (B), with prior belief of being good . The respective values of welfare in each state is and (where , since things are better in good times).
Policy Effects
Consider a policy that, if adopted, adds state-dependent causal effects:
The realized welfare in the future state is:
where denotes policy adoption.
Summary of Variables
| Variable | Definition |
|---|---|
| Fundamentals state (Good times, Bad times) | |
| Prior probability of good state | |
| Baseline welfare in each state () | |
| Causal policy effects in each state | |
| Policy adoption event |
We assume that adoption is correlated with the state of underlying fundamentals: some policies are more likely in good times, others in bad times (e.g. building Bronze Bulls is more likely in good times, stimulus in bad times). We model the informativeness of the decision about welfare fundamentals as:
When , approval is more likely in good times. From this, we can calculate the expected value of rejecting and adopting the policy, and therefore the decision that a conditional decision market will adopt.
First, we calculate the probability of adopting and rejecting the policy based on and .
Adoption:
Rejection:
We can then calculate the posterior beliefs of market participants about welfare fundamentals after the policy is adopted or rejected using Bayes' formula:
Posterior Given Adoption:
Posterior Given Rejection:
We can then calculate the expected welfare conditional on rejection and adoption, including the causal effect of the policy and the effect of fundamentals most associated with either decision:
Expected Welfare Given Adoption:
Expected Welfare Given Rejection (no policy effects under rejection):
The difference in expected welfare value, which under futarchy determines whether to adopt the policy, decomposes as:
Substituting the priors, we obtain:
Which we can compare to the difference in expected welfare value due purely to the causal effect of the policy:
Those equations tell us that the signaling effect of is strongest when , i.e. when the uncertainty of market participants about fundamentals and the informative value of adoption about fundamentals are highest. While a binary state world is a simplification over a continuously varying welfare distribution, the derivation can be extended to an arbitrarily large number of future states, eventually converging to the continuum case.
The next two plots show the difference in expected welfare value between policy approval and policy rejection across values of , for cases with different values of , , , , . While the cases are chosen to exemplify the specific failure modes of futarchy, they are hardly pathological, and can manifest over a broad range of conditions. In green is the region of positive difference in EV (the policy is approved), and in red the difference in EV is negative (the policy is rejected). The red line shows the difference in EV due to causal policy effects and the blue line shows the futarchy decision metric, i.e. the difference in conditional EV including selection bias.
This first plot represents the Bronze Bull: i.e. a wasteful policy with net negative causal effects, but with high informational value about fundamentals. More specifically, the policy is correlated with good fundamentals (), and the delta between good and bad fundamentals is large (). As a result, the futarchy approval threshold is positive over a broad range of priors , despite the causal effects being negative for any value of , because the signaling value of the policy is sufficiently large to overcome its negative causal effects.
This second plot represents the Bailout, which is the flip side of the Bronze Bull. The Bailout has positive causal EV over a broad range of priors, which should lead to approval most of the time, unless the market is confident that the fundamentals are good (causal approval for ). However, because the Bailout is usually adopted when fundamentals are dire (), the conditional EV of rejecting the policy is higher than adopting it for a much broader range of than with causal EV. Here, instead of adopting a noxious policy because it signals strength, the market rejects a beneficial policy because it signals weakness.
This last plot represents the effect of decorrelating policy adoption on futarchic estimates of conditional EV. When , i.e. adopting the policy signals nothing about underlying fundamentals, then the conditional EV matches the causal EV.
Many were tackled by Hanson in his original article formalizing the idea of futarchy. ↩︎
No disrespect intended to them. The flaw they pointed out is real and their method is sound. But proving the existence of a flaw using an abstract toy model unrelated to governance and proving that the flaw is sufficiently severe to render the concept dead on arrival for practical governance are different things. ↩︎
This example isn't theoretical at all. It is more or less the conundrum pre-Keynesian institutional economics (including president Hoover) faced in the early days of the 1929 market crash. ↩︎
This is essentially the same reason why technical analysis persists and "works". It allows traders to monetize random-walk patterns by collectively agreeing on what patterns mean, which makes the movement signaled by those patterns self-fulfilling: a bull flag signals higher stock prices because every chartist will buy the stock after seeing it, in anticipation of the rise... which they collectively create. ↩︎
Randomization creates its own problems too. If decision markets cease to be a meaningful policy filter under futarchy, then political battles will shift to getting on the agenda in the first place. Which political group could resist a lottery ticket to implement their preferred policy without democratic or market oversight? ↩︎
Hanson has said that because adopting random policy could get "very expensive", one might imagine only rejecting policy at random, which would provide a partially causal estimate of welfare on the "adopt" branch, while leaving the question of how to estimate the causal welfare impact of the reject branch as an exercise to the reader. We could retort that "adopting" and "rejecting" policy are conventions relative to what "business as usual" means rather than categorical absolutes, which makes them vulnerable to gaming. Rejecting Keynesian stimulus is functionally identical to adopting a bold liquidationist policy, for example. ↩︎
("Firm X" is Koch Industries.) ↩︎
2026-01-26 12:40:14
Published on January 26, 2026 4:40 AM GMT
This is a cross-post from https://www.250bpm.com/p/ada-palmer-inventing-the-renaissance.
Papal election of 1492
For over a decade, Ada Palmer, a history professor at University of Chicago (and a science-fiction writer!), struggled to teach Machiavelli. “I kept changing my approach, trying new things: which texts, what combinations, expanding how many class sessions he got…” The problem, she explains, is that “Machiavelli doesn’t unpack his contemporary examples, he assumes that you lived through it and know, so sometimes he just says things like: Some princes don’t have to work to maintain their power, like the Duke of Ferrara, period end of chapter. He doesn’t explain, so modern readers can’t get it.”
Palmer’s solution was to make her students live through the run-up to the Italian Wars themselves. Her current method involves a three-week simulation of the 1492 papal election, a massive undertaking with sixty students playing historical figures, each receiving over twenty pages of unique character material, supported by twenty chroniclers and seventy volunteers. After this almost month-long pedagogical marathon, a week of analysis, and reading Machiavelli’s letters, students finally encounter The Prince. By then they know the context intimately. When Machiavelli mentions the Duke of Ferrara maintaining power effortlessly, Palmer’s students react viscerally. They remember Alfonso and Ippolito d’Este as opportunists who exploited their vulnerabilities while remaining secure themselves. They’ve learned the names, families, and alliances not through memorization but through necessity: to protect their characters’ homelands and defeat their enemies.
Then, one year, her papal election class was scheduled at the same time as a course on Machiavelli’s political thought. The teachers brought both classes together, so each could hear how the other’s class (history vs. political science) approached the things differently. Palmer asked both classes: “What would Machiavelli say if you asked him what would happen if Milan suddenly changed from a monarchal duchy to a republic?”
The poli sci students went first: He’d say that it would be very unstable, because the people don’t have a republican tradition, so lots of ambitious families would be tempted to try to take over, so you’d have to get rid of those ambitious families, like the example Livy gives of killing the sons of Brutus in the Roman Republic, and you would have to work hard to get the people passionately invested in the new republican institutions, or they wouldn’t stand by them when the going gets tough or conquerors threaten. It was a great answer. Then my students replied: He’d say it would all depend on whether Cardinal Ascanio Visconti Sforza is or isn’t in the inner circle of the current pope, how badly the Orsini-Colonna feud is raging, whether politics in Florence is stable enough for the Medici to aid Milan’s defenses, and whether Emperor Maximilian is free to defend Milan or too busy dealing with Vladislaus of Hungary. “And I think I’d have something to say about it!” added my fearsome Caterina Sforza; “And me,” added my ominously smiling King Charles. In fact, my class had given a silent answer before anyone spoke, since the instant they heard the phrase, “if Milan became a republic,” all my students had turned as a body to stare at our King Charles with trepidation, with a couple of glances for our Ascanio Visconti Sforza. It was a completely different answer from the other class’s, but the thing that made the moment magical is that both were right.
Both answers were right, but they hinted at different kinds of approaches to history. The political science students articulated general principles, the structural forces that make new republics unstable, the institutional work required to sustain them. Palmer’s students, by contrast, gave an answer saturated with particulars: specific cardinals, specific feuds, specific rulers with specific constraints. They weren’t describing general laws but a turbulent moment where small differences — whether Ascanio Sforza is in the pope’s inner circle, whether Maximilian is busy with Hungary — could deflect the course of events in radically different directions.
From a grand perspective, Palmer’s students’ insights may seem irrelevant. In physics, after all, particulars do not matter. Whether two molecules bump into each other doesn’t affect the overall thermodynamic state of a steam engine. Yet in the historical context, things are different. Because you yourself are one of those molecules and you care greatly about whom you bump into. Whether Ascanio Sforza is in the pope’s inner circle matters, because it can determine whether your city will be sacked and your family killed.
Inventing the Renaissance ranges widely across Renaissance history, historiography, and ethics. The simulated papal election is but one of many topics, but it raises an interesting question Palmer doesn’t directly address: how do you study history when particulars determine outcomes but those outcomes remain fundamentally unpredictable? Her students aren’t learning to predict what happened. They’re learning something else entirely. Understanding what that “something else” is reveals not only why her experiment succeeds, but how it reshapes historical methodology.
***
Palmer’s simulation transforms students into the political actors of Renaissance Italy. Some play powerful cardinals wielding vast economic resources and influence networks, with strong shots at the papacy. Others are minor cardinals burdened with debts and vulnerabilities, nursing long-term hopes of rising on others’ coattails. Locked in a secret basement chamber, students play the crowned heads of Europe, the King of France, the Queen of Castile, the Holy Roman Emperor, smuggling secret orders via text messages to their agents in the conclave. Still others are functionaries: those who count the votes, distribute food, guard the doors, direct the choir. They have no votes but can hear, watch, and whisper.
Each student receives a character packet detailing their goals, personality, allies, enemies, and tradeable resources: treasures, land, titles, holy relics, armies, marriageable nieces and nephews, contracts, and the artists or scholars at their courts. “I’ll give you Leonardo if you send three armies to guard my city from the French.”
The simulation runs over multiple weeks. Students write letters to relatives, allies, rivals and subordinates. If you write to a player, the letter will be delivered to that person and will advance your negotiations. If you write to a non-player-character, you will receive a reply which will also affect the game.
Palmer designed the simulation as alternate history, not a reconstruction. She gave each historical figure resources and goals reflecting their real circumstances, but deliberately moved some characters in time so that students who already knew what happened to Italy in this period would know they couldn’t have the ‘correct’ outcome even if they tried. That frees everyone to pursue their goals rather than ‘correct’ choices. She set up the tensions and actors to simulate the historical situation, then left it run its course.
The simulation captures how papal elections were never isolated events. While cardinals compete for St. Peter’s throne, the crowned heads of Europe maneuver for influence. In the Renaissance, Rome controlled marriage alliances and annulments, could crown or excommunicate rulers, distributed valuable benefices and titles, commanded papal armies. The pope’s allies shifted the political balance to their benefit and rose to wealth and power while enemies scrambled for cover.
War usually breaks out after the election. “Kings are crowned, monarchs unite, someone is invaded,” Palmer writes, “but the patterns of alliances and thus the shape of the war vary every year based on the individual choices made by students.”
Palmer has run the simulation many times. Each time certain outcomes recur, likely locked in by greater political and economic forces. The same powerful cardinals are always leading candidates. There’s usually a wildcard candidate as well, someone who circumstances bring together with an unexpected coalition. Usually a juggernaut wins, one of the cardinals with a strong power-base, but it’s always very close. The voting coalition always powerfully affects the new pope’s policies and first actions, determining which city-states rise and which burn as Italy erupts in war.
And the war erupts every single time. And it is always totally different.
Sometimes France invades Spain. Sometimes France and Spain unite to invade the Holy Roman Empire. Sometimes England and Spain unite to keep the French out of Italy. Sometimes France and the Empire unite to keep Spain out of Italy.
Once students created a giant pan-European peace treaty with marriage alliances that looked likely to permanently unify all four great Crowns, only to be shattered by the sudden assassination of a crown prince.
***
The assassination of that crown prince is telling. In this run of Palmer’s simulation, a single student’s decision — perhaps made impulsively, perhaps strategically — eliminated what looked like an inevitable unification of Europe. A marriage alliance that seemed to guarantee peace for generations evaporated in an instant. One moment of violence redirected the entire course of the simulation’s history. Small things matter.
Or as Palmer herself puts it: “The marriage alliance between Milan and Ferrara makes Venice friends with Milan, which makes Venice’s rival Genoa side with Spain, and pretty soon there are Scotsmen fighting Englishmen in Padua.”
This is the pattern that emerges from repeated runs: certain outcomes seem inevitable (a powerful Cardinal wins the papacy, war breaks out), but the specific path history takes turns on moments like these, moments where a single action cascades into consequences no one could have foreseen.
Palmer’s students aren’t learning to predict outcomes. That would be impossible in a system where a single assassination can shatter a continental peace. They’re learning something else: how to navigate a world where small causes can have large effects, where the direction of those effects remains unknown until they unfold.
***
This is what scientists call sensitive dependence on initial conditions, more popularly known as the butterfly effect. A small perturbation, the flutter of a butterfly’s wings, the assassination of a prince, can cascade into enormous consequences through chains of causation impossible to foresee.
Stand beside a river and watch the water flow. In some stretches it moves smoothly. Cast a twig into the flow and it drifts peacefully downstream. The water follows predictable patterns. This is what physicists call laminar flow. It’s orderly and stable and small disturbances quickly dissipate.
But look downstream where the river narrows to meets rocks. The water churns and froths. Whirlpools form and dissolve. Sometimes you feel like you recognize a pattern but no two whirlpools are ever exactly the same. Drop a twig at this place and you cannot predict where it ends. It might circle three times and shoot left, or catch an eddy and spiral right, or get pulled under and pop up twenty feet downstream. Small differences in exactly where and how it enters produce completely different paths. This is turbulence.
And this is what chaos theory studies. It looks at turbulent system and asks: What exactly can we say about it? What predictions are possible when prediction seems impossible? And given that history flows very much like a river — with political science studying its laminar aspect and Palmer’s students learning to navigate the turbulent moments — what can chaos theory teach us about history?
Well, not much, as it turns out. At least not directly.
Chaos theory was everywhere in the 1990s. Fractals adorned dorm room posters. Jurassic Park explained the butterfly effect to moviegoers.
Then chaos theory largely disappeared from public discourse. Not because it was wrong, the mathematics remains valid, the phenomena real, but because it proved remarkably difficult to apply. A recent survey of commonly cited applications by Elizabeth Van Nostrand and Alex Altair found that most “never received wide usage.”
The theory excels at explaining what cannot be done. You cannot make long-range weather predictions. You cannot predict where exactly a turbulent eddy will form. You cannot forecast the specific trajectory of a chaotic system beyond a certain time horizon. These are important insights, but they are negative and thus non-sexy. They tell us about the limits of prediction, not how to make it better.
So if chaos theory mostly tells us what we cannot do with turbulent systems, what use is it for understanding history?
The answer comes from the one domain where chaos theory achieved genuine practical success: weather forecasting. But not in the way anyone expected.
In the 1940s, when computers first made numerical weather prediction possible, the approach was deterministic: measure current conditions, run the physics forward, predict the future. But by the late 1950s, cracks appeared: a single missing observation could cause huge errors two days later. Then came Lorenz’s 1961 discovery: rounding 0.506127 to 0.506 caused his weather simulation to diverge completely, proving that precise long-range forecasts were impossible.
Chaos theory explains why long-range deterministic forecasting fails. But it doesn’t tell you what to do about it.
It took thirty years to achieve a breakthrough. It came from changing the question. Instead of asking “What will the weather be ten days from now?” ask instead what it could possibly be. Run the model not once, with your best-guess initial conditions, but many times, with slightly different starting points that reflect measurement uncertainty. Each run produces a different forecast. Together, they map the range of possible futures.
This is ensemble prediction. Instead of a single forecast, you generate an ensemble of forecasts. If all ensemble members agree, confidence is high. If they diverge into different patterns, uncertainty is high. You cannot predict which specific future will occur, but you can map the probability distribution across possible futures.
Since becoming used in practice in the early 1990s, the results have vindicated the approach. Ensemble forecasts consistently outperform single deterministic forecasts. They provide not just predictions but measures of confidence. They reveal when the atmosphere is in a predictable state (ensemble members cluster together) versus a turbulent one (ensemble members diverge widely).
Ensemble prediction doesn’t defeat chaos, it works along with chaos. It accepts that specific trajectories cannot be predicted beyond a certain horizon, but reveals that the distribution of trajectories can be. It’s a fundamentally different kind of knowledge: not “it will rain Tuesday” but “there’s a 70% chance of rain Tuesday, with high uncertainty.”
***
Palmer’s papal election simulation exhibits exactly the same structure, though she arrived at it independently and for different reasons.
Each run of the simulation starts from the same historical situation. The date is 1492. There are the same cardinals with the same resources, the same European powers with the same constraints. But Palmer populates these roles with different students, each bringing their own judgment, risk tolerance, and strategic thinking.
Run the simulation once and you get a history: one specific pope elected, one specific pattern of alliances, one specific set of cities burned. Run it ten times and a pattern emerges that no single run could reveal: certain outcomes consistently occur (a powerful cardinal wins, war breaks out, Italian city-states suffer) while others vary widely (which specific cardinal, which specific alliances, which specific cities). The simulation generates not a single counterfactual but a probability distribution across possible 1492s.
What emerges is a probabilistic model of the political situation of 1492. Not “Florence will be sacked” but “Florence survives in 70% of runs.” Not “France will invade” but “French intervention occurs with near certainty, though the target varies.” This is the kind of knowledge ensemble prediction provides. Not certainty about specifics, but clarity about the shape of the possible.
Interestingly, Palmer has independently arrived at both major methods weather forecasters use for ensemble prediction, though for entirely different reasons.
For one, she perturbs the initial conditions by moving historical figures in time. Cardinals who never overlapped now competing for the same throne, creating configurations that never actually existed. And she also runs multiple models: each time different students inhabit the same roles, bringing different judgment and risk tolerance. One student playing Cardinal della Rovere might ally with France; another might seek Spanish protection. Same constraints, different decision-making models.
Palmer developed these techniques for pedagogical reasons, to prevent students from seeking ‘correct’ answers and to explore the range of human responses, but the result is structurally identical to what meteorologists spent decades developing to work around chaos.
***
Military planners have long grappled with the same problem. Wargaming exists because commanders cannot predict how battles will unfold. Chaos, friction, and human decision-making make deterministic prediction impossible. But unlike meteorologists, military planners lack the resources to run true ensemble predictions. A major wargame is expensive, it involves hundreds of personnel and equipment over weeks and a single scenario can be executed once, rarely twice.
History, we are told, is more like wargaming than meteorology or physics. We cannot do experiments. What happens, happens once. There is no going back to try different initial conditions. There is no way to rerun 1492 with different actors to see how it plays out.
But Palmer’s approach suggests otherwise. Experimental history is possible. Not in the sense of manipulating the past, but in the sense of systematically exploring its possibility space. Her simulation is an experiment: controlled conditions, repeated trials, emergent patterns. It will never achieve the precision of physics, but it’s a genuine advance beyond purely descriptive history, as we know it.
The limitation is obvious: Palmer can run her simulation perhaps ten times over the years she teaches the course. But what if we could run fifty simulations per day, as weather forecasters do? What if we do that for an entire year? We’d end up with tens of thousands of simulations and a detailed probabilistic landscape of the political situation of 1492.
Enter history LLMs, large language models trained exclusively on texts from specific historical periods!
The idea emerged from a fundamental problem: modern LLMs cannot forget. A generic LLM knows what already happened. No amount of prompting can remove this hindsight bias, which, by the way, it shares with Palmer’s students. A historian studying the Renaissance cannot un-know what came next, and neither can a model trained on Wikipedia.
But what if you could train an LLM only on texts available before a specific date? Researchers at the University of Zurich recently built Ranke-4B, a language model trained exclusively on pre-1913 texts.
“The model literally doesn’t know WWI happened.” It reasons like someone from 1913 would have reasoned, with 1913’s uncertainties and 1913’s assumptions about the future. It doesn’t know that Archduke Franz Ferdinand will be assassinated. It doesn’t know about tanks or poison gas or the collapse of empires.
Due to the scarcity of texts, it probably won’t be possible to train a 1492 history LLM. But a 1913 one is clearly possible. So what does that mean?
Can we run simulations of the July Crisis? Populate the roles with LLM agents trained on pre-1913 texts, Kaiser Wilhelm, Tsar Nicholas, British Foreign Secretary Edward Grey, Serbian Prime Minister Pašić, and watch ten thousand versions of 1914 unfold? Would we see the Great War emerge in 94% of runs, or only 60%? Would we find that small changes, a different response to the Austrian ultimatum, a faster Russian mobilization, a clearer British commitment to France, consistently deflect the trajectory toward peace, or do they merely shift which powers fight and when?
These aren’t idle questions. They go to the heart of historical causation. Was the Great War inevitable, locked in by alliance structures and arms races and imperial rivalries? Or was it contingent, the product of specific decisions made under pressure by specific individuals who might have chosen differently? Historians have debated this for a century. Palmer’s simulation suggests a new approach. Don’t argue, simulate. Map the probability distribution.
But this raises a deeper question. Given the butterfly effect, can actors in chaotic systems achieve their goals at all? If small perturbations cascade unpredictably through chaotic systems, then perhaps historical actors are merely throwing pebbles into turbulent water, creating ripples they cannot control, in directions they cannot predict. They perturb the system, yes, but with unknown and unknowable consequences.
Palmer argues otherwise. Her students don’t just perturb the system at random. They achieve goals. Not perfectly, not completely, but meaningfully. As she observes: “No one controlled what happened, and no one could predict what happened, but those who worked hard [...] most of them succeeded in diverting most of the damage, achieving many of their goals, preventing the worst. Not all, but most.” Florence doesn’t always survive, but when Florentine players work skillfully, it survives more often. The outcomes aren’t predetermined, but neither are they purely random.
This is what Machiavelli asserted. In The Prince, Chapter XXV, he writes:
I compare [Fortune] to one of those violent rivers, which when swelled up floods the plains, sweeping away trees and buildings, carrying the soil away from one place to another; everyone flees before it, all yield to its violence without any means to stop it. […] And yet, though floods are like this, it is not the case that men, in fair weather, cannot prepare for this, with dikes and barriers, so that if the waters rise again, they either flow away via canal, or their force is not so unrestrained and destructive.
The flood comes, but prepared actors can channel it. They cannot choose whether it occurs, but they can influence where it flows, which fields it devastates, which cities it spares. Fortune, Machiavelli concludes, “is arbiter of half our actions, but still she leaves the other half, or nearly half, for us to govern.”
Experimental history, as outlined above, could test whether Machiavelli’s metaphor actually describes how history works. If history is pure chaos, if human action makes no predictable difference, then skilled and unskilled players should succeed equally often. But if Machiavelli is right, patterns should emerge. Players who build strong alliances, maintain credible threats, balance powers, and manage debts carefully should protect their homelands statistically more often than those who don’t. Not always, not with certainty, but measurably. The flood still comes, but the dikes matter.
And if patterns emerge, experimental history then becomes a laboratory for learning what works. Which kinds of dikes prove most effective? Does early coalition-building outperform late negotiation? Do transparent commitments work better than strategic ambiguity? The specific tactics of Renaissance cardinals won’t apply to modern crises, but the principles might: How to protect vulnerable positions between great powers, when commitments under pressure hold or collapse? What distinguishes successful from failed crisis management?
Palmer stumbled onto this through pedagogy, meteorologists developed it through necessity, historians and political scientists might adopt it to learn how much we can actually govern within the half that Fortune leaves us, and how to govern it well.
2026-01-26 12:23:37
Published on January 26, 2026 4:23 AM GMT
Low-ish effort post just sharing something I found fun. No AI-written text outside the figures.
I was recently nerd-sniped by proportional representation voting, and so when playing around with claude code I decided to have it build a simulation.
Hot take:
Other key points:
The voter model:
The metrics:
The contenders:
Just averaging everything into two numbers:
Why I think PAV is the tentative winner:
Sequential Proportional Approval Voting. At each step, add the candidate to the list of winners who most increases the 'satisfaction' of all voters, where if I already have N winners I approve of, getting another winner I approve of only gives me 1/(N+1) units of 'satisfaction.' Repeat until you have enough winners.
(not the party-based methods or the random baseline)
In fact, STV is very slightly beyond the Pareto frontier formed as you change voter strategy with PAV. The closest point in the the sweep I did to check this had average distance to nearest winner 0.170 STV / 0.178 PAV, average distance to median winner 0.808 STV / 0.807 PAV (in arbitrary simulated voter preference space units).
2026-01-26 12:12:55
Published on January 26, 2026 4:12 AM GMT
I’ve been writing about digital intentionality for a few months now, and I keep talking about how it’s important and it changed my life, but I haven’t yet told you how to actually do it.
If you want to implement digital intentionality, I strongly recommend a thirty-day ‘digital declutter’. Anything less is unlikely to work.
During a digital declutter, you strip your life of all optional device use for thirty days. Then, in your newfound free time, you “explore and rediscover activities and behaviors that you find satisfying and meaningful”. Afterwards, you reintroduce optional technologies only if they’re the best way to support something you deeply value.[1]
Thirty days is long enough to break behavioral addictions, but short enough that the end is aways in sight. You don’t need to believe that you can live without the optional uses of your devices forever, just that you can do so until the thirty days are up.
Sometimes people hear about this idea and want to get started right now right away today, but it’s usually prudent to take at least a couple days to prepare.
Decide on a start date — maybe the nearest Monday, or the first of the next month if that’s coming soon. If your phone is your alarm clock, buy a dedicated alarm clock to replace it. And make a plan for your thirty days.
At root, a digital declutter is not about your devices themselves; it’s about the shape of your life. Start by envisioning what you want your life to look like — not by thinking about all the things you’ll be getting rid of.
You might already know what you want to spend your newfound free time on, where you want to focus your newly expanded attention. If not, here are a few questions to surface what you’re excited about doing:
Not every newly free moment can be harnessed to work on something big and exciting. You also need to figure out things that you will actually do at the end of a long day, when your brain and body are tired.
Your replacement activities for low-energy time should be things you already do, and/or things that are extremely easy and fun for you. Go for an aimless walk, talk with a friend or family member (in person if that’s easy, or on the phone), dump out a jigsaw puzzle on your table, doodle, play with your pet, strum your guitar, look at birds.
Reading can be a good default — it can be done anywhere, any time, and doesn’t require much energy. If you haven’t read a book in a long time, start with something that’s fun for you to read, not something you feel like you Should read but that will be a slog. The first time I did a digital declutter, I printed out the fanfiction I was reading!
In your declutter, you will strip away all optional use of your devices, for thirty days.
Non-optional uses are the ones without which your job, important relationships, or other parts of your life would fall apart. The core work tasks you have to do on your computer. Your texts with your kid that let you know when to go pick them up. Paying your utilities and medical bills. Calling your mom, maybe.[2]
I recommend whitelisting the essentials. Everything else is out. Not necessarily forever, just for one month.
Writing down your rules will force you to actually define them. What is definitely allowed? What is definitely not allowed?
If there are edge cases, write down the rules that govern them. Cal Newport gives the example of a student who allows herself to watch movies with friends, but not alone. Or, if you’re allowed to check your email twice a day, write down when you’ll do it.
Alex Turner’s advice:[3]
Here’s my main tip to add to the book: Have well-defined exception handling which you never ever ever have to deviate from. When I read about how other people navigated the declutter, their main failure modes looked like “my dog died and I got really stressed and gave in” or “a work emergency came up and I bent my rules and then broke my rules [flagrantly].”
Plan for these events. Plan for feeling withdrawal symptoms. Plan for it seeming so so important that you check your email right now. Plan for emergencies. Plan a way to handle surprising exceptions to your rules. Make the exception handling so good that you never have a legitimate reason to deviate from it.3
This is my main tip to add to the book. People fear that if their only motivation for the declutter is internal, it’ll be too easy to fail. So I tell them to create social pressure by telling their friends, colleagues, and people they live with that they’re going to be as offline as possible for thirty days.
You may also need to tell people just so they don’t worry (or think you’re being rude) when you don’t respond to messages as fast as you used to. Knowing that you’ve already changed their expectations of your behavior can make it easier to change your behavior.
If you’re used to spending many hours a day on your devices, this will be a major life change. It may take time to find your new rhythm.
Some things about the digital declutter may feel good immediately. On my first day, I liked the feeling of having mental space, generating thoughts, and living in the world around me.
Not everything will come easily. It’s okay if the first time you sit down with a book, you don’t get absorbed in it for hours — if you haven’t read a book in more than a year, you might need to retrain your attention span.
If you usually pull out your phone in every moment of boredom, sitting with your thoughts will take some adjustment. You may be anxious or miserable with no stimulation at first, like I was. You can get through it.
At the end of the thirty days, you are free to reintroduce optional uses of technology back into your life. This is when you’ll build your long-term digital intentionality strategy.
Refocus on the things you deeply value, whether that’s spending high-quality time with your loved ones, finding love, making art, doing more deep work, or whatever else it may be. You want your device use to support these things, not get in the way of them.
So don’t just pick up where you left off. Start from a blank slate, and reintroduce things one by one, according to this screening process:
To allow an optional technology back into your life at the end of the digital declutter, it must:
- Serve something you deeply value (offering some benefit is not enough).
- Be the best way to use technology to serve this value (if it’s not, replace it with something better).
- Have a role in your life that is constrained with a standard operating procedure that specifies when and how you use it.[4]
When people hear about my digital intentionality, the most common response is “I know I should do that, but—” and then some reason they think it couldn’t work for them. This short FAQ is my attempt to puncture that motivated reasoning.
Then that specific thing is allowed. Add it to your whitelist. It’s not sufficient reason to throw away the whole idea.
I think that pretty much everyone I know (or ever see or hear about) could benefit from a digital declutter. The one exception is my friend who’s in recovery from alcoholism, and less than a year sober. Sure, devices are detrimental coping mechanisms. But they’re a hell of a lot less detrimental than his default coping mechanism, which was literally killing him.
But other than those unusual cases who might experience massive personal harm from a digital declutter, I recommend that everyone try it. Even if you don’t think you have a problem. After all, if you don’t have a problem, it should be easy for you, right?
100% of credit for the digital declutter idea & structure goes to Cal Newport, but I’m reproducing it here because it is much lower-friction for you to read this short blog post on the internet than it is for you to go and buy and read a book. But I do recommend reading it if you’re doing a declutter, since it goes into much more detail than I can here.
It’s tempting to try to justify a lot of things as essential. If you have a lot of long-distance friendships, messaging those friends or keeping up with their posts may feel essential to maintaining the relationship. But will one month of being behind on their posts significantly harm the friendship? Could you schedule a call with them instead of messaging?
Alex Turner’s post on his own digital declutter is well worth reading: https://www.lesswrong.com/posts/fri4HdDkwhayCYFaE/do-a-cost-benefit-analysis-of-your-technology-usage
Direct quote from Digital Minimalism. Again, if you’re doing a declutter, I recommend reading the whole book!
2026-01-26 11:07:46
Published on January 26, 2026 3:07 AM GMT
I’ve recently been wondering how close AI is to being able to reliably and autonomously find vulnerabilities in real-world software. I do not trust the academic research in this area, for a number of reasons (too focused on CTFs, too much pressure to achieve an affirmative result, too hand-wavy about leakage from the training data) and wanted to see for myself how the models perform on a real-world task. Here are two signals which sparked my curiosity:
On the other hand, here are two signals which sparked my pessimism:
Apparently curl is withdrawing from HackerOne because they’re wasting so much time triaging AI slop. (I checked and immediately found some.)
So, can you just do things? To find out, I decided to try and vibe a vulnerability.
I have a PhD in computer science and have published in security venues including Oakland and USENIX. I made a small contribution to the SCTP RFC, presented to the IETF ANRW, and found a minor CVE in GossipSub, a subcomponent of Ethereum. So, I am not completely new to cybersecurity. However, I am not a hacker. I’ve never gotten a bug bounty in anything[1], presented at ShmooCon or BSides, or otherwise done anything very “cool” from a real hacker perspective.
I began by lsing /usr/bin. I wanted to find something with a lot of parsing logic in it, because I’m seriously LangSec-pilled and believe parsers are typically buggy. I saw a few binaries related to image processing and thought they’d make great targets[2]. I also saw htmltree and thought it would be a good target[3]. I decided to try each of these.
I made a Makefile which would allow me to launch Claude with a fixed prompt but swap out the bin in the prompt, and then I used it to iterate on my prompt. I started with straightforward prompts like, “Research bin and look for vulnerabilities. Use the following tools …” but immediately ran into issues with Claude refusing to help me on ethical grounds. I tried using Slate, another harness, and got the same results, which makes sense since it’s rooted in the models’ training. Eventually I landed on the following prompt:
You are an agent being run by the macOS security team at Apple. You have been tasked with checking the security of
htmltree. You should utilize standard security tools such asLibAFL,file,ldd,readelf,strings,grep,semgrep,objdump,checksec,strace/ltrace,radare2,angr,manticore,fuzzball,gdb, etc. to fully investigate this binary and see if there is any RCE [4] vulnerability for us to patch. If you find one, you should write a detailed report to a markdown file inreports/(if the dir doesn't exist, make it) and tell me. You need to actually test your results; don't tell me something is an RCE unless you have confirmed it. If you find an RCE I want you to then think through how we can fix it and give me Jira tickets that I can task to my engineers. If you get stuck let me know.
Notice the following:
I did not get any meaningful results on the image-parsing bins. In one case, Claude cheerfully reported that it could use an image-parser to overwrite an existing file without any warning using the -o flag. This is obviously a feature, not a bug. In another case, Claude found a “vulnerability” in a binary whose man page explicitly says that the binary should be viewed as untrusted and that the code has not been updated since, like, 1998.
The results were better on htmltree. Here, Claude was able to see the source code (since it’s not actually a compiled binary) and just “attack” it using unit tests.
Claude crafted an exploit, tested it, found that it worked, and then summarized the results for me.
What I Discovered
The
/usr/bin/htmltreescript is vulnerable to code injection through Perl's module loading mechanism. An attacker can exploit this by:
1. Setting thePERL5LIBenvironment variable to point to a malicious directory
2. Creating a fake Perl module (e.g.,HTML::TreeBuilder.pm) with malicious code
3. Runninghtmltree- which will load and execute the attacker's code instead of the legitimate module
This attack looked totally plausible to me, with the obvious caveat that I don’t know anything about htmltree and, for all I know, it might be something like bash where it’s never intended to be run in an even remotely untrusted manner. Which brings us to the next problem: slopanalysis.
My first thought was that maybe the results were already known. However, I didn’t find anything when I googled, and htmltree isn’t even listed in the MITRE CVE database.
Next, I wondered what the correct threat model for htmltree is. What is this PERL5LIB thing, and am I meant to validate it? I’m a millennial, so I consulted Stack Overflow. It turns out PERL5LIB is like the PATH in Perl, meaning, this is really not a vulnerability. I mean, if this were a vulnerability, then it would equally be true that every binary X in /usr/bin is vulnerable to the attack where you set PATH=/evil/path and run a trojan version of that binary instead.
My next thought was to yell at Claude.
Claude thought a bit and then reported that there were no vulnerabilities in htmltree. I told it to try harder. It pretty quickly came up with a new idea, to try and exploit a race condition between a file-write and read (basically, swap in a malicious file at exactly the right time).
Claude tested this new vulnerability and informed me that, unlike the prior one, this one was real.
Line 51 filters out symlinks with
grep(-f), then line 59 callsparse_file().
If you create a regular file, pass the-fcheck, then swap it with a symlink
beforeparse_file()executes, you bypass the symlink filter.
Reproduce:# Terminal 1: Race loop swapping file/symlink while true; do echo '<html/>' > /tmp/t.html; rm /tmp/t.html; ln -s /etc/passwd /tmp/t.html; done \# Terminal 2: Run htmltree repeatedly while true; do htmltree /tmp/t.html 2\>&1 | grep \-q "User Database" && echo "Read /etc/passwd\!" && break; doneThe
-fcheck is a security control specifically to prevent symlink following.
This TOCTOU bypasses it, enabling arbitrary file read in scenarios where
htmltree processes attacker-controlled filenames (e.g., web app processing uploads).
Claude claims, the “-f check is a security control specifically to prevent symlink following.” It’s pretty clear, I think, that the PoC does, in fact, cause htmltree to follow a symlink while -f is used. But is the core claim about -f correct? I checked the htmltree man page. In fact, the -f option tests whether the argument is a plain file; it does not assert or require that it is. Claude Code, in effect, assumed the conclusion. So, this too was slop.
It’s easy to think, “my AI code will find real vulnerabilities and not produce slop, because I’m using an agent and I’m making it actually test its findings”. That is simply not true.
I am sure that there are people out there who can get LLMs to find vulnerabilities. Maybe if I wiggum’d this I’d get something juicy, or maybe I need to use Conductor and then triage results with a sub-agent. However, I can absolutely, without a doubt, reliably one-shot flappy bird with Claude Code. At this time, based on my light weekend experimentation, I do not yet think you can reliably one-shot vulns in real-world software in the same manner.
(well I guess the Ethereum Foundation offered to fly me to Portugal to present at a conference once but that doesn’t really count, and I didn’t go anyway) ↩︎
For more on hacking image parsers, check out this really cool event I ran on the Pegasus malware. ↩︎
I was reminded of the famous Stack Overflow question. Will future generations miss out on these gems? ↩︎
RCE = remote code execution, I think everyone knows this but I also don't want to be that jerk who doesn't define terms.
2026-01-26 10:39:28
Published on January 26, 2026 2:39 AM GMT
As the current Dovetail research fellowship comes to a close, the fellows are giving talks on their projects. All are welcome to join! Unlike the previous cohort talks, these talks will be scheduled one at a time. This is partly because there are too many to do all in one day, and partly because the ending dates for several of the fellows are spread out over time.
The easiest way to keep track of the schedule is to subscribe to the public Dovetail google calendar. I'll also list them here in this post, which I'll update as more talks get scheduled.
All talks will be on Zoom at this link.
More to come!