RSS preview of Strange Loop Canon

Rss preview of Blog of Strange Loop Canon

Notes on Mexico

2026-02-01 22:00:16

A series of observations about Mexico from my travel over the holidays, now that I’ve had time to digest. I went Mexico City, to touch the Aztec, Zapotec and Mayan civilisations, at least cursorily, which made me inordinately happy. It’s the first time I’ve gone, but I got a few days in each place to actually just be which is the only way to travel in my opinion. I’d read a bunch of books before and during my trip, but what I came away with most strongly was the impression of a country that’s psychically much larger than it is physically, with the weight of a few layers of history, and with a peculiar mix of life.

Mexico is like if India was richer, things were cleaner, while being much (much!) more unsafe. This showed up for me almost everywhere I went, often in the background, often not. For instance, this means that while in India you will see a lot more spaces for the rich or large luxury malls, in Mexico it feels like those are hidden away inside secure compounds. In fact the only place I saw this easily accessible and displayed was in Cancun, which is as if the Mexicans built a tourist place just for the Americans and made it look like Dubai.
I was shocked that Mexico City still has a murder rate 1/3rd of NYC in the 1990s. Turns out this ignoble list is also dominated by Mexico.
I continue to be just constantly amazed at how safe India is. It has no right to be so, it’s poor, ill organised and the justice system moves like molasses. I first had this thought in Nigeria, and have repeated this observation in too many countries to name. Central and South America look likely to only exacerbate this question.
This is particularly germane in Mexico because Mexico City reminds me a lot of Delhi, albeit with somewhat worse roads, less people, and far cleaner sidewalks. And entire squadrons of police cars with visible guns every block or two in all the tourist friendly areas.
An interesting aspect that I had never considered is Mexico used to be bigger than the US when it owned most of the US’ current southwest. The country still seem to remember this in their bones. They’re 130 million people but feels much larger. The weight of most of mesoamerican history centers it. They have a Place in History, writ in capital letters in the national psyche.
The level, variety, and affordability of street food remains one of Mexico’s major success stories. Plentiful, tasty and cheap. I largely prefer it to restaurant food. Tlayudes ftw.
Going through the Zocalo in Mexico City is a full body immersive experience, and not one I care to repeat. On the other hand it is massive, disorganised in the best way, and sells anything and everything you can imagine. We got lost inside it and had to trek a dozen blocks in a randomly chosen direction to get out. We realised this after calling an Uber and waiting 20 mins before realising it’s never going to make it.
This is also a plus, because just like the lack-of-zoning-success-stories of almost every country except the US, it makes Mexico City undeniably attractive to every American, who of course love mixed-use easily walkable cities as long as they don’t have to live in them.
This exact reason also makes Cancun the worst place in Mexico I visited, because it’s built for tourism, has a hotel zone, and fails my “Civilisation Test” which is the number of cafes in walking distance. In case you were curious, the winner was Oaxaca. Excellent coffee, and even better hot chocolate.
Mexico City truly is a cultural capital. Incredible museums, great art, great food. The Museum of Anthropology in CDMX is the best museum I’ve seen (‘n’ is very high here).
The main Cathedral is absolutely gorgeous. And being built on the remnants of the lake you can see the effects of the soil moving about as the cathedral is a bit slanted. The styles are more eclectic than you’d find in a European city, and more ornate than I personally like, but worth seeing.
Walking among the Aztec ruins next to the Cathedral is a quasi religious experience because they’re so well preserved. The feathered serpent, Quetzlcoatl, is everywhere, encircling the plazas, out of the walls, surrounded in parts with forms of corn and shells.
As usual I found the fact that until recently tearing down an ancient monument and building another gorgeous monument to be normal and not at all noteworthy, to be interesting. Something we can learn from.
The Aztecs took their iconography and religion seemingly from Teotihuacan, which is an hour away. It’s an older civilisation, 600 years before Aztecs, whose traces they clearly discovered and were influenced by but knew little about. They didn’t know who they were, what their society was like, what they called themselves, nothing. So they, rather whimsically, named it Teotihuacan, the place where gods came from, adopted many of their gods (or so it seemed to me), for instance named the feathered serpent Quetzcoatl, and generally lived a grand life of military conquest for a couple centuries until Cortez arrived.
I can understand why. Teotihuacan is extraordinary, and the Pyramid of Quetzcoatl in particular is magnificent. Considering they didn’t have metal or pack animals this is all the more impressive. The ability of humans to accomplish incredible things at scale never stops continuing to amaze me.
I have not been able to make up my mind about the import of human sacrifice and how much it’s true/ false/ exaggerated compared to other historic cultures.
Driving in Mexico City is very hard. Half the roads are tiny and don’t even look like roads. The green signs that show the roads and destinations often had three names none of which matched what Google maps said, so it was entirely visual navigation. I am now ready to drive in India.
Mexico City also has cable cars as a core mode of public transport, which I hadn’t seen before, and looks wonderful especially when stuck in a traffic jam. I wish the US had these, or indeed any public transport. I tried to take one but it was night and gpt recommended the amount of changes I’d need to make to take a ride was not safe and I shouldn’t do it. So I had churros and cafe de olla instead.
As my 8yo observed, the infrastructure got better as we went from Mexico City to Oaxaca then to Cancun. Curious.
Oaxaca is a jewel of a place. Fits in your palm, highly walkable. High civilisation score. Great food. Great cathedral, though the churrageruerisco was not the best of its type, didn’t come together cohesively.
The street food is plentiful and good. The speciality is mole, a particular type of sauce with mixed spices, and chapulines, fried grasshoppers. Apparently delicious when mixed into butter and eaten with bread.
Oaxaca also had the highest density, originality and quality of art I’ve seen in a city since
There’s plenty of prehispanic food and drink about. Tejate was meh to me, though a latte tejate I had at a market was extraordinary. Generally I remain a fan of modernity, we’ve perfected much of what history revered (and made them better).
Monte Alban, an hour from Oaxaca, is worth visiting. Zapotec built, on top of a hill. Gorgeous views all around. The guide told us when it was built and during the heyday it used to have 9 months of rain, so the water would flow down to the sides of the hill through channels that were cut, and this would supply water from the priests to the commoners. But the water dried up during a long drought lasting a couple decades, people lost faith in the priests to bring rain by praying to Tlaloc, and folks left. So it goes.
The burial rituals were fascinating, they would put the body in a small enclosed space for 4 years, shut tight so no smells would escape, and then would remove the bones and put them in an urn. If more people died they had different spaces like this outside the house.
The various pedestals and spaces had holes below for priests to show “magic”, disappearing and reappearing, as the guide told us. I am personally suspicious of the “people in the olden days were easily fooled” argument, but am in favour of the “everyone likes and believes in rituals” argument.
The idea of worship starting with some seed of truth and then becoming a self fulfilling prophecy as those responsible for the worship taking matters into their own hands will never stop being funny.
Cancun was the least interesting part of the visit. It is also, at least the hotel area, not at all pedestrian friendly. It’s big tourist resorts or nothing.
Chichen Itza, a couple hours from Cancun, was remarkable. Their architecture shows influence from teotihuacan, from toltecs, and there clearly seemed to be trade and information routes between the lands. The Mayan civilisation at least per reading stood for 3600 years, which is an absurdly long length of time.
The cenotes are magnificent. Cenote Xkeken, was a particular favourite, it’s mostly underground with only a shaft of light coming down.
The fact that Mayans ruled for so long in such a dry place with the main water source underground feels quite bizarre. Though once you rationalise by the number of inhabitants maybe it’s fine. Chichen Itza had around 40k, 5x less than Teotihuacan, itself less than Tenochtitlan, and none of them had decent water supply. I do not understand living life in hard mode for that long.
One reason though for the longevity of these civilisations might be survival bias, because but he time a lot of monuments got built without mechanised power it’s already a couple centuries. There’s a funny comparison to be made right California HSR here where we’ve horseshoe theoried our way to construction but I leave that to someone else.
The beaches near Cancun are very good, especially Cozumel, the island where Hernan Cortez first landed. Sting rays and nurse sharks played in the shallows next to our feet at El Cielo. But I’ll be honest I still prefer the beaches of Southeast Asia. Thailand cannot be beaten.
For the number of civilisations that roughly lived side by side at different points in Central America is really impressive. I got GPT to make me multiple maps and websites to help understand this better.
This trip without LLMs would’ve been about 30% as good. Everything from planning to asking about cafes and restaurants to dealing with zocalos to hotels and snacks and history and geography and pretty much anything we wanted to know or learn was made better by GPT, and sometimes Gemini.
Again the sheer number of extremely heavily armed police present in nearly all parts, including highways, was quite striking. They stopped cars at night, frisked folks, and generally were a loud and constant presence. Is this signaling or actual deterrent? Unclear, but everyone states the importance of being sensible and safe.
A substantial proportion of tourists to Mexico City and Oaxaca were Mexican, I think. As a consequence it’s not English language friendly, though again with Google translate and ChatGPT it’s not hard to travel.
I was told by the tourist guides multiple times to not call it Gulf Of America as a form of protest. Everything is politics.
Overall I really liked it, though I understand better why people who don’t have easy access to Asia, like Americans, like it so much more than I did. When it comes to food and markets and the general feeling you’re in a “free” city with limited top down strictures on life, this is the only real choice from North America without braving a really long flight. But I know, or rather I feel, for those you simply cannot beat India or Japan, which are also significantly safer, and have great food and history. Similarly for beaches I’m still a fan of Thailand but by a thin margin. That seems to be the primary motivation for most Americans I know who have gone to Mexico, which seems quite shortsighted to me. Because when you combine all that with its long history and culture, Mexico is pretty great.

The Tragedy of the Agentic Commons

2026-01-21 00:16:09

Written with Alex, who writes here, and you should read him! The repo here.

This has become part of a series of essays, evaluating the new “homo agenticus sapiens” that is AI Agents. Part I was seeing like an agent. Part II is why the agentic economy needs money. And this is Part III.

Whitney Wolfe Herd, Bumble’s founder, recently described a future where your AI chats with potential matches’ AIs to find compatibility. Say what you will about AI being involved in your love life, but this is one domain where AI agents can potentially have large returns: the dating/marriage “market” is the epitome of the type of high-dimensional matching problem that Herbert Simon identified as impossible for people to optimize. Rather than optimising, Simon argued people engage in “satisficing”, i.e., settling for good enough.

Why would AI agents be useful here? Let’s start with how most markets work. Hayek’s big insight–outlined in what he called the economic problem of society–was that prices do an incredible amount of work. They compress a ton of information such as preferences, costs, scarcity, expectations into a single number that acts as a sufficient statistic for value. When you’re buying oranges, the seller doesn’t care what you’ll do with them. The price coordinates the transaction and that’s enough.

But prices work best when the transactions involve commodities. When you’re buying some oranges, the seller doesn’t particularly care what you’re going to do with them; you don’t need to convince him that you’ll take care of the fruit. The price does all the work in coordinating that transaction. Matching markets are conceptually different. You can’t just choose your spouse, your employer, or your college: you also have to be chosen by them. This is the domain that Al Roth, the 2012 Nobel winner for “the theory of stable allocations and the practice of market design,” spent most of his career studying. Roth showed that matching markets require careful institutional design; this design includes algorithms, timing, and the right rules to get the market to “clear.” His deferred-acceptance mechanisms now allocate medical residents to hospitals, students to schools, and kidneys to patients.

But the efficiency of matching markets hangs on the ability to elicit a person’s preferences, i.e., that people can express their rank orderings over potential options. But what if people’s preferences don’t fit in dropdown menus or are difficult to articulate on a standardised questionnaire? Peng Shi studied in his excellent paper “Optimal Matchmaking Strategy in Two-Sided Markets.” He looks at online platforms that match customers to providers using a variety of matchmaking strategies, from searching one side of the market to centralised matching that allows for back-and-forth communication.

Shi found that centralised matching works beautifully when preferences are “easy to describe,” i.e., straightforward to elicit using standard questionnaires, but breaks down when they’re contextual, idiosyncratic, or otherwise difficult to express through standard techniques. This is why many platforms still make you search. You want a contractor who shows up on time and knows your budget–this is easy–but you also want someone who understands your tastes in postmodern living room design. Good luck expressing that on a dropdown web form.

Here is where Large Language Models come in. They are fantastic at turning any unstructured piece of information into better structured matching. They’re also eminently scalable, enabling Coasean bargaining. But scaling things brings with it more coordination problems, too many agents negotiating with too many other agents is noisy. So what type of an institutional setup would make most sense to install here, to make this work well?

That’s what we sought to test with our experiments. The question being, could we figure out how and whether LLMs can help in matching markets where preferences are “hard to describe”? Can LLMs actually elicit the dispersed, hard-to-articulate preferences better than standardised methods? And if they can, what happens when LLM-based agents are available to everyone in the market?

Now, there’s some recent work on the topic that suggests guarded optimism that this is possible. Very new work by Ben Manning, Gili Rusak, and John Horton show that, when parsed through LLMs, short natural-language “taste descriptions” can be superior to standard questionnaires for eliciting preferences when the option set is large. They run an experiment where people write a few paragraphs about what they want in a job and then rank between 10 and 100 options (depending on the condition). Consistent with Simon’s conjecture, people’s ranking effort plateaus as the option size grows large; choice quality grows unstable as the consideration set increases. People get tired of ranking a ton of options and just start guessing. AI-parsed “taste descriptions” scale much better: once tastes are written down, the marginal cost of evaluating one more option is negligible for an AI agent. The advantages of AI-parsed matches are even higher in congested markets where people are more likely to be pushed.

But a theoretical paper by Annie Liang offers an important counterpoint in the case of a potentially complex two-sided matching market. She shows that when personality is sufficiently high-dimensional, meeting just two people in person beats searching over infinitely many AI representations. The noise in AI approximations compounds faster than the benefits of scale. This is a very cool result, and you should all read the paper in full–it’s that perfect type of economic theory that’s both conceptually rich and practically useful.

Ok, with that preamble…

Let’s run an experiment

We set up a simulated Hayekian marketplace with a whole bunch of digital shoppers, providers as AI agents.

Preference elicitation: Knowledge is dispersed in each digital shopper’s “head”: customers know what they want and providers know what they can offer. We want to know how eliciting the preference–either through the standard intake questionnaire or high-dimensional text parsed by an AI agent–can change the market structure for optimal results.
Mechanism interaction: When elicitation improves, can centralised matching beat search, and what are the conditions under which this happens?
Scale: We then check what happens when everyone uses AI agents
Institutional design: Finally, we figure out the right institutional mechanism to solve the resulting problems, and to maximise welfare

Preferences here are latent vectors in each agent’s head. Both the customer and provider agents have a true weight vector over some set of attributes (6 dimensions in this case). So elicitation changes the platform’s inferred w, not the true w. A standard intake is a structured form, and only exposes a few coarse priorities. The AI intake is free text, back-and-forth chat, and can be parsed in the platform’s inferred weight by a couple mechanisms - either by a rule- based algorithm or an AI agent.itself or , or via GPT parsing.

Figure 1 has an abridged illustration of the design and some results. There’s an appendix at the end of the essay in case you want to check out the details of the experimental design. But without further ado, here are some…

Results

First, AI-assisted preference elicitation improves matches across the board.

Figure 1: Experimental design

Figure 2

Second, as shown in Figure 2, AI-based elicitation changes what type of market design works best, and the conditions under which centralised matching can beat search.

Specifically, “Search” and “centralised” are the two different matching protocols we tested. Search means customers iteratively message providers in some ranked order until the matches ‘stick’. Think about how you would find a plumber–message folks, talk to them, iteratively until one ‘fits’.

Centralised is where the platform computes the shortlist for you, and clears a match based on mutually acceptable terms.

Once dispersed knowledge can be elicited and compressed into usable signals, the platform can centrally clear the market rather than forcing users to search. When knowledge can’t be compressed, search dominates because it lets users do iterative, contextual refinement in the loop.

The core object is the ‘ROI boundary’. If the per action attention cost is high enough, centralisation dominates–it just requires fewer actions. If the cost is low, search can dominate because it can “handle” more actions. This is the very idea of Coasean bargaining helping remove the boundaries of firms.

So where does the value of LLM-based elicitation actually come from? Is it from the back and forth conversation, or the ability to parse large text? As described above, we prompted all of the customers to write some free text about things they like and whatnot, and then used some rules-based parsers and some LLM-based parsers. There’s also the option for conversational elicitation via chat.

We thought the AI agents’ ability to ask follow-up questions would be the game-changer. Turns out though (see Figure 3), most of the value comes from the AI agent simply inferring more signal from messy text compared to the signal in a rule-based parser. This is consistent with the work of Manning et al. that we discussed above. This may of course be something specific to our prompts—perhaps one could obtain further gains by explicitly instructing the AI agents to engage in structured back-and-forth with the customers, and to do so in contexts where this would be helpful, but this was not the case here.

This highlights the utility of LLMs for extracting (potentially high dimensional) signals from unstructured data. Back in the day OKCupid used to make people fill out 90-100 questions to help match them with their potential partners. With LLMs, they might’ve been able to get away with writing a short essay and getting their Agentic Cupid to pull out the relevant information. Whitney is certainly on to something.

Figure 3

But what if people don’t really know what they want, does preference uncertainty matter? Whenever Rohit shops, he’s not sure of what he wants before he goes in the store. There’s a lot of noise in the process. Alex is a pure satisficer: the first item that meets a (very low) threshold gets put in the cart (usually virtual), and off to check out he goes.

We can test for that pretty easily here by introducing a bit of randomness into our shoppers’’ heads. At least in our setting, injecting noise into preferences doesn’t matter for the AI’s ROI all that much. We can still do centralised matching and extract a lot of value from that mechanism—as long as the preference noise isn’t too cacophonous.

What if everyone uses an LLM agent?

We had originally set up a pretty small marketplace. The centralized mechanism at this scale can be computed and cleared so we can run the experiment. But what happens when the scale explodes, both in the number of options and the number of customers potentially using AI agents? This is the problem matching platforms like Upwork are trying to solve: the option set is absolutely huge, but so is the potential customer base.

Every time a customer opens up a marketplace like Upwork, the number of choices just on the front page makes it hard to remember what they came for. Ideally AI-delegated agents can solve this problem: the user speaks or writes down what they want to do, the AI agent pings the platform, and the user is presented with the match. But what if every potential shopper had their own AI agent who wanted to message the providers on the platform? That’s a lot of agents doing individualised message sending to the provider inboxes!

So as you increase the number of customers with AI agents, the level of congestion rises significantly. Each customer agent sends a query to a provider agent’s inbox and it has to respond. Responding to all those agents takes a lot of compute. Here is what happens in our simulation (Figure 4): At full adoption, the providers’ inboxes flood with 5x the amount of requests, response rates collapse from 48% to 2%, and net welfare drops 88%.

Figure 4

Without institutions in place to scaffold the marketplace, a tragedy of the commons emerges: If everyone has an AI agent, it’s almost like nobody does. The paradox of plenty is real, and AI agents create their own version of Jevons paradox.

The need for institutions and scaffolding

What can fix this type of congestion? Prices!

As in a previous post–where we showed the importance of money in coordinating trade amongst AI agents–introducing a price mechanism recovers most of the lost welfare in matching. A vindication of Hayek’s deeper insight.

Specifically, we can introduce an exchange and money, such that the agents now have a pricing mechanism to signal their “strength of preference”. The idea is that the complexity falls because now not every provider and customer need to message each other. Prices capture a lot of high dimensional information in a single statistic, streamline a lot of that information, as we’d seen with the simulation in as we saw in the simulation in barter_to_money, complexity falls from O(n²) to O(n).

Figure 5

Pricing works! As shown in Figure 5, most of the welfare gains are recovered and the congestion issues are resolved. LLMs may lower the cost of expressing dispersed knowledge, but they don’t remove the need for institutional design to manage externalities. At least in our experimental simulation, the price system remains essential to solve the issue of complexity and congestion.

What did we learn?

If we think about an AI agent economy, we would want to know more about the mechanism that facilitates coordination. First, we have to ask, “If agents lower transaction costs, do markets just happen?”

In a previous post we looked at what would happen if there were a bunch of agents who had to interact with each other to trade, and it turned out that they don’t form markets spontaneously. In fact you have to do a fair amount of work before the agents are ready to interact.

Ok, if markets need scaffolding, what’s the minimal substrate that makes coordination scale? i.e., how will the agents coordinate amongst themselves? Will they be able to develop methods to do so themselves, e.g., through bilateral and multilateral negotiations, or will they need further help. It turns out that no matter how much you want to set things up just so, the agents will still need money and prices to trade efficiently. Even with the lower transaction costs and larger levels of compute, the coincidence-of-wants problem still doesn’t disappear - Hayek remains vindicated.

In this current essay we explore whether LLM agents can make centralised matching more efficient–we should expect marketplace consolidation in categories that were previously too heterogeneous for algorithmic matching, e.g., wedding vendors, specialised consulting, creative services. We showed that in “thin” markets AI agents help facilitate better match quality through centralised mechanisms.

However, if everyone has an AI agent, we still need a pricing mechanism to solve the resulting congestion and complexity problems that arise. Congestion is a serious threat at scale!.

So what is the broader take away from this essay, from the whole series of essays? For us it’s that AI agents work remarkably well when institutional design facilitates the interactions and transactions. Since direct instruction for every eventuality is impossible, the only way to make the AI agents behave at scale is to design the right scaffolding to facilitate coordination and exchange. This involves the creation of markets, and yes, money! If we can learn to design the “institutions” within which the agents operate, then we can help have them do far more complex tasks that we want. Autonomy, that’s the true prize!

Appendix: More about the design

Warning: wonky.

We constructed a simulated marketplace where customers seek service providers (contractors) across task categories that vary in how difficult preferences are to articulate. Each customer is seeded with true preferences represented as a 6-dimensional vector of weights (summing to 1) over provider attributes. A match is formed when both sides’ true values clear a threshold.

“Easy” categories include things like TV mounting or furniture assembly; preferences in these categories can be mapped cleanly onto standard form fields such as price, availability, and distance. “Hard” categories, such as ability to repair a historic staircase or a complicated asbestos remediation with specific guidelines, involve preferences that are more difficult to elicit using standardised questionnaires. We then see whether the ROI threshold changes based on how well the models can “elicit the true preferences” of the underlying actors.

The experimental intervention targets the preference-inference pipeline: how customer preferences get translated into data the platform can act on. The experiment varies the intake method (standard structured forms versus free-text descriptions parsed by an LLM) crossed with the matching mechanism (decentralised search where customers browse and choose, versus centralised assignment where the platform matches algorithmically). Match quality is computed as the dot product of the customer’s true preference weights and the matched provider’s attributes, minus any search costs incurred. All of this is summarised in Figure 1 below.

Figure A1: Experimental Design

Will money still exist in the agentic economy?

2025-12-19 22:03:27

Written with Alex Imas, subscribe to his blog here!

This has become part of a series of essays, evaluating the new “homo agenticus sapiens” that is AI Agents. There was Part I, seeing like an agent. This is Part II. And Part III on what happens when we all have AI agents.

Sometimes I forget but we live in a future transformed by information technology pretty much across ever aspect. But one thing has remained largely the same: we still live in a world where the vast majority of economic transactions are done by people. If you want to buy a car, the process is largely the same as it was 50 years ago. You go down to the dealership and negotiate the best price that you can. Sure, you may have some extra information from doing research on the web beforehand - it’s certainly much easier to do comparison shopping with a supercomputer in your pocket - but the basic process of transacting with another human being has largely stayed the same.

One change that’s likely to come though is that there will soon be 10x, 100x, maybe more AI agents working in the world as there exist people. And as we have lots of AI agents working on our behalf, doing all forms of work, then there is a thesis that many of the frictions and information asymmetries that people face in markets may disappear if economic transactions are delegated to aligned agents, leading to a so-called Coasean singularity.

We’re not there yet though. Today’s agents are simply not good enough yet to act sensibly or without strict instructions. Many of the features of human-mediated markets still seem to be reproduced in AI agentic interactions. But as online spaces adapt to the promise of AI technology, it seems natural to think of how agentic markets will be organized. In a future world where we do have billions of AI agents, how would they coordinate with each other? What kind of coordination mechanisms would be needed? What institutions are likely to emerge?

And one possibility is particularly intriguing: will coordination still require money? Not in the sense of US dollars, but a shared medium of exchange and a hub/ clearing protocol.

Money, Money, Money

“Why money” has occupied economists going back to Adam Smith, who framed cash as solving what has since been termed the coincidence of wants. To see what we mean, consider a pure barter economy. Let’s say Alex is an apple farmer and Rohit raises chickens. If Alex wants chickens and Rohit wants apples, then Alex can just walk over to Rohit’s house with a bushel of apples and get some chickens in return. Simple. But what if Alex wants chickens but Rohit wants an electric toothbrush - he has no need for apples right now. Then to get the chickens, Alex would need to find a person who is willing to trade an electric toothbrush for his apples, and then come back to Rohit for a trade.

This would still all be fine if there was just one other person to visit and trade with, but what happens in a large market, with many (many) people who potentially have both an electric toothbrush to trade and want Alex’s apples? In order to trade, Alex needs to happen to find a person that both 1) has what Alex wants and 2) wants what Alex has. As very nicely shown in a paper by Rafael Guthmann and Brian Albrecht, the need to satisfy this coincidence of wants through finding matches creates complexity that quickly blows up as the size of the market increases. If the market is even moderately large, this complexity makes even basic transactions essentially impossible.

Ergo money. While the origin of money is a hot topic of debate (e.g., see David Graeber’s excellent book Debt: The First 5000 Years), the role of money in a competitive market is to solve the coincidence of wants. Money acts as a special type of good called the numeraire, where its only role is that it can be exchanged for other goods at pre-determined quantities. These quantities are reflected in the prices that each good is worth.

Going back to Alex and Rohit: one way to solve the coincidence of wants would be for Alex to sell his apples at a special place called market and then to use the money to purchase Rohit’s chickens. Rohit can then use that money to buy an electric toothbrush, or indeed any other thing his heart desires. Money eliminates the need for people to coordinate their transactions based on their current endowment (what they have) and preferences (what they want).

Bring on the agents

Okay, so money is necessary to coordinate transactions in an economy with people. This is largely because each individual can’t hope to have enough information on what everyone else has and wants to reliably engage in market transactions. Alex and Rohit are as yet, sadly, mortals.

But will this be the case for AI agents?

Agents do not have the same computational constraints as human beings. In theory, it may be possible to solve the search problem where the coincidence of wants becomes a non-issue. In that case, the agentic economy could eliminate the need for a key institution of the human economy. We decided to run an experiment to find out.

The experiment

First, the repo here. We can have N agents, with N goods, and each starts with its own good and wants another. There’s multiple rounds, one action per agent per round. Agents decide their course of action via structured JSON, and success simply means you get what you want.

The first question is about a pure barter economy. We explore whether LLM agents can achieve efficient allocations through barter at any scale, i.e., to engage in multiple bilateral negotiations to achieve gains from trade. The agents in the experiment have no real shortage of time. If this works then Coasean bargaining should be straightforward; goodbye money!

The table below has the results. What do we see? When the scale is small - when Alex just has to worry about coordinating with Rohit - all of this works. But as the number of agents grows, things start to get really difficult. By the time we get to even 8-12 agents the number of successful transactions drops to below 50%. And this is the absolute simplest setting.

Perhaps this should be expected. The problem is still O(n²) in complexity, which grows exceptionally fast as the number of agents grow. And if this isn’t just bilateral, but starts to include multiparty negotiations, it might become O(n!), which is far bigger for any number bigger than 3.

Ok let’s make it a bit easier for the agents. If they can’t talk to each other, since they are agents anyway, we should be able to give them omniscience. Enter Central Planning. There has been plenty of work before in the limits of bilateral negotiations, but we can test how well a “hub” structure can help. Does having a central planner help set the stage for better performance?

As the results table shows, central planning makes things slightly better, but we are still very much in a world of the Hayekian troubles. A hierarchy without a numeraire just isn’t enough.

Ok, we can continue looking at our human history to see what else we can do. In Debt, David Graeber argues that money emerged at least in part through state power, to enforce the paying of taxes in order to fund foreign wars. Before this, he argues, IOUs and bartering seemed to have worked just fine to manage the economy; the IOUs themselves became a sort of numeraire that could be traded in order to solve the coincidence of wants.

So let’s introduce_,Credits and IOUs. We can give the agents the ability to give each other an IOU and see whether providing the basics of credit allows them to come up with better ways to interact with each other.

This still didn’t help as much as we thought. There were a few segments where the transactions started happening, but they really didn’t start to work. Or scale.

Most interestingly, the concept of money didn’t emerge from this, not organically. IOUs didn’t become money. Even though in conversations LLMs all know that this is the smart thing to do, it did not emerge.

This was a bummer, because as with the prior research, what this shows is that AI agents do not yet come with the natural instincts of humans to turn IOUs into a numeraire that acts as a stand-in for money. They don’t even come with the same set of ideas as this sea otter.

Ok, let’s take the final step and actually introduce Money. We do this by creating an exchange where the agents can do bids and offers, and look at market outcomes. The results are stark: markets resolve at a success rate of 100% and much faster than through other mechanisms, at the rate of O(n).

One note is that this result presumes the exchange works without a hitch. In reality there will be friction coming from liquidity constraints, differential compute resources, etc. For example, in the N=8 run, the hub handled 23 inbound + 23 outbound messages and prices stayed fixed. And if regulations require that AI agents use different types of country-specific currencies, then exchange rates will complicate things further.

Discussion

To sum: An agentic economy doesn’t emerge automatically with even SOTA agents (who really should know better). Barter and central planning remain inefficient and infeasible, and money does not emerge organically even when credit and IOUs are introduced. At least in our setting, an agentic economy needs more top-down engineering to become efficient.

Previous work on agent-based modeling has explored what kind of emergent economic realities we are likely to see with rule-based agents interacting. The world of AI agents is fundamentally different. These agents act based on a huge corpus of human knowledge, with the underlying LLM models able to solve incredibly difficult problems on their own. These agents can plan, they can negotiate, they can code. And even with all this knowhow at their disposal, it’s interesting to see that they still appear to require top-down institutions to create an effective and efficient market.

As we transition to a more agentic economy, a key part of ‘getting ready’ for that world is setting up institutions for the agents. Like including:

Identity and roles

Settlement and payment
Pricing and quote formats
Reputation
Marketplaces and clearinghouses

This is by no means exhaustive, but we wager that mechanism design for multi-agent work is going to be a rather fertile area of research for a while. Humanity went through millennia of evolution to figure out the right societal setup that lets us progress, that lets us build a thriving civilisation.

It is both necessary and inevitable that the world of AI agents will also need the equivalents, though the emergence of such institutions will likely be much faster given the millennia of human knowledge that we’ve already amassed.

Github repo here.

Seeing like an agent

2025-12-08 23:02:20

This has become part of a series of essays, evaluating the new “homo agenticus sapiens” that is AI Agents. This is Part I, seeing like an agent. Part II is why the agentic economy needs money. And Part III on what happens when we all have AI agents.

One of the books that I loved as a kid was Philip Pullman’s His Dark Materials. The books themselves were fine, but the part I loved most were the daemons. Each human had their own daemon, uniquely suited to them, that would grow with them and eventually settle into a form that reflects their personality.

I kept thinking of this when reading the recent NBER paper by John Horton et al about The Coasean Singularity. From their abstract:

By lowering the costs of preference elicitation, contract enforcement, and identity verification, agents expand the feasible set of market designs but also raise novel regulatory challenges. While the net welfare effects remain an empirical question, the rapid onset of AI-mediated transactions presents a unique opportunity for economic research to inform real-world policy and market design.

Basically they argue, if you actually had competent, cheap AI agents doing search, negotiation, and contracting, like your own daemon, then a ton of Coasean reasons firms exist disappear, and a whole market design frontier reopens.

This isn’t a unique argument, though well done here. I’ve made it before, as has others, including Seb Krier recently here and Dean Ball and many others. The authors even talk about tollbooths as from Cloudflare and agents only APIs and pages.

But while reading it I kept thinking by now this is no longer a theoretical question, we now have decent AI agents and we should be able to test it. And it’s something I’ve been meaning to for a while, so I did. The question was, if we wire up modern agents as counterparties, do we actually see Coasean bargains emerge. Repo here.

The punchline is that AI agents did not magically create efficient markets. And they also kinda fell prey to a fair bit of human pathologies, including bureaucratic politics and risk aversion.

Experiment 1: An internal capital market

The first way to test these was to just throw them into a simulated company and see what happened. So I set up four departments - Marketing, Sales, Engineering and Support - and said they could all bid for budget to do their jobs. Standard internal capital market where departments would submit bids and projects get funded until budgets get exhausted.

If the promise of Hayek holds and we can get markets if information flowed freely, then we should be able to see this work. And it would be much better than the command and control method by which we try to decide this today.

Well, it didn’t work. Marketing and Sales accumulated political capital. Engineering posted negative utility for most quarters. The market we set up systematically funded customer facing features and starved infrastructure work. It’s like Seeing Like A State all over again.

I think this was because GTM type departments could come up with immediate articulable customer values, whereas Engineering’s value kept feeling preventative or diffuse.

It’s a bit frustrating to see that the models still retain human foibles since this is effectively Goodhart’s Law. When you measure departmental utility and fund accordingly, and you let the agents argue on their behalf, you do start to see negative externalities for core functionality.

So I added countermeasures. I added risk flags on features and veto power over “dangerous” work. Added shared outage penalties (if you ask for a risky feature and everything crashes, you pay for it too). And when I ran that, outages did happen. GTM departments observed this and tempered their bids, though only a little.

Engineering utility however still stayed low. GTM could discount future outages and gambled on “maybe it won’t break” for its immediate wins. But Engineering couldn’t proactively push folks into infrastructure investments. The pattern is hardly dissimilar.

The truly interesting part was that the agents perfectly replicated the dysfunction of real companies. Onwards.

Experiment 2: External markets - IP licensing

This was the most interesting part. The best way to see Coasean bargaining come true is to set up an external market for cross firm technology licensing. Twenty firms and thirty software modules. Each firm has some internal capabilities but could also license tech, so the buy vs build becomes a much cleaner decision with AI agents vs humans in reality. A classic setup, and the payoffs should be excellent. Or so I thought.

First run had zero deals. Every firm decided to build everything internally. They understood the rules and saw potential counterparties and had budget to trade, but still they chose autarky.

Okay, so I added reputation systems, post-trade verification, penalties for idleness, bonuses for successful deals, counterparty hints, even price history. Basically the kitchen sink.

Still zero trades.

This is the perfect setup as per the paper. Transaction costs effectively zero. Perfect information. Aligned incentives. Etc etc. The agents just didn’t care to trade! Because of very high Knightian uncertainty aversion (I assume), or some heavy pretraining that firms mainly build, not trade.

So I mandated ask/bid submissions. If you don’t post prices, defaults are generated. Profits are then directly coupled to next quarter’s budget. And I even gave explicit price hints, because the agents clearly couldn’t, or wouldn’t, discover equilibrium without them.

Now we start to see trades! Success! Three deals per round. The welfare is still far below the market optimum, but that’s possibly also because we haven’t optimised them yet.

But by now it wasn’t a market in the Hayekian sense. Like it’s no longer voluntary. We’re forcing the agents to trade, and then they do the sensible things.

Since it worked well for well behaved participants, I also did a robustness check, so we are creating adversarial firms and then check if the market still functions! And it does. Adversarial sellers captured much of the surplus, i.e., fairness is expensive. It’s either weak strategic sophistication or the agents are just nice and passive by default, I don’t know which.

Experiment 3: Second price auctions

The third experiment was one to check whether the models behave according to their beliefs. Vickrey auctions are sealed second price auctions, so the winner pays the price of the second highest bid. This means the dominant strategy is for the bidders to be accurate to their beliefs.

And they did. Allocative efficiency was 1. This is a little bit of a control group since the models must be smart enough to know the dominant strategy. I added “profit max only” personas, and collusion channels, just to check, and the behaviour still looked like standard truthful Vickrey bidding.

This tells us that they’re smart enough to do the right thing, but also that given a messy environment with underspecified mechanisms, which is most of the real world, they default to passivity or autarky.

I tested this also with a bargaining test with five players, which asks the models to divide a surplus value and have them negotiate with each other as to how to split things. The players can see a broadcast and each others proposal, but after round 1 the players can DM others. I even made one of the players adversarial. And still the splits remained near-equal, very far from the Shapley vector. They are norm conforming. Models are highly self-incentivised to be fair!

Synthesis

We saw 4 claims tested. To summarise:

AI lowers transaction costs so markets emerge spontaneously - False
With mechanism design, AI-mediated markets can function - True, but costly (required forced participation with Gosplan-ish price hints)
Internal markets improve on hierarchy when coordination costs fall - False (GTM dominates Engineering even with full information)
AI agents play fair in functioning markets - Mixed (adversarial agents extract rents, but agents are mostly fair)

The takeaway from these experiments is that to get to a point where the AI agents can act as sufficiently empowered Coasean bargaining agents, for them to become a daemon on my behalf, they need to be substantially empowered and so instructed. They do not act in the way humans act, but are much fairer and much more passive than we would imagine.

Markets don’t form spontaneously. Markets form under coercion but are pretty thin. And when markets exist, strategic sophistication determines who wins, depending on how the agents are set up. It shows alignment problems don’t disappear just because the agents can negotiate with each other. This is pessimistic for the AI dissolves firms narrative but optimistic for AI can enable better institutions narrative.

The Coasean Singularity paper argues AI lowers transaction costs but the gains require alignment and mechanism design, which is what I empirically tested here. It’s a strong confirmation of its strong form - that reduction in transaction costs was nice but mechanism design was needed to set up an actual market.

Also the fact that we needed to couple their budgets so the AIs needed to work from the same hymn is important, it means any multi agent design we create would need a substrate, like money, to help them coordinate.

Now some of this is that the intuitions we have built up over time, both from other humans but also from stories, is to assume that the agents have enough context at all times on what to do. I see my four year old negotiating with his brother to get computer time and by the time he’s a bargaining agent with some hapless corporation he would have had decades of experience with this. Our models on the other hand had millions of years of subjective experience in seeing negotiation but have zero experience in feeling that intense urge of wanting to negotiate to watch Prehistoric Planet with his brother.

Perhaps this matters. These complex histories can get subsumed in casual conversation into a seemingly innocuous term like “context” and maybe we do need to stuff a whole library into a model to teach it the right patterns or get it to act the way we want. The daemons we do have today aren’t settled in forms that reflect our interests out of the box though they know almost everything about what it is like to act as if it shares those interests.

But what the experiments showed is that this is far from obvious. Coase asked why firms exist if markets are efficient, and answered it’s because of transaction costs. The experiments here ask, even with zero transaction costs, why do firm-like structures still emerge1?

And if we do end up doing that, we might have just rediscovered the reason why firms exist in the first place, the very nature of the firm. Even as we recreate it piece by instructive piece.

Github repo here

And when we are able to roll the AI agents out, we will get firms that are more programmable, more stimulated and more explicitly mechanism-designed than human firms ever were.

Contra Scott on AI safety and the race with China

2025-12-02 09:12:23

has a really interesting essay on the importance of AI safety work, arguing it will not cause the US to fall behind China, as is often claimed. It’s very well written, characteristically so, and well argued. His argument, in a nutshell ( I paraphrase) is:

US has ~10x compute advantage over China
Safety regulations add only 1-2% to training costs at most
China is pursuing “fast follow” strategy focused on applications anyway
Export controls matter far more (could swing advantage from 30x to 1.7x)
AI safety critics are inconsistent - they oppose safety regs but support chip exports to China
Sign of safety impact is uncertain - might actually help US competitiveness

I quite like this argument because I actually agree with all of the points, mostly anyway, and yet find myself disagreeing with the conclusion. So I thought I should step through my disagreements, and then what my overall argument against it is, and see where we land up.

First, the measurement problem

Scott argues that the safety regulations we’re discussing in the US only adds 1-2% overhead. This is built off of METR and Apollo’s findings, around $25m for internal testing, and contrast this with $25 Billion for training runs. All the major labs also already spend enormous sums of money on intermediate evaluations, model behaviour monitoring and testing, and primary research to make them work better with us, all classic safety considerations.

This only holds if the safety regulation based work, hiring evaluators and letting them run, is strictly separable. Which is not true of any organisation anywhere. When you add “coordination friction”, you reduce the velocity of iteration inside the organisation. Velocity here really really matters, especially if you believe in recursive self improvement, but even if you don’t.

This is actually visible in ~every organisation known to man. Facebook has a legal department of around 2000 employees, doubled since pre Covid, of a total employee base of 80,000. Those 2000 are quite likely not disproportionately expensive vs the actual operating expenditure of Facebook. But the strain they put on the business far exceeds the 2.5% cost it puts on the output. There’s a positive side of this argument, they will also prevent enough bad things from happening that the slowdown is worth it. Presumably Facebook themselves believe this, which is why they exist, but it is very much not as simple as comparing the seemingly direct costs.

The argument that favours Scott here is maybe pharma companies,

This gets worse once you think about the 22 year old wunderkinds that the labs are looking to hire, and wonder if they’d be interested in more compliance, even at the margin.

China is a fast follower

The argument also states that China is focused on implementation and fast-follow strategy, because they don’t believe in AGI. I think it’s an awfully load bearing claim, and feels quite convenient. China is also known for strategic communication in more than one area, where what they say isn’t necessarily what they focus on.

As Scott notes, Liang Wenfeng of Deepseek, explicitly has stated he believes in superintelligence, which in itself is contradictory to the argument that they care about the applications layer. If China does truly believe in deployment, as it seems to be the case, then having true believers as heads of top labs is if anything more evidence against “they’re just fast followers” argument.

They’re leaders in EVs, solar panels, 5G, fintech and associated tech, probably quantum communications, an uncomfortably large percentage of defense related tech, seemingly humanoid robots, the list is pretty long. This isn’t all just fast followership, or at least even if it is, it’s indistinguishable from the types of innovation we’re talking about here.

Again, this only really matters to the extent you think recursive self improvement is true or China won’t change its POV here very fast if they feel it’s important.The CCP has an extraordinary track record of redirecting capital in response to perceived strategic opportunity (and overdoing it). That means “they don’t believe in AGI” is an unstable parameter. Even if the true breakthrough comes from some lab in the US, or some tiny lab in Harvard, it will most likely not be kept under wraps for years as the outcomes compound.

The AI safety critics are sometimes bad faith

This is true! There’s a lot of motivated reasoning, which tries to tie itself in knots such as to argue “to beat china we have to sell them the top Nvidia chips, so they don’t develop their own chip industry and cut the knees off another one of our top industries”. Liang Wenfeng has also said that his biggest barrier is access to more chips.

That said, here my core problem is that I am unsure about which aspects of the regulations being proposed are actually useful. Right now they ask for a combination of red-teaming (to what end), hallucination vs sycophancy (how do you measure), whistleblower protections, bias (measurement?), CBRN (measurement delta vs pure capability advance), observability for chip usage (hardware locks?), and more. These assume a very particular threat surface.

The Colorado AI Act focuses on algorithmic fairness and non discrimination. Washington HB 1205 focuses on digital likeness and deepfakes. AB2013 in California on disclosing training data for transparency. Utah’s SB 332 says AI has to say theyre AI when using a chatbot. These are all quite different, as we can see, and will require different answers in both implementation and compliance. writes about this cogently and cohesively.

Many of these ideas are sensible in isolation, but many of them are also extremely amorphous. Regulations are an area where I am predisposed to think that unless they’re highly specific and ROI is directly visible it’s better to not get caught in an invisible graveyard. The regulatory ratchet is real, as Scott acknowledges. Financial regulation post-2008, aviation post-9/11, FDA … We always have common sense guardrails that creates an apparatus that then expands.

Sign uncertainty

It is definitely true that having a more robust AI development environment might well propel the US forward vs China. Cars with seatbelts beat cars without seatbelts. Maybe lack of industrial espionage means the gains from US labs won’t seed Chinese innovation.

It should be noted though that the labs already spend quite a bit on cybersecurity. Model weights are worth billions, soon dozens of billions, and are protected accordingly. Should it be made stronger? Sure.

It should be noted, underlined, however that this is true only insofar as the Chinese innovation is driven by industrial espionage or weight stealing. Right now that definitely does not seem to be the case. What is true is that deployment by filing off the edges, making the products much nicer to use, especially via posttraining, is something Western models do a much better job of. Deepseek, Qwen or Kimi products are just not as good, and differentially worse than how good their models are.

So … now what.

Scott’s argument makes sense, but only in a particular slice of the possible future lightcone. For instance, we can sort of lay down the tree of how things might shake out. There are at least 5 dimensions I can think of offhand:

Takeoff speed
Alignment difficulty
Capability distribution (oligopoly, monopoly etc)
Regulations’ impact on velocity
China’s catch up timeline

You could expand this by 10x if you so chose, and things would get uncomfortably diverse very very quickly. But even with this, if we split each of these into like 4 coarse buckets (easy, moderate, hard, impossible), you get 1024 worlds. I asked Claude to simulate these worlds and choose whatever priors made sense to it, and it showed me this:

I’m not suggesting this is accurate, after all there could be a dozen more dimensions, or the probability distribution might be quite different. Change in one variable might impact another. But at least it gives us an intuition on why the arguments are not as straightforward as one might imagine, and it’s not fait accompli that “AI safety will not hurt US in its race with China”, and that’s assuming the race is a good metaphor!

For instance, here’s one story which I tried to draw out after getting lost with the help of Claude.

Does recursive self improvement happen?
- Y. First to ASI wins the lightcone
  - Is there a close race with China?
    - Y. Every month matters
      - Do safety regs meaningfully slow us?
        
        Y. Disaster!
        
        N. Small overhead doesn’t matter!
    - N. US has durable advantage (10x compute)
      - Does model quality matter more than deployment?
        
        Y. We have time for safety work. 6mo slower might be fine!
        
        N. Safety regs might not matter
- N. Gradual capability increases
  - Which layer determines winner?
    - Model layer
      - How durable is US advantage
        
        10x compute advantage wins, so regulations are basically “free”
        
        If china can catch up however, efficiency gains matter, so safety regs might be a small drag but real
    - Application layer
      - Do safety regulations affect deployment velocity?
        
        Yes. Compliance morass and lawyerly obstruction everywhere.
        
        N. Safety regs only affect the model. It’s fast and unobtrusive. It’s fine.

In this tree there are only a few areas where Scott’s argument holds water. Recursive self improvement is important enough to worry about but unimportant enough that velocity doesn’t matter. Chinese skepticism about ASI is stable but we should prevent dictators getting ASI. We can measure direct costs but what about illegible costs? Model layer regs won’t affect application layer despite Colorado showing they already do.

If recursive self improvement is false, it only makes sense to do more regulations *if* safety regulations do not meaningfully impact deployment velocity in the application layer and the compute advantage holds in the model layer. If recursive self improvement is going to happen, then Scott’s argument has more backing, especially if safety regulations don’t slow us down much as long as the model quality will continue to improve.

Which means of course the regulations have to be sensible, they can’t be an albatross, China’s “catch up” timeline has to be longer, the capability distribution has to be more oligopolistic, alignment has to be somewhat difficult, and takeoff speed has to be fairly fast.

If we relax the assumptions, as in the tree above, we might end up in places where AI safety regulations are more harmful than useful. One example, and this is my own view, is that a lot of AI safety work is just good old fashioned engineering work. Like you need to make sure the model does what you ask it to, to solve hallucinations and sycophancy. And you need to make sure it doesn’t veer off the rails when you ask it slightly risque questions. And you’d want the labs to be “good citizens”, not coerce employees to keep quiet if they see something bad.

Scott treats regulatory overhead as measurable and small in his essay. But the history of compliance shows they compound through organisational culture, talent selection, and legal uncertainty and dominate direct costs. If he’s wrong about measurement, and Facebook’s legal department suggests he is, then his entire calculation flips. Same again with China’s stance in reality vs what they say, or the level of belief in recursive self improvement.

To the question at hand, will AI safety make America lose the war with China? It depends on that tree above. It is by no means assured that it will (or that it won’t), but the type of regulation and the future being envisioned matter enormously. The devil, as usual, is in the really annoying details.

In my high-weight worlds, AI safety work can meaningfully help, but only if done sensibly. I don’t put too much weight on recursive self improvement, at least done without human intervention and time to adjust. I also think that large amounts of safety are intrinsic principles to build widely available and used pieces of software, so are not even a choice. They might not be called AI safety, they might be called, simply, “product”, which would have to think about these aspects.

Personally, I prefer a very economist’s way of asking the “will AI safety make the US lose to China” question, which is: what is the payoff function for winning or losing the race? Since regulations are (mainly) ratchets, we should choose them carefully, and only when we think it’s warranted (high negative disutility if not, positive utility if we do).

In “mundane AI” world, we get awesome GPTs but not a god. Losing means we’re Europe. While some might think of this as akin to death, it’s not that bad.
In “AI is god” world, losing is forever

Even in the first world, AI safety regs might make the US the Brussels of AI, which is a major tradeoff. Most regulations currently posed don’t seem to yet cause that effect. But, it’s not like it’s hard to imagine.

Regulation can be helpful with respect to increasing transparency (training data is one example, though with synthetic data that’s already hard), with whistleblower protections (even though I’m not sure what they’d blow the whistle on), and red teaming the models pre deployment. I think chip embargoes are probably good, even though it helps Huawei.

It’s far better to not think about pro or con AI safety regulations, but to be specific about which regulation and why. The decision tree above helps, you do need to specify which worlds you’re protecting.

Epicycles All The Way Down

2025-11-27 03:37:24

“All models are wrong, but some are useful.” — George E. P. Box

“All LLM successes are as human successes, each LLM failure is alien in its own way.”

I. Two ways to “know”

I was convinced I had a terrible memory throughout my schooling. As a consequence pretty much for every exam in math or science I would re-derive any formula that was needed. Kind of a waste, but what could I do. Easier than trying to remember them, I thought. It worked until I think second year of college, when it didn’t.

But because of this belief, I did other dumb things too beyond not study. For example I used to play poker. And I was convinced, and this was back in the day when neural nets were tiny things, that my brain was similar and I could train it using inputs and outputs and not actually bother doing the complex calculations that would be needed to measure pot odds and things like that. I mean, I can’t know the counterfactual but I’m reasonably sure this was a worse way to play poker that just actually doing the math, but it definitely was a more fun way to do it, especially when combined with reasonable quantities of beer. I was convinced that just from the outcomes I would be able to somehow back out a playing strategy that would be superior.

It didn’t work very well. I mean, I didn’t lose much money, but I definitely didn’t make much money either. Somehow the knowledge I got from the outcomes didn’t translate into telling me when to bet, how much to bet, when to raise, how much to raise, when to fold, how to analyse others, how to bluff, you know all those things that if you want to play poker properly you should have a theory about.

Instead what I had were some decent heuristics on betting and a sense of how others would bet. The times I managed to get a bit better were the times I could convert those ideas of how my “somewhat trained neural net” said I should and then calculated the pot odds and explicitly tried to figure out what others had and tried to use those as inputs alongside my vibes. I tried to bootstrap understanding from outcomes alone, and I failed1.

II. Patterns and generators

“What I cannot create, I do not understand.” — Richard Feynman

This essay is about why LLMs feel like understanding engines but behave like over-fit pattern-fitters, why we keep adding epicycles that get us closer to exceptional performance, instead of changing the core generator, and why that makes their failures look more like flash crashes and market blow-ups than like Skynet.

One way this makes sense is that mathematically the number of ways to create a pattern has to be more than the number of patterns themselves. There are more words than letters. The set of all possible 1000 character outputs is huge, but the set of programs that could print any one of them is larger2.

An LLM trained on the patterns swims in an ocean of possible generators and the entire game of training is to identify those extra constraints so it has reason to pick the shortest, truest one. Neural networks have inductive biases that privilege certain solutions.

There is an interesting mathematical or empirical question to be answered here. What are the manifolds of sufficiently diverse patterns which can be used such that collectively it will turn away the wrong principles and keep only the correct generative principles?

I’m not smart enough to prove this but perhaps starting with Gold’s theorem, which says something like if all you ever see are positive examples of behaviour, then for a sufficiently rich class of programs it might well be true that no algorithm can be guaranteed to eventually lock onto the exact true program that produced them. LLMs are a giant practical demonstration of this. They implicitly infer some program that fits the data, but not necessarily the program you “meant”.

I asked Claude about this, and it said:

The deeper truth is that success is low-dimensional. There are relatively few ways to correctly solve “2+2=” or properly summarize a news article. The constraint satisfaction problem has a small solution space. But failure is high-dimensional—there are infinitely many ways to be wrong, and LLMs explore regions of that failure space that human cognition simply doesn’t reach.

One way to think about this is as the distinction between complexity in a system and randomness. Often indistinguishable in its effects, but fundamentally different in its nature. A world where a butterfly can flap its wings and cause a hurricane somewhere else is also a world that is somewhat indistinguishable from being filled with the randomness. The difference of course as that the first one is not random, it is deterministic, it just seems random because we cannot reliably predict every single step that the computation needs to take in all its complex glory.

One of Taleb’s targets is what he calls the “ludic fallacy,” the idea that the sort of randomness encountered in games of chance can be taken as a model for randomness in real life. As Taleb points out, the “uncertainty” of a casino game like roulette or blackjack cannot be considered analogous to the radical uncertainty faced by real-life decision-makers—military strategists, say, or financial analysts. Casinos deal with known unknowns—they know the odds, and while they can’t predict the outcome of any individual game, they know that in the aggregate they will make a profit. But in Extremistan, as Donald Rumsfeld helpfully pointed out, we deal with unknown unknowns—we do not know what the probabilities are and we have no firm basis on which to make decisions or predictions.

This isn’t just Taleb being esoteric. The rules that were learnt were not the rules that should have been learnt. This is a classic ML problem, that still exists in deep learning. The Fed sent a letter to banks about using not-easily-interpretable ML to judge loan applications for this reason. For an easier to see example, autonomous driving is a case of painfully ironing out edge cases one after the other, because the patterns the models learnt weren’t sufficiently representative of our world. Humans learn to drive with about 50 hours of instruction, Waymo in 2019 itself had run 10 billion simulated miles and 20m real miles, and Tesla at 6 billion real miles driven and quite likely hundreds of billions of miles as training data.

This isn’t as hopeless as it sounds. We see with LLMs that they are remarkably similar to humans in how they think about problems, they don’t get led astray all that often. The remarkable success of next token prediction is precisely that it turned out to learn the right generative understanding.

LLMs are brilliant at identifying a “line of best fit” across millions of dimensions, and in doing so produces miracles. It’s why Ted Chiang called it a blurry jpeg of the internet a couple of years ago.

III. Prediction and causation

“With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” — John von Neumann

Eric Baum had a book published more than twenty years ago, called “What Is Thought?” Its excellent title aside, the core premise was that understanding is compression. Just like drawing a line of best fit seems to gets you the right understanding in statistics, y = mx + c, so do we with all the datapoints we encounter in life.

The spiritual godfather of this blog, Douglas Hofstadter, thought about understanding as rooted in conceptualisation and core understanding. There was a recent New Yorker article that discussed this, and relationship to the truly weirder aspects of high dimensional storage of facts or memory.

In a 1988 book called “Sparse Distributed Memory,” Kanerva argued that thoughts, sensations, and recollections could be represented as coördinates in high-dimensional space. The brain seemed like the perfect piece of hardware for storing such things. Every memory has a sort of address, defined by the neurons that are active when you recall it. New experiences cause new sets of neurons to fire, representing new addresses. Two addresses can be different in many ways but similar in others; one perception or memory triggers other memories nearby. The scent of hay recalls a memory of summer camp. The first three notes of Beethoven’s Fifth beget the fourth. A chess position that you’ve never seen reminds you of old games—not all of them, just the ones in the right neighborhood.

This is a rather perfect theory of LLMs.

It’s also testable. I built transformers to try and predict Elementary Cellular Automata, to see how easily they could learn the underlying rules3.

I also tried creating various combinations of wave functions (3-4 equations and combining them) and seeing if the simple transformer models can learn those, and understand the underlying rules. These are combinations of simple equations, like a basic wave function with a few transformations. And yet:

There have been other similar attempts. This paper, what has a foundation model found, in particular was fascinating because it tried to use a similar method to see if you could predict orbits of planets based only on observational data. And the models managed to do it, except they all tried to approximate instead of learning the fundamental underlying generative path4.

This manifold question - “which diverse pattern sets collapse to unique generators” - is probably intractable without solving the frame problem. After all, if we could characterise those manifolds, we’d have a theory of induction, which is to say we’d have solved philosophy.

Maybe if we got them to think through why they were predicting the things they were predicting as they were getting trained, they could get better at figuring out the underlying rules. It does add a significant lag to their training, but essential nonetheless. Right now we seem stuck with Ptolemaic astronomy, scholastically adding epicycles upon epicycles, without making the leap to hit the inverse-square law. Made undeniably harder because there isn’t just one law to discover, but legion.

IV. Can reasoning escape the pattern trap?

“The aim is not to predict the next data point, but to infer the rule that generates all of them.” — Michael Schmidt

One solution to this problem is reasoning. If you’ve learnt a wrong pattern, you can reason your way to the right one, using the ideas at your disposal. It doesn’t matter if you’re wrong, as long as you can course correct.

Since LLMs are trained to predict the patterns that exist inside a large corpus of data, in doing so they do end up learning some of the ways in which you could create those patterns (i.e., thinking), even if not necessarily the right or the only way in which we see that getting created. So a large part of the efforts we put is to teach them the right ways.

Now we have given models a way to think for themselves. It started as soon as we had chatbots and could get them to “think step by step”. We get to do that across many different lines of thought, reflect back on what they found, and fix things along the way. This is, despite the anthropomorphisation, reasoning. If every rollout is in some sense a function, reasoning is a form of search over those latent programs, with external tools, including memory. Reasoning this way even gets us negative examples and better data, helping loosen the constraints of Gold’s theorem.

It’s also true that now they can reason, we do see them groping their way towards what absolutely looks like actual understanding. This can also often seem like using its enormous corpus of existing patterns that it knows and trying to first-principles-race its way towards the right steps to take to get to the answer.

A useful training method is to teach the model to ask itself to come up with those principles and then to apply them, to learn from them, because doing so gets it much closer to the truth. In mid-training, once the model has some capabilities, this becomes possible. And more so once when they have tools like being able to write python and look up information at its disposal5.

Because we are still pushing the induction problem up one level. It is now a game of how much can it learn about how to think things through. Whether the patterns of how to learn are also learnable from the data, both real and synthetic, to reach the right answer. Or the patterns to learn how to learn6.

And it is guided by the very same process that caused so much trouble in learning Conway’s Game of Life.

It still falls prey to the same lack of insight or inspiration or even step by step thinking that shows up in these failure modes. Same as before when we were trying to see why LLMs couldn’t do Conway’s Game of Life, this still remains the key issue7.

This, to be clear, does seem odd. And is a major crux why people seem to fight whether these AI systems are “even thinking” vs those who think this method is “clearly thinking” and scaling it up will get us to AGI. Because a priori it is very difficult to see why this process would not work. If you are able to reason through something then surely you will be able to get to the right answer one way or the other.

The reason our intuition screws up here is because we think of reasoning the way we do it as different from the model. Rightly so. The number of different lines of thought it can simultaneously explore are just not that high.

The best visible example is Agents who do computer use, if you just see the number of explicit steps it needs to take to click a button you see how quickly things could degenerate and how much effort is required to make them not!

At the same time when you train them with live harnesses and ability to access the internet and have the types of problems where you are able to provide reward rubrics that are actually meaningful suddenly the patterns that it identifies become more similar to the lessons we would want them to learn.

V. Consciousness

An aside, but considering the topic I couldn’t resist. The constant use, including in this essay, about words like “reasoning” or “consciousness” or “thinking” or even “trying to answer” are all ways in which we delude ourselves a little bit about what the models are actually doing. Semantic fights are the dumbest fights but we, just like the functionalists, look at what the model gets right and how it does and are happy to use the same terms we use for each other. But what they get wrong are where the interesting failures.

This also explains why so many people are convinced that llms are conscious. Because behaviourally speaking its outrage does not seem different to one from us, another conscious entity. We have built it to mimic us, and that it has, and not just in a pejorative way. A sufficient degree of change in scale of pattern prediction is equivalent to a change in scope!

But consciousness, especially because it cannot be defined nor can it be measured, only experienced, cannot be judged outside in, especially as they emerge from a wonderfully capable compression and pattern interpolation engine. Miraculous though it seems, the miracle is that of scale! We simply do not know what a human being who has read a billion books looks like, if it is even feasible, so an immortal who has read a billion books feels about as smart as a human who has read a few dozen.

There can’t of course be proof that an LLM is not conscious. Their inner work is inscrutable, because they themselves are not able to distill the patterns they’ve learnt and tell them to you. We could teach them that! But as yet they can’t.

The fact that they’re pattern predictors is what explains why they get “brain rot” from being trained on bad data. Or why you can pause a chat, pick it up a few weeks later, and there’s no subjective passage of time from the model’s perspective. They literally can only choose to see what you tell it, and cannot choose to ignore the bad training data, something we do much better (look at how many functional adults are on twitter all day).

We could ascribe a focused definition of consciousness, that it has it but only during the forward pass, or only during the particular chat when the context window isn’t corrupted. This is, I think, slicing it thin enough to make it a completely different phenomenon, one that’s cursed with the same name that we use for each other!

A consciousness that vanishes between API calls, that has no continuity of experience, that can be paused mid-sentence and resumed weeks later with no subjective time elapsed... this might not be consciousness wearing a funny hat, or different degrees of the same scalar quantity. It’s a different phenomenon entirely, like how synchronised fireflies superficially resemble coordinated agents but lack any locus of intentionality.

Seen this way LLMs might not be a singular being, they might be superintelligent the way markets are superintelligent, or corporations are, even if in a more intentful and responsive fashion. Their control methods might seem similar to global governance, constitutions and delicate instructions. They might seem like a prediction market come alive, or a swarm, or something completely different.

VI. The thesis

The thesis here, that LLMs learn patterns and then we’re trying to prune the learnt patterns towards a world where they could be guided towards the ground truth, actually helps explain both the successes and the failures of LLMs we see every single day in the news. Such as:

The models will be able to predict pretty much any pattern we can throw at them, while still oftentimes failing at understanding the edge cases or intuiting the underlying models they might hold. Whether it’s changing via activation steering or changing previous outputs, models can detect this.
Powerful pattern predictors will naturally detect “funky” inputs. Eval awareness is expected. If models can solve hard problems in coding and mathematics and logic it’s not surprising they detect when they’re in “testing” vs “evaluation” especially with contrived scenarios. Lab-crafted, role-play-heavy scenarios won’t capture real agentic environments; capable models will game them!
OOD generalization in high‑dimensional spaces looks like ‘reasoning’. It even acts like it, enough so that for most purposes it *is* reasoning. Most cases of reasoning are also patterns, some are even meta patterns.
Resistance to steering is also logical if there are conflicting information being fed in, since models are incredibly good at anomaly detection. Steering alters the predicted token distribution. A reasoning model can detect the off‑manifold drift and correct. Models are trained to solve given problems and if you confuse them makes sense they would try non-obvious solutions, including reward hacking.
Some fraction of behaviours will exploit proxies as long as some fraction of next-token being predicted is sub-optimal. Scale exposes low‑probability tokens and weird modes.

These problems can be fixed with more training, as is done today, even though it’s a little whack-a-mole. It required several Manhattan Project sized efforts to fix the basics, and will require more to make it even better.

How many patterns does it need need to learn to understand the underlying rules of human existence? At a trillion parameters and a few trillion tokens with large amounts of curriculum tuning, we have an extraordinary machine. Do we need to scale this up 10x? 100x?

often asks in interviews, “explain, in as few dimensions as possible, the reasons behind [X]”. This is what understanding is. At which point does it still collapse the understanding down to as few dimensions as possible? Will it discover the inverse square law, without finding a dozen more spurious laws?

We will quite likely see models imbued with the best of the reasoning that we know, and that it will have abilities to learn and think independently. Do almost anything. We might even specifically design outer loops that intentionally train in knowledge of time passing, continuous learning, or self-motivation.

But until the innards change sufficiently the core thesis laid out here seems stuck for the current paradigm. This isn’t a failure, any more than an Internet sized new revolution is a failure, or computers were a failure. We live in the world Baum foresaw.

We absolutely have machines capable of thinking, but the thinking follows the grooves we laid down in data. Just like us, they are products of their evolution.

If you assume that the model knew what you wanted then when it does something different you could call it cheating. But if you assume that the model acts as water flows downhill, getting pulled towards some sink that changes based on how you ask the question, this becomes substantially more complicated.

(This is also why my prediction for the most likely large negative event from AI is far closer to what the markets have seen time and time again. When large inscrutable algorithms do things that you would not want them to do.)

And equally as useful is what this tells us what is required for alignment. Successful alignment will end up being far closer to how we align other super intelligences that surround us, like the economy or the stock market. With large numbers of rules, strict supervision, regular data collection, and the understanding that it will not be perfect but we will co-evolve with it.

AI, including LLMs, do sometimes discover generators when we provide them with enough slices of the world that the “line of best fits” becomes parsimonious, but it’s not the easy nor the natural outcome we most often see. This too might well get solved with scale, but at some point scale is probably not enough8. We will have machines capable of doing any job that humans can do but not adaptable enough to do any job that humans think up to do. Not yet.

A summary, TLDR

LLMs today primarily learns patterns from the data they learn from
Learning such patterns makes them remarkably useful, more so than anyone would’ve thought before
Learning such patterns as yet still causes many “silly mistakes” because they don’t often learn the underlying generators
With sufficient amounts of data they do learn underlying principles for some things but it’s not a robust enough process
Reasoning helps here, because they learn to reason like us, but this still has the same problem that the reasoning patterns they learn do not have the same underlying generator
As we push more data/ info/ patterns into the models they will get smarter about what we want them to do, they are indeed intelligent, even though the type of intelligence is closer to a market intelligence than an individual being (speculative)

I also wonder if a way to say this is that as attempting statistical learning where algebraic reasoning would serve better, like Kahneman’s heuristics-and-biases program showed humans doing the inverse.

Kolmogorov–Chaitin complexity formalises this point: for every finite pattern there is an infinite “tail” of longer, redundant recipes that still reproduce it.

Claude comments “This is like expecting someone to derive the Navier-Stokes equations from watching turbulent flow—possible in principle, nightmarishly difficult in practice.” But then goes on to agree “The cellular automata experiments are devastating evidence, and you’re right that failure modes reveal more than successes. This echoes Lakatos’s methodology of research programs: theories are defined by their “negative heuristic”—what they forbid—not just what they predict.”

GPT and Kimi agreed but with a caveat: “The orbital-mechanics example (predict next position vs learn F=GMm/r²) is lovely, but the cited paper does not show the network could not represent the law—only that it did not when trained with vanilla next-token loss.“

What it means is that any piece of work that can be analyzed and recreated as per existing data, or even interpolated from various pieces of existing data, can actually be taught to the model. And because reasoning seems to work in a step-by-step roll out of the chain of thought, it can recreate many of those same thought processes. Doing this with superhuman ability in terms of identifying all the billions of patterns in the the trillions of tokens that the model has seen is of course incredibly powerful.

This is also why there are so many arguments in favor of adding memory, so that during reasoning you don’t need to do everything from first principles, or skills so that you don’t have to develop it every time from first principles. Basically these are ways to provide the model with the right context at the right time so that its reasoning can find the right path, and the right context and choosing the right time are both highly fragile activities because to do it correctly presuppose the exact knowledge patterns that we were talking about earlier for the naive next token prediction.

Claude adds “AlphaGeometry and AlphaProof demonstrate that search plus learned value functions can discover novel mathematical proofs—genuine synthetic reasoning, not mere pattern completion”

Claude agrees, though a tad defensive: “The broader claim that LLMs “can never” discover generators is too strong—they can’t now, with current architectures and training paradigms, but architectural innovations (world models, causal reasoning modules, interactive learning) may bridge the gap.”

Strange Loop CanonModify