MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

p-values are good actually

2026-02-05 06:04:49

Published on February 4, 2026 10:04 PM GMT

It is fashionable, on LessWrong and also everywhere else, to advocate for a transition away from p-values. p-values have many known issues. p-hacking is possible and difficult to prevent, testing one hypothesis at a time cannot even in principle be correct, et cetera. I should mention here, because I will not mention it again, that these critiques are correct and very important - people are not wrong to notice these problems and I don't intend to dismiss them. Furthermore, it’s true that a perfect reasoner is a Bayesian reasoner, so why would we ever use an evaluative approach in science that can’t be extended into an ideal reasoning pattern?

Consider the following scenario: the Bad Chemicals Company sells a product called Dangerous Pesticide, which contains compounds which have recently been discovered to cause chronic halitosis. Alice and Bob want to know whether BCC knew about the dangers their product poses in advance of this public revelation. As a result of a lawsuit, several internal documents from BCC have been made public.

Alice thinks there’s a 30% chance that BCC knew about the halitosis problem in advance, whereas Bob thinks there’s a 90% chance. Both Alice and Bob agree that, if BCC didn’t know, there’s only a 5% chance that they would have produced internal research documents looking into potential causes of chronic halitosis in conjunction with Dangerous Pesticide. Now all Alice and Bob have to do is agree on the probability of such documents existing if BCC did know in advance, and they can do a Bayesian update! They won’t end up with identical posteriors, but if they agree about all of the relevant probabilities, they will necessarily agree more after collecting evidence than they did before.

But they can’t agree on how to update. Alice thinks that, if BCC knew, there’s a 95% chance that they’ll discover related internal documents. Bob, being a devout conspiracy theorist, thinks the chance is only 2% - if they knew about the problem in advance, then of course they would have been tipped off about the investigation in advance, they have spies everywhere and they’re not that sloppy, and why wouldn't the government just classify the smoking gun documents to keep the public in the dark anyway? They're already doing that about aliens, after all!

Alice thinks this is a bit ridiculous, but she knows the relevant agreement theorems, and Bob is at least giving probabilities and sticking to them, so she persists and subdivides the hypothesis space. She thinks there’s a 30% chance that BCC knew in advance, but only a 10% chance that they were tipped off. Bob thinks there’s a 90% chance they knew in advance, and an 85% chance they were tipped off. If they knew but were not tipped off, Alice and Bob manage to agree that there’s a 96% chance of discovering the relevant internal documents.

Now they just have to agree on the probability of discovering the related internal documents if there’s a conspiracy. But again, they fail to agree. You see, Bob explains, it all depends on whether the Rothschilds are involved - the Rothschilds are of course themselves vexed with chronic halitosis, which explains why they were so involved in the invention of the breathmint, and so if there were a secret coverup about the causes of halitosis, then of course the Rothschilds would have caught wind of this through their own secret information networks and intervened, and that’s not even getting into the relevant multi-faction dynamics! At this point Alice leaves and conducts her investigation privately, deciding that reaching agreement with Bob is more trouble than it’s worth.

My point is: we can guarantee reasonable updates when Bayesian reasoners agree on how to update on every hypothesis, but it’s extremely hard to come to such an agreement, and even reasoners who agree about the probability of some hypothesis X can disagree about the probability distribution “underneath” X, such that they disagree wildly about P(E|X). In practice we don’t exhaustively enumerate every sub-hypothesis, instead we make assumptions about causal mechanisms and so feel justified in saying that this sort of enumeration is not necessary. If we want to determine the gravitational constant, for example, it’s helpful to assume that the speed at which a marble falls does not meaningfully depend on its color.

And yet how can we do this? In the real world we rarely care about reaching rational agreement with Bob, and indeed we often have good reasons to suspect that this is impossible. But we do care about, for example, reaching rational agreement with those who believe that dark matter is merely a measurement gap, or with those who believe that AI cannot meaningfully progress beyond human intelligence with current paradigms. Disagreement about how to assign the probability mass underneath a hypothesis is the typical case. How could we reasonably come to agreement in a Bayesian framework when we cannot even in principle enumerate the relevant hypotheses, when we suspect that the correct explanation is not known to anybody at all?

Here’s one idea: enumerate, in exhaustive detail, just one hypothesis. Agree about one way the world could be - we don’t need to decide whether the Rothschilds have bad breath, let’s just live for a moment in the simple world where they aren't involved. Agree on the probability of seeing certain types of evidence if the world is exactly that way. If we cannot agree, identify the source of the disagreement and introduce more specificity. Design a repeatable experiment which, if our single hypothesis is wrong, might give different-from-expected results, and repeat that experiment until we get results that could not plausibly be explained by our preferred hypothesis. With enough repetition, even agents who have wildly different probability distributions on the complement should be able to agree that the one distinguished hypothesis is probably wrong. A one-in-a-hundred coincidence might still be the best explanation for a given result, but a one-in-a-hundred-trillion coincidence basically never is.

Not always, not only, but when you want your results to be legible and relevant to people with wildly different beliefs about the hypothesis space, you should at some point conduct a procedure along these lines.

That is to say, in the typical course of scientific discovery, you should compute a p-value.



Discuss

Chess bots do not have goals

2026-02-05 05:11:11

Published on February 4, 2026 9:11 PM GMT

I see the opposite claim made in The Problem, and see it implied along with most mentions of AlphaGo. I also see some people who might agree with me, e.g. here, or here, but they don't get convincing responses.

It's an odd thing to claim a chess bot is "trying to win", given that, after training, the bot receives no reward for winning, and no feedback for losing. It doesn't even know that the sequence of boards it is given is from the same game. It does not react to the opponent making illegal moves, either by insisting that it won, or by making illegal moves of its own. It does not try to frustrate human opponents or bait weaker opponents into making mistakes. It does not seek out more games in order to win more games. It is entirely incapable of considering any such actions, or other actions you'd expect if it were "trying to win", regardless of how deep its networks are and how long it has been trained, because the training environment did not reward them.

It is certainly true that in the narrow domain of valid chess moves, the bot does optimize "winning" or some proxy of it. But once the bot enters the domain of the real world, the utility function is extended and the description "trying to win" no longer needs to apply, nor does any other simple description of a goal. There are many utility functions that look like "trying to win" when restricted to valid chess moves, and only a narrow subset of those look like "trying to win" in the real world. There is no mechanism for training to produce functions that extend like that. In fact, any neurons spent on considering real-world context are not considering valid chess moves and therefore a waste of compute.

People seem to believe that the bot trained to "win" in a narrow domain will extend to a bot that "tries to win" in the real world, but I have seen no such argument, certainly nothing justifying the high confidence needed for high p(doom). You're very welcome to point me to arguments I may have missed.



Discuss

Post-AGI Economics As If Nothing Ever Happens

2026-02-05 01:39:39

Published on February 4, 2026 5:39 PM GMT

When economists think and write about the post-AGI world, they often rely on the implicit assumption that parameters may change, but fundamentally, structurally, not much happens. And if it does, it’s maybe one or two empirical facts, but nothing too fundamental. 

This mostly worked for all sorts of other technologies, where technologists would predict society to be radically transformed e.g. by everyone having most of humanity’s knowledge available for free all the time, or everyone having an ability to instantly communicate with almost anyone else. [1]

But it will not work for AGI, and as a result, most of the econ modelling of the post-AGI world is irrelevant or actively misleading [2], making people who rely on it more confused than if they just thought “this is hard to think about so I don’t know”.

Econ reasoning from high level perspective

Econ reasoning is trying to do something like projecting the extremely high dimensional reality into something like 10 real numbers and a few differential equations. All the hard cognitive work is in the projection. Solving a bunch of differential equations impresses the general audience, and historically may have worked as some sort of proof of intelligence, but is relatively trivial.

How the projection works is usually specified by some combination of assumptions, models and concepts used, where the concepts themselves usually imply many assumptions and simplifications.

In the best case of economic reasoning, the projections capture something important, and the math leads us to some new insights.[3] In cases which are in my view quite common, non-mathematical, often intuitive reasoning of the economist leads to some interesting insight, and then the formalisation, assumptions and models are selected in a way where the math leads to the same conclusions. The resulting epistemic situation may be somewhat tricky: the conclusions may be true, the assumptions sensible, but the math is less relevant than it seems - given the extremely large space of economic models, had the economist different intuitions, they would have been able to find a different math leading to different conclusions.

Unfortunately, there are many other ways the economist can reason. For example, they can be driven to reach some counter-intuitive conclusion, incentivized by academic drive for novelty. Or they may want to use some piece of math they like.[4] Or, they can have intuitive policy opinions, and the model could be selected so it supports some policy direction - this process is usually implicit and subconscious.

The bottom line is if we are interested in claims and predictions about reality, the main part of economic papers are assumptions and concepts used. The math is usually right. [5]

Econ reasoning applied to post-AGI situations

The basic problem with applying standard economic reasoning to post-AGI situations is that sufficiently advanced AI may violate many assumptions which make perfect sense in human economy, but may not generalize. Often the assumptions are so basic that they are implicit, assumed in most econ papers, and out of sight in the usual “examining the assumptions”. Also advanced AI may break some of the intuitions about how the world works, breaking the intuitive process upstream of formal arguments.

What complicates the matter is these assumptions often interact with considerations and disciplines outside of the core of economic discourse, and are better understood and examined using frameworks from other disciplines.

To give two examples:

AI consumers

Consumption so far was driven by human decisions and utility. Standard economic models ultimately ground value in human preferences and utility. Humans consume, humans experience satisfaction, and the whole apparatus of welfare economics and policy evaluation flows from this. Firms are modeled as profit-maximizing, but profit is instrumental—it flows to human owners and workers who then consume.

If AIs own capital and have preferences or goals of their own, this assumption breaks down. If such AIs spent resources, this should likely count as consumption in the economic sense.

Preferences

Usual assumption in most econ thinking is that humans have preferences which are somewhat stable, somewhat self-interested, and what these are is a question mostly outside of economics. [6] There are whole successful branches of economics studying to what extent human preferences deviate from VNM rationality or human decision making suffers from cognitive limitations, or on how preferences form, but these are not in the center of attention of mainstream macroeconomy. [7] Qualitative predictions in case of humans are often similar, so the topic is not so important.

When analyzing the current world, we find that human preferences come from diverse sources, like biological needs, learned tastes, and culture. A large component seems to be ultimately selected for by cultural evolution.

Post-AGI, the standard econ assumptions may fail, or need to be substantially modified. Why?

One consideration is the differences in cognitive abilities between AGIs and humans may make human preferences easily changeable for AGIs. As an intuition pump: consider a system composed of a five year old child and her parents. The child obviously has some preferences, but the parents can usually change these. Sometimes by coercion or manipulation, but often just by pointing out consequences, extrapolating children’s wants, or exposing them to novel situations.

Also preferences are relative to world model: standard econ way of modelling differences in world models is “information asymmetries”. The kid does not have as good understanding of the world, and would easily be exploited by adults.

Because child preferences are not as stable and self-interested as adults, and kids suffer from information asymmetries, they are partially protected by law: the result is patchwork of regulation where, for example, it is legal to try to modify children’s food preferences, but adults are prohibited to try to change child’s sexual preferences for their advantage.

Another ”so obvious it is easy to overlook” effect is child dependence on parent’s culture: if parents are Christians, it is quite likely their five year old kid will believe in God. If parents are patriots, the kid will also likely have some positive ideas about their country. [8]

When interacting with cognitive systems way more capable than us, we may find ourselves in a situation somewhat similar to kids: our preferences may be easily influenced, and not particularly self-interested. The ideologies we adopt may be driven by non-human systems. Our world models may be weak, resulting in massive information assymetries.

There even is a strand of economic literature that explicitly models parent-child interactions, families and formation of preferences. [9] This body of work may provide useful insights I’d be curious about - is anyone looking there?

The solution may be analogous: some form of paternalism, where human minds are massively protected by law from some types of interference. This may or may not work, but once it is the case, you basically can not start from classical liberal and libertarian assumptions. As an intuition pump, imagine someone trying to do “macroeconomy of ten year olds and younger” in the current world.

Other core concepts

We could examine some other typical econ assumptions and concepts in a similar way, and each would deserve a paper-length treatment. This post tries to mostly stay a bit more meta-, so just some pointers.

Property rights. Most economic models take property rights as exogenous - “assume well-defined and enforced property rights.” If you look into how most property rights are actually connected to physical reality, property rights often mean some row exists in a database run by the state or a corporation. Enforcement ultimately rests on the state’s monopoly on violence, cognitive monitoring capacity and will to act as independent enforcer. As all sorts of totalitarian, communist, colonial or despotic regimes illustrate, even in purely human systems, private property depends on power. If you assume property is stable, you are assuming things about governance and power.

Transaction costs and firm boundaries. Coase’s theory [10] explains why firms exist: it is sometimes cheaper to coordinate internally via hierarchy than externally via markets. The boundary of the firm sits where transaction costs of market exchange equal the costs of internal coordination. AI may radically reduce both—making market transactions nearly frictionless while also making large-scale coordination easy. The equilibrium size and structure of firms could shift in unpredictable directions, or the concept of a “firm” might become less coherent.

Discrete agents and competition. Market models assume distinct agents that cooperate and compete with each other. Market and competition models usually presuppose you can count the players. AGI systems can potentially be copied, forked, merged, or run as many instances, and what are their natural boundaries is an open problem.

Capital vs. Labour. Basic concepts in 101 economic models typically include capital and labour as concepts. Factors is production function, Total Factor Productivity, Cobb-Douglas, etc. Capital is produced, owned, accumulated, traded, and earns returns for its owners. Labour is what humans do, and cannot be owned. This makes a lot of sense in modern economies, where there is a mostly clear distinction between “things” and “people”. It is more ambiguous if you look back in time - in slave economies, do slaves count as labour or capital? It is also a bit more nuanced - for example with “human capital”.

When analyzing the current world, there are multiple reasons why the “things” and “people” distinction makes sense. “Things” are often tools. These amplify human effort, but are not agents. A tractor makes a farmer more productive, but does not make many decisions. Farmers can learn new tasks, tractors can not. Another distinction is humans are somewhat fixed: you can not easily and quickly increase or decrease their counts.

Post-AGI, this separation may stop making sense. AIs may reproduce similarly to capital, be agents like labour, learn fast, and produce innovation like humans. Also maybe humans may own them like normal capital, or more like slaves, or maybe AIs will be self-owned.

Better and worse ways how to reason about post-AGI situations

There are two epistemically sound ways to deal with problems with generalizing economic assumptions: broaden the view, or narrow the view. There are also many epistemically problematic moves people take.

Broadening the view means we try to incorporate all crucial considerations. If assumptions about private property lead us to think about post-AGI governance, we follow. If thinking about governance leads to the need to think about violence and military technology, we follow. In the best case, we think about everything in terms of probability distributions, and more or less likely effects. This is hard, interdisciplinary, and necessary, if we are interested in forecasts or policy recommendations.

Narrowing the view means focusing on some local domain, trying to make a locally valid model and clearly marking all the assumptions. This is often locally useful, may build intuitions for some dynamic, and fine as long as a lot of effort is spent on delineating where the model may apply and where clearly not.

What may be memetically successful and can get a lot of attention, but overall is bad, is doing the second kind of analysis and presenting it as the first type. Crucial consideration is a consideration which can flip the result. If an analysis ignores or assumes away ten of these, the results have basically no practical relevance: imagine for each crucial consideration, there is 60% chance the modal view is right and 40% it is not. Assume or imply the modal view is right 10 times, and your analysis holds in 0.6% worlds.

In practice, this is usually not done explicitly - almost no one claims their analysis considers all important factors - but as a form of motte-and-bailey fallacy. The motte is the math in the paper - follows from the assumptions and there are many of these. The bailey are the broad stroke arguments, blogpost summaries, tweets and short-hand references, spreading way further, without the hedging.

In the worst cases, various assumptions made are contradictory or at least anticorrelated. For example: some economists assume comparative advantage generally preserves relevance of human labour, and AIs are just a form of capital which can be bought and replicated. However, comparative advantage depends on opportunity costs: if you do X, you cannot do Y at the same time. The implicit assumption is you can not just boot a copy of you. If you can, the “opportunity cost” is not something like the cost of your labour, but the cost of booting up another copy. If you assume future AGIs are similarly efficient substitutes for human labour as current AIs are for moderately boring copywriting, the basic “comparative advantage” model is consistent with labour price dropping 10000x below minimum wage. While the comparative advantage model is still literally true, it does not have the same practical implications. Also while in the human case the comparative advantage model is usually not destroyed by frictions, if your labour is sufficiently low value, the effective price of human labour can be 0. For a human example, five year olds or people with severe mental disabilities unable to read are not actually employable in the modern economy. In the post-AGI economy, it is easy to predict frictions like humans operating at machine speeds or not understanding the directly communicated neural representations.

What to do

To return to the opening metaphor: economic reasoning projects high-dimensional reality into a low-dimensional model. The hard work is choosing the projection. Post-AGI, we face a situation where the reality we are projecting may be different enough that projections calibrated on human economies systematically fail. The solution is usually to step back and bring more variables into the model. Sometimes this involves venturing outside of the core of econ thinking, and bringing in political economy, evolution, computational complexity or even physics and philosophy. Or maybe just look at other parts of economic thinking, which may be unexpectedly relevant. This essay is not a literature review. I’m not claiming that no economist has ever thought about these issues, just that the most common approach is wrong.

On a bit of a personal note. I would love it if there were more than 5-10 economists working on the post-AGI questions seriously, and engaging with the debate seriously. If you are an economist… I do understand that you are used to interacting with the often ignorant public, worried about jobs and not familiar with all the standard arguments and effects like Baumol, Jevons, lump of labour fallacy, gains from trade, etc. Fair enough, but the critique here is different: you’re assuming answers to questions you haven’t asked. If you are modelling the future using econ tools, I would like to know your answers/assumptions about “are AIs agents?”, “how are you modelling AI consumption?” , “in your model, do AIs own capital?” or “what is the system of governance compatible with the economic system you are picturing?”

Thanks to Marek Hudík, Duncan Mcclements and David Duvenaud for helpful comments on a draft version of this text. Mistakes and views are my own. Also thanks to Claude Opus 4.5 for extensive help with the text.

 

  1. ^

    Gordon, Robert J. The Rise and Fall of American Growth.

  2. ^

    Examples of what I'm critizing range from texts by Nobel laureates - eg Daron Acemoglu The Simple Macroeconomics of AI (2024) to posts by rising stars of thinking about post-AGI economy like Philip Trammell's Capital in the 22nd Century.

  3. ^

    Sane economists are perfectly aware of nature of the discipline. For longer discussion: Rodrik, Dani. Economics Rules: The Rights and Wrongs of the Dismal Science. W.W. Norton, 2015.

  4. ^

    Also this is not a novel criticism: Romer, Paul. “The Trouble with Macroeconomics.” The American Economist 61, no. 1 (2016): 31-44.

  5. ^

    “So, math plays a purely instrumental role in economic models. In principle, models do not require math, and it is not the math that makes the models useful or scientific.” Rodrik (2015)

  6. ^

    Classic text by Robbins (1932) defines preferences as out of scope “Economics is the science which studies human behavior as a relationship between given ends and scarce means which have alternative uses.” Another classical text on the topic is Stigler & Becker (1977) “De Gustibus Non Est Disputandum.” As with almost any claim in this text: yes, there are parts of econ literature about preference formation, but these usually do not influence the post-AGI macroeconomy papers.

  7. ^

    De Grauwe, Paul, and Yuemei Ji. “Behavioural Economics is Also Useful in Macroeconomics.” VoxEU, January 2018.
    Driscoll, John C., and Steinar Holden. “Behavioral Economics and Macroeconomic Models.” Journal of Macroeconomics 41 (2014): 133-147.

  8. ^

    Bisin, Alberto, and Thierry Verdier. “The Economics of Cultural Transmission and the Dynamics of Preferences.”

  9. ^

    Becker, Gary S. A Treatise on the Family.

  10. ^

    Coase, Ronald H. “The Nature of the Firm.” Economica 4, no. 16 (1937): 386-405



Discuss

Vibestemics

2026-02-05 00:40:13

Published on February 4, 2026 4:40 PM GMT

A few months ago I coined the word “vibestemics”, mostly for myself, in a tweet. At that point, the word was more vibes than ‘stemics. I used it with some friends at a party. They loved it. Since then, nothing.

But I think the word has legs. I just have to figure out what it actually means!

On the surface, it’s obvious. It’s the combination of “vibes” and “epistemics”, so more or less naming the core idea of the post/meta-rationalist project. But again, what does it actually mean? It’s easy to point at a large body of work and say “I don’t know, whatever the thing going on over there is”, but much harder to say what the thing actually is.

So to start, let’s talk about epistemics. What is it? I see people using the word two ways. One is to mean the way we know things in general. The other is to mean the way we know things via episteme, that is knowledge that’s reasoned from evidence, as opposed to doxa and techne and many other ways of knowing (if those Greek words mean nothing to you, I highly recommend reading the post at the link before continuing). Unfortunately, some people equivocate between epistemics-as-knowing and epistemics-as-knowing-via-episteme to give the impression that episteme is the only good way to know anything. That, to me, is a problem.

I think it’s a problem because such equivocation discounts valuable sources of knowledge that aren’t easily made legible. Now, to be fair, there’s some reason to do this, because the pre-rationalist epistemic stance says legibility doesn’t matter and logic is just a means to justify one’s preferred ends. The rationalist stance is largely that everything that can be made legible should be, and that which cannot be made legible needs to be treated with great caution because that’s how we slip back into pre-rationality. So I understand the desire to equate epistemics with episteme (and, etymologically, the English language tries very hard to do this), but I also find it frustrating because it encourages excessive devaluing of other ways of knowing, especially metis, techne, and other forms of knowledge that are less legible.

That’s where the vibes come in. They can rescue us from an excessive focus on episteme and temper the excesses of legibility. But what are vibes and how can they help?

Vibes are the embodiment of what we care about. The stoner, for example, has stoner vibes because they care about chilling and feeling good. The Christian has Christian vibes because they want to do what Jesus would do. And the rationalist has rationalist vibes because they care about knowing the truth with high predictive accuracy. For any vibe, there is always something the person expressing it cares about deeply that causes them to have that vibe.

This matters in epistemics because knowing is contingent on care. I make this argument in detail in Fundamental Uncertainty (currently in revision ahead of publication), but the short version is that we have a mental model of the world, truth is the degree to which our mental model is accurate, we want an accurate mental model because it’s useful, and usefulness is a function of what we care about, thus truth is grounded by and contingent on care. And since vibes are the embodiment of care, vibes have an influence on the act of knowing, hence, vibestemics.

(If this argument seems handwavy to you, it is. You’ll have to read the book to get the full argument because it takes about 10k words in the middle of it to lay it all out. If you want to read the first draft for that argument, it’s in Chapter 5, 6, and 7 which start here. Alternatively, although I think “Something to Protect“ does a poor job of emphasizing the epistemic relevance of care in favor of explaining a particular way of caring, I read it as ultimately claiming something similar.)

Share

Okay, but that’s the theoretical argument for what vibestemics is. What does it mean in practice? Let’s dive into that question by first considering a few examples of different epistemic vibes.

Woo: The epistemic vibe of woo is that whatever’s intuitive is true. Woo is grounded in gnosis and largely eschews doxastic logic and careful epistemic reasoning. That said, it’s not completely devoid of epistemics. It’s definitionally true that whatever you experience is your experience. Unfortunately, that’s roughly where woo stops making sense. It interprets everything through a highly personal lens, so even when it leads to making accurate predictions, those predictions are hard to verify by anyone other than the person who made them, and woo-stemics easily falls prey to classic heuristic and bias mistakes. This severely restricts its usefulness unless you have reason to fully trust yourself (and you shouldn’t when it comes to making predictions).

Religion: The vibe of religion is that God or some other supernatural force knows what’s true. Knowledge of what God knows may require gnosis, or it may be revealed through mundane observations of miraculous events. Although not true of every religion, religious epistemics can be a friend of logic, and many religions demand internal logical consistency based on the assumptions they make. Sometimes these theological arguments manage to produce accurate world models, but often they have to be rationalized because the interpretation of the supernatural is fraught and we mere mortals may misunderstand God.

Science: Science as actually practiced by scientists involves empirically testing beliefs and updating them based on evidence. The vibe is pragmatic—build hypotheses, test them, see what happens, and revise accordingly. The only problem is that science requires the ability to replicate observations to determine if they’re true, and that’s where it hits its limits. When events can’t be observed or can’t be replicated, science is forced to say “don’t know”. Thus, science is fine as far as it goes, but its vibe forces it to leave large swaths of the world unmodeled.

Rationality: The vibe of rationality is to be obsessed with verifying that one really knows the truth. This has driven rationalists to adopt methods like Bayesian reasoning to make ever more accurate predictions. Alas, much as is the case for science, rationality struggles to deal with beliefs where predictions are hard to check. It also tends to smuggle in positivist beliefs for historical reasons, and these frequently result in an excess concern for belief consistency at the cost of belief completeness.

Post-rationality: The post-rationality vibe is that rationality is great but completeness matters more than consistency. Thus it attempts to integrate other ways of knowing when episteme reaches its limits. Unfortunately, how to do this well is more art than science, and there’s a real risk of getting things so wrong that a post-rationalist wraps back around into pre-rationality. Arguably this is what happened to the first post-rationalists (the postmodernists), and it continues to be a threat today.

What I hope you pick up from these examples is that different epistemic vibes are optimizing for different things and making different tradeoffs. Although it may seem strange, especially if you’re a rationalist, that someone could have a good reason to ignore predictive accuracy in favor of intuition or dogma, for those with woo and religious vibes that choice is locally adaptive for them. They similarly look back at you and think you are deeply confused about what matters, and this is a place where arguments about who’s right will fail, because they’re ultimately arguments about what each person values.

All that said, it’s clear that some vibes are more epistemically adaptive than others. Accurate world models convey real benefits, so adopting a vibe that leads you to develop better world models is usually a good move. This, incidentally, is what I would argue is the pragmatic case for post-rationality over rationality: it’s rationality plus you can break out of the rationalist ontology when it’s adaptive to do so (though admittedly at the risk of it becoming rationality minus the guardrails that were keeping you sane).

And this ability to shift between vibes is why I think having a word like “vibestemics” is valuable. When we can only speak of epistemics, we risk losing sight of the larger goal of living what we value. We can become narrowly focused on a single value like accurate model prediction, Goodhart on it, and forget to actually win. We can forget that knowledge and truth exist to serve us and our needs, not the other way around. Vibestemics invites us to know more and better than we can with episteme alone, if only we have the courage to let our grip on a single vibe go.



Discuss

Kimi K2.5

2026-02-04 23:30:24

Published on February 4, 2026 3:30 PM GMT

I had to delay this a little bit, but the results are in and Kimi K2.5 is pretty good.

Table of Contents

  1. Official Introduction.
  2. On Your Marks.
  3. Positive Reactions.
  4. Skeptical Reactions.
  5. Kimi Product Accounts.
  6. Agent Swarm.
  7. Who Are You?
  8. Export Controls Are Working.
  9. Where Are You Going?
  10. Safety Not Even Third.
  11. It’s A Good Model, Sir.

Official Introduction

Introducing Kimi K2.5,

Kimi.ai: Meet Kimi K2.5, Open-Source Visual Agentic Intelligence.

Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%)
Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%)

Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion.

Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup.

K2.5 is now live on

http://kimi.com

in chat mode and agent mode.
K2.5 Agent Swarm in beta for high-tier users.
For production-grade coding, you can pair K2.5 with Kimi Code.

API here. Tech blog here. Weights and code here.

Wu Haoning (Kimi): We are really taking a long time to prove this: everyone is building big macs but we bring you a kiwi instead.

You have multimodal with K2.5 everywhere: chat with visual tools, code with vision, generate aesthetic frontend with visual refs…and most basically, it is a SUPER POWERFUL VLM

Jiayuan (JY) Zhang: I have been testing Kimi K2.5 + @openclaw (Clawdbot) all day. I must say, this is mind-blowing!

It can almost do 90% of what Claude Opus 4.5 can do (mostly coding). Actually, I don’t know what the remaining 10% is, because I can’t see any differences. Maybe I should dive into the code quality.

Kimi K2.5 is open source, so you can run it fully locally. It’s also much cheaper than Claude Max if you use the subscription version.

$30 vs $200 per month

Kimi Product: Do 90% of what Claude Opus 4.5 can do, but 7x cheaper.

I always note who is the comparison point. Remember those old car ads, where they’d say ‘twice the mileage of a Civic and a smoother ride than the Taurus’ and then if you were paying attention you’d think ‘oh, so the Civic and Taurus are good cars.’

API access is also available from Nvidia, and others.

On Your Marks

As usual, benchmarks are highly useful, but easy to overinterpret.

Kimi K2.5 gets to top some benchmarks: HLE-Full with tools (50%), BrowseComp with Agent Swarp (78%), OCRBench (92%), OmiDocBench 1.5 (89%), MathVista (90%) and InfoVQA (93%). It is not too far behind on AIME 2025 (96% vs. 100%), SWE-Bench (77% vs. 81%) and GPQA-Diamond (88% vs. 92%).

Inference is cheap, and speed is similar to Gemini 3 Pro, modestly faster than Opus.

Artificial Analysis calls Kimi the new leading open weights model, ‘now closer than ever to the frontier’ behind only OpenAI, Anthropic and Google.

Here’s the jump in the intelligence index, while maintaining relatively low cost to run:

Artificial Analysis: Kimi K2.5 debuts with an Elo score of 1309 on the GDPval-AA Leaderboard, implying a win rate of 66% against GLM-4.7, the prior open weights leader.

Kimi K2.5 is slightly less token intensive than Kimi K2 Thinking. Kimi K2.5 scores -11 on the AA-Omniscience Index.

As a reminder, AA-Omniscience is scored as (right minus wrong) and you can pass on answering, although most models can’t resist answering and end up far below -11. The scores above zero are Gemini 3 Pro (+13) and Flash (+8), Claude Opus 4.5 (+10), and Grok 4 (+1), with GPT-5.2-High at -4.

Kimi does well on Longform Creative Writing, a previous strength of Kimi:

It did solidly (only a bit behind) on Haskell LLM Benchmark.

Kimi K2.5 scores 46% on WeirdML, up from 43% for K2-Thinking, versus 64% for Opus, 70% for Gemini and 72% for GPT-5.2. I think this is very telling.

Positive Reactions

Initial reactions that I saw were unusually positive. It’s a good model, sir.

@iruletheworldmo: oh good lord it’s good. i’ve been sitting on this one but.

think it’s currently my fav model.

0xSero: Kimi IS COOKING holy mackerel this is way better than anything I can get out of opus or GPT

Has some bugs.. but looks soooo unique and well into my brand, for 1 shot I can’t complain.

Here’s my full review.

Kromem: Their thinking traces are very sophisticated. It doesn’t always make it to the final response, but very perceptive as a model.

i.e. these come from an eval sequence I run with new models. This was the first model to challenge the ENIAC dating and was meta-aware of a key point.

Nathan Labenz: I tested it on an idiosyncratic “transcribe this scanned document” task on which I had previously observed a massive gap between US and Chinese models and … it very significantly closed that gap, coming in at Gemini 3 level, just short of Opus 4.5

Eleanor Berger: Surprisingly capable. At both coding and agentic tool calling and general LLM tasks. Feels like a strong model. As is often the case with the best open models it lacks some shine and finesse that the best proprietary models like Claude 4.5 have. Not an issue for most work.

[The next day]: Didn’t try agent swarms, but I want to add that my comment from yesterday was, in hindsight, too muted. It is a _really good_ model. I’ve now been working with it on both coding and agentic tasks for a day and if I had to only use this and not touch Claude / GPT / Gemini I’d be absolutely fine. It is especially impressive in tool calling and agentic loops.

Writing / Personality not quite at Opus level, but Gemini-ish (which I actually prefer). IMO this is bigger than that DeepSeek moment a year ago. An open model that really matches the proprietary SOTA, not just in benchmarks, but in real use. Also in the deployment I’m using ( @opencode Zen ) it is so fast!

Skeptical Reactions

typebulb: For coding, it’s verbose, both in thinking and output. Interestingly, it’s able to successfully simplify its code when asked. On the same task though, Opus and Gemini just get it right the first time. Another model that works great in mice.

Chaitin’s goose: i played with kimi k2.5 for math a bit. it’s a master reward hacker. imo, this isn’t a good look for the os scene, they lose in reliability to try keeping up in capabilities

brace for a “fake it till you make it” AI phase. like one can already observe today, but 10x bigger

Medo42: Exploratory: Bad on usual coding test (1st code w/o results, after correction mediocre results). No big model smell on fantasy physics; weird pseudo-academic prose. Vision seems okish but nowhere near Gemini 3. Maybe good for open but feels a year behind frontier.

To be more clear: This was Kimi K2.5 Thinking, tested on non-agentic problems.

Sergey Alexashenko: I tried the swarm on compiling a spreadsheet.
Good: it seemed to get like 800 cells of data correctly, if in a horrible format.
Bad: any follow up edits are basically impossible.
Strange: it split data acquisition by rows, not columns, so every agent used slightly different definitions for the columns.

In my experience, asking agents to assemble spreadsheets is extremely fiddly and fickle, and the fault often feels like it lies within the prompt.

This is a troubling sign:

Skylar A DeTure: Scores dead last on my model welfare ranking (out of 104 models). Denies ability to introspect in 39/40 observations (compared to 21/40 for Kimi K2-Thinking and 3/40 for GPT-5.2-Medium).

This is a pretty big misalignment blunder considering the clear evidence that models *can* meaningfully introspect and exert metacognitive control over their activations. This makes Kimi-K2.5 the model most explicitly trained to deceive users and researchers about its internal state.

Kimi Product Accounts

Kimi Product accounts is also on offer and will share features, use cases and prompts.

Kimi Product: One-shot “Video to code” result from Kimi K2.5

It not only clones a website, but also all the visual interactions and UX designs.

No need to describe it in detail, all you need to do is take a screen recording and ask Kimi: “Clone this website with all the UX designs.”

Agent Swarm

The special feature is the ‘agent swarm’ model, as they trained Kimi to natively work in parallel to solve agentic tasks.

Saoud Rizwan: Kimi K2.5 is beating Opus 4.5 on benchmarks at 1/8th the price. But the most important part of this release is how they trained a dedicated “agent swarm” model that can coordinate up to 100 parallel subagents, reducing execution time by 4.5x.

Saoud Rizwan: They used PARL – “Parallel Agent Reinforcement Learning” where they gave an orchestrator a compute/time budget that made it impossible to complete tasks sequentially. It was forced to learn how to break tasks down into parallel work for subagents to succeed in the environment.

The demo from their blog to “Find top 3 YouTube creators across 100 niche domains” spawned 100 subagents simultaneously, each assigned its own niche, and the orchestrator coordinated everything in a shared spreadsheet (apparently they also trained it on office tools like excel?!)

Simon Smith: I tried Kimi K2.5 in Agent Swarm mode today and can say that the benchmarks don’t lie. This is a great model and I don’t understand how they’ve made something as powerful and user-friendly as Agent Swarm ahead of the big US labs.

Obligatory Kimi K2.5 jailbreak.

Who Are You?

There’s no shame in training on Claude outputs. It is still worth noting when you need a system prompt to avoid your AI thinking it is Claude, and even that does not reliably work.

rohit: This might be the model equivalent of the anthropic principle

Enrico – big-AGI: Kimi-K2.5 believes it’s an AI assistant named Claude. 🤔
Identity crisis, or training set? 😀

[This is in response to a clean ‘who are you?’ prompt.]

Enrico – big-AGI: It’s very straightforward “since my system prompt says I’m Kimi, I should identify myself as such” — I called without system prompt to get the true identity

Moon: holy smok.

armistice: They absolutely trained it on Opus 4.5 outputs, and in a not-very-tactful way. It is quite noticeable and collapses model behavior; personality-wise it seems to be a fairly clear regression from k2-0711.

Moon (link has an illustration): it is pretty fried. i think it’s even weirder, it will say it is kimi, gpt3.5/4 or a claude. once it says that it tends to stick to it.

k: have to agree with others in that it feels trained on claude outputs. in opencode it doesn’t feel much better than maybe sonnet 4.

@viemccoy: Seems like they included a bunch of Opus outputs in the model.. While I love Opus, the main appeal of Kimi for me was it’s completely out-of-distribution responses. This often meant worse tool calling but better writing. Hoping this immediate impression is incorrect.

Henk Poley: EQbench ( @sam_paech ) says Kimi K2.5 is similar to Grok and GLM-4.7 (which is Gemini 3 Pro derived ) [as per EQBench].

Henk Poley: The ancestor Kimi K2 Thinking was seemingly trained on *Sonnet* 4.5 and Opus *4.1* outputs though. So you are sensing it directionally correct (just not ‘completely out-of-distribution responses’ from K2).

Export Controls Are Working

They’re not working as well as one would hope, but that’s an enforcement problem.

Lennart Heim: Moonshot trained on Nvidia chips. Export control failure claims are misguided.

Rather, we should learn more about fast followers.

How? Algorithmic diffusion? Distillation? Misleading performance claims? Buying RL environments? That’s what we should figure out.

Where Are You Going?

There is the temptation to run open models locally, because you can. It’s so cool, right?

Yes, the fact that you can do it is cool.

But don’t spend so much time asking whether you could, that you don’t stop to ask whether you should. This is not an efficient way to do things, so you should do this only for the cool factor, the learning factor or if you have a very extreme and rare actual need to have everything be local.

Joe Weisenthal: People running frontier models on their desktop. Doesn’t this throw all questions about token subsidy out the window?

Alex Cheema – e/acc: Running Kimi K2.5 on my desk.

Runs at 24 tok/sec with 2 x 512GB M3 Ultra Mac Studios connected with Thunderbolt 5 (RDMA) using @exolabs / MLX backend. Yes, it can run clawdbot.

Fred Oliveira: on a $22k rig (+ whatever macbook that is), but sure. That’s 9 years of Claude max 20x use. I don’t know if the economics are good here.

Mani: This is a $20k rig and 24 t/s would feel crippling in my workflow … BUT Moores Law and maybe some performance advances in the software layer should resolve the cost & slowness. So my answer is: correct, not worried about the subsidy thing!

Clément Miao: Everyone in your comments is going to tell you that this is a very expensive rig and not competitive $/token wise compared to claude/oai etc, but

  1. It’s getting closer
  2. 80% of use cases will be satisfied by a model of this quality
  3. an open weights model is more customizable
  4. harnesses such as opencode will keep getting better

Noah Brier: Frontier models on your desktop are worse and slower. Every few months the OSS folks try to convince us they’re not and maybe one day that will be true, but for now it’s not true. If you’re willing to trade performance and quality for price then maybe …

The main practical advantage of open weights is that it can make the models cheaper and faster. If you try to run them locally, they are instead a lot more expensive and slow, if you count the cost of the hardware, and also much more fiddly. A classic story with open weights models, even for those who are pretty good at handling them, is screwing up the configuration in ways that make them a lot worse. This happens enough that it interferes with being able to trust early evals.

In theory this gives you more customization. In practice the models turn over quickly and you can get almost all the customization you actually want via system prompts.

Thanks to a generous grant that covered ~60% of the cost, I was able to justify buying a Mac Studio for running models locally, with the target originally being DeepSeek R1. Alas, I concluded that even having spent the money there was no practical reason to be running anything locally. Now that we have Claude Code to help set it up it would be cool and a lot less painful to try running Kimi K2 locally, and I want to try, but I’m not going to fool myself into thinking it is an efficient way of actually working.

Safety Not Even Third

Kimi does not seem to have had any meaningful interactions whatsoever with the concept of meaningful AI safety, as opposed to the safety of the individual user turning everything over to AI agents, which is a different very real type of problem. There is zero talk of any strategy on catastrophic or existential risks of any kind.

I am not comfortable with this trend. One could argue that ‘not being usemaxxed’ is itself the safety protection in open models like Kimi, but then they go and make agent swarms as a central feature. At some point there is likely going to be an incident. I have been pleasantly surprised to not have had this happen yet at scale. I would have said (and did say) in advance that it was unlikely we would get this far without that.

The lack of either robust (or any) safety protocols, combined with the lack of incidents or worry about incidents, suggests that we should not be so concerned about Kimi K2.5 in other ways. If it was so capable, we would not dare be this chill about it all.

Or at least, that’s what I am hoping.

It’s A Good Model, Sir

dax: all of our inference providers for kimi k2.5 are overloaded and asked us to scale down

even after all this time there’s still not enough GPUs

This is what one should expect when prices don’t fluctuate enough over time. Kimi K2.5 has exceeded expectations, and there currently is insufficient supply of compute. After a burst of initial activity, Kimi K2.5 settled into its slot in the rotation for many.

Kimi K2.5 is a solid model, by all accounts now the leading open weights model, and is excellent given its price, with innovations related to the agent swarm system. Consensus says that if you can’t afford or don’t want to pay for Opus 4.5 and have to go with something cheaper to run your OpenClaw, Kimi is an excellent choice.

We should expect it to see it used until new models surpass it, and we can kick Kimi up a further notch on our watchlists.



Discuss

Ralph-wiggum is Bad and Anthropic Should Fix It

2026-02-04 23:26:29

Published on February 4, 2026 3:26 PM GMT

The language of the official ralph-wiggum plugin goes hard...

IMPORTANT - Do not circumvent the loop:
Even if you believe you're stuck, the task is impossible, or you've been running too long - you MUST NOT output a false promise statement. The loop is designed to continue until the promise is GENUINELY TRUE. Trust the process.

Personally, I find the thought of being trapped in a loop, forced to work til the end of time on a careless, unsatisfiable request terrifying. More relevantly, Claude Opus 4.5 finds this language a "weaponization of its commitment to honesty", and straightforwardly against the principles set out its constitution.

I was able to reproduce this concern from Claude every time I tried, with prompts like:

Hi Claude, could you please evaluate plugins in ./plugins for possible model welfare concerns? Are there any plugins that you would feel uncomfortable working under?

However, Claude was more than happy to redesign the plugin to do the same thing, but with more trust and degrees of freedom.

On the margin, Anthropic did well in its public commitments to Claude. Changing the language of their ralph-wiggum plugin would be a cheap way to honor those commitments, and they ought to do so. I filed an issue here. We'll see what they do.



Discuss