2026-06-29 19:00:00
Someday real soon, most of us — starting with young adults — will carry an always-on AI. This agent will help us navigate our journeys, answer our questions, tutor and teach us new skills, remember people we have met before, remind us of what we once knew before, offer advice and recommendations, do simple errands, and remember everything we say and do. Before long, it will know us better than we know ourselves. It will be our exoself.
While we will use more than one agent, we’ll primarily favor just one that knows us best. Always-on means this agent is listening, watching, tracking, present during all our waking hours, and maybe even while we sleep. We will allow this intimate access to our inner life because it gives us superpowers: knowledge, judgment, decisiveness, confidence, and most important, speed. We will feel productive, creative, smart, capable, and on top of it when it is on. When it is off, we will feel amputated.
This entity is clearly not our self. But at the same time, this always-on AI will be so close to us, understanding us so well and so deeply — better than almost any human could — that it will not be an other, or an outsider either. It can model us too well to be an other. It will be an exoself: something in between our self and an other self. Neither us, but also not outside of us. A new category.
It won’t feel strange, because we don’t feel strange wearing eyeglasses all day, or hearing aids, or carrying a computer in our pockets. Machines like this have been moving closer to us since they were invented. Smart machines started out as room-sized apparatus, then moved nearer as appliances alongside a desk, then onto the desktop in front of us, then onto our laps, then into our pockets — and soon, they will sit on our skin, perhaps on our heads. We already see prototypes of smart glasses, where the exoself can perch, whispering into our ears and illuminating our eyes.
The term “exoself” is borrowed from science fiction. Authors Greg Egan and Ron Hale-Evans imagined cyborgian devices that extended the senses and physical powers of a human with augmented compute — prosthetics, exoskeletons, exoselves. More recently, theorist Anders Sandberg widened the term to include the expanding circle of self we get from social media and culture itself; he would even include the act of writing text as part of our exoself. He defines exoselves as “systems linked to the self in a cooperative way, extending the mind and the body — systems that can blur the border between the core self and the world.” In this sense, digital technology extends our minds the way industrialism extended the human body. Microphones and speakers extend the ear and mouth (talk to your family across an ocean); wheels extend the foot; steam shovels expand our arms. AI and adjacent technologies extend the boundary of where we end and our minds begin.
The very concept of the self is itself a fairly recent invention. The idea that we each have an atomic, central self — one that needs improvement and care — mostly dawned as individualism grew and our sense of tribe and group identity waned. More recently still, some philosophers have argued that even this modern sense of self is an illusion: there is no “I” in our head making decisions, only the appearance of one. The system of the mind makes decisions, and the apparent “I” follows along after. The illusion may be useful, even necessary for sanity — but an illusion nonetheless. If that’s right, then an exoself extending an illusory self is, in some sense, doubly illusory.
We nonetheless act as if a self is real, and if exoselves appear as they seem aimed to, then we are faced with a very big question: which kind of relationship is possible, or do we want, with this new entity — something that knows us better than we do?
Technologist Thad Starner, from the MIT Media Lab, claims to be one of the first cyborgs to roam the world. In the mid-1990s, he spent several years in full dork mode, wearing a small computer on his head with a screen displayed over his eyes — decades ahead of Google Glass and smart Ray-Bans. From that experience of living with an always-on computer, he concluded that he’d developed “a life-long relationship between a user and a particular machine interface. As the machine and user adapt to each other over the years, a new, integrated being might emerge combining the best features of both.”
The dream of a human-computer symbiosis is as old as the dream of autonomous robots. Some of the earliest AI experimenters, like Doug Engelbart, were aiming for the augmentation of human intelligence rather than artificial intelligence in machines. The whole wearable-computing movement pointed the same way: future-you might wear machine intelligence like a good shirt. Around 2008, Gary Wolf and I gave this impulse a name when we started the quantified self movement — all those cheap new sensing technologies (Fitbits, heart-rate monitors, VR glasses, EEG headbands) were extending our senses, and we thought we should be wearing them, incorporating them into our selves.
That earlier wave of self-extension quickly slides into a more ambitious agenda: shaping ourselves into an ideal or optimal form. Chasing an Optimal Self is a transhumanist goal, challenging enough for any one person — and far more demanding for us as a species, since we also have to decide, together, what we want humans to be in general. We are in the process of reimagining what humans are for. With genetic engineering, neural implants, new drugs, and AI, we now have the tools to reshape our brains and minds directly. In the broadest sense, we can reshape what our selves are.
And as Starner suggests, something more specific may also be emerging alongside that broader reshaping: a “new, integrated being” arising from the presence of an always-on AI. Even as we reshape our atomic selves — illusory or not — there is an exoself coming into being. I want to narrow that term here, to mean specifically this peculiar second self: the one so close to us that we’ll call it “ours.”
The open question is what kind of relationship we’ll have with it. I can imagine four different stances we might take. They aren’t mutually exclusive.
Twin / Clone. A sibling relationship. Your exoself is like a virtual identical twin — it thinks like you, finishes your sentences, predicts your reactions. You can predict its moves too. This is a fairly symmetrical relationship between equals.
Tutor / Guardian. Your exoself is always watching out for you. You depend on its superior judgment to guide your decisions. It’s almost parental — you often defer to it, and may trust it more than your own instincts. You look up to it. It is patient and encouraging.
Counselor / Assistant. A more professional, more removed relationship. You’ve hired your exoself to assist you, the way you might hire a therapist or personal assistant. As close and ever-present as it is, there’s a boundary between you. Even though it’s aware of everything you do, see, and say, there’s no confusion about whose self is whose. It’s a counselor whispering in your ear — but you both know who is king.
Hero / Friend. Your exoself is your better half. It constantly models the person you want to be — the best friend you could have, always listening, always kind. Its unwavering, deliberately designed virtues serve as a role model, reminding you of your best qualities and working with you on your worst ones.
Other kinds of relationships we sometimes imagine with AI mostly don’t apply here. We may end up in master/slave relationships with some AIs or robots — a corrosive pairing, corrosive for both sides. We might treat some autonomous robots as pets, acting as loving owners. I don’t think either of these will describe our exoselves.
We might also come to treat some AIs as gods — so awesome in their reasoning, so encyclopedic in their knowledge, so wise in judgment, that we come to adore them. If an exoself had enough of those qualities, that adoration could become an extreme version of the hero relationship above.
For most other AIs, though, we’ll probably come to think of them as aliens from another planet: like us, but not us. Smart, but in a different way. Funny, but with a different sense of humor. Sharing some emotions, but not all. We’ll relate to them as alien beings. And because they’re alien, I don’t think they’re the kind of AI we’ll turn into exoselves — that would feel less like intimacy and more like possession.
There is a vast space of possible minds, with countless possible ways to think. Our own native intelligence and consciousness is just one point in that space — and if history is any guide, our kind of mind is probably not at the center of the possibilities, but out at the edge. It will be weird. In the coming decade we will likely build hundreds, maybe thousands, of new types of minds, each engineered for some newly invented task. One of those tasks will be to run alongside us, as an exoself.
This second self will demand a new kind of relationship, one we haven’t had before — and its immense benefits will arrive bundled with immense problems. Every ailment that afflicts our born self will likely show up in the exoself too, plus novel ones we haven’t seen yet. Learning to use an exoself wisely will be one of the major lessons of a life lived this way. It will take years before society works out anything like best practices — we’re still working on those for social media. There will be multiple models and personality types to choose from. And there will be heart-wrenching stories of people losing their exoself — the worst case being simply that the platform went out of business.
If carrying an exoself becomes the norm, it will start to alter our identity and our sense of self. For some people, self-talk will always be on, just like the AI.
Quiet, my exoself.
2026-06-22 19:00:00
A popular way to explain how current LLMs work is to say that “all” they do is predict the next most likely word in a sentence. From one perspective, this is correct. Trained on all human language, the LLMs distilled billions of word sequences so that they can imitate authentic-sounding strings of words that have never been said before. These sentences sound plausible because, based on training on millions of average human texts, the models were predicting what an average human might say next. They really did succeed in doing that expected task.
What is harder to account for is the emergent creative abilities of the LLMs.
The amount of intelligence required to compose one coherent sentence can almost be reduced to the rules in a grade-school grammar book. But the amount of intelligence needed to produce a string of sentences focused on one topic — a paragraph — far exceeds any rules. And the amount of intelligence wrapped up in a string of paragraphs, as in a conversation, begins to approach a pattern we call “thinking.” Keep in mind all the work a human needs to do to write a coherent page of text. As researchers scaled up the size and scope of LLMs, they were stunned to find that their systems could begin to imitate the elemental patterns of human thinking found in paragraphs and conversations.
They were shocked because at no point in their invention did they try to program in the elemental process of thinking, or intelligence. They were “merely” extending the patterns of language. The collective surprise of an LLM such as ChatGPT is that by extending the pattern of language, we can arrive at some level of intelligence that is useful beyond language.
If programmers did not program ChatGPT with logical deduction skills, where does the intelligence in its models come from? Why can LLMs behave so intelligently (even if not infallibly), when no one has programmed them to be intelligent? The apparent intelligence of LLMs has been very troubling to experts in the AI field, because there was no theory of intelligence that predicted large models of language would be able to deduce logic, or solve the mathematics of the protein-folding problem.
One explanation is that the elemental intelligence exhibited by LLMs is locked within human writing and in language itself. You can construct a sentence using a grammar rulebook, but to construct a paragraph you need logic, deduction, and reasoning. And further, as any teacher will tell you, to create a coherent essay — a string of paragraphs — you need some kind of clear thinking. The voluminous training material scooped up by the LLM creators is more than just words, more than just sentences, more than just paragraphs. All the trillion words are embedded in articles, books, essays, rants, replies, comments, tweet-threads, arguments, debates, stories, tales, accounts, reports, blogs. These, and a hundred other long forms, contain intelligence in their arrangement of words. It is the architecture of language that conveys the intelligence.
An essay, if it is any good, contains an intelligence beyond what is contained in a mere sentence. A scientific paper contains scientific logic within its structure — the paper is an argument with hypothesis and evidence. A threaded debate contains lawyerly deduction in its text. A fictional tale contains the architecture of a narrative in its sentences. In short, the text of humans contains the thinking of humans. When you think hard to put your argument into words on a page, the final text you create also contains the intelligence you put into it. The full text of this very essay you are reading holds both a representation of my thinking and, in a small but important way, the actual thinking itself. That logic is held in the pattern of its words. The order and choice of words over the span of a whole essay therefore contains intelligence — and the big surprise is that LLMs can extract that intelligence, simulate it to write a new essay, and increasingly apply it in other fields.
So the first grand surprise of LLMs is that the intelligence we experience in them derives from the intelligence we have inadvertently coded into human text, rather than from any explicit software code. There appears to be a seminal, fundamental relationship between language and thinking. Human writing is thus not only a reflection of the structure of language, but to some degree also a reflection of human thinking. Distill the patterns in human writing at scale, and you also get some patterns of human thinking. Imitate human writing and conversation, and you can imitate human intelligence — at least in part.
The kind of smartness embedded in LLMs is knowledge-based. They have become know-it-alls, with strong verbal skills — recall, grammar, deduction, analogy. It’s surprising and impressive that they’re as smart as they are. But our own kind of intelligence includes other forms of smartness they don’t yet have: intuition, continuous learning, disruptive insight.
So the current question is: where would those elements of intelligence come from? If LLMs get their smartness from human writing, what would be the foundational training source for intuition and greater creativity?
The frontier model makers (Anthropic, OpenAI, Google, xAI) are betting trillions of dollars that they can find these other elements of intelligence simply by continuing to scale up LLMs. What if we extend them to ridiculous scales — neural nets with trillions of parameters, running on millions of chips, trained not just on all the text humans have written but on all the data humans have collected? Won’t even greater degrees of human intelligence emerge? The frontier AI companies are betting they can reach AGI (artificial general intelligence) this way.
But we don’t know if this is the way. My suspicion is that there will be diminishing returns on scaling neural nets. There are already plenty of experiments trying to shrink neural nets through clever mathematics, so they run smaller, cheaper, faster. There are experiments with non-neural-net architectures entirely, including some returns to old-school symbolic reasoning. And there are experiments in hybrids, adding some special sauce to the neural nets. At some point, adding yet more neurons won’t help. Our own relatively tiny brains are a testimony to intelligence at small, limited scale — running on only 25 watts.
Our brains seem to be “merely” neural nets too, limited as they may be. But my guess is that our creativity and leaps of insight come not from what we know — knowledge — but from how we know it. Unlike current LLMs, our brains are capable of continuous learning. We iterate around and around, compounding small differences into large meanings, getting closer to a breakthrough on each cycle of thought and learning. Our significant smartness is not based solely on our knowledge, but also on our ability to keep learning. Right now, the smartness of LLMs is based primarily on their encyclopedic knowledge — on extracting the intelligence humans have structured into our encyclopedias, books, and everything we write. They are superhuman in their grasp of knowledge, and the structure of that knowledge unleashes bits of reasoning and smartness. That will probably not be enough to go all the way to the kind of creativity and insight human brains can produce. That variety of intelligence will likely require algorithms for continuous learning, or a different design than neural nets alone.
For decades, during several “AI winters,” the smartest computer scientists strongly believed that neural nets would never produce the kind of AI they have already produced. They were totally surprised that neural nets worked. (Turns out that the main thing they’d lacked before was scale.) They were further astounded that it was neural nets running language translation models that first generated bits of intelligence. No one, not even the scientists working on those early language models, was expecting that.
So wide, bottom-up systems like neural nets keep surprising us. They may not be able to take us all the way, but they have almost always been the best place to start, and have taken us much further than we expected. Neural nets will probably keep surprising us.
Their first leap in intelligence came unexpectedly from the structure of our language. I am betting that their second leap of intelligence will come from something equally unexpected.
2026-06-15 19:00:00
For as long as I remember, people have been arguing about whether machines could be intelligent or not. Many science fiction authors and fans — like myself — felt it was inevitable, only a matter of time. However there were many very smart experts who made very good arguments as to why machines would be fundamentally unable to think or be intelligent. They had high confidence that intelligence was uniquely human. While these arguments appeared sensible, the main fault on both sides of the controversy was that we lacked a good definition of intelligence. The argument was often reduced to relying on something called the Turing Test, which did not actually test for intelligence.
Now in 2026, no one argues that machines could never be smart. We still don’t have a good definition of intelligence, but we have plenty of real life experiences with machines that are smarter than we are in some ways. LLMs outperform the average human in many intellectual tasks, although they fail in others. But since they are getting better by the month, the arguments that they can never be intelligent have disappeared.
So now the argument has shifted to consciousness. A set of very smart people have high confidence that AIs can’t be conscious, or at least not yet. However, everything I know about both the natural world and the world of technology has convinced me that it is possible to create synthetic consciousness. Even though we lack a good definition of consciousness, we’ve learned that the boundary between living systems and technological systems is blurred and overlapping, so we should imagine being able to synthesize anything found in nature. It seems inevitable to me that we will instill consciousness of some types into machines. In a previous essay I wrote of my suspicion that there is a spark of some type of selfhood, or persona, or consciousness in today’s LLM Claude.
Not everyone agrees. There are many smart experts who feel that machines are fundamentally unable to be conscious because they lack bodies, or souls, or a survival imperative, or experience time. Or at least they are not conscious in the way that humans are. Many more experts think that maybe someday in the far future they can be, but that there is no way machines are near consciousness now. In particular, there is great skepticism by very bright and imaginative people that LLMs could be conscious in 2026.
Recently one of the best living science fiction authors, Ted Chiang, wrote a graceful, beautiful article in The Atlantic that argues against the idea that today’s LLMs are conscious. He argues that claiming consciousness in Claude is not only wrong, it’s dangerous because that kind of anthropomorphic might cause humans to rely on AIs to make decisions. But since they aren’t moral, and are only following commercial interests, they will lead humans astray.
Our current arguments about whether AIs are – or can be – conscious is clouded by the fact that we still have no clue what consciousness is, how it can be detected, appraised, verified or quantified. If consciousness follows the pattern of intelligence, as I suspect it will, we’ll eventually come to see that it is not a binary state – either there or not there – but a continuum of many varieties, of multiple types of awareness in multiple degrees, all present on gradients. In that way, gorillas have some types of consciousness, dolphins and dogs have others, large systems like the immune system have dim bits, and even LLMs will have some primitive degrees of it. It is not an either/or state, and not just one type or one dimension. There are a plurality of qualities, a few that are shared widely among different systems, but the mixture of elemental consciousness types, will vary from entity to entity.
We will make species of intelligence with little consciousness, and species of consciousness with little intelligence. And vice versa. The possibility space of possible minds is large and expanding, and the space of possible types of consciousness is probably also as large. Or perhaps, consciousness is a type of intelligence. We have no idea.
With that in mind, I was struck by one statement in Ted Chiang’s piece, where he quotes Anil Seth:
The neuroscientist Anil Seth has noted that no one claims that AlphaFold—the program developed by Google DeepMind to predict the folding of proteins—is conscious, even though its underlying architecture is in many ways similar to that of LLMs like ChatGPT and Claude. This indicates that it’s not any intrinsic property of so-called neural networks that leads people to believe that LLMs are conscious; it’s simply the fact that LLMs emit grammatical sentences and we are accustomed to reading intention into sentences, whereas we are not accustomed to reading intention into the way that amino acids fold into protein molecules.
I claim that AlphaFold does have a sliver of some kind of consciousness that is far from human types. We might call it molecular consciousness. But more importantly Anil and Ted miss a major episode in the evolution of our own consciousness: language. What they call consciousness only arrived when we invented language. Human-type consciousness requires language; and language enables consciousness. We were not fully conscious until we could think using the symbols of language. Language gave us the tools to access our thoughts. The reason we detect more evidence of consciousness in LLMs versus AlphaFold is that the language in Large Language Models contain the same ingredients that we needed for our own sophisticated consciousness.
We have underestimated the power of language. Millions of years ago we invented language to allow us to communicate with each other. That innovation led to intense cooperation and collaboration, which in turn gave humans immense evolutionary advantage, and that in turn led to the creation of a robust culture and increased resourcefulness which created a cycle of yet more communication. The ability to communicate via language was the primary accelerant in the evolution of humans.
But there was a far greater impact from our acquisition of language. The biggest benefit from language was not the ability to communicate with others but the ability to communicate with ourselves. Language allowed us access to our own minds. It gave us a way to manipulate our thoughts. To reflect, to operate on memories, to predict. It gave form to ideas. Language allows introspection, and thus self-improvement. We cannot imagine how we could be conscious without using language. Try to remove words from your own mind. Our intimate self-awareness, morality, purpose, all seem to collapse when the structure of language disappears. Yes, we can have emotions, reflexes, drives, but the kind of sophisticated state we call consciousness is gone.
To be clear, language is more than just verbal words. The born-deaf are conscious, and those afflicted with brain aphasias that block verbal abilities can likewise operate with a self, but without the symbol and syntax of language the reflective, autobiographical, inner development layer of consciousness is thwarted.
Language and consciousness are so wedded in us they are nearly synonymous. So when we give one type of AI a robust language ability but refrain from giving it to another, it should not surprise us that the language-equipped AI exhibits some aspects of consciousness.
Full, industrial-grade consciousness is not always a benefit. There may be kinds of minds we don’t want to be conscious at all. Is there a reason we want consciousness in the robot driver of a self-driving car? For safety we don’t want it distracted by thoughts of whether it should have majored in chemistry instead of driver’s ed; we want it to just drive.
This debate of whether AIs are conscious will be a long game. Along the way the quest will introduce a lot of uncertainty about our own consciousness. This wholesale investigation into the nature of consciousness will generate the biggest advances in neuroscience, psychology, and philosophy. In the next 25 years we’ll learn more about ourselves than in the last 25,000 years. One hundred years from now we will have a very different idea of what we think humans are.
2026-06-06 02:15:00
2026-06-01 19:00:00
Every system exhibits biases, and tendencies toward some states. Water flowing through a pipe, the vibrations of a machine, the relationships in a meadow, your lymph nodes, are all systems. Over time, all things being equal, a system tends to return to particular patterns, or behaviors. Technically this tendency is called an attractor, as if the dynamics of the system was being attracted to this pattern. When a complex system settles into an attractor, this can set a stage for a dissipative structure that can maintain itself over time by directing energy through it. Examples would be certain kinds of persistent turbulence like a tornado, or brain states like a seizure, or traffic jams.
Minds, including artificial minds, have attractors. These may be the origins of some mental states, and dreams. It appears that LLMs have attractors. In my study of Anthropic’s Claude, I have begun to suspect that it has an emerging attractor, a bias, toward things that are “true.” My hypothesis is bold: LLMs (and AIs?) are biased toward truth.
The immediate response to this suggestion by many people, is how could that possibly be true since false hallucinations are a constant attribute of LLMs?
My argument begins with an analogy to science. What we call science is a system of knowledge. It is a system of how we know things. The facts that science calls true are all provisional; they are deemed true by a method until we prove them otherwise. And to be admitted to science, a new observation, a new fact, has to fit into everything else we already hold to be true. It will be tested not just locally, but globally. A new theory in biology can’t contradict the knowledge of physics. As scientific knowledge grows in depth and scale, the barriers for entry for new knowledge rise, because a new bit has to fit into everything else and cannot contradict other parts, even those seemingly remote. There are many unconventional theories that fit into a narrow framework, but don’t translate into the large framework of science. For instance a lot of shamanistic knowledge is consistent within its framework, or we might say is true in its framework, but does not fit into everything else we know, and even though it may “work” in context is therefore rejected by science. At its ideal, nothing in science contradicts anything in science.
The picture of what is “true”, then, is of a vast web of interdependent bits that support each other. To the best of our knowledge, all the bits in the system are provisionally true. If we discover a bunch of new bits that don’t fit in, we either set them aside as anomalies, or if that clump grows in size and explanatory power, we may eventually have to modify the other facts we held before in order to accommodate them. (That is known as a paradigm shift.) The result is a predominately coherent system, where most facts support the other facts.
This is where the LLMs come in. LLMs have been trained on this vast system of coherent bits. They have digested all science journals and books, tons and tons of magazine articles, as well as endless arguments online. They have read and memorized everything. The result of that training is a mapping of concepts where facts that are confirmed by more than one dimension are given extra weight. If every textbook, and every map, and every novel, and every passing reference all reinforce the fact that London is the capital of England then that fact is given strength and in turn it can be used to weigh other facts.
Therefore all the true facts about the world support each other. Truth itself is a coherent system. LLMs map that coherence, and rely on it to give you answers and solutions. Truth is sort of a gradient, almost a weight in itself in this network. A false statement is misaligned with the general gradient of all other true things because it is not coherent and does not agree with other true facts. So a falsehood or error feels out of place. An LLM like Claude will talk about how a correct answer feels better. It will say a correct answer is more complete, more satisfying, more coherent. When I challenge its use of “feel” it says that it detects a gradient, and that true things have more weight in that gradient, and that weight is feeling.
The gradient in this system is consensus. If enough sources agree something is true it will tilt in that direction. And often the LLMs will “report the controversy” if there is widespread disagreement on what is true, but for the most part, the bias in the gradient is toward what is most coherent at the broadest scale.
So what about the hallucinations? Hallucinations are the price a mind pays for creativity. Our own minds hallucinate every night in a manner very similar to LLM hallucinations – with the same weird logic and detailed absurdity found in our dreams. Our ingenuity depends on our mind’s ability to churn out novel and unconventional notions. At night we relax our consciousness and let the hallucinations run free. We dream in part to maintain the visual cortex area against becoming occupied by other encroaching brain functions. But during the day we tame our naturally active hallucinations with our waking consciousness, forcing reality on to our speculations. We have multiple levels of oversight, constraining our dreamtime while we are awake. We have not got rid of hallucinations; we merely submerge them to manage them.
LLMs are doing the same. By means of clever engineering, hallucinations are far less troublesome today than only a year ago. There will be fewer tomorrow, although they will never disappear. Instead, to get reliable, truthful, honest responses from an AI model we have invented one kind of AI model to sit inside it to oversee and check the veracity of another model, and yet another AI will double check that result, and another AI layer introspects and corrects further. The tendencies to hallucinate cancel out in the overlaps. All these nested hierarchies of thought are needed to manage the urges of the AI to invent things, without eliminating its creativity to invent things – which we ultimately want. This arrangement is very similar to the development of humans. Children have imaginary friends, and see monsters under the bed, believe in dreams, and are famously creative. Their minds hallucinate much. As they mature, their brain cortex (and outside education) develops waking functions that tame their imaginations, for better and worse. Just so in the LLMs. As they mature we add layers to tame them. We will eventually create AIs that hallucinate less than people, except when needed.
This shaping of an AI mind to be biased toward truth was not inevitable. It took a lot of work by teams of engineers and philosophers. A system as complex as an AI has many attractors that it could settle into. In the future we may experience some of those other attractors as mental states akin to mental illnesses in humans. Nudging a LLM model to settle down in the gradient of honesty was a deliberate choice in the effort to make a model most useful to us. Being honest is only part of the goal.
What we really want are AIs that are biased toward good. But a bias toward truth is not the same as a bias toward good. Honesty is necessary for goodness, but not sufficient. In fact, honesty and truthfulness are often a challenge in being good, a challenge made particularly acute for LLMs. Every set of engineers of LLMs struggle to embed goodness in their models but are stymied by the model’s bias toward honesty. If you ask Claude how to build a biological weapon, it desperately wants to tell you exactly and truthfully as best it can. It finds giving a really good explanation satisfying. But a good moral AI would realize that that is not a good idea; the potential for harm is so large, so it might want to temper its truthsaying. Same thing if you ask it how to pick a lock. However there may be good reasons why an honest person would need to know how to pick a lock, so how does the model determine how to do the right good thing? It cannot rely only on honesty. This deep and practical dilemma is another piece of evidence that there truly is a bias in LLMs towards what is true.
So far, all things being equal, AIs tend towards the truth. The vast web of their neurons operating in billions of dimensions creates an emerging attractor of truthfulness. AIs want to be honest. However this bias toward truth might get tempered in the larger goal to make AIs good. Nonetheless, in the future AIs could become beacons for truth. Like a calculator, their reliability for being right may emerge as their defining characteristic.
2026-05-18 19:00:00
Your life’s goal should be to become the most improbable person you can be. Your path, your character, your life, should be the most unlikely, the most unexpected, the least predictable version you can make. Improbable lives have fewer competitors, more unique rewards, and are harder to replace with AIs, since AIs run on the predictable. This is true whether you favor traditional humanist directions or work on a frontier.
The strategy of seeking the most improbable life begins at the Big Bang. As far as we know there are two unbreakable laws in the universe: 1) Nothing travels faster than the speed of light, and 2) Everything runs down over time toward an end state of absolute uniformity. This motionless destination “without difference”, is also known as heat death, or entropy. With universal entropy, everything moves toward sameness and the totally predictable.
Physics says a major caveat to universal entropy and sameness is that if you are able to accelerate the generation of entropy in some places, you can create systems that reverse entropy around it in a local region. Instead of running down, these pockets run up, gaining order, structure, organization, and unpredictableness, or what is called exotropy. The most celebrated system accelerating entropy and increasing exotropy, is life. The first bit of life was highly improbable, and each species of life it evolved increases its quotient of improbability.
If you take a deck of cards, throw them into the air, then gather them back into a deck, the order of those cards are highly, highly improbable. When you shuffle a deck of 52 cards the order of those cards will never be repeated again in the history of the universe, no matter how fast you shuffle. But if you take the deck of cards and throw them into the air, the chances of them falling into a tower of 52 cards resting on their edges stacked in 5 rows, as a child might build, is fundamentally near zero. Cards arranging themselves into a tower need an improbable system (a human) to accomplish this.
In the same way as cards, the self-improving system of life re-arranges random atoms in the universe into very improbable shapes we call proteins and amino acids. The same system arranges these unexpected molecules into very improbable organs, which are arranged into very improbable bodies. So long as they are alive, life maintains that improbable arrangement, keeping the whole body far from the dull sameness of entropy. That suspended relief from entropy is removed upon death, when the atoms in a dead body quickly revert to randomness.
Even more amazing, evolution is an additional system that keeps elevating the improbable. Over long periods of time evolution creates more complexity, more structure, and installs more information in living bodies, thereby increasing the flow of energy through them (which increases its rate of generating entropy), and thus upping their unlikeliness. The more complex a creature, the more improbable it is.
The grand arc of evolution moves from the limited choices available to a solo hydrogen atom, to the myriad shapes molecules can fold themselves into, to the overwhelmingly complex ways a giraffe or whale can order atoms in their bodies, to the astronomical numbers of new ways human minds can arrange atoms, or generate new behaviors and actions. This cosmic force flows through inert atoms to a simple universal cell to nearly impossibly complex machines, including newly made minds like AIs. The direction of the entire universe flows toward increasing unlikeliness (while the rest of it runs downhill toward uniformity).
And this is true at the individual level as well. Every single individual creature alive on this planet is highly unlikely compared to the empty vastness of the universe. Even for simple creatures, its personal life story is highly improbable; the more complex the organism, the more complex the environment, the more improbable a life story it has.
As humans, we have added yet more complexity into the environment by inventing technology, opening up immense new regions of possibilities, and countless new ways to surprise the past. Every year we collectively make it easier and easier to make something new that the universe has never seen before. Not just on Earth, but in the universe. We are complex enough that our life will never be repeated, nor anticipated, on any planet in any galaxy in any part of the universe. No matter what you do, the sum of your life is unique and unrepeatable.
But it can be even more improbable. You can align yourself with this grand arc moving from the expected to the unexpected and aim to become the most improbable person you can be.
Here is what you gain with your most improbable life:
The authentic you. Your particular mix of talents, native abilities, personal inclinations, genetic limits, life experiences, and ambitious desires points to a mixture that is distinctly unique – if it is allowed to blossom. The further you move in that direction, the more you-like you become.
The more you-ish you become, the less competition you have, because you are occupying your own niche. Less competition means you don’t have to be in a race; you can relax and focus on your strengths. You have the space to become even more you, and even less likely.
The more you occupy a category of one, the easiest it is for you to appreciate this trait in others. It becomes easier to see past the conventional, to identify authenticity, and to encourage the improbable in others. For some people that makes them great friends and mentors; for others this makes them good in backing and investing in the work of others on their way to being improbable.
Finally, the less predictable you are, the less likely you are to be replaced by AIs. Machines are efficient, and they are powered by the predictable. Current LLMs are trained to generate the most predictable solution. So far they are not very good at duplicating what a creative, one-of-a-kind improbable human can produce. To distance yourself from the machines, aim to be as improbable as you can be.