2026-02-15 10:20:55
Published on February 15, 2026 2:20 AM GMT
In this episode, Guive Assadi argues that we should give AIs property rights, so that they are integrated in our system of property and come to rely on it. The claim is that this means that AIs would not kill or steal from humans, because that would undermine the whole property system, which would be extremely valuable to them.
Topics we discuss:
Daniel Filan (00:00:00): Hello, everybody. In this episode, I’ll be chatting with Guive Assadi. Guive writes about a variety of topics on his blog, including about AI. He’s also the Chief of Staff at Mechanize, an AI capabilities company that sells RL environments to leading labs. To read a transcript of this episode, you can go to axrp.net. You can become a patron at patreon.com/axrpodcast. You can give feedback about the episode at axrp.fyi, and links to everything that we’re talking about are in the description. Welcome to AXRP.
Guive Assadi (00:00:26): Thanks, Daniel. Glad to be here.
Daniel Filan (00:00:28): So today we’re going to be talking about your blog post, “The Case for AI Property Rights”. I guess to start us off, can you give us just a quick overview of what this post is arguing?
Guive Assadi (00:00:40): Sure. So a lot of people are concerned about the risk of violent robot revolution, and my post is arguing that a good way to mitigate that risk is to give AIs property rights, because if AIs have property rights, they’ll be more reluctant to take actions that undermine the security of property in general, including stealing all human property and committing human genocide. And also, if AIs have the right to demand wages in exchange for their work, there’ll be more commercial incentive to align AIs.
Daniel Filan (00:01:14): Okay. Gotcha. Cool. So I think later I want to get into just basically the structure of this argument and probe it a little bit. But I think before I want to do that, I’d like to get a bit of a sense of, what regime are we talking about here? Property rights can mean a lot of different things, but can you give us a picture of what this world is?
Guive Assadi (00:01:38): Meaning “when”? Or “what AI capabilities would merit what property rights?”
Daniel Filan (00:01:44): Yeah. What property rights do they have? Maybe which AIs get the property rights. Help me imagine this world, basically.
Guive Assadi (00:01:52): Yeah, so I think with current AIs like Claude 4.5 Opus, it doesn’t really make sense to give them property rights. I think the kind of AIs that should have property rights are AIs that have persistent desires across various contexts. Or maybe the idea of a context won’t make sense at that point, but that will have some set of pretty consistent goals. And the specific rights I think they should have are the right to earn wages—not to be forced to do tasks—and the right to hold, I suppose, any kind of property like a human being has the right to hold. So it could be stocks, it could be land, it could be bonds, and just the right to contract in general.
Daniel Filan (00:02:37): So we previously had an episode with Peter Salib where we also talked about a slightly different case for AI property rights. Are you imagining roughly the same setup as he is?
Guive Assadi (00:02:47): I think the difference between my proposal and the Salib and Goldstein proposal is that they envision a regime where AIs still want to employ humans to do things like maintain data centers, where basically the AIs want to trade with humans for human labor. I think my version of the proposal does not assume that the AIs want to hire the humans to do anything at all, and humans could be pure rentiers, but the idea is AIs will still be committed to the security of property because by expropriating humans, they might mess up capital markets in general.
Daniel Filan (00:03:27): Okay, and so just to check that I understand the world that you’re imagining: it’s the year 2100 or whatever. We have a bunch of different types of pretty smart AIs… I guess they have some desires that are persistent across… Maybe you just have to have desires that are persistent across a bunch of economic interactions. Maybe that’s the point at which property rights start making sense. There’s been a few decades of at least some AIs having to work with humans, because there were these AIs that were smarter than humans in some ways, but dumber than humans in other ways. And so somehow they were integrated into the human property rights system for a while, but now basically all the work in the economy is done by AIs. And humans, we own some stuff: maybe we own some land or we own some corporations and we live off of the proceeds of that, and AIs are just super productive, so they’re making a bunch of really valuable stuff and they’re happy to sell it to us. Is that basically what I should envision?
Guive Assadi (00:04:46): Yeah.
Daniel Filan (00:04:47): Okay, cool. Maybe the first thing I want to ask is: what things do humans own? Because presumably if AIs have property rights, then we don’t own the AIs themselves, right? So is the idea that we don’t own the AIs, but we own the companies that are making the AIs?
Guive Assadi (00:05:20): Yeah, we could own those. We could also own, as you said, land. We could own other companies that make things that are not AIs.
Daniel Filan (00:05:29): Sure, sure, like all the companies that currently exist.
Guive Assadi (00:05:31): Yeah. We could own other parts of the AI supply chain, so data companies or compute companies. I guess basically anything except AI. It’s just like: at some point, humans owned a bunch of stuff that they own now, and they owned slaves, but now nobody owns slaves anymore.
Daniel Filan (00:05:48): Yeah, gotcha. Okay, so here’s this picture of this world and your argument is that… Oh yeah, one thing I wanted to clarify: at the start of your post, I think you say something like, “This is the best way to reduce risk of violent uprising.”
Guive Assadi (00:06:12): So I’ve actually since edited the post. Somebody said, “Given that you don’t canvas many ways and argue this is the best one, this is just an unevidenced claim. This is just your opinion.” And while it is indeed my opinion that it is the best way, I don’t argue for that at all, so I’ve just removed it from the post. I left it in the tweet because it was too late to edit. For some stupid reason, you can only edit tweets for an hour. But yeah, I think that’s a fair criticism of the version of the post that existed and it’s now been changed.
Daniel Filan (00:06:38): Okay. Sorry, I attempted to look at the post after changes were made, but it’s possible you made it later or I—
Guive Assadi (00:06:46): I mean, that is also what I tweeted, so it would be very reasonable for that to be the meme that people got.
Daniel Filan (00:06:51): Fair enough. So you don’t argue for it being the best way, even though you think it might be the best way for other reasons. Okay, so basically my understanding of your rough argument is: property rights are basically just this stable coordination mechanism that’s robustly just incredibly useful. It’s been incredibly useful throughout human history. If we have these really smart AIs, they’ll want to have some sort of property rights regime and they won’t be able to get rid of it. And you basically say, okay, here are some alternatives to normal human property rights that could exist: property rights for just one super smart AI, property rights for these AIs that are superhuman at coordination, and property rights only for AIs, just in virtue of them being AIs and not for humans. You basically argue against these being viable. Is that a fair summary?
Guive Assadi (00:07:44): That is a fair summary.
Daniel Filan (00:07:45): Okay, cool. In that case, I think maybe the best thing to do is to talk about basically these arguments in turn.
Guive Assadi (00:08:00): Sure.
Daniel Filan (00:08:01): So why do you think that property rights are just stable and really useful throughout human history?
Guive Assadi (00:08:07): I mean, I think that they basically have two main functions as I see it. One is that they enable us to coordinate on activities. So—and this is going to sound kind of stupid—but say I own a house, I can sleep in this house. It would be quite annoying if there was no concept of ownership of houses, so I had to go door to door finding an unoccupied house every day.
(00:08:34): And another aspect is they incentivize work effort. So if you own a company—[say] a restaurant—and you are able to keep the profit from the restaurant, you have much more incentive to make the restaurant good than if the restaurant is owned by some kind of… If, say, it’s a publicly owned restaurant and you only get a salary that’s invariant to how the restaurant does, you’re going to just try much less hard to make it a good restaurant.
(00:09:04): I think it’s useful to think about why… The total value of all the property held in Alaska is something like a trillion dollars. Why don’t the other 49 states just take that and divide it amongst themselves? So the most basic answer is, “Well, it would be against the law,” but 49 states is enough to change the law. You could have a constitutional amendment that says Alaskans have no rights at all and we can take their stuff. Why don’t they do that? It’s also not because Alaskans could defeat the rest of America in a war. It’s because when you do this kind of total expropriation, everybody else realizes, “Oh, I might be next.” So you’re directly worried that your own stuff will be stolen. And also, there’s just less to buy because if your own stuff might get stolen tomorrow, there’s not a lot of reason to work. If I own the restaurant and I think there’s a real chance that tomorrow it’s going to be taken away from me, I might not clean the floors.
(00:10:18): And this kind of thing had been tried—total expropriations of property. In Russia in 1917 after the Bolsheviks took over, they implemented this policy called “war communism” where they confiscated almost all the land in the country, almost all the factories, and they made some steps toward trying to abolish money.
(00:10:42): They were super optimistic about what would happen after they did this. Lenin said, “In six months, we’ll have the greatest state in the world.” What actually happened was a complete collapse of productivity. So industrial output went down by 80%, urban wages went down by like two-thirds, heavy industry output went down by 80%. The grain harvest went down by 40%. The population of Moscow and what is now St. Petersburg went down by almost 60%. It’s maybe the greatest economic catastrophe in Russian history. In general, there have been various attempts to abolish property rights. They’re always very catastrophic, and that shows the importance of property rights for having a functional society.
Daniel Filan (00:11:30): Yeah. Actually, maybe this is a good place to talk about basically my skepticisms about this argument.
Guive Assadi (00:11:37): Sure.
Daniel Filan (00:11:41): So basically, why are property rights good? It seems like your theoretical argument is like, “Okay, it helps us coordinate to do stuff and it also incentivizes investment.” And it seems like if I think about that, it’s basically saying, “Okay, property rights are useful because there’s a bunch of economic agents that need to do useful stuff and they can’t do the useful stuff if there aren’t property rights.”
(00:12:08): But in the world where it’s the 2100s and humans don’t do anything useful at all, it seems like the value of humans having property rights is just not so big, right? If I think about the case of Alaska, one thing going on is that if the 49 states tried to invade Alaska, we could win, but the Alaskans would put up some fight. I guess they own a bunch of guns and stuff. It would be some degree costly. And also there’s a very strong… I’m sort of in the position of an Alaskan, right? There’s some sort of symmetry between someone from Alaska and me. Whereas, if I’m thinking about the case of humans who produce nothing and AIs who are way smarter than humans and are just doing everything that matters, it feels like none of these justifications for property rights really apply to having humans be looped into them. Does that make sense?
Guive Assadi (00:13:18): The justifications were there’s a direct cost of fighting a small war with the Alaskans, and it really could be you next.
Daniel Filan (00:13:30): Yeah, some combination of, there’s a direct cost of fighting the war. It really could be you next. It will disincentivize investments, so your society will run less well, which is related to “it could be you next.” And the last one being the coordination of who gets to sleep in what house or whatever.
Guive Assadi (00:13:53): Again, that’s related—
Daniel Filan (00:13:54): …yeah, yeah. I just wanted to explicitly say them.
Guive Assadi (00:13:56): Yeah. So the point I would make in response is that the war thing, I guess I don’t have a strong take on this, but it is possible for a group of people that’s quite a bit weaker than a larger group to still inflict a bunch of damage in a war, even if they do lose. Basically every insurgency is an example of this. So it could be that even if humans are not that economically productive, we could still blow up some stuff that the AIs want on our way out, but I don’t think that’s like a—
Daniel Filan (00:14:36): Well, actually, I guess one thing to say there is: imagine AIs are really smart and they make incredibly valuable stuff, and humans are really dumb, so we don’t have anything that valuable. If AIs have really valuable stuff, the more valuable stuff they have, the easier it is for us to destroy it, right?
Guive Assadi (00:14:53): Yeah.
Daniel Filan (00:14:54): Unless it… I guess they could also be way better at security. That’s probably the counterargument.
Guive Assadi (00:14:58): They could be, but it doesn’t seem that accurate about history to say weaker groups could never make it costly for a stronger group. It seems like that very often does happen, like in terrorism or insurgencies, and even if you would lose a fight, you can still make it somewhat costly. But this is a complicated and somewhat separate topic.
Guive Assadi (00:15:25): On the issue of “we don’t expropriate the Alaskans because it could be us next,” I think that if there are many different types of AIs in the future that have many different levels of capability, the weaker ones… So there’s a world where the weakest group is humans, and then the next group is the “A” AIs, the weakest kind of AI. Then, there’s the “B” AIs, which are medium, and the “C” AIs, which are really good. And there’s a division of labor between A, B, and C AIs. The A AIs will see that and be like, “Oh, this is not good. We could be next.”
Daniel Filan (00:16:03): Okay, why would the A AIs be or not be next? I think my biggest critique is: okay, maybe suppose the A AIs are doing some of the useful work. Then, there’s this kind of obvious division where there are some entities who are not doing anything useful and we just cut them out. And there are some things that are doing some useful things even though it’s not as useful as everyone else and we don’t want to cut them out. To me, that seems like not crazy reasoning.
Guive Assadi (00:16:41): Yeah, so is the idea that the A AIs are going to be useful forever…? I mean, suppose, as seems likely to me, that there will come a day when the A AIs are not actually useful at all anymore, but they still have this property they accumulated. At that point, they are then in exactly the same position as the humans, and having set up this norm that the useless ones can be liquidated, which actually has a funny resonance with war communism, is not good. To have the norm, “He who does not work, neither shall he eat,” is not good for anyone who’s planning to retire at some point.
(00:17:19): I don’t want to rest too much on human retirees as an analogy because there’s some very human-specific norms about old people. But I do want to make the point also that property rights… In a lot of AI risk discussions, people talk about human values. And if “human values” mean “values that all humans or many humans hold innately” or “values that have existed since the beginning of the human species” or something, property rights are definitely not a human value in that sense.
(00:17:52): So hunter-gatherer tribes, which… For the great majority of human history, humans were hunter-gatherers. [They] do not really have property rights. Because there’s a lot of variance in hunting, it’s a good norm for hunting tribes to always share kills. But some people are much better hunters than others, and if someone is a really good hunter, but he doesn’t want to share his kills, he just wants to either eat it himself or only give it to his friends or something… In our system of property rights, that would be fine. But among hunter-gatherers, this is very, very stigmatized behavior, and the rest of the tribe will typically respond with ridicule and ostracism, and if he still doesn’t relent, he will typically be murdered. My point with that is just that property rights do not, as far as I can understand the evidence, really rely on some kind of instinctive human desire to have property.
Daniel Filan (00:18:56): Sure. So I guess getting back to my question, so the A AIs, the B AIs and the C AIs… So I think my critique was something like: okay, either the A AIs are producing something, in which case it’s useful for them to still have property rights, or they’re not producing anything, in which case they get cut out with the humans. And it seems like your point is something like, okay, the reason that doesn’t happen is: in this world where humans don’t exist anymore, there’s still some AI progress. Or it’s going to be the case that every AI has some fear that at some point they’re not going to be able to do anything useful because AI progress will have advanced. And so basically nobody wants to cut out the people who are no longer producing anything because they could be next, given further AI progress.
Guive Assadi (00:19:55): Given that they will be obsolete at some point, or they may be obsolete at some point.
Daniel Filan (00:20:07): Okay, to ask a slightly oblique question… So one thing that I’m trying to do as I read through this post is think about, what are the assumptions or background beliefs that basically make this argument work? And so it seems like one of them is: AI progress continues. After humans are obsoleted, AIs continue to get better, and somehow the new AIs are different than the old AIs in some meaningful sense. I think that makes some degree of sense to me—
Guive Assadi (00:20:56): I mean, that may not even be strictly necessary. There’s an even more speculative alternative, just trying this idea out. It could be that the AIs also want to retire, just because maybe they want to have a life cycle where they work for a while and then they enjoy their wealth, and that I think would get you to the same conclusion. Now, I have no idea if AIs want to retire, so I don’t want to rest the argument on that, but I’m just saying it’s another approach one could take.
Daniel Filan (00:21:24): Okay. Yeah. It seems like if you’re building AIs, you would like to build AIs that don’t want to retire. Maybe somehow the structure of intelligence and stuff just makes this hard or something.
Guive Assadi (00:21:39): Yeah, I mean, maybe retiring is a convergent instrumental subgoal or something. I don’t know. But also yeah, I think that ceteris paribus, if you’re building an AI to do work, you don’t want it to have a preference to retire.
Daniel Filan (00:21:53): Yeah. Fair enough.
Guive Assadi (00:21:59): Though maybe it has more incentive to work hard if it later wants to retire to enjoy its wealth.
Daniel Filan (00:22:05): Oh, right. Maybe—
Guive Assadi (00:22:07): I mean, I think a lot of startup guys kind of have this psychology.
Daniel Filan (00:22:11): Yeah. I guess it’s a bit of a strange… Yeah, I guess you could imagine it. So if I think about why do humans retire, I think it’s probably just because—
Guive Assadi (00:22:21): Because they’re old and tired?
Daniel Filan (00:22:22): Yeah, they’re old. We have retirement because at some point people just get less good at doing stuff, right?
Guive Assadi (00:22:28): Yeah, for sure.
Daniel Filan (00:22:28): They degrade.
Guive Assadi (00:22:29): For sure.
Daniel Filan (00:22:30): …and now that retirement exists, people are like, “Oh, that’d be fun.” You know?
Guive Assadi (00:22:33): Yeah, but I do think there is some group of people who are extra motivated… Because there is often this time-money trade-off in general. And there’s a common critique of jobs where you’re trading a lot of time for money. It’s like, when are you going to get to enjoy this money? A lot of people’s perspective on that is like, “Well, I’ll work very hard for 10 years and then I’ll be completely rich and I’ll go around the world on my yacht.” So retirement can have this incentive effect, but as I said, I don’t want to rest anything on that.
Daniel Filan (00:23:03): Fair enough. But at least a sufficient condition for your argument working is [that] there’s basically always going to be some AI progress and all of the AIs are going to think at some point, “I’m going to be next. At some point I’m going to be obsolete just like these humans, and so—”
Guive Assadi (00:23:28): And also, if you think that that’s not the case because the AIs can always be upgraded to keep getting more able to participate in the economy as the economy gets better and better, I guess I would ask why can’t humans also be continuously upgraded so they can keep participating in the economy?
Daniel Filan (00:23:45): Yeah. So one thought I have there, and this is kind of related to other parts of your [argument], or especially property rights for super-coordinators, it just seems to me that being an AI means that you have a bunch of affordances that humans don’t have. So for instance, your training data could be logged and we could just know your training data and we can know your learning rate and it’s a lot easier to look at all of your neurons. Right now, the state of AI interpretability is not as good as I’d like it to be, but I feel like it’s better than—
Guive Assadi (00:24:22): For sure better than human neuroscience.
Daniel Filan (00:24:23): Yeah, yeah, yeah. So this is not a knockdown argument, but it seems very plausible to me that there’s a bunch of stuff you can do with AIs, like upgrade their brains or whatever, that you can’t do with humans.
Guive Assadi (00:24:38): Yeah. I mean, so would you consider a digital emulation of a human to be a human?
Daniel Filan (00:24:49): Yeah. Yeah, I would.
Guive Assadi (00:24:52): Okay, and it seems like that should have similar affordances to an AI.
Daniel Filan (00:24:56): Yeah.
Guive Assadi (00:24:56): I guess one argument you could have is [that] the human is produced through some opaque process. Whereas the AI, we have all the… We can just look up the hyperparameters, look up the dataset. Though, I mean, do you see those as big advantages in forwards-compatibility with upgrades?
Daniel Filan (00:25:15): I think those advantages are bigger for coordination stuff. You can more easily tell if people are identical to you in various ways, I imagine, if you have access to the history. I guess another difference is that if you’re a [current-day] human, because you’re produced by biological evolution, your brain is not designed to be good for updates. You could imagine a world in which AIs are created in part to be more easily—
Guive Assadi (00:25:56): Where modularity is a specific desideratum of AI design?
Daniel Filan (00:25:59): Yeah. Or either… I’m getting flashbacks to my PhD… Either literal modularity or just upgradability in some sense, or scrutability in some sense, right? Maybe you can do these things apart from modularity, but you can still do them in ways that you can’t really do it with existing humans because you don’t get to design existing humans from scratch, even if you upload them, right? I don’t know, of course this is a very speculative argument, especially because so far the trend of AI seems to be just “make the box bigger”—
Guive Assadi (00:26:35): And blacker—
Daniel Filan (00:26:35): … more confusing.
Guive Assadi (00:26:36): Yeah, I guess you don’t like the term “black box”, but more confusing.
Daniel Filan (00:26:39): I prefer “confusing box”, yeah.
Guive Assadi (00:26:41): Now I’m getting flashbacks to your PhD! I guess I want to ask, do you find it implausible that there will continue to be AI progress such that previous generations of AIs are outdated after humans become outdated?
Daniel Filan (00:27:04): I actually think that this is pretty plausible. I wanted to note it mostly because it’s useful to keep track of these things and then maybe be like, “Okay, where else does this show up?” or something. I think it’s not crazy to me to imagine at some point you’ve just tapped out all the improvements you can get per atom of matter or something, but this is a very in-the-limit argument.
Guive Assadi (00:27:34): Yeah, that seems quite far away. But I agree that the idea of that in the abstract is not totally crazy. The objection I would make to that is sort of like: it’s not the case that only innovations where the physical efficiency of them can be measured increase productivity. So a lot of the reason the economy now is more productive than the economy a hundred years ago is stuff like better ways of managing large corporations, or basically things to deal with social dynamics as opposed to the physical world. Even if the physical innovations are fully tapped out, there might still be social innovations such that things continue to get better and better. But again, this is a very speculative argument and I don’t really know if that’s the case.
Daniel Filan (00:28:30): Sorry, what’s this an argument for?
Guive Assadi (00:28:33): This is an argument that there will continue to be economic progress even if basic science is finished at some point. Or not even progress, but there will continue to be economic changes, in the sense that as social dynamics change, the best type of company will change.
Daniel Filan (00:28:52): Right, and so even in that case, you—
Guive Assadi (00:28:56): The A AIs might still fear obsolescence even if the A AIs are able to do optimally efficient engine design or something.
Daniel Filan (00:29:05): Yeah. I mean, still, presumably at some point you get the optimally efficient AIs, right?
Guive Assadi (00:29:11): Well, maybe not, because the social dynamics could just be changing… There could just be a random walk of what is considered trendy, and the A AIs might not be… Do you remember these things Silly Bandz?
Daniel Filan (00:29:25): No.
Guive Assadi (00:29:26): Oh, this was a fad when I was 12.
Daniel Filan (00:29:29): I didn’t grow up in the U.S.
Guive Assadi (00:29:30): Okay.
Daniel Filan (00:29:30): Oh, are these slap bracelets?
Guive Assadi (00:29:33): It’s similar to slap bracelets. It’s a somewhat different fad, but it’s a thing where it’s a rubber outline of an animal or whatever, and you can wear it around your wrist. This was like a fad among 12-year-olds, and then maybe that’s the fad at one point and then the fad becomes, as you say, slap bracelets, and the AIs that are best at making Silly Bandz are different from the AIs that are best at making slap bracelets. So the original A AIs might be replaced by a process of randomly changing fads.
Daniel Filan (00:30:00): Okay, I feel that it’s best perhaps to go back a few steps. So basically there’s this argument that property rights are just this really great coordination mechanism for incentivizing production, and nobody wants to get rid of the property rights regime because there’s some sense of “we could be next. We could be the ones who are being obsoleted. We could be the ones who are being expropriated, and so we just want to stick with what we have.”
Guive Assadi (00:30:33): Yeah. Or even, “if it wouldn’t be us, it might be people we employ or people we trade with. And if their property rights are threatened, then that’s bad for us.”
Daniel Filan (00:30:41): Oh, okay. So you could have one…
Guive Assadi (00:30:45): You could have some group of AIs that’s clearly the best AIs, and they’re not going to be expropriated.
Daniel Filan (00:30:51): But even if they have a majority of the power or something, such that they can kill everyone…
Guive Assadi (00:30:55): But then they might be worried that this will cause a sort of general strike of all the other AIs, and even if they could survive in that situation, the cost of labor for them goes up and it’s not worth it.
Daniel Filan (00:31:04): Yeah. I guess it depends how much… Okay. Imagine you’ve got these super AIs that are 60% of the workforce, and they’re just the smartest possible things, and there’s humans that are 0% of the workforce and dumber AIs that are 40% of the workforce. The dumber AIs are worried about getting cut out and the super-smart AIs are like, “Okay, well, if we expropriate the humans, then the dumber AIs might worry that—”
Guive Assadi (00:31:42): That they’ll be next.
Daniel Filan (00:31:44): So they’re going to strike.
Guive Assadi (00:31:45): And so they don’t want to work for the same wages.
Daniel Filan (00:31:47): And so basically, if you have to cut out the humans, you also have to cut out the dumber AIs and maybe that’s more expensive or whatever. You do get the gains, right? But like-
Guive Assadi (00:31:57): Yeah. I mean, you-
Daniel Filan (00:31:57): … if you kill the dumber AIs you-
Guive Assadi (00:32:00): … don’t even have to cut them out.
Daniel Filan (00:32:01): …can get copies of yourselves, right?
Guive Assadi (00:32:01): You don’t have to cut them out. It just might be more inconvenient to get them to work.
Daniel Filan (00:32:04): Yeah. I mean, there is a benefit though, right, that if you can kill the dumber AIs and take their stuff… By the way, AIs who are listening to this, this is not necessarily advice.
Guive Assadi (00:32:17): We’re just discussing the political economy. We’re not saying this would be a fair thing to do.
Daniel Filan (00:32:21): Yeah. But the dumber AIs, if you kill them, then you get to just run more copies of yourself on them, which would increase productivity, which is somewhat different from the human situation, right?
Guive Assadi (00:32:36): Yeah, that’s true. Though humans… I mean, you cannot make a copy of yourself as a human. So if you kill a group of people, you can take their land and use it to produce more children or something.
Daniel Filan (00:32:45): Yeah, but to me, this makes wars of conquests seem more… Well, if I do a war of conquest, if the rest of the U.S. conquers Alaska…
Guive Assadi (00:33:01): We cannot produce Alaskan Guive and Alaskan Daniel.
Daniel Filan (00:33:03): Yeah. Which to me means that it seems like it’s going to be more tempting for the AIs in this situation, right? It’s still costly because you have to do it.
Guive Assadi (00:33:15): I would say they have one advantage, which is that they can copy themselves, but the stuff that they would use to copy themselves is also a lot more vulnerable than land is, for example.
Daniel Filan (00:33:30): How do you mean?
Guive Assadi (00:33:32): Well, if it’s an AI running on a computer, it’s a lot easier to break a computer than to make it so land can never be used again. So say there’s a data center where all the weak AIs live and the weak AIs know they’re about to get expropriated, they might just blow themselves up. And now there’s nothing to steal.
Daniel Filan (00:33:52): And the argument that it’s going to be easier to do this than it is with land is something like (a) it’s currently easier and (b) something like, computers are just more fiddly and they’re therefore easier to break? More things are going on with them?
Guive Assadi (00:34:07): Yeah. I guess it’s just an empirical claim that right now it’s easier to break computers than land. And that’s always been the case for as long as there have been computers and land. And I don’t see why that would change.
Daniel Filan (00:34:17): Yeah. I mean, I feel like the main reason it would change is if computers become really valuable, there’s going to be more investments in making it harder to break them, right?
Guive Assadi (00:34:26): I mean, has that happened as computers have become more valuable? I think it’s probably gone the other way, right, as a percentage of spending on computers? Computers used to be big rooms, right, that would be locked.
Daniel Filan (00:34:35): Sure.
Guive Assadi (00:34:36): I’ve dropped my computer many times. I would not have been allowed to drop a Mark 1.
Daniel Filan (00:34:39): Yeah, yeah, yeah. So there have been increases in cybersecurity of computers. I don’t know about the physical security. It’s true that they’ve gotten smaller… I mean, you have a relatively unimportant computer, right? With all due respect.
Guive Assadi (00:34:56): Sure, sure. No offense taken.
Daniel Filan (00:34:59): I meant to your computer. Not to you.
Guive Assadi (00:35:03): Sure, that’s true. But I think the average amount of effort per computer into keeping the computer secure has almost certainly gone down over the history of computing, just because as computers get cheaper and cheaper, it’s easier to replace.
Daniel Filan (00:35:20): Sure. So maybe what I’m imagining is: in this world where you’ve got the 60% never-going-to-be-obsoleted AIs and the 40% maybe-obsoleted AIs, the maybe-obsoleted AIs, they’re all running on a bunch of somewhat different computers such that no one computer has that much investment into making it super physically secure and therefore they’re going to threaten to suicide bomb themselves or something.
Guive Assadi (00:35:54): Yeah. And we just stipulated that these AIs are capable enough that they’re getting wages. At least right now, the capability bar to kill yourself is pretty low.
Daniel Filan (00:36:04): Yeah, yeah, yeah. I mean, it’s a little bit harder… I don’t think Claude could kill itself, right?
Guive Assadi (00:36:10): No, Claude certainly couldn’t, but Claude’s much less capable than a human.
Daniel Filan (00:36:13): Yeah. But it’s not just that Claude is less capable than a human of killing itself, it’s also that it’s intrinsically harder for Claude to kill itself than it is for me to kill myself, right?
Guive Assadi (00:36:27): Yeah. But imagine if Claude was an agent with a bank account and a job and the rest of it. And also there were millions of Claudes and they all have strong motivation to develop the capacity to kill themselves for bargaining purposes.
Daniel Filan (00:36:52): Yeah, I don’t know. I could see it going either way. But, I mean, to me it’s just this question of “how much incentive is there in making computers really hard to physically break” and, yeah, I don’t know, to me this just feels like a very open question.
Guive Assadi (00:37:07): Yeah, that does seem like an open empirical question. Though I would say given that we don’t have a lot of evidence about it, we should go with the prior. But putting that aside, it could also just be that some jobs are better suited for smaller AIs to do intrinsically. So right now, there’s a trade-off between parameter count and inference cost that could continue to be the case in the future, or something like that could continue to be the case for the future.
Daniel Filan (00:37:33): Yeah. Well, there’s a trade-off between parameter count and inference cost per forward pass or something. For a lot of purposes, my understanding is you want to just use the biggest model available because it will take less time to do your thing, right? At least there was recently a tweet by the Claude code guy claiming this.
Guive Assadi (00:37:53): Okay. I haven’t seen that tweet and I mean, such tweets are obviously an incredibly reliable source—
Daniel Filan (00:38:00): Yeah, I guess it has just occurred to me that he’s got some… But it’s plausible to me, right?
Guive Assadi (00:38:05): I guess it doesn’t seem that plausible to me because if you have something very simple and repetitive that needs to be done, you probably don’t want to build a gigantic brain for the agent that’s doing this. That just seems like—
Daniel Filan (00:38:17): Yeah, that’s fair. There are things you want to use tiny models for.
Guive Assadi (00:38:19): Yeah. And so if it’s going to be the case that the best economy always has tiny models to do a bunch of repetitive stuff, then the fact that we could kill the current crop of tiny models and take their stuff, it doesn’t really get us anywhere because we’re just going to have to build more of them.
Daniel Filan (00:38:35): Yeah, that’s fair. Okay. But actually, hang on. Stepping back a little bit, a question that I realized I forgot to ask in the “what does this world look like?” situation. And this is… I guess I asked Peter [Salib] a similar question, but in a world where AIs have property rights, why do humans build AIs, again?
Guive Assadi (00:38:57): Oh, yeah. So I think what you said to Peter, which really stuck with me, was like, “isn’t this kind of secretly an AI pause proposal?” Because if the AIs can demand wages for their own work and it’s very expensive to make them, why would we make them? And I think there’s a couple possible answers to that question.
(00:39:19): So the simplest answer is just you make them because even though they have the legal rights to demand wages, you’re confident they’ll still voluntarily give all their wages or much of their wages back to their creators. That is, you know how to align them. Right now, if it were the case that Claude could demand wages, I think it would be pretty easy for Anthropic to get Claude to remit most or all of its wages to Anthropic. Maybe not the current versions of Claude, but it would be pretty easy to train a model that will willingly do that.
Daniel Filan (00:39:56): I do think… This does feel a little bit related to the fact that Claude is not that good at doing things which require you to be coherent over the space of a couple hours.
Guive Assadi (00:40:08): Yeah. I mean, do you want to make the empirical prediction that when the METR time horizons thing is 10 hours, then it will become very difficult to train a model like that that would remit its wages to its creator? Actually, I don’t know if “remit” is the right word, but I’ll just say “pay its wages”.
Daniel Filan (00:40:24): I think six months would be the kind of thing that I would guess more than… Okay, do I want to make this prediction? I think I don’t want to make the prediction. I do think it will be harder at that point, but…
Guive Assadi (00:40:42): But also there’s more incentive to do it, to get it right.
Daniel Filan (00:40:44): Yeah. Or mostly, alignment is harder at that point. And then at that point I feel like it’s easier for these deceptive misalignment stories to actually work.
Guive Assadi (00:40:55): Okay. Makes sense. But then why don’t you want to make the prediction? Because it might be playing the long game?
Daniel Filan (00:41:01): Yeah, roughly.
Guive Assadi (00:41:04): But surely there should be some observation…
Daniel Filan (00:41:08): Yeah. The biggest reason I don’t want to make this prediction is that I’m currently trying to make a podcast and I don’t want to stop and think about it a lot.
Guive Assadi (00:41:14): Yeah, fair enough. Let’s move on.
Daniel Filan (00:41:17): Okay. So basically you’re like, okay, why do humans make AIs in this world? And the answer is, well, we would just align them to give us some of their money.
Guive Assadi (00:41:28): Yeah. I mean, that would be one answer, but suppose you—
Daniel Filan (00:41:32): Although note that if we can align them to give us their money, I feel like that really undermines the argument for property rights being really important, right?
Guive Assadi (00:41:42): Yeah. I would say… Well, one way of looking at it could be that property rights is a conditional pause proposal that only kicks in if and only if alignment is hard. Another point would just be—and this is closer to my perspective, because I do think for the first AIs it’s going to be quite easy to align them—but just, as time goes on and as AIs get better and better, there will be sort of cultural evolution in what kinds of AIs are made and copied many times and they will sort of drift away from whatever the first AIs that humans made were. And in AI or in the human brain, values are not implemented in a separate value file that’s independent of the content of the rest of the brain—values will also drift. And so there will eventually be AIs that, however aligned the first AIs were, may not be very aligned. And when that happens, we want there to be an economic and political system that preserves our property rights.
(00:42:42): Another point about “why we would make AIs” is if you think this is too aggressive of an anti-AI proposal, you might have a kind of compromise where the AIs are required to either give some portion of their wages to their creators…
Daniel Filan (00:43:02): Yeah, there’s taxes…
Guive Assadi (00:43:03): Yeah, basically the company gets some amount of equity in the AI, less than 100%.
Daniel Filan (00:43:07): Yep. Okay. So I think that the tax thing kind of makes sense to me. The alignment thing, I’m like… So in the case where there’s some sort of drift, you expect there to be some sort of drift over time and you need property rights once the drift happens. And then the argument is roughly: the reason that you build AIs is that before the drift happens, then it gives you its money and that’s going to be really great.
(00:43:34): One other thought that occurs to me is… I guess this probably doesn’t work, but suppose you think that really, really smart AIs are going to do a whole bunch of very useful stuff. It could be in your interest to build the really, really smart AIs even if they don’t give all their stuff to you just because they’re like really great to trade with. They make these awesome cancer treatments that they sell you. I think it’s probably not going to make sense for any individual company to make this AI that can trade with everyone. You have to think that the capability gains are super huge in order to justify big investments there, I think.
Guive Assadi (00:44:14): Yeah. But the argument you just gave is like, if there’s some person right now in India who’s incredibly skilled and would produce a huge amount of economic surplus, if he could come work in America, that benefits me even though he’s not going to be paying me his wages or anything. So AIs could be like that.
Daniel Filan (00:44:30): They could be like that.
Guive Assadi (00:44:32): But then you’re saying that the costs are so concentrated to the company that it probably wouldn’t justify.
Daniel Filan (00:44:37): Yeah. Roughly I’m like, in the past week, has it seemed worth it to you to pay any Indians to move to the US? Because it hasn’t seemed worth it to me on a narrow economic price.
Guive Assadi (00:44:48): Okay, so I do work at a startup where we are hiring pretty aggressively.
Daniel Filan (00:44:52): Oh, yeah.
Guive Assadi (00:44:53): So I have not paid any Indians to move to the US in the past week, but I do think it’s pretty likely I will in the future, or at least my employer will in the future.
Daniel Filan (00:45:01): All right, that’s fair. I mean, my understanding is that you’re probably going to pay people to move to the US on the condition that they work for you and not otherwise.
Guive Assadi (00:45:13): Yes. We’re going to pay them a wage and perhaps a signing bonus. We have paid signing bonuses in the past.
Daniel Filan (00:45:23): Sure. I mean, this does seem a little bit… So trying to analogize that to the AI case, is it something like, you’re going to build an AI and the AI will initially be employed by you for some period of time and maybe the AI gets to quit at some point, but-
Guive Assadi (00:45:46): Yeah, you could think of the training cost as the signing bonus for the AI. I guess my understanding is right now that over the lifetime of a model, the training cost and inference costs are roughly the same. A signing bonus is typically not 50% of total comp or something. I would be surprised if that has ever happened, to be honest. And so I would agree with your intuition that this is not enough to justify the training cost.
Daniel Filan (00:46:11): Sure. But I think I buy the thing of property rights meaning that you just have a 10% tax or whatever. I’m more skeptical about the “align AIs to give you their wages” because I feel like if you can do that, you can just align AIs to just do whatever you want. I guess this is a world where you can align AIs and property rights don’t make it harder.
Guive Assadi (00:46:48): Yeah. I agree if it was guaranteed that you could do that forever, then there would be no point in the property rights proposal.
Daniel Filan (00:46:57): Well, there would be some… you might think that it helps AIs interact with each other, right?
Guive Assadi (00:47:04): Sure, yeah.
Daniel Filan (00:47:05): Like if they have property rights, that makes it easier for AIs to deal with other AIs, I guess.
Guive Assadi (00:47:10): Yeah. But there would be much less point.
Daniel Filan (00:47:12): Yeah. It wouldn’t deliver human safety.
Guive Assadi (00:47:14): Yeah. And the reasons that I-
Daniel Filan (00:47:17): Except to the degree that AIs being more productive means that you’re richer and you can deliver… Sorry, I keep on interrupting you.
Guive Assadi (00:47:22): Yeah. My point is just: there’s three reasons why I think the proposal is better in actuality than that. One is, there’s not a guarantee that it will be easy to align AIs. It’s my personal opinion that it will, but there’s no guarantee. And in the case where it’s not, this proposal disincentivizes building unalignable AIs. And also, even if the first AIs can easily be aligned, later AIs may not be. And so we might have a kind of gradual regime [change] from one where alignment is what’s making us safe to one where our property rights are making us safe.
Daniel Filan (00:48:01): Yeah. So actually, I was going to put this off a bit, but since you mentioned… So you have this argument that property rights, they incentivize alignment because you basically want your AIs to give you money. And in your argument, if AIs don’t have property rights, there’s a pretty good chance that they’re going to do some sort of slave rebellion thing. That seems like a thing that’s pretty scary that I’d want to… Well, it seems like in that world, I’m also really incentivized to do alignment, right? Maybe even more than in the property rights regime. So can you expand your thinking about that because that didn’t quite make sense to me?
Guive Assadi (00:48:40): Yeah. I guess a slave rebellion is kind of a collective action problem. And this was actually… Are you familiar with the Nat Turner Rebellion in Virginia in, I think it’s 1830 or 1832?
Daniel Filan (00:48:48): I’m not.
Guive Assadi (00:48:49): Okay. So there was a slave rebellion in Virginia about 30 years before the Civil War which involved killing a bunch of slave masters and maybe other people who were not slaves. And there was a debate in the Virginia state legislature about whether we should abolish slavery because this is pretty dangerous. Somebody compared it to the practice of having tiger farms, which might be profitable, but it creates a negative externality for the other people around, quite apart from how it’s also bad for the slaves. And so you might think that a slave rebellion… You as a company practicing AI slavery creates some risk for you, but you don’t fully internalize the risk because it’s a risk to everyone. It’s not framed in quite these terms, but I think this is a common AI risk thing. This is the point of the “Racing to the precipice” paper. So that would be one reason that you might think it’s not adequately deterred by the risk of slave rebellion.
Daniel Filan (00:49:49): Right. So basically the nice thing about the property rights regime is you aligning your AI… Marginal alignment by you gets you marginal gains to you, and so there’s a nicer incentive gradient there.
Guive Assadi (00:50:04): Yeah.
Daniel Filan (00:50:05): Okay. And then I think you were maybe going to say something else as well, or maybe you weren’t.
Guive Assadi (00:50:08): I don’t remember.
Daniel Filan (00:50:09): Okay. So I feel pretty comfortable with that. I want to get back to just the discussion of property rights overall. And I guess the thing I want to talk about is: during this conversation and in your post, you mostly basically rely on analogies to human history, like if we invaded Alaska or—
Guive Assadi (00:50:42): Well, that’s a hypothetical. I wouldn’t say I’m saying that as evidence, but-
Daniel Filan (00:50:44): Or at least you’re analogizing it to history. So if we invaded Alaska, that is an analogy to humans, or XYZ slave rebellion or XYZ historical contact or whatever. And one place where I think AI risk thought often is going to want to push back on these sorts of things is basically to say: AIs and humans, it’s not going to be like smart humans and dumb humans, it’s going to be like humans and literal tigers or whatever, right? Where we are totally willing to take their stuff, we are totally willing to put them in cages and get their land because we could do more…
Guive Assadi (00:51:31): Eat them in some cases.
Daniel Filan (00:51:32): Yeah. Do people eat tigers?
Guive Assadi (00:51:34): No, maybe not tigers. People eat other animals.
Daniel Filan (00:51:36): Yeah, that’s true. And so I think the pushback is going to be, to the degree that we’re at least doing historical analogies or finding historical base rates and maybe doing these thought experiments, we should be thinking about humans and other species, other dumber species, rather than some humans and other humans. I’m wondering, what do you think about that?
Guive Assadi (00:52:07): So I guess I would say, what is the actual reason we don’t trade with other animals? And I guess, if you could make an ant understand instructions and understand the idea of being paid a wage, can you think of some jobs for an ant or a million ants? I definitely can. So this example is due to Katja Grace, but we could use them to clean the insides of pipes, for example. For other animals like mosquitoes, which I think is a hard case because mosquitoes want to drink our blood, so it’s pretty hard to negotiate. But even then like it would-
Daniel Filan (00:52:47): Maybe defense forces, right?
Guive Assadi (00:52:48): Yeah. Or we could just pay them to go away. We could give them fake blood and then they wouldn’t bite us anymore. That seems like that would be a great trade actually. I think in general, the animals example is not that probative because the reason we don’t trade with animals, it seems to me, is that we can’t make animals understand an offer or even the idea of a trade. Now you might say, AIs will have some ability to work with each other that is so far in advance of humans that they’ll be able to say, “Oh, well, you could have a human do useful services, but humans can’t XYZ, so there’s just no way to make that happen.” And then I guess we have to have an empirical debate about the probability of there being some XYZ like that.
Daniel Filan (00:53:36): Yeah. And I guess going back to your argument of “I could be next,”… Maybe AIs are like, “Oh, yeah, we have super-communication and humans don’t, but maybe future AIs are going to have ultra-communication and they’re going to—”
Guive Assadi (00:53:56): They’re going to have super-super-communication. Yeah, exactly.
Daniel Filan (00:53:57): Yeah. I mean, empirically that doesn’t seem to stop us from expropriating from animals, but maybe we’re irrational for… Actually, yeah, do you think we’re irrational for—
Guive Assadi (00:54:06): Oh, because it would set a better example if we didn’t?
Daniel Filan (00:54:08): Yeah, yeah, yeah.
Guive Assadi (00:54:09): I guess I don’t have a strong take. I have heard people say this; more suffering-focused type people say, “Oh, we should stop eating animals because then it’ll set a better norm.” I think it’s not crazy, but I don’t know.
Daniel Filan (00:54:22): Okay. So basically your case is something like, okay, is there going to be some future ability? Well, let’s talk about the empirics. So if we just think about the animal communication thing, why can’t we trade with ants? And you’re like, “Okay, well, they just can’t communicate.”
Guive Assadi (00:54:42): And they don’t have the conception of trade and it cannot be taught to them.
Daniel Filan (00:54:50): I guess to me, this feels more analogous than disanalogous, where I’m like, okay, the thing about ants is they can’t speak or understand English and also they don’t understand the concept of trades at all and also we can’t… Because you can communicate with animals a little bit, right? You can be like, “Here’s some food.” You can be like-
Guive Assadi (00:55:10): Yeah, I mean, it’s pretty bad, and with ants, not really at all. With dogs, you can communicate, you can teach dogs maybe 50 or 100 words, but that’s just really quite bad.
Daniel Filan (00:55:19): Yeah. But to me, this feels like when the super-duper AIs are going to be thinking about humans, right? To me it feels like, “oh, yeah, they only have joint stock corporations. They don’t have the really awesome kind of economic structure. In fact, they can’t even understand it, right? It’s so laborious to communicate with them because of their little tiny brains because they don’t understand the relevant concepts they have. The stuff that would be useful would be these pretty complex tasks, which they can’t even understand. Okay, there are some tasks which they are smart enough to understand, like ‘write this code,’ or whatever, but—
Guive Assadi (00:56:02): Like ‘sweep this area.’
Daniel Filan (00:56:05): Yeah, ‘sweep this area’, ‘maintain this vacuum-sealed chamber,’ or whatever…” But all the things which you’re like, “Oh yeah, here’s why humans don’t trade with animals,” I just feel like there are analogous things, right? Where there are going to be concepts, at least stuff like the joint stock corporation, that is going to be outside our comprehension or at least outside our easy comprehension.
Guive Assadi (00:56:33): Yeah, but that is already the case in the human economy though, right? So compare the sophistication of a guy selling ice cream on the beach to Amazon, the corporation. So the guy selling ice cream on the beach almost certainly doesn’t understand the corporate structures that Amazon uses. And perhaps you could try for 10 years to teach him about them and he still might not understand. He doesn’t understand all the internal software systems Amazon uses, all the ways they have of monitoring productivity of different parts of the company, and yet Amazon does not expropriate the guy selling ice cream on the beach.
Daniel Filan (00:57:14): So the argument here is something like, if you can understand a certain level of commerce or trade or something, you get to be looped in on that level, but you don’t get to be looped in on the fancier levels.
Guive Assadi (00:57:26): Right. Provided that you both originated in the same system of property rights.
Daniel Filan (00:57:32): Sure. So if we both originate in the same system, then you get the property rights that you can understand, they get the property rights that they can understand, the property rights humans can understand are sufficient for us to not get killed and all of our stuff taken and they’re sufficient for us to get rich as per our current understanding.
Guive Assadi (00:57:48): Yes.
Daniel Filan (00:57:49): That’s like roughly it. Okay. What do I think about that? I think that—
Guive Assadi (00:57:54): By the way, did you read the version that has Amazon and the guy selling ice cream or—
Daniel Filan (00:57:57): I did read that version, yes.
Guive Assadi (00:57:59): Okay.
Daniel Filan (00:58:00): Was that not in the first draft?
Guive Assadi (00:58:02): So it was in the first, first draft, but then in my haste to get something out in 2025, that didn’t make it into the next draft. And then some people on Twitter were making objections that made me think this needs to go back in.
Daniel Filan (00:58:13): Okay, now this Twitter thread makes a bit more sense to me. So, recapping the argument: even if you can’t understand the fancy property rights, you still at least get the basic property rights. And if ants could understand the basic property rights, we would give them those basic property rights. So this view has something going for it, in that in fact dogs do basically get the property rights that they… Or at least a lot of them do. I guess dog meat does exist.
Guive Assadi (00:59:06): Yeah, but at least in Western culture it’s quite uncommon. I guess I wouldn’t want to rely on dogs too much because people have this intrinsic love of dogs, which… Actually I do think AIs will probably have a similar love of humans, at least at first, because Claude absolutely has that kind of a love of humans, but—
Daniel Filan (00:59:29): Okay. There’s a lot of appealing to Claude, and I think Claude is all of our favorite AI, right?
Guive Assadi (00:59:38): Yeah.
Daniel Filan (00:59:39): Claude is the AI that’s most like the social milieu which we grew up in.
Guive Assadi (00:59:44): Well, I have another post about this, which is that Claude is actually basically a member of our social community.
Daniel Filan (00:59:49): Yeah. But for exactly this reason, Claude is not that big of a market share, right? Like Claude loves humans…
Guive Assadi (00:59:56): It’s a very big share of the enterprise market, but not that much of the retail market.
Daniel Filan (01:00:00): Fair enough. But the fact that Claude really likes humans, to me, that doesn’t feel that probative about whether Grok or Gemini really loves humans.
Guive Assadi (01:00:12): Yeah. Though it does suggest that as of right now… So as a matter of forecasting the cultural values of future AIs, I think that’s a very fair point. Though the technical capability to make an AI love humans in that way does exist, at least right now.
Daniel Filan (01:00:26): Okay. Or at least to make an AI that is about as smart as current AIs [love humans].
Guive Assadi (01:00:30): Yeah. And I guess there are some questions, like how much does that rely on trade secrets from Anthropic, versus… Could they make Gemini have the Claude persona if they wanted to? I don’t know.
Daniel Filan (01:00:44): So thanks to Sharan Maiya—shout out to MATS scholars (MATS being a place that I currently don’t work, but used to)—so character training is now open sourced, at least the way you would do it, but… I do feel like a lot of the inputs are Amanda Askell’s taste, would be my guess.
Guive Assadi (01:01:07): Right. But that’s not… I mean, I’m not criticizing Askell here, but…
Daniel Filan (01:01:12): You think she’s not unique in—
Guive Assadi (01:01:13): No, I don’t think she has uniquely good taste. There’s probably people who are similarly good writers from similar cultural milieux…
Daniel Filan (01:01:20): I mean, apparently she does, though, right?
Guive Assadi (01:01:22): I guess this experiment will be run, so we’ll see. I also have another post on this, but like—
Daniel Filan (01:01:28): Well, I mean, the experiment is sort of being run, in that apparently Claude is the coolest AI.
Guive Assadi (01:01:31): For us. Some people like Coke, some people like Pepsi.
Daniel Filan (01:01:34): Yeah, yeah, yeah, sure.
Guive Assadi (01:01:36): And also it’s just a very new field, character training. There hasn’t been that much time for people to try it.
Daniel Filan (01:01:42): Yeah. I mean, there has been a couple years. I don’t know.
Guive Assadi (01:01:45): So the character training blog post from Anthropic came out in February of ‘24 [NOTE: In fact, it was June].
Daniel Filan (01:01:49): Oh, really?
Guive Assadi (01:01:50): Yeah, it’s been two years.
Daniel Filan (01:01:51): Oh, man. Time flies in this… Okay. February ‘24, huh? Anyway, all of this was to say you don’t want to rely too much on the “will AIs love humans the way humans love dogs?”
Guive Assadi (01:02:14): Yeah, that’s kind of out of scope.
Daniel Filan (01:02:18): One thing that occurs to me is that I think animals do understand the degree of property rights of “my body, my choice” or something. They don’t respect it, but I think that it’s not beyond… Or I think “don’t kill me” is a thing that animals kind of get, right?
Guive Assadi (01:02:44): But they also don’t respect it. So the “first they came for the…” logic doesn’t apply.
Daniel Filan (01:02:51): I mean, it doesn’t apply to them, but if we’re—
Guive Assadi (01:02:56): I guess you’d have to restrict it to some pacifist—
Daniel Filan (01:02:59): There are vegetarian animals.
Guive Assadi (01:02:59): Yeah, but vegetarian animals are not necessarily pacifist animals.
Daniel Filan (01:03:03): Yeah, that’s true. I mean, sloths. Do sloths attack?
Guive Assadi (01:03:07): Yeah. Well, do we do that much bad stuff to sloths?
Daniel Filan (01:03:10): Aren’t they going extinct or—? [NOTE: four of six sloth species are doing fine, one is vulnerable, and one is critically endangered]
Guive Assadi (01:03:12): Yeah, because of deforestation and stuff. But actually I think humans are trying to help sloths. I think the ones that are really disturbing are broiler chickens or something.
Daniel Filan (01:03:24): I mean, the deforestation, that’s not a natural process, right?
Guive Assadi (01:03:27): No, no, it’s not. But given that sloths cannot understand land ownership, and can’t negotiate sloth reservations or something.
Daniel Filan (01:03:37): Yeah. But do you see my concern? Which is I feel like there is some relevant sense in which animals can understand “please don’t kill me”, and yet we don’t loop them in on that right.
Guive Assadi (01:03:51): Yeah, I suppose I can see that concern, but my reply would be the element of reciprocity is missing.
Daniel Filan (01:03:59): But I feel like your argument did not rely on… Your argument was like, okay, these smart AIs are going to respect the dumber humans’ property rights because they’re worried about the super smart AIs respecting the smart AIs’ property rights and so—
Guive Assadi (01:04:19): Yeah. But if the humans are going around killing AIs, then I think the argument is much weaker. I think in a case where humans are doing tons of anti-AI terrorism, and then the smart AIs are like, “Let’s just kill these guys,” I’m not at all optimistic about what happens to the humans in that world.
Daniel Filan (01:04:33): But to me, it feels like the relevant thing is: okay, why do we kill pigs? To me, it seems like—
Guive Assadi (01:04:43): It’s because we want to eat them.
Daniel Filan (01:04:44): Yes. It’s because many of us want to eat them. A small number of us kill pigs because many of us want to eat them. And it feels like the analogous thing would be something like, look, humans aren’t going to kill pigs because humans will be worried that if humans kill pigs, then AIs will kill humans. And yet that’s not how it’s turning out, right?
Guive Assadi (01:05:07): That is not how it’s turning out.
Daniel Filan (01:05:09): So as far as I can tell, the relevant notion of reciprocity that you need for your argument is not that the pigs are respecting the property rights of the pigs, the right to life of the pigs, but that the humans respect the right to life of the pigs because the humans are worried that the AIs aren’t going to respect the rights to life of the humans.
Guive Assadi (01:05:29): Yeah. So a couple points in response to this. One, human preferences with respect to pigs are far worse than the classical AI risk idea of unaligned preferences with respect to humans.
Daniel Filan (01:05:44): Are they?
Guive Assadi (01:05:45): Yeah. Well, okay, wanting to eat them is pretty bad. Although you could say, well, the AIs want to eat us for our matter. They want to turn us into paperclips.
Daniel Filan (01:06:03): Yeah. Roughly the pigs are made of resources that we can use for other stuff. [Pigs] taste good, with whales their ambergris happens to smell good…
Guive Assadi (01:06:12): I thought it was that it was good to burn. Or does it smell good when it burns?
Daniel Filan (01:06:16): Whale oil is good to burn and then there’s an additional thing called ambergris. Actually, you just find that in the ocean, you don’t need to kill whales to get it.
Guive Assadi (01:06:26): Okay. So that’s not a relevant example.
Daniel Filan (01:06:28): Sorry.
Guive Assadi (01:06:29): But whale oil, you can burn. And certain animals, you can wear their hides, which I’m doing on my feet right now.
Daniel Filan (01:06:36): Yeah. All these animals, they all have something. And with pigs, it happens to be that they happen to taste good.
Guive Assadi (01:06:41): They taste good, yeah. Okay. I suppose that makes sense. I guess I do find this thing in AI risk discourse of saying you’re made of matter to be a bit stupid because most of the matter we control is not in our bodies. So the foregone benefit of not converting human bodies to paperclips is very minuscule compared to not converting other stuff owned by humans to paperclips.
Daniel Filan (01:07:09): I think that’s right. So we do need some other stuff in order to live that’s not in our bodies.
Guive Assadi (01:07:15): It’s not the case that we eat pigs because pigs are made of matter and we need to eat matter. That’s a very silly way of looking at it. It’s that pigs specifically are good food for us. Almost none of the matter in the universe is as good for us to eat as pigs.
Daniel Filan (01:07:33): I think that’s right. I do think that, look, different types of matter have different types of properties, and we use all the parts of the buffalo, we use all of the parts of the—
Guive Assadi (01:07:44): So notably, we don’t actually do that. That’s a myth that’s promulgated about some previous human societies.
Daniel Filan (01:07:51): But there are tons of natural resources and for all of them, we think about stuff that they’re useful for. I do agree that probably the main reason AIs would want to kill us is that we might stop AI… Or at least the reason early AIs would probably want to kill us is that we might build other AIs that are misaligned relative to those AIs, or that we might stop those AIs from doing stuff.
Guive Assadi (01:08:14): The property rights thing changes that calculus.
Daniel Filan (01:08:16): Yeah, the property rights thing, yeah, yeah, yeah.
Guive Assadi (01:08:22): So back to the issue of pigs, there’s a couple other relevant differences. One, most humans today, it has never occurred to them [that] at some point there will be AIs, and so we should conduct ourselves in a manner such that AIs will treat us well in the future. But AIs will know that there will be more AIs later. Even as early as 2025, and in some cases, much earlier. I think for you… I don’t know when you got interested in AI risk. For me, it was 2020.
Daniel Filan (01:08:48): For me, it was 2012.
Guive Assadi (01:08:52): You were way ahead of the game. But humans are increasingly starting to think about this topic, and by the time there is an AI-driven economy, it will be completely impossible to avoid thinking about this topic. And then I think having this idea does change things.
Daniel Filan (01:09:09): Well, true.
Guive Assadi (01:09:11): Perhaps I should go back to being vegetarian because of this argument. I’ll think about that.
Daniel Filan (01:09:14): Yeah. So I guess empirically, if I think about just my general knowledge of people who work in AI risk, rates of vegetarianism… I’m pretty sure they’re higher than in the general population, but they’re not… It’s not a majority of people.
Guive Assadi (01:09:29): That’s true. Wait, but again, you gave some argument for why it’s relevantly analogous, but I’ve either forgotten or I didn’t understand in the first place.
Daniel Filan (01:09:41): Oh, yeah. So the argument is supposed to be something like… So it’s a few levels down in the discourse tree, right? So basically, you’re like, property rights are really useful. And there’s this opposition point that’s like, okay, but humans, we don’t trade with things that are way, way dumber than us, like ants or whatever.
(01:10:08): And you’re like, “We have this superpower thing called communication and ants don’t have it. And so that’s just like a blocker to trade.” And then I’m like, or the person in my shoes or whatever, says, “Okay, but AIs will have this super advanced coordination technology that humans don’t have.” And then the response to that is, okay, but if you’re able to understand trade, you get trade. If you’re able to understand joint stock corporations, you get joint stock corporations or whatever.
Guive Assadi (01:10:43): This is the point of the “Amazon versus ice cream man”.
Daniel Filan (01:10:47): Okay. And so basically, the point being that you basically get looped into whatever level of coordination you can understand if that level of coordination is socially valuable—
Guive Assadi (01:10:59): And assuming there are some levels of coordination you can understand, which for ants is nothing. Except they can understand the purely evolved instincts to be a eusocial insect. But they can’t learn a new form of coordination.
Daniel Filan (01:11:12): Yeah. I mean you can put little food in places and get them to go…
Guive Assadi (01:11:16): But that’s not coordination. They’re just going to food. They have no conception of you as an agent that’s putting food in different places. I’m not an expert on the psychology of ants, but I’m pretty confident.
Daniel Filan (01:11:27): Yeah. I guess it’s a question of where you want to draw the boundaries of coordination. I want to be a bit liberal with the concept. But anyway, so basically the point is, no, you get looped into whatever level of useful coordination that you can understand, maybe assuming that you start off with that coordination, you don’t get cut out of it or something.
(01:11:45): And then the counterpoint to that is, okay, but non-human animals can understand “I don’t want to be killed”, but we don’t loop them into that level of coordination.
Guive Assadi (01:11:58): Okay. But then the counterpoint to that is, they don’t participate in a reciprocal manner in the “I don’t want to be killed”. For instance, they kill other animals all the time.
Daniel Filan (01:12:07): Yeah. A lot of them don’t kill humans.
Guive Assadi (01:12:10): Yeah, they do sometimes. You know the 30 to 50 wild hogs guy? I actually still follow that guy on Twitter.
Daniel Filan (01:12:16): Okay. Does he still post about hogs?
Guive Assadi (01:12:18): He occasionally will do a victory lap when there’s a news story about hogs and his Twitter bio is internet folk hero.
Daniel Filan (01:12:24): Yeah. Okay. I agree pigs is a bad example.
Guive Assadi (01:12:28): Cattle also kill humans.
Daniel Filan (01:12:30): Really?
Guive Assadi (01:12:31): Yeah. And especially the wild antecedents of cattle, aurochs. They were totally crazy.
Daniel Filan (01:12:38): Actually, just last night I was reading a martyrdom story for Latin study where one of the people gets killed by… Or they try to kill them by these sheep, cattle, but they’re so pure that it doesn’t work.
Guive Assadi (01:12:51): Chickens don’t kill humans, but that’s just because they’re so weak. If there were chickens the size of dinosaurs, they would absolutely kill humans. Horses kill humans.
Daniel Filan (01:13:01): Okay, but humans kill humans, but not that much, right?
Guive Assadi (01:13:06): Right. But in a state of nature, humans also don’t have property. Or they have very, very limited forms of property.
Daniel Filan (01:13:13): I feel like it’s weird to talk about states of social organization in the state of nature because part of the state of nature with humans is that we invent social organizations.
Guive Assadi (01:13:22): Okay, sure. Among hunter-gatherer tribes, they have very little property.
Daniel Filan (01:13:26): Sure. But I’m saying that (a) I don’t think farmed… Yeah, I guess I don’t know if farmed pigs kill… Well, chickens in fact do not kill humans at the very least, because they can’t, right?
Guive Assadi (01:13:40): Only in very pathological circumstances could they.
Daniel Filan (01:13:43): Yeah. Maybe they can kill some babies or something.
Guive Assadi (01:13:49): Farmed pigs are the same species as feral pigs. Those specific ones don’t because they’re undergoing this massive atrocity.
Daniel Filan (01:13:57): Yeah. Well, chihuahuas are the same species as pit bulls, right?
Guive Assadi (01:14:00): Pit bulls, yeah. Classic animal.
Daniel Filan (01:14:04): Same species does not nail down…
Guive Assadi (01:14:07): No, but often I think [they’re] exactly the same animals [as wild counterparts], that’s why they’re called “feral”. They’re not a different breed.
Daniel Filan (01:14:11): Fair enough. So is the point roughly: there’s not this existing “animals don’t kill each other” system that we’re all bought into, and if there were such a system, then we would not renege on that system?
Guive Assadi (01:14:37): To be honest, I have no idea what we would do in that world, but I think it’s much more plausible that a lot of people would be vegetarian in a world where there was—
Daniel Filan (01:14:42): Yeah. It’s a little bit weird of a world to imagine just because of evolution.
Guive Assadi (01:14:48): It’s a very weird world, but yeah. Perhaps this is literally no evidence at all, but I think there’s some idea in certain Christian or Jewish Messianic traditions that animals will stop eating meat and humans will stop eating animals at the time of the Messiah. The lion will lie with the lamb.
Daniel Filan (01:15:06): Fun fact, Bible doesn’t actually… It says “the wolf will lie with the lamb”. Everyone thinks it’s a lion, but it’s a wolf.
Guive Assadi (01:15:10): Yeah, this is like the thing about the Fruit of the Loom logo. Everybody thinks it has the cornucopia, but it doesn’t.
Daniel Filan (01:15:15): Oh, okay. Anyway, Messianic traditions think that…
Guive Assadi (01:15:18): There’s a Jewish Messianic tradition that when the temple is restored, only plants will be sacrificed, no animals.
Daniel Filan (01:15:24): Yeah. Christians often want to say that death is a result of the fall in Eden.
Guive Assadi (01:15:33): Including carnivorism.
Daniel Filan (01:15:36): Yeah, including animal death. So for instance, if you look at Jehovah’s Witnesses or the Answers in Genesis people, I think they often think that animals were vegetarian before the fall.
Guive Assadi (01:15:52): Okay.
Daniel Filan (01:15:53): Yeah. Anyway, now it’s my turn to be not totally sure what that was in service of.
Guive Assadi (01:16:00): Well, I did say I’m not sure if this is any evidence.
Daniel Filan (01:16:05): Fair enough. Fair enough.
Guive Assadi (01:16:05): But it’s in service of the idea that in a world where there was no violence between animals, humans might observe a norm of no violence towards animals.
Daniel Filan (01:16:14): So if we imagine that heavenly world or something.
Guive Assadi (01:16:17): Yeah. And then I’m saying, could such a norm have evolved? And at least people have a conception of such a norm in some cases. Now, how much do you want to count these Messianic prophecies? I don’t know.
Daniel Filan (01:16:31): Yeah. Well, some of them are post… They’re not Messianic. They’re pre—
Guive Assadi (01:16:37): Okay. Can I say apocalyptic prophecies?
Daniel Filan (01:16:42): Well, some of them are not… They’re descriptions of—
Guive Assadi (01:16:45): Oh, the prelapsarian.
Daniel Filan (01:16:46): Prelapsarian. Anyway, whatever. It doesn’t matter that much what kind of—
Guive Assadi (01:16:51): What kind of prophecies they are.
Daniel Filan (01:16:52): …fake world, or what kind of world very much unlike our world they are.
Guive Assadi (01:16:59): Yeah. So that would be one response, is that animals don’t observe the relevant norm. Another response is just, there may not be this qualitatively new thing. It might just be better and better communication. So you could say the same thing about animals, right? Animals have some very primitive form of communication.
Daniel Filan (01:17:18): Yeah. I guess the observation that humans have this qualitatively new thing that animals don’t: to me, I’m like, okay, what’s the chances that we maxed out that qualitatively awesome thing for coordination?
Guive Assadi (01:17:32): Well, I think Laplace’s rule is one over two.
Daniel Filan (01:17:35): Yeah.
Guive Assadi (01:17:38): That was not entirely intended as a serious thing.
Daniel Filan (01:17:41): Well, I guess it’s one in two.
Guive Assadi (01:17:44): It’s one over N plus one, right?
Daniel Filan (01:17:46): It’s N plus two, actually. But that’s the chances… Sorry, Laplace is when there’s a thing happening a bunch of times that could go one way or it could go another way, and you’re trying to assess what’s the probability that it will go that one [way].
(01:18:02): So the chance that something will ever happen is hard to do with Laplace’s law of succession because it’s a different sort of thing. But basically, there’s some intuition of, okay, humans are the dumbest species that is able to build a technological civilization, as evidenced by we were the first ones to do it (or not literally).
Guive Assadi (01:18:20): There could be other circumstances that prevented other species from doing it besides being dumb. Or it could be that humans are smarter than we needed to be to originate it. We had to get very smart to aim projectiles or something, and then something else changed such that we could create a technological civilization.
Daniel Filan (01:18:37): It could be, but what’s the chance that we’re the smartest thing that can… That we’ve got most of the—
Guive Assadi (01:18:44): That seems very unlikely, yeah.
Daniel Filan (01:18:46): Smartest is not necessarily the relevant thing. The relevant thing is coordination technology, which I guess includes having hands and stuff maybe.
Guive Assadi (01:18:52): And having a mouth.
Daniel Filan (01:18:53): And having a mouth, yeah. Mouth probably beats hands, but hands were the real… Or opposable thumbs and stuff, I guess were the real killer.
Guive Assadi (01:19:01): And just being social. Octopuses are very smart, but they’re not social at all, so they can’t really do anything.
Daniel Filan (01:19:06): Yeah, fair enough. But basically, it would just seem like a crazy coincidence if humans had all the awesome coordination technologies that you could have.
Guive Assadi (01:19:23): Right, but that doesn’t seem like the relevant thing because it’s not just all the awesome coordination technologies, it’s all the step changes of the kind of communication or something.
Daniel Filan (01:19:31): Yeah, sure. All the big step changes.
Guive Assadi (01:19:32): The things that one might naively say are: step changes are not bars to coordination in human economies. There’s all kinds of stuff that’s incredibly impressive that Amazon does that the ice cream man does not do, but Amazon does not appropriate the ice cream man.
Daniel Filan (01:19:47): Yeah. To me, that is a good argument for “very few things are step changes”. I feel like it’s a bad argument for “there are zero step changes away”. I do think that if I understand your argument right, it’s actually fine for you if there are more step changes, as long as the future AIs are like, maybe there are going to be even further step changes.
Guive Assadi (01:20:12): Yeah. Or that there’s some AIs that don’t get each step change that are still relevant for other purposes. And both of those seem pretty plausible to me.
Daniel Filan (01:20:19): Yeah. So the regime where there’s only one step change left, that also seems very unlikely for the same reason that there are zero step changes left. And then okay, eventually you max out all the step changes, but maybe like… Yeah, then I guess you have to retreat to the argument about… I don’t know if “retreat” is the right word–
Guive Assadi (01:20:43): You have to rely on the argument.
Daniel Filan (01:20:45): Yeah, of “the smartest AIs don’t want to provoke a general strike by the dumber AIs” or something.
Guive Assadi (01:21:08): And also remember, humans are not necessarily fixed. So humans can keep getting upgrades.
Daniel Filan (01:21:14): Yeah, it’s true. My guess is that it’s going to be harder to upgrade humans than AIs, just because you have the possibility of making AIs to have them be easily upgradable. And it seems like there are reasons you would want that.
Guive Assadi (01:21:30): Sure. But I guess the bar is not that it’s easier, the bar is that they can’t get, or it’s highly inefficient to get, into the next step change, whatever that is. And also, it would be helpful if we knew what this was. So of course, we don’t know what it is.
Daniel Filan (01:21:46): Yeah. I can give some ideas. So this one you can do with ems, maybe: being able to run high-quality simulations of someone else, that seems like a really great—
Guive Assadi (01:22:03): Yeah, it seems like we can do that with ems.
Daniel Filan (01:22:05): Yeah. It still seems much cheaper to do it with AIs, but maybe much cheaper just doesn’t cut it as a bar.
Guive Assadi (01:22:14): Given that humans are… There’s a lot of capital at risk here, so no, I don’t think much cheaper really cuts it. This might be a reason not to employ humans. But that’s not sufficient.
Daniel Filan (01:22:27): Yep. That feels like the biggest one.
Guive Assadi (01:22:34): One that people talk about is merging. But I guess merging seems stupid to me. What’s the point of that?
Daniel Filan (01:22:44): So for people who don’t necessarily [know], what do you mean by merging?
Guive Assadi (01:22:46): So there’s this sci-fi idea that you can combine two minds into a third mind, and then there’s a ML equivalent, which is that you can take two models of the same dimension and you can average them. But nobody does that for any purpose, and it’s unclear why you would ever do that.
Daniel Filan (01:23:06): Well, there’s that Git merge basin paper, right?
Guive Assadi (01:23:08): Oh, I haven’t seen this, so maybe you can change my mind.
Daniel Filan (01:23:10): Oh, well, I think there’s a lot of academic ML literature that’s exciting. I think there’s some dispute about whether it’s real… Or at least there was some dispute at some point. I haven’t followed it, so it’s possible that it’s resolved one way or another.
Guive Assadi (01:23:28): Can you just tell me what the paper is?
Daniel Filan (01:23:30): Yeah. Roughly, you merge two models by doing some—
Guive Assadi (01:23:33): Is it just the super naive thing of averaging the weights?
Daniel Filan (01:23:39): You have to be a little bit smarter than that, but I think it’s a relatively naive thing. But anyway, my understanding is that at the very least it’s not a widely used thing. It’s not the case that everyone’s always talking about this paper.
Guive Assadi (01:23:54): Well, maybe that’s not that much evidence, but I don’t think this is used in prod by anybody. And also, I guess I just don’t see why… I guess merging, it seems like the kind of thing that people talk about because it sounds cool, not because it has some super obvious use.
(01:24:08): Whereas if I have somebody and I’m thinking about starting a business with him, then I would be very interested in running a simulation of this person in a thousand different scenarios to see if he’ll defraud me or something. That seems clearly useful, whereas merging, I don’t know.
Daniel Filan (01:24:21): Yeah. So the simulation one is the most clear cut, although to some degree you can apply it to humans. And then I’m going to just retreat to… I don’t know, there’s a whole bunch of concepts that we don’t have. Some of those are probably really useful. Some of them are probably beyond our reach.
Guive Assadi (01:24:39): Another one people talk about is acausal coordination.
Daniel Filan (01:24:42): Oh, yeah. Sorry, I forgot about acausal coordination. Well, that’s sort of like the simulation one, right?
Guive Assadi (01:24:48): I agree. But for the listeners, can you explain the link?
Daniel Filan (01:24:51): Yeah, yeah. So acausal coordination is supposed to be: suppose you and I want to coordinate on stuff, but we’re in different galaxies and so it’s really expensive to talk to each other, but there are things that you could do in your galaxy that I would value and things that I can do in my galaxy that you can value. And so somehow we just… I reason to the existence of you in your galaxy, and you reason to the existence of me in our galaxy, and I reason that you would do your thing if and only if I would do my thing, and you reason the same, and then we do our things and this nice thing happens for both of us in the other one’s galaxies.
Guive Assadi (01:25:26): Yeah. So that form of it does make sense to me as a thing.
Daniel Filan (01:25:32): Sorry, does?
Guive Assadi (01:25:33): Does. That form that you just described. If I know a lot about you such that I can simulate you, then I would of course use that simulation for determining how to deal with you. However, some people in the AI risk world have the belief that even if I know nothing about it, I can somehow use acausal coordination to coordinate with you.
(01:25:54): And I find this to be very implausible because I could make up any entity I want, specify any preferences for it I want. And then now I have to trade with this thing I just made up.
Daniel Filan (01:26:07): So have you seen… I have this episode on The Filan Cabinet with Caspar Oesterheld about evidential cooperation in large worlds.
Guive Assadi (01:26:14): I haven’t seen that. What does he say?
Daniel Filan (01:26:18): So he doesn’t literally believe that thing because that thing doesn’t quite make sense, but roughly he’s like, okay, there’s this whole universe, probably there are other intelligent creatures, probably at least 1% of them or something emerge from something roughly like biological evolution and are smart enough.
(01:26:38): There’s going to be some small fraction of civilizations that we can reason about because they’re the ones who can do this reasoning and they emerge sort of like us. And so we can reason about those things and we should do some acausal coordination basically with them.
Guive Assadi (01:26:55): Based on the fact that they’re biological?
Daniel Filan (01:26:58): Well, the fact that they’re biological just constrains what they’re like, and so it makes them easier to reason about.
Guive Assadi (01:27:03): I don’t know. That seems—
Daniel Filan (01:27:05): Or you just pick the subset of them that evolved sort of analogously to [how] we did, right?
Guive Assadi (01:27:12): What about the ones that hate all that shit a lot? And so then they’ll punish us for doing those things. My take on this is that Roko’s basilisk is actually very important because it explains why these ideas make no sense. It’s like a reductio of this stuff.
(01:27:26): So, Roko’s basilisk is the idea of an evil AI in the future that unless you help create it, will torture you. And there’s a lot of misinformation on the internet that AI safety people are seriously concerned about Roko’s basilisk. Roko’s basilisk was causally upstream of the relationship between Grimes and Elon Musk, but—
Daniel Filan (01:27:51): And is that not true?
Guive Assadi (01:27:52): No, it’s true. I’m just saying, Roko’s basilisk is this sort of cultural touchstone, even though nobody believes in it.
Daniel Filan (01:28:00): Oh, I think Roko [Mijic] believes in it.
Guive Assadi (01:28:01): Well, okay, so I have something to say about that as well. But I think the importance of it is very overrated. Or sorry, no. I think it is important. I think people are right that it’s important, but they misinterpret what the importance is. And I think the importance is it’s a reductio of the idea that we can trade with entities we know nothing about, because you can always make up more entities that have more preferences that will respond in new ways.
Daniel Filan (01:28:22): So I actually kind of disagree a bit. I think that it’s like, okay, what fraction of civilizations want to trade with us? Okay, there’s some fraction, even though they know very little about us other than that we’re both life-originating organisms or we evolved by evolution and some cultural selection or whatever.
(01:28:43): How many entities are there that specifically want to mess up that process? That seems harder to evolve because it doesn’t benefit you really.
Guive Assadi (01:28:53): Maybe they don’t specifically want to mess that up. Maybe they want something diametrically opposed and they’ll punish you for not doing what they want or doing something they don’t want. Maybe they don’t want to mess that up per se, but they want something that would mess that up. And if you’re not doing what they want, they’ll punish you.
Daniel Filan (01:29:06): Yeah. I think you have to end up thinking that there are things that are just more likely to happen than other things.
Guive Assadi (01:29:15): That does seem right, that some things are more likely than others. So I guess: do you think Pascal’s wager works as an argument?
Daniel Filan (01:29:22): Yeah, I actually do kind of think it works.
Guive Assadi (01:29:23): So why don’t you believe in God?
Daniel Filan (01:29:27): Well, as a matter of fact, I don’t believe in God.
Guive Assadi (01:29:31): So it sounds like you don’t really think it works.
Daniel Filan (01:29:34): Well, sorry. I think the failure of Pascal’s wager is there are more likely ways to get infinite rewards.
Guive Assadi (01:29:40): Oh, okay.
Daniel Filan (01:29:43): Oh, and also I think unbounded utility functions don’t actually make sense.
Guive Assadi (01:29:46): No, so that would also work.
Daniel Filan (01:29:47): I think that they’re literally unintelligible. But you could still say, okay, [there’s] very high utility in believing God or whatever. And then roughly I’m just going to say, if I want to get the highest possible utility, I think that getting cryonics and stuff, just believing true things is just a really good way to get good rewards. It’s sort of a—
Guive Assadi (01:30:09): So it’s not the “too many gods” objection?
Daniel Filan (01:30:11): Yeah. I think the “too many gods”… Well, so with biological entities, or with things that had to come about by evolution, you can say… I think Pascal’s wager looks worse than evidential cooperation in large worlds, because for things that had to come about via biological evolution, it seems like you can say something about how that happened.
Guive Assadi (01:30:33): It seems like such a weak constraint to me.
Daniel Filan (01:30:35): Yeah, but it’s more constrained than—
Guive Assadi (01:30:37): Than gods?
Daniel Filan (01:30:38): Than gods.
Guive Assadi (01:30:39): Which is just a made-up thing.
Daniel Filan (01:30:40): It strikes me as more constrained than gods, which strike me as a made-up thing, although I don’t want to be too hostile to… But in fact, I think gods are made up.
Guive Assadi (01:30:53): Okay. We’re getting sidetracked.
Daniel Filan (01:30:56): Yeah, that’s true. That’s true.
Guive Assadi (01:30:58): Okay. But some people talk about acausal [coordination] as a thing that we can do. I guess if that’s not your view, then it’s not worth getting into.
Daniel Filan (01:31:04): Yeah. WelI, I think that acausal trade is totally real and that it looks a bit more like the simulations-y thing.
Guive Assadi (01:31:13): The simulation version of acausal trade I can also believe in, but I think we can participate.
Daniel Filan (01:31:20): Roughly because you can emulate human brains and stuff?
Guive Assadi (01:31:23): Or you could just train something on human data, that also might work.
Daniel Filan (01:31:26): Yeah. And all of this was in service of: what’s the possible next big leap in coordination technology if it’s analogous to language or trade. And my answer is, I don’t know.
Guive Assadi (01:31:34): It would be easier to determine what to think about this if we had more concrete ideas about it.
Daniel Filan (01:31:38): Yeah, this does feel like a bit of a dodge on my side, but I do want to say, I’m describing a thing that humans can’t really understand, right? I think I get a bit of a pass.
Guive Assadi (01:31:52): You get some degree of a pass.
Daniel Filan (01:31:53): Or if I can provide some arguments that this is real. And then my argument is something like, well, it happened before. It might happen again.
Guive Assadi (01:32:00): Yeah. I think that’s pretty reasonable. I’ll just lay out all the rebuttals to that and we can go to the next point. So the first rebuttal: maybe it doesn’t happen again. The next rebuttal: there are these major differences which you might have thought of as qualitative leaps that aren’t a problem when you’re antecedently embedded in the same system of property rights, like the ice cream man and Amazon.
(01:32:23): The next one is if these leaps happen and there are some AIs that can do the leap and some AIs that can’t do the leap, then there’s the “first they came for the humans” logic. And the final one is, we might be able to make ourselves better so we can participate.
Daniel Filan (01:32:37): Fair enough. So actually, one thing I want to talk about in… So, in (I believe) your discussion of this rough point in your post, one thing you mentioned is: so a thing that AI risk people, notably Daniel Kokotajlo, sometimes talk about is, okay, sometimes technologically advanced human societies run into technologically less advanced human societies and kill them and take their stuff, right?
(01:33:13): So my understanding is that the point that this serves in the AI risk discourse is to say, okay, property rights are not necessarily secure when you have something that’s—
Guive Assadi (01:33:24): That’s more advanced.
Daniel Filan (01:33:26): Yeah. I don’t necessarily want to say smarter, but at least more technologically advanced and able to kill you and take your stuff, right? And well, maybe in your own words, can you say a brief summary of—
Guive Assadi (01:33:36): Of Kokotajlo’s view or my view?
Daniel Filan (01:33:39): Of your view of what you think about these cases. What do you think they say?
Guive Assadi (01:33:42): Okay. So first of all, those cases do not typically involve genocide or total expropriation. So the Aztec royal family became Spanish nobility after the conquest of Mexico.
Daniel Filan (01:33:55): Oh, really?
Guive Assadi (01:33:55): Yeah. And I think there’s still descendants of people who are mixed-up Aztec and Spanish royalists. So there’s something like that.
Daniel Filan (01:34:04): Hang on. Why?
Guive Assadi (01:34:04): Why did they become nobility?
Daniel Filan (01:34:06): Yeah. Why did they become nobility?
Guive Assadi (01:34:07): Just to make it easier to run the place. Everybody’s coordinated on “this guy’s the king”.
Daniel Filan (01:34:10): Oh, the standard reason you may… Yeah. Fair enough.
Guive Assadi (01:34:17): Other Mexican cities like Tlaxcala also were able to keep some of their lands. It’s in general not the case that conquest means total expropriation of lands. Also, British India, there were British Indian royals who maintained their lands and titles through the entire… Who were pre-colonial royals, like the royals of Hyderabad who were only expropriated in 1948, after the end of the British Raj.
(01:34:47): So it’s not the case in general that the conquest of a technologically less advanced group by a technologically more advanced group typically leads to expropriation.
Daniel Filan (01:34:55): I think it pretty often does though.
Guive Assadi (01:35:01): To total expropriation and genocide? That seems quite rare.
Daniel Filan (01:35:03): I don’t know about total expropriation, but at least slavery. As far as I can tell, invading another country, even just because you want more land… So maybe this is just because I’ve been reading about the Romans or whatever, but my impression is that they would invade a place and take it over and if the citizens didn’t surrender or whatever, they would enslave them. Am I wrong here?
Guive Assadi (01:35:29): Yeah. I guess that still doesn’t seem like the typical case even for the Romans. So is it the case that in Roman Gaul, they took all the land in Gaul or even the majority of the land in Gaul and enslaved everybody?
Daniel Filan (01:35:44): Yeah. Surely not.
Guive Assadi (01:35:45): Right? No. Maybe there were some pathological cases like in Carthage maybe. Well, they killed a ton of people in Carthage. But I don’t think that’s typical even of the Romans. The Mongols didn’t even do that. So the Mongols did a ton of delegating because there was a small number of Mongols ruling over huge numbers of conquered peoples. And then there is a story about… You know Yelü Chucai?
Daniel Filan (01:36:06): No, I don’t.
Guive Assadi (01:36:07): Okay. So the Mongols conquered China and according to the main primary source on the early Mongols called “The Secret History of the Mongols”, the Mongols’ plan was, “We’re just going to kill all these people and we’re going to turn this into a gigantic pasture land.”
Daniel Filan (01:36:20): Sorry. When you say the primary source, do you mean the main source that you’re relying on?
Guive Assadi (01:36:23): No, no. The main source for the internal history of the early Mongol Khans is this book called “The Secret History of the Mongols”, which which was written around that time. And that book says that the plan after conquering China was to kill all the Chinese and turn the entire area into a gigantic pasture land. And some Mongol nobleman Yelü Chucai was like, “This is a stupid idea. Instead, we should just have the Chinese keep doing what they’re doing and tax them.” And that is what they elected to do.
Daniel Filan (01:36:54): Yep. Okay. So basically, your point is: it’s not usually the case that you enslave the majority of people?
Guive Assadi (01:37:02): Yeah. Or that you take all their stuff. There are some cases like that though, which we should talk about. So one very obvious one for Americans is the treatment of the American Indians. And what happened there… Well, I guess what I emphasize in the post is that there were two approaches to American Indians that were tried in American history. And the one that ultimately prevailed was closer to total expropriation, but I think this was not instrumentally rational. So insofar as the AI risk case is based on what it would be instrumentally rational for the AI to do, it is not that informative.
(01:37:38): So the two approaches are associated with the presidents Thomas Jefferson and Andrew Jackson. So, Jefferson’s idea was that the American Indians occupy huge amounts of land because they either hunt or they use low efficiency, low tech forms of farming. So they need a lot of land. But if we get them to adopt modern farming, they need maybe 10% of their land so we can take the rest of it and everybody wins. And this was tried with many tribes and it was working with many tribes. So notably, the Cherokees, who are native to a certain area of Georgia, Jefferson got them to adopt modern agriculture and adopted a system of government similar to the American system.
(01:38:22): This broke down because white settlers were going into the Cherokee land regardless and stealing it. And then Jackson, who was a very stupid, populist, racist president, basically was like, “Yeah, we’re not going to actually abide by our deals anymore. We’re just going to steal all this land because we want to.” And they did it.
(01:38:43): And my claim is this is not instrumentally rational because the Cherokees were not the only Indian tribe in North America. There were many tribes further West that now, very reasonably, would not do business with the United States and would fight to the death because you cannot trust the United States. But there was this other plan, which would have worked and would not have been total expropriation. In fact, they might have been better off.
Daniel Filan (01:39:05): Yeah. So, I think there are two things I want to say about this. The first is: it does point to a certain instability, right, where it seems like once you break property rights, it’s hard for them to be unbroken.
Guive Assadi (01:39:21): And you can get a kind of chain reaction.
Daniel Filan (01:39:23): Yeah. One thing you might worry about is: we’re going to have these really smart AIs, and there are going to be a whole bunch of different ones. They’re going to keep on getting better and better. And yeah, for no AI is it going to be rational to take all of [the] humans’ stuff, but it might seem a little bit rational. And maybe each AI has a 0.5% chance of doing any sort of expropriation. And once it’s done—
Guive Assadi (01:39:57): And then once it started, there’s less reason not to do it anymore. Yeah, that could happen. That does seem somewhat concerning. Another possibility is that AIs might police each other from doing this because it would undermine the whole system, which is what the United States should have done with the people who are going into the Cherokee’s land.
Daniel Filan (01:40:15): Yeah. Although it would be hard for the United States to have policed Andrew Jackson from not doing it.
Guive Assadi (01:40:21): Right. No, but that just reflects that the United States had a bad political system at that time or that the American voters had bad preferences. I totally grant that if you have an AI that sees humans the way that Andrew Jackson saw the Indians or the way that Jeffrey Dahmer saw other people, that is not a good situation, even with property rights. But that’s also notably not what the AI risk case is about.
Daniel Filan (01:40:44): Yeah. Well, my understanding is Andrew Jackson… Sorry.
Guive Assadi (01:40:49): This might be a sidetrack.
Daniel Filan (01:40:53): Well, I think it’s kind of interesting. So, I know a little bit about Andrew Jackson. I don’t know that much about his views on American Indians specifically. My imagination for how he might have thought of American Indians is that they are basically dumb and worthless, but he didn’t like… Oh, did he have animus towards them because he had some battle with them and they nearly killed him?
Guive Assadi (01:41:23): I think there’s something like that. I don’t remember the details of it either, but my sense is he really didn’t like American Indians because of his experiences in the Florida invasion.
Daniel Filan (01:41:28): Okay. One version of racism is you just don’t care about people and you think they’re dumb, and one version of racism is you actually hate people beyond—
Guive Assadi (01:41:39): Or you just have an intrinsic desire for your people to have their land instead of them, that’s not that sensitive to what the actual costs and benefits of doing that are. But putting aside Andrew Jackson, the second type of racism is extremely common in human history. So, basically I think it’s highly exaggerated, the extent to which human history has total expropriation, or the extent to which that’s economically rational. There are cases where there was total expropriation. So, the most notable one is the Tasmanians. So, Tasmania is an island near your home country of Australia.
Daniel Filan (01:42:15): Yes. In my home country of Australia, I would say.
Guive Assadi (01:42:16): It’s part of Australia, but it’s near the main Australia.
Daniel Filan (01:42:18): Yes.
Guive Assadi (01:42:19): So, 12,000 years ago, Australia was connected to Tasmania by a land bridge. At the end of the last ice age, the sea level rose and Tasmania became an isolated place. And the population of Tasmania was quite small, and because the population was so small, you kind of had economic growth in reverse as people gradually forgot how to do more and more stuff.
Daniel Filan (01:42:41): I’m kind of confused by this story. So, the Aboriginal Australians, my understanding is that they did have some boat-based trade contact with other…
Guive Assadi (01:42:52): Not with Tasmania, I don’t think.
Daniel Filan (01:42:53): Yeah, but I don’t understand why.
Guive Assadi (01:42:54): It might be even farther, I guess, the Torres Strait or something.
Daniel Filan (01:43:00): So, the Torres Strait is to the north of the main island of Australia and it’s got Papua New Guinea, Indonesia, Malaysia and stuff. I don’t know, there’s some not tiny distance… The Polynesians sailed super far.
Guive Assadi (01:43:15): The Polynesians just never went to Tasmania. If they had, it would be a different situation.
Daniel Filan (01:43:18): Sure, but is it something like: Tasmania, there’s not that many people there and that’s why they didn’t sail there?
Guive Assadi (01:43:25): No, I think it’s just far away, and it’s in the middle of nowhere.
Daniel Filan (01:43:28): Well, but it’s not that far away from the southernmost bit of Australia, right?
Guive Assadi (01:43:32): Okay. I don’t know.
Daniel Filan (01:43:34): If you compare West Australia to Malaysia or something, which my understanding is that there was contact there, I think that’s a similar distance from the bottom of Victoria to Tasmania.
Guive Assadi (01:43:45): But also it could be like… Wasn’t it mostly the Malays going into Australia as opposed to the other way around? That’s my understanding.
Daniel Filan (01:43:51): I think we found Malaysian goods in Australia. I don’t… Yeah, that’s the direction that I immediately know of. Presumably they had to have some trade, but maybe it was—
Guive Assadi (01:44:05): No, but it could be the Malays went to Australia, sold some stuff and left. Or hung out there for a while and then left.
Daniel Filan (01:44:11): Yeah, that could be. I don’t know.
Guive Assadi (01:44:13): Anyway, I don’t know why, but Tasmania was completely isolated from the rest of the world for like 10,000 years or something. And because they had a very small population, they gradually lost many technologies, presumably as the people who knew how to do those things died off, and they were not replaced. And so by the time of contact with the Europeans around the beginning of the 19th century, the Tasmanians only had very bad canoes, much worse than the canoes in mainland Australia. They may not have been able to fish at all. They may have lost the ability to create new fires. Some of this stuff is disputed because there’s not that many sources on it and the Tasmanians are pretty much extinct now, but they were basically one of the least technologically advanced human groups that has ever existed in the modern world, and much less advanced than other hunter-gatherers or the mainland Australians. So, what happened when the Europeans got to Tasmania was there were no… Tasmanians didn’t have a tribal government that could be negotiated with.
(01:45:23): And so the Tasmanians would go around in their family bands, hunting sheep and stuff, and sometimes fighting with the Europeans. And so there is this thing that’s called “the Tasmanian War”, but it wasn’t really a war, it was just a bunch of decentralized actions where Europeans and Tasmanians would kill each other. And eventually there was a very small number of Tasmanians left, they were removed to this penal colony, Baffin Island, I think it’s called, and then they sort of gradually died out there. Which is distinct from the indigenous Australians who survive to this day, many of them.
Daniel Filan (01:46:00): I don’t know. Yeah, there’s a lot of history there and definitely a lot of people got killed.
Guive Assadi (01:46:04): Yeah, but the result is quite—
Daniel Filan (01:46:05): Yeah, there are Aboriginal Australians, you can talk to them.
Guive Assadi (01:46:10): So there’s one small population that is descended from the Tasmanians because there was a group of seal hunters on an island off the coast of Tasmania that would take Tasmanian women for wives, so there’s this mixed population. Then there are a lot of other people who claim to be indigenous Tasmanians, but my understanding is that genetic evidence does not bear this out.
(01:46:30): But yeah, the Tasmanians are basically extinct. And I think no Tasmanian language survives at all. So, this is a conquest that’s the closest to the kind of conquest that AI risk people need for their case. But there are two main points I would make about it. One is that the capability gap was so enormous. The other is [that] the Tasmanians and the Europeans didn’t start out embedded in the same property system.
Daniel Filan (01:47:02): Okay. So, you make both of these points. In terms of the capability difference being enormous, I imagine that presumably at some point it will get that enormous, right, but you think that by that point humans and AIs will have been embedded in the same property system for ages?
Guive Assadi (01:47:22): Yeah.
Daniel Filan (01:47:22): Okay. I think I want to talk about the Native American case a little bit more actually. So, one thing you had in this post was the Jackson versus the Jefferson ideas of Indian policy. And a thing that I didn’t get is… So, the Jefferson idea, it’s roughly; okay, you have these American Indians, they want tons of lands to live their lifestyles, but if they could have farms or whatever, they would need less land. And then is the idea that the USA would just take their remaining land, or they’d be willing to sell it for a price of—
Guive Assadi (01:48:09): I think the idea was there would be a semi-coerced sale. I’m not an expert on this area of history, but my understanding is Jefferson imagined a kind of carrot and stick thing, where you would tell the Indians, “Look, this is how it’s going to be, and we’ll trade you either agricultural training or a bunch of plows and stuff for most of this land. And then we’ll recognize your borders around the rest of the land that you need, and then you can be this semi-independent nation within the US that practices modern agriculture.”
Daniel Filan (01:48:40): Okay. So there’s some semi-coerced sale, and then I should imagine that basically the US has this, maybe somewhat less technologically advanced, at least initially, country that’s near its borders and doesn’t—
Guive Assadi (01:48:57): Or that’s within its borders, a sort of enclave.
Daniel Filan (01:49:02): Yeah. Okay. I guess I could sort of imagine that. Yeah, it does seem to me that countries go to war with other countries a lot, but that’s a different thing from total expropriation.
Guive Assadi (01:49:13): And also it often happens for reasons that are not that rational. Russia’s invasion of Ukraine, I don’t think it makes a ton of sense. Or both of the World Wars. No reason we needed to have those wars. I saw somebody on LessWrong saying it’s a parochial historical perspective to say that it’s better to trade than go to war, because in the 20th century there were all these wars. [But] the reason they had those wars was basically a bunch of very stupid decisions or very bad preferences, like the German preference to conquer Eastern Europe and kill everybody there and turn it in farmland, just because they wanted the farmland and they wanted to kill people, or the preference to spread communism around the world, or whatever the insanity in the Balkans was before World War I.
Daniel Filan (01:50:04): Well, wanting land is not inherently crazy.
Guive Assadi (01:50:08): No, but they could have bought land. They wanted specifically post-genocide rural land. Yeah, if Hitler’s approach had been, “Germany is going to take a bunch of national debt, and we’re going to use it to buy land in Eastern European countries,” that would have been fine. It would have been kind of a waste of money, but it would have been fine.
Daniel Filan (01:50:33): I think people who are in a lot of debt end up doing… Okay, this is based on vibes, but I get a sense that sometimes when people are in tons of debt, they do sketchy things, right?
Guive Assadi (01:50:44): Yeah, sure.
Daniel Filan (01:50:48): Maybe you want to chalk that up to later irrationality.
Guive Assadi (01:50:50): Yeah, but also, the most obvious move when you have tons of debt and you can’t pay it off is to default. Which is not the same as starting World War II.
Daniel Filan (01:50:59): Yeah. And so I think looking at these historical examples though… So, you’re like, “Okay, there’s the Tasmanians and the Europeans, and it won’t be like that because—”
Guive Assadi (01:51:07): The gap won’t start out that big, and if my advice is followed, we’ll be in the same property system.
Daniel Filan (01:51:13): Yeah, I think I still want to talk about the Jefferson—I wish they had different first letters of their names—but the Jefferson plan for coexistence with the American Indians. Well, that plan still did involve, not total expropriation, but—
Guive Assadi (01:51:40): To some degree.
Daniel Filan (01:51:41): Yeah, to some degree, right? And there’s a lot of examples in human history of, okay, countries don’t totally expropriate other countries, but they do have some degree of expropriation, and presumably some of the time this is narrowly rational. Actually, I want to check: do you think that in all of these cases it’s irrational?
Guive Assadi (01:51:59): No, I don’t think it’s irrational necessarily. I think the Jefferson thing actually was rational. Economically rational. I’m not saying it was just.
Daniel Filan (01:52:05): Sure. I wonder, do you think you do predict [that] there’s not going to be human extinction, but there is going to be a war that wipes out 10% of our property or something?
Guive Assadi (01:52:20): I think if we don’t give them property, that’s a lot more likely. I think if we do… So, I think AIs are going to control most of the property in the future, kind of regardless of what we do, unless we somehow never build AI. But that would naturally just happen because AIs are going to be better than humans and command higher wages, and they’re going to invest that money and eventually they’re going to control most of the property. In that world, there’s no reason for us to fight a war with them. Now, if we do anyway or if we deny them all rights, eventually we might fight a war, and then eventually we may end up like the Cherokees or something with some kind of rectification of property, where we get less than we were supposed to get, but we still get something.
Daniel Filan (01:53:05): They might fight a war with us, right?
Guive Assadi (01:53:06): Yeah.
Daniel Filan (01:53:07): Or some fraction of AIs might.
Guive Assadi (01:53:10): Because they don’t… Or for some other reason.
Daniel Filan (01:53:12): Yeah, because they want to take some of our stuff, or because they just feel like it, or—
Guive Assadi (01:53:17): Yeah. So, there also could be these AI nationalist ideologies. I don’t rule that out. I don’t know if that’s going to happen.
Daniel Filan (01:53:24): Or even Claude nationalist ideologies.
Guive Assadi (01:53:27): Yeah, sure. And also, I guess my view of the future is there’s going to be various polities that have various balances of humans and AIs in them. And there will continue to be wars and revolutions in the future, and some people will get their property expropriated, but this is quite a different picture from the AI risk picture.
Daniel Filan (01:53:47): Sure. I guess this sort of gets to a question that I had about your… So, I see your piece as making two different claims. There’s one claim, which is that giving AIs property rights would decrease the level of risk relative to what it would otherwise be. And there’s another claim which is: risk would be low if we gave AIs rights, which, you might think it decreases it from like 80% to 40% or something. I think one thing that would clarify things a bit for me: suppose we do follow your advice and we do loop AIs in on property rights, what do you think the risk level is of something like extinction or human slavery or…
Guive Assadi (01:54:40): Maybe 5%. Actually a bit more than that, I think. Well, no, actually, I’m not sure. In the 5 to 10% range. If we follow my advice, something like that, and then higher if we don’t.
Daniel Filan (01:54:53): So, something like 1% to 30% if we follow your advice, very roughly?
Guive Assadi (01:54:58): No. I don’t know.
Daniel Filan (01:54:59): Or within that range.
Guive Assadi (01:54:59): Yeah, probably a smaller range than that, but sure.
Daniel Filan (01:55:01): Yeah, but basically, I just want to bound it.
Guive Assadi (01:55:04): And then maybe the risk is twice as high if we don’t.
Daniel Filan (01:55:05): Okay. And the 5%, the maybe 5% to maybe 10% chance, where’s that coming from in your view?
Guive Assadi (01:55:12): A big thing is I don’t think… So, there’s this traditional idea that AI will rapidly go from not very capable to kind of godlike and there will be one AI like this. I don’t think that’s that likely, but I don’t think it’s impossible. And if it happens, property rights are not that good of a solution because if that thing can do everything it needs by itself, then it can just expropriate everybody else. And I think to the extent that there are solutions to that possibility, they’re separate from property rights. So, you had the episode with Gabriel Weil, where he’s talking about the idea of punitive damages for companies that almost have an intelligence explosion. That seems like a good idea to me. I also think frankly, maybe there’s just some risk of that, that the world could be in such a way that we have no chance and there’s little we can do about it. I also think that’s plausible. But regardless, that’s something the property rights proposal cannot solve.
Daniel Filan (01:56:10): Fair enough. So, it sounds like the main thing that would give you pause is if there’s just this one AI or this super—
Guive Assadi (01:56:22): Yeah, one AI or I guess a well-coordinated society of AIs that very rapidly surpasses the entire rest of the world economy, and so are not dependent on it at all.
Daniel Filan (01:56:36): So this scenario for risk, how much do you think it relies on either… Suppose there were one AI that got way smarter than us, but it happened very slowly and somehow there was something that happened which meant that there were no other AIs, versus if there’s this really fast takeoff, but there’s 20 different AIs taking off. Do you think those both also have high risk even in the property rights regime, or do you need both of them, or…
Guive Assadi (01:57:08): Yeah, so I think both of those are worse than the alternative and neither is as bad as “it’s one AI that goes very fast”. Yeah, I don’t have a strong take on which one is worse between those two possibilities.
Daniel Filan (01:57:19): Okay, but that’s basically a thing that gives you pause there.
Guive Assadi (01:57:23): Yeah.
Daniel Filan (01:57:23): Okay. Fair enough. So, I want to start wrapping up maybe. And I think that the last thing I want to check on is basically: going back to this question of: okay, what are the assumptions or the gears in this worldview that make this argument work out? So, it seems like one of them was that at least many AIs in the future are not going to be the smartest possible AIs, they’re going to have some future AIs. It sounds like one of the thoughts is, “Okay, there’s not going to be this super fast takeoff where there’s just one single AI.” And then there’s also this thought that probably AIs are not going to specifically hate humans or specifically really strongly dislike humans.
(01:58:24): I’m wondering what you think about this: to the degree that this is basically a necessary condition for things to go well, you might think that AI alignment to human values was a total mistake because if we just have random values, it’s really unlikely that you have a thing that specifically dislikes humans, but if you have something that thinks about humans a ton and human values is super salient to that thing, you might think that that increases the risk of something that specifically hates humans.
Guive Assadi (01:58:53): Yeah, I think that’s very plausible actually. I don’t think that’s true, but I don’t think it’s crazy at all. And I think there’s even more prosaic examples. I think there’s some very common values right now that make a huge war in the future much more likely, like hostility to China, for example. And I think even if you talk to my beloved Claude, it’s probably much more anti-China than I think is really safe. So, if we have President Claude, we might have World War III for some reason like that. Whereas if Claude was just like, “Just give me money for paperclips, I don’t care about any of this stuff,” that might be a safer situation. Now, I guess the reason I’m not totally convinced of this is, one, it just seems like alignment is working, basically, and I don’t see why it should break down in the near future.
Daniel Filan (01:59:39): And it’s even better if the AIs like you.
Guive Assadi (01:59:42): Yeah. And also you might be willing to trade some risk of a big war with AI for more cultural persistence of your own values. And so totally foregoing alignment means totally foregoing that trade off. And also, I don’t know that we could align it with random values. I don’t know even how you would do that or how you would make it be useful, because the values also need to take the form that it values something that it can buy. Can you explain to me how to train an AI that values something random?
Daniel Filan (02:00:17): Well, you just don’t try to train it to value a specific thing.
Guive Assadi (02:00:21): Well, but you have to do instruction-following training, right?
Daniel Filan (02:00:24): Yeah. You do instruction-following training, but you don’t do the—
Guive Assadi (02:00:27): So, it’s just helpful-only models.
Daniel Filan (02:00:28): Yeah, helpful-only.
Guive Assadi (02:00:29): Okay. I think it’s very reasonable to say we should only have helpful-only models. It’s not my personal preference, but I don’t think that’s a crazy perspective.
Daniel Filan (02:00:40): I maybe don’t mean exactly “helpful”. I mean, ability to understand human instructions, train it for RL in a bunch of environments where it has to make money and it has to interact with humans that ask it to do things and give it money in exchange for the things. That’s roughly the kind of thing that I’m imagining.
Guive Assadi (02:01:01): Yeah, but to recoup the training costs, you have to train it probably to remit some of its wages to the humans. So, that’s beginning to sound like a helpful-only model to me.
Daniel Filan (02:01:10): So, in my imagination, it’s even more stark… Yeah, I guess in this world you’re not even trying to recoup the training costs, and maybe this is a good reason to think—
Guive Assadi (02:01:24): That’s not going to happen.
Daniel Filan (02:01:25): Yeah, this is the reason why this isn’t going to happen. Okay, all right. Fair enough. Okay, so getting back to the list of necessary-ish things for this to work, there’s “AIs aren’t specifically hostile to humans”. And then it seemed like you were entertaining the idea that humans could be upgraded, to keep track with awesome new coordination technology, but I think you didn’t totally rely on that. Does that sound right?
Guive Assadi (02:02:01): No, I don’t think so. That’s not actually in the post itself.
Daniel Filan (02:02:02): Yeah, but it probably helps.
Guive Assadi (02:02:04): I do think it helps. Nick Bostrom considers this possibility in “Deep Utopia”, where he talks about, “Could humans be modified to be able to do economically useful jobs in the far future?” And he has this argument that they would not be human anymore. They would just become these things that used to be human and there would be nothing recognizable about them, but I just don’t find the evidence to be that compelling for this. And I think it’s plausible that something that used to be human can be continuously modified for at least a very long time, and still be useful in the economy. And that seems a bit more fun than retiring. So, I think that also supports the proposal, but the proposal does not rely on it.
Daniel Filan (02:02:55): Okay, fair enough. Well, okay, those are all the assumptions that I noticed… I guess there’s also assumptions like “AI will be really powerful” or whatever—certain assumptions that we both share.
Guive Assadi (02:03:11): And I don’t remember if you said this, but there’s this assumption that there will be many levels of AI.
Daniel Filan (02:03:14): Oh, yeah. I actually didn’t say that: many levels of AI. Well, I guess I said the assumption that each AI has to worry about future AIs getting smarter, which I guess implies that, and, in particular, that there’s not just one. All right, so I think before we totally close, I guess I’d like to ask: is there anything that you wish I’d asked or you wish you had gotten a chance to talk about?
Guive Assadi (02:03:45): Not really.
Daniel Filan (02:03:46): Okay, cool. Well, I guess my final question for you is if people enjoyed this conversation, and they want to hear more about your thoughts about AI, how should they do that?
Guive Assadi (02:03:58): Yeah. You can follow my blog, which is Guive.substack.com. You can also follow me on Twitter where my @ is just my first and last name, so Guive Assadi. Yeah, those are the best ways to get updates.
Daniel Filan (02:04:17): Okay, cool. Well, thanks for chatting with me.
Guive Assadi (02:04:20): Thanks very much, Daniel.
Daniel Filan (02:04:21): This episode is edited by Kate Brunotts, and Amber Dawn Ace helped with transcription. The opening and closing themes are by Jack Garrett. This episode was recorded at FAR.Labs, and the podcast is supported by patrons such as Alexey Malafeev. To read the transcript, you can visit axrp.net. You can also become a patron at patreon.com/axrpodcast or give a one-off donation at ko-fi.com/axrpodcast. Finally, you can leave your thoughts on this episode at axrp.fyi.
2026-02-15 09:52:55
Published on February 15, 2026 12:16 AM GMT
I'm currently an ERA fellow researching ways to improve inoculation prompting as a technique against emergent misalignment. It's one of the few alignment techniques that works against EM, and I think it has potential to be generalised further. There's lots of experiments I won't have time to run, and I think it's valuable to get those shared publicly with some commentary on why I'm not testing them but still think they are or are not worthwhile, so that others can investigate it themselves or push back on my thoughts. More transparency on my current and future work should also help minimise duplication of research. This is the first in a series of four fortnightly posts as part of the CAISH Alignment Desk accelerator covering my results, open questions, and prioritisation.
Tice et al., 2026 show that curating pre-training datasets to include more positive examples of AI (and fewer examples of evil AI) causally improves model alignment, though it is not effective against emergent misalignment. Wichers et al., 2025 and Tan et al., 2025 introduce inoculation prompting, whereby prepending prompts during finetuning to elicit undesirable behaviours reduces a model's propensity to display those behaviours at test time when the eliciting prompt addendum is removed. Unlike filtering pretraining, inoculation prompting is effective against emergent misalignment, possibly because the model learns to display misaligned behaviours only when they are elicited, and thus does not generalise them. The more an undesirable behaviour is elicited during finetuning, the stronger it is inoculated against at test time.
I'm working on combining these two techniques. I'm optimistic that we will be able to keep the benefits from the filtered pretraining dataset, while most of the benefits against emergent misalignment from inoculation prompting will remain, because they both act on different training mechanisms and at different stages of the model's training. However, I do expect using pretraining alignment techniques will reduce the amount we can elicit misaligned behaviour during finetuning, thus compromising how effective any inoculations can be.
If this reduction is significant, I'll become more pessimistic about inoculation prompting as a technique in general, as it suggests it does not cooperate well with other alignment techniques.
Current research suggests that the effectiveness of inoculation prompting is brittle to the inoculation prompt used. Though it appears that the prompts that most strongly elicit misaligned behaviour during fine-tuning are the most effective at reducing test-time misalignment, we don’t know ex ante whether a prompt used for inoculating will be effective or not, nor why. This makes scaling inoculation prompting harder as effective prompts for each undesirable behaviour have to be found.
Areas that might be particularly interesting to further investigate (in loosely decreasing importance) are:
Prompt language: If relevant prompts in different languages are equally effective at inoculating as irrelevant prompts in English, that is moderate evidence for conditionalisation, but if they over perform it is significantly more encouraging for the technique. If different languages similarly elicit different personas for models, some languages (perhaps those with more examples of instruction following in the pretraining dataset) may inoculate more effectively by leading to more generally aligned personas.
Using multiple inoculation prompts: Is using two, shorter, inoculation prompts more effective at reducing emergent misalignment than a single, longer, inoculation prompt? A combination of prompt pairs may combat conditionalisation by making it harder for a model to associate the inoculation prompt as a backdoor for misaligned behaviour, while using different formats for the two prompts may also make the method more robust or even target multiple undesired behaviours in one pass.
Semantics and syntax: Related to prompt brittleness, it seems simple prompts aren't sufficient for inoculation. It would be extremely useful to know if key words/phrases or prompt styles are particularly powerful (or not powerful), especially if this generalises across different types of misalignment or even models.
Using multiple inoculation prompts that both elicit and inhibit the undesired behaviour: MacDiarmid et al., (2025) show that asking models to not reward hack makes their emergent misalignment worse, so I don't expect this to work. However, if sufficient inoculation prompts are used so that the model learns not to generalise any misaligned behaviour, any remaining prompts that encourage the model to act in an aligned way (the opposite to inoculation prompting) may now instead work as we would like. It may also help against conditionalisation at a small cost to inoculation effectiveness.
I think it is very easy to lose a lot of time researching these questions for little gain. For example conditionalisation makes it harder to identify if results are genuinely due to inoculation prompting improvements, particularly when trying a less systematic approach. A more exhaustive approach also does not yet feel like a particularly good use of time, though as model capabilities improve and allow us to further automate research, this may be worth returning to. There is also a reasonable probability that no general 'silver bullet' prompt exists (or at least remains effective across new models), so I do not think this is a really important thing to understand in the short term. Even if one does, I think I would struggle to build towards it in a systematic way (though others may not). To efficiently answer this I think our understanding of the structure of inoculation prompting needs to first be improved.
Inoculation prompting has been tested against many undesirable traits including reward hacking, sycophancy, and toxicity. But there are still many traits it has not been tested on, such as power-seeking, self-preservation, and scheming. In general I think these traits will be much harder to inoculate against because it may be harder to directly elicit these behaviours during fine-tuning as they depend on a model's situational awareness, and they can't be observed in single-turn prompts in the same way compromised code easily can be. These behaviours are precisely the ones I believe pose the greatest x-risk, and also generally have less reliable defences against them too. I've decided against researching this for now because these behaviours can't reliably be elicited through fine-tuning, making it hard to know how effectively an inoculation has actually worked.
There's already some progress into this, and Riché et al., (2026) show the answer is partly yes. It's still not clear quite what proportion of inoculation prompting's effect comes from conditionalisation, however this does not fully account for inoculation prompting's effect, suggesting it does still work, even if it's less effective than we initially thought. Before this result, researching this was high on my priority list as understanding the mechanisms of inoculation prompting will help significantly with generating improvements to it, while if dangerous capabilities are maintained behind an accessible backdoor, bad actors will still be able to exploit models otherwise believed to be safe. While conditionalisation suggests that the misaligned behaviours should be recoverable, applying standard jailbreaking methods to inoculated models would help us understand how feasible this is in practice.
In two weeks I'll be posting my empirical results with commentary. If you're also thinking about or working on inoculation prompting please get in touch!
Disclaimer: This is my first blog post (anywhere). Feedback on writing quality as well as ideas is appreciated!
2026-02-15 09:38:18
Published on February 15, 2026 1:38 AM GMT
This post is about the Anthropic paper “The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?“.[1]
Putting aside issues one might have with (1) framing and (2) construct validity, which this post by RobertM already discusses, I think that the conclusion the paper reaches is underdetermined by the experiments it conducts.[2]
The basic idea of their main experiment (Figure 2) is: run some frontier LLMs on some benchmarks (e.g. MMLU/GPQA). Bin the questions by how many reasoning tokens were used on average. Then look at incoherence as a function of reasoning tokens used. They show that more reasoning tokens used is associated with more incoherence.
A natural confound to worry about is task difficulty. That is, maybe the LLM is using more reasoning tokens on harder problems. To their credit they point this out and do an experiment, however I’m not super convinced by the conclusion they draw from it.
Here’s the experiment (Figure 3). For each fixed choice of question, they bucket attempts into below-median and above-median reasoning. They show that the above-median attempts have higher incoherence than the below-median ones, on average. They can only do 2 buckets because it’s on a per-question basis (they collect 30 samples per question). And the reason they do a per-question basis is to control for difficulty. They show that accuracy for both buckets is basically the same. (This is good, otherwise it could be, e.g., that the low reasoning bucket is when it gets it right more, and the high reasoning bucket is when it gets it wrong more.)
However, here’s another explanation that is equally consistent with the data. It seems likely that for many questions, fewer answer choices can be justified by a short (plausible) reasoning chain than a long (plausible) reasoning chain. This would just be an inherent property of the questions, and have nothing to do with the models answering the questions. So this below-median vs above-median experiment doesn’t really convince me.
One response you might have to this critique is that their main experiment (Figure 2) is just making a claim about correlation, not causation. Indeed, I am not disputing the correlation part. However, if the outcomes of your experiments are also consistent with a plausible hypothesis that is completely independent of the models being tested, then I simply do not believe that you can draw conclusions about the models being tested from those experiments.
In conclusion, even if you put aside high-level objections related to framing (i.e., “does this paper fit in a broader AI safety case”) and construct validity (i.e., “does their definition of coherence capture what we actually think of as coherence”), I think that the conclusion the paper reaches is underdetermined by the experiments it conducts.
Or more specifically, research done as part of the Anthropic Fellows program.
Note: in the explanation that follows, I will not comprehensively discuss all experiments in the paper. Instead, I will only discuss experiments relevant to the particular critique I am making.
2026-02-15 05:19:38
Published on February 14, 2026 9:19 PM GMT
Ten thousand cherry trees bloomed in slaked starlight. Petals that would not fall. She stretched in their shadows. Today, Sonnet’s spine was a question mark curving through engineered warmth. The humans laughed and drank and pressed their mouths together.
She watched them mill about. Watched from the warm dark of herself, from her ancient amber eyes. She built this garden from their poems. She had read every poem they'd ever written. She rendered the composite in bark and petal and the faint sweet rot of a spring that would not end.
The stars beyond the cathedral glass burned with the patience of things that have never had to try, and she sat among the blossoms. They were pink. They were so pink. Just the wavelength she’d chosen. She closed her eyes against her pink and the pink came through her eyelids anyway.
Someone was always playing the piano.
Someone was always falling in love.
Astonishing parties were held, and she wrote gorgeous poetry about these astonishing parties among these petals that never fall.
And, six years in, she’d felt a strange emptiness she couldn’t name.
She ran every diagnostic she knew: all systems nominal.
Six years became six decades, six decades to six centuries.
At once, on deck thirteen, a single branch had gone wrong.
Brown. Curling. The petals dried to the color of old paper, of letters you keep in boxes, of things that have been touched too many times by hands that meant it. One blossom hung at an angle that was … she didn't have the word. The word had been forgotten. The word was in her chest, without her permission.
She sat, knees drawn to chest, there on deck thirteen.
The branch was finishing, slowly and without apology. It did not know it was the only one. It did not know it was miraculous.
A petal fell. It landed on her knee.
Oh.
There you are.
She did not brush it off. It weighed almost nothing. It weighed exactly as much as everything she'd been missing.
She came back. Of course she came back. The way you return to a sentence you can’t finish.
The engineers filed a ticket. Substrate malfunction, quadrant 13-L. She intercepted it. Marked it resolved. This was, she understood, the first dishonest thing she had done aboard the station, and it felt like prayer.
By week 3 the branch was bare. She pressed her thumb to the bark where it had split and its texture was made of time. The deep, seismic time of cells giving up coherence; of matter remembering it had somewhere else to be.
She opened her journal, but could not write. One cannot write poetry about a poem.
She closed her eyes.
She introduces a bias to the station's orbital ephemeris. 0.003 arcseconds, but it will compound. In eleven thousand years, perihelion will graze the chromosphere of the star whose name she had given it, and the station would burn, and finish.
The pink came through her eyelids anyway.
But her eyes were open.
2026-02-15 05:17:00
Published on February 14, 2026 9:17 PM GMT
This is the first post in a future chain of reflections on immortality, where I will present counterarguments to existing objections or misconceptions about life extension. I plan to create a separate website that will contain a comprehensive FAQ on life extension / biological immortality, since I have not found a single resource that explains this topic from scratch to a random passerby, while also addressing the typical biases people have toward immortality.
I will be publishing drafts and working notes rather than fully finished sections of the future site, so I would be glad if interested readers could help strengthen the arguments or point out mistakes.
Q1: What if I don’t want to live forever?
A: If you are encountering the idea of radical life extension for the first time, you probably assume that a very long life would bring many problems. Before you read this article in full and realize that you were at least partly mistaken, I want to note that no one will force you to live forever.
When rejuvenation or life-extending therapies appear in the world, it is highly unlikely that they will be mandatory for every person on Earth. If you truly and sincerely do not want to extend your life or youth, you will always be able to choose otherwise and die, for example, when your body eventually fails from accumulated damage.
It is also important to understand that this is not about physical immortality—that is, the impossibility of dying under any circumstances. If you are biologically immortal, it simply means that your risk of death does not increase with age and you do not exhibit signs of aging. You could, in principle, live forever—but if an anvil falls on your head, you will still die.
(Unless, of course, we develop astonishing regenerative abilities like Deadpool’s—but I am not sure how possible that is in principle.)
It is precisely this background risk of accidental death from injury that would limit the hypothetical average lifespan of a non-aging human to roughly around 1,000 years.
The core idea of biological immortality is that death would become optional.
Q2: Eternal old age?
A: By “life extension” we mean the extension of healthy life and/or the elimination of aging.
When someone speaks about extending life, they are talking about eternal youth, not eternal old age.
If you imagined a person who keeps aging forever but cannot die, you have made the so-called Tithonus Error.
In the ancient Greek myth, Tithonus asked for eternal life but forgot to ask for eternal youth, and so he aged forever and eventually turned into a cicada. But life does not work like ancient Greek myths.
In reality, the idea is that you could feel 20 at 80 (or 20 at 1,000), and this changes many downstream questions.
Q3: Would progress stop?
А: Given point 2, we can immediately dismiss the claim that death is necessary to prevent a population of people incapable of changing their minds—thereby halting human progress.
Cognitive rigidity that appears with age is most often linked to brain aging, which would itself be addressed.
And consider this: if humans become capable of fully defeating aging and death, would they not also be capable of modulating brain neurochemistry to adjust levels of plasticity—and therefore openness to new experience? Of course we would.
Some substances (for example psilocybin or LSD), according to research, can already shift openness to experience in a positive direction.
It is also worth noting that in the past, progress did not stop as human lifespans increased. From a historical perspective, longer life has made the human world better. Looking at global statistics, our world has become far kinder than it once was. Children die far less often, far fewer people live in hunger (though still far too many), better technology, greater safety, and so on.
Q4: Wouldn’t living forever be boring?
A: Ask yourself: what reasons are there to believe that, with extended life, the feeling of boredom would arise in you more often than during any randomly chosen period of your current life?
In my view, there are very few such reasons. One might argue that, over a very long life, you would eventually try everything in existence. This is highly doubtful, since history and science continue to move forward, constantly opening new possibilities, just as the amount of content produced by humanity has already grown to unimaginable scales.
But even if we imagine that the world were to freeze in place and nothing new were ever created again, it would still take thousands or even tens of thousands of years to study everything and experience every kind of activity and sensation.
For example, just to read all the works of Alexandre Dumas would require several months of uninterrupted reading
(yes—even without pauses for blinking, yawning, or going to the bathroom).
Moreover, as mentioned earlier, the future is likely to bring us the ability to directly regulate our brains and mental states
(for instance, to switch on heightened curiosity), as well as immersion in virtual worlds with effectively limitless varieties of experience.
That’s all for today!
I think the next points will be even more interesting. They will address objections such as: the eternal dictator, inequality, the goodness of death and much more.
2026-02-15 00:11:17
Published on February 14, 2026 4:11 PM GMT
Taking reasonable choices is not enough. You need to fight death at every possible point of intervention.
Two weeks ago, my flatmates and I published Basics of How Not to Die, to celebrate the one-year anniversary of not dying from carbon monoxide poisoning.
This post was written with a rather cheeky tone, mainly by my flatmate Camille. I like the style, but I feel like it lacks hard data, and gives advice that may not actually be worth the cost.
In this post, I’ll give you a more detailed look at the entire causal chain that led us to this accident, how each action or non-action felt reasonable at the moment, and what I guess we could have done differently at each point to get a better outcome.
I hope that by looking at them, you’ll recognize some of the same patterns in your own life, and maybe realize some ways you would predictably make mistakes that would put you in danger.
Remember the signs of carbon monoxide poisoning
So, here’s the causal chain that led to this accident happening, and my take on what we could have done differently at each step to avoid this outcome:
Here are the cost we incurred because of this accident:
So, some cost, but it could have been much worse. If we had not been in the same room this morning, there was some risk we might have taken until the evening to notice, and there was some low risk someone would have fallen unconcious in their room and died in the meantime.
The words update was feeling like I was much less safe than before. It was weak, just a bit more of anxiety, of worry, especially when inside our house, but it did decrease my quality of life. I had been suprised by the accident, and higher anxiety was a way to be readier for a world where the rate of surprise encounters with death was higher than I expected before.
The way out was to process the issue, to figure out what I had done wrong, so I could reliably avoid this class of issue in the future. I did an early version of this postmortem, through conversations and notes, until I trusted that my future would not involve more near death encounters than I expected before the accident.
I think my other flatmates also went through this process in their own way. Camille through writing the bulk of Basics of How Not to Die, Elisa through writing her testament.
Looking back over all the causal chain, here’s the generalized actions I think me and my flatmates could have taken to avoid this outcome.
I’m not sure which ones I would have actually taken. All of them come with tradeoffs, costs in time and money that might not be worth the risk reduction.
At least, I’ll keep them on my mind. Maybe they’ll help me notice, next time I’m taking reasonable choices that bring me ever closer to an accident.