2026-01-12 15:37:43
Published on January 12, 2026 7:37 AM GMT
I noticed that some AI-safety-focused people are very active users of coding agents, often letting them run completely unrestricted. I believe this is a bad standard to have.
To be clear, I do not think that running Claude Code will take over the world or do anything similar. But risk profiles are not binary and I believe that this new interface should be used more cautiously.
I will explain my thoughts by going over the standard reasons someone might give for --dangerously-skipping-permissions:
Many people are using Claude Code, Codex, Cursor, etc. and there haven't been any catastrophic accidents yet, so it is a reasonable claim.
Even if today's models are safe, I expect they will get more dangerous down the line. The important thing is knowing where your red lines are. If the models get slightly better every two months, it is easy to get frog-boiled into riskier behavior by not changing your existing habits.
If you are fine with the small chance of all your files being deleted, that's okay; just define this explicitly as an acceptable risk. A year from now the consequences of a misbehaving agent could be worse and you might not notice.
Modern-day agents are very, very useful, so you should consider their advantages for your productivity, especially if your work is important.
I think this is true, and I also think that agents are only going to get more useful from this point on. If your current workflow is autonomously running simulations and building new features, your future workflow might be autonomously writing complete research papers or building dozens of products. This will feel amazing, and be very effective, and could also help the world a lot in many important ways.
The reason people are worried about super-intelligence is not because it won't contribute to personal productivity.
The risks and benefits of these powerful, general-purpose tools both arise from the same set of capabilities, so a simple utilitarian calculus is hard to apply here. The benefits are concrete and quantifiable and the risks are invisible - they always will be. Instead of making sure it's more beneficial than harmful, I believe that when working with agents, it is better to set up hard limits on the blast radius, then maximize your productivity within these limits.
There is the MIRI viewpoint of a big, discrete leap in capabilities that makes everything much more dangerous. Here is a simplified view:
If we are in such a world, then how you use your AI models doesn't really matter:
Given this view of the world, how your local user permissions are configured no longer seems relevant.
This is somewhat of a strawman, as MIRI didn't say that current models are not dangerous, and as far as I am aware, are not in favor of running today's coding agents unsupervised. But this mindset does seem common in the field - The only important thing is preparing for AGI, which is a singular moment with existential consequences, so lower risk profiles don't matter.
The endpoint probably does look like this (with extremely capable minds that can radically alter our society according to their values), but real issues begin much earlier. Unrestricted "non-AGI" agents may still run malicious code on your machine, steal sensitive information or acquire funds. Even if they won't destroy the world, we will have a lot of problems to deal with.
Similarly, people should still make sure their roofs are clean from asbestos even if their country is threatened by a nuclear-weapon state.
(Also, cyber security failures can contribute to existential risks. In case of a global pandemic for example, we would need good worldwide coordination to respond well, which would require resilient digital infrastructure.)
If you use Claude to autonomously run and test code on your computer, you'll probably be fine (as of January 2026). You should still be wary of doing that. Some risk is acceptable, but it should come with an explicit model of which risks are allowed.
Agent security is an emerging field (I have heard of several startups on this topic and a dozen more are probably setting up their LinkedIn page as we speak). I am not well-informed in this field, and this is why I didn't recommend any concrete suggestions for how to solve this. My point is that this is a relevant and important topic that you should consider even if your focus is on the longer-term consequences.
2026-01-12 14:37:54
Published on January 12, 2026 6:37 AM GMT
Sometimes you begin a conversation, or announce a project, or otherwise start something. What goes up must come down[1] and just so most things that get started should get finished. It's like starting a parenthesis, every open ( should be paired with a matching ). Some such things are begun by other people, but still need you to finish them.
"Finish the thing" often includes "tell people you've finished." I think of the tell people part as "closing the loop." Closing the loop is a surprisingly useful part of doing things when working in groups.
The personal benefit in closing the loop is primarily in having one less thing to track as a dangling "should I do something about that?" I try to maintain a list of all the projects or tasks I've undertaking, sorted neatly by topic and priority but at least written down and not lost. When for whatever reason my ability to maintain that list in written form gets disrupted, I start tracking it in my head and start getting worried I'll forget something. My understanding is lots of other people do it in their heads most of the time, and are often worried they're forgetting something.
The benefit to others is that they know the task got done. If they asked you to do it, plausibly it's still hanging around on their todo list to check and see if the thing got done. Why would they do that instead of just trusting you to finish the task in a timely and effective manner? Experience Because people commonly get busy or distracted, and the task winds up not getting finished. Actively telling them 'yep, job done' is helpful. Even saying 'actually, I've decided not to do this' is useful!
Those two reasons give rise to my great appreciation for ticket management systems like Trello or Jira, and why I tend to recreate them in various forms wherever I go. But there's another context I've learned closing the loop is surprisingly useful.
In social conflicts, letting people know when a situation has (from your perspective at least) reached a resting state turns out to be especially helpful. If you were planning to talk to another party, or take a night to think about things, or otherwise the ball is in your court, then telling the first person when you've finished is helpful to them. "Thanks for bringing this to my attention, I've finished my investigation, and at present based on my understanding I think no further action is necessary." Basically, when you've stopped doing things, try and summarize your current resting state to the people directly involved who might reasonably expect to hear something back.
(If you're reading this around when I post it in Jan 2026, and you're thinking hey I've got an open loop with Screwtape he hasn't closed- yeah, there's more than a couple threads like that. This one's not a "ahah, I have been Successful and will tell others of my Success" post, this is a "dang, I keep dropping the ball in this particular way the last couple months" post.)
unless it achieves escape velocity
2026-01-12 12:25:58
Published on January 12, 2026 4:25 AM GMT
I have come to spread the good word: we're doing Inkhaven again, this April 1 – 30. You can apply on the website.

Inkhaven activates people as bloggers/writers. We had 41 residents, and all of them completed the program of 30 posts in 30 days.[1] Of those 41, 30 have continued to submit blogposts to the Inkhaven slack since December 2nd, with those 30 publishing an average of 1 post per week since then.
But also because the month of first cohort of Inkhaven one was one of my favorite months of my life. I got to write, and be surrounded by writers that I respected.

As I say, people actually published. If we add in all the visiting writers and staff who also submitted posts to Inkhaven (e.g. I wrote for 20 continuous days of the 30), then here are some summary stats.
To be clear, some of the residents did more than their mandatory 30. One day Screwtape published 3 posts. Not to be outdone, local blogging maniac Michael Dickens published 10 (!) posts in a single day. And on the last day, Vishal (who read an incredible ~80% of the content produced during Inkhaven) published a blogpost that has 22 short posts faithfully in the voice of the 22 different residents.
People overall had a pretty great experience. Here are some assorted quotes from the feedback form and blogposts they wrote about their experience.
"Writer's block is fake now? I feel my mind extremely attuned to whatever happens in the day to try and spin something about it."
—Inkhaven Resident
"I overcame a lot of my crippling perfectionism, and now I'm just excited to write!"
—Inkhaven Resident
"This is the longest period of time that I've been in 'deep work'. Turns out I normally live in a state of constant low-grade distraction—the first week of its absence was disorienting."
—Ben Goldhaber
"Having the coaches around helped SO MUCH. I am finally around writers who can give me good feedback. All I've had access to so far in my life were English teachers, who were always too overworked to challenge me."
—Inkhaven Resident
"I have a pretty clear idea about how to search for interesting ideas, and write about it from an opinionated perspective."
—Inkhaven Resident
"I feel respected... I like when programs treat their participants like adults, and care more about the substance than the niceties."
—Inkhaven Resident
"I can see doing this a lot, and forever. I could not imagine that before."
—Inkhaven Resident
"This was one of the best experiences of my life."
—Inkhaven Resident
"What a terribly wonderful month. I loved it all and never wanted it to end."
—Jenn
Over 20 established writers came by to offer support, feedback, classes, and advice. Scott Alexander, Gwern, Scott Aaronson, Adam Mastroianni, Alexander Wales, Andy Matuschak, Aella, Clara Collier, Ozy, Slime Mold Time Mold, Dynomight, and many more. Dwarkesh Patel came and did a Q&A, and CJ the X came and gave a midnight-talk on objectivity in art while drinking alcohol.
These people contributed so much to the culture of Inkhaven and gave inspiration and feedback for the Residents' writing.

Here's the feedback form that 39/41 people filled out.
I showed this to Scott Alexander and he described the feedback as 'good'.
To be clear, it wasn't for everyone. One or two people knew that 30 posts in 30 days wasn't going to be good for them, and one or two people had jobs to keep up that stressed them out, and I'm not sure it was worth it for them.

I'd say that the main thing that happens is you actually write. That's the thing that happens here. Some people came in knowing what they wanted to write about, and they wanted to get it out. Some people came having no idea what they wanted, except that they wanted to explore. People had different goals and grew in different ways.

All sorts. There was history blogging, econ blogging, fictional futuristic vignettes, health advice, math blogging, dramatic personal life stories, project management advice, mental health advice, fictional parody, rationality blogging, AI alignment blogging, romance blogging, cyberpunk blogging, YouTube blogging, therapy blogging, global conflict blogging, humor, gender blogging, and, of course, writing advice.
You can view ~30 essays that did especially well on Hacker News, LessWrong, and/or as voted by Residents, in the featured essays portion of the Inkhaven website.

If you would like to grow from someone who wants to be a writer, or someone who blogs occasionally, or someone who has written some good things but not invested as much as you'd like into writing, Inkhaven is your opportunity to graduate into "actual blogger".
There were also many established bloggers who grew by doing Inkhaven—Michael Dickens, Tsvi BT, Angadh, etc. It's also an opportunity to cut out distractions and focus on your writing for a month.

I can think of around ~two people who knew that writing a single blogpost in a day simply wasn't how they could produce writing they're proud of, and have since taken substantial parts of their writing offline. I'm not saying this is good for literally everyone, though I do think it is good for most people who want to be writers.
I can think of around ~two people who had serious external commitments that made focusing on their writing difficult or painful during the month, and one of them may have regretted coming due to that.

Yes! Jenn made some that you can buy here. I have them on my laptop.
![]() |
![]() |
The program fee is $2,000. Housing at Lighthaven starts at $1.5k, so the price is $3.5k.
Financial aid is available, and last cohort around half of the residents received some amount of financial aid.
Basically you just show us some of your existing writing, and we guess whether we'd like to read more of it from you.
Go to the website to apply and find out more info!
I look forward to spending April with some of you reading this :-)

With the technical exception of one blogger who, in some effort to experiment as a rule-breaker, on his last day published a blogpost below the 500-word limit. So technically, instead of 41* 30 = 1230 mandatory blogposts, we got 1229, and the last one was a blogpost but it was just ~300 words. I'm going to roughly ignore this as being irrelevant in most relevant contexts about whether the program works and whether people complete it.
2026-01-12 11:13:33
Published on January 12, 2026 3:13 AM GMT
Inequality is a common and legitimate worry that people have about reprogenetic technology. Will rich people have super healthy smart kids, and leave everyone else behind over time?
Intuitively, this will not happen. Reprogenetics will likely be similar to most other technologies: At first it will be very expensive (and less effective); then, after an initial period of perhaps a decade or two, it will become much less expensive. While rich people will have earlier access, in the longer run the benefit to the non-rich in aggregate will be far greater than the benefit to the rich in aggregate, as has been the case with plumbing, electricity, cars, computers, phones, and so on.
But, is that right? Will reprogenetics stay very expensive, and therefore only be accessible to the very wealthy? Or, under what circumstances will reprogenetics be inaccessible, and how can it be made accessible?
To help think about this question, I'd like to know examples of past technologies that stayed inaccessible, even though people would have wanted to buy them.
Can you think of examples of technologies that have strongly disproportionately benefited very rich people for several decades?
Let's be more precise, in order to get at the interesting examples. We're trying to falsify some hypothesis-blob along the lines of:
Reprogenetics can technically be made accessible, and there will be opportunity to do so, and there will be strong incentive to do so. No interesting (powerful, genuine, worthwhile, compounding) technologies that meet those criteria ever greatly disproportionately benefit rich people for several decades. Therefore reprogenetics will not do that either.
So, to falsify this hypothesis-blob, let's stipulate that we're looking for examples of a technology such that:
We can relax one or more of these criteria somewhat and still get interesting answers. E.g. we can relax "could be made accessible" and look into why some given technology cannot be made accessible.
What are some other examples?
In general, necessary medical procedures tend to be largely covered by insurance. But that doesn't mean they aren't prohibitively expensive for non-rich people. Cancer patients especially tend to experience "financial toxicity", i.e. they can't easily afford to get all their treatments so they are stressed out and might not get all their treatments and they die more. There's some mysterious process by which drugs cost more with unclear reasons [1] (maybe just, drug companies raise the price when they can get away with it). This would be more of a political / economic issue, not an issue with the underlying technologies.
Some of these medical things, especially IVF, are kinda worrisome in connection with reprogenetics. Reprogenetics would be an elective procedure, like IVF, which requires expert labor and special equipment. It probably wouldn't be covered by insurance, at least for a while—IVF IIUC is a mixed bag, but coverage is increasing. This suggests that there should maybe be a push to include reprogenetics in medical insurance policies.
Of course, there are many technologies where rich people get early access; that's to be expected and isn't that bad. It's especially not that bad in reprogenetics, because any compounding gains would accumulate on the timescale of generations, whereas the technology would advance in years.
Lalani, Hussain S., Massimilano Russo, Rishi J. Desai, Aaron S. Kesselheim, and Benjamin N. Rome. “Association between Changes in Prices and Out‐of‐pocket Costs for Brand‐name Clinician‐administered Drugs.” Health Services Research 59, no. 6 (2024): e14279. https://doi.org/10.1111/1475-6773.14279. ↩︎
2026-01-12 11:09:09
Published on January 12, 2026 3:09 AM GMT
My friend Justis wrote a post this week on what his non-rationalist (“normal”) friends are like. He said:
Digital minimalism is well and good, and being intentional about devices is fine, but most normal people I know are perfectly fine with their level of YouTube, Instagram, etc. consumption. The idea of fretting about it intensely is just like… weird. Extra. Trying too hard. Because most people aren’t ultra-ambitious, and the opportunity cost of a few hours a day of mindless TV or video games or whatever just doesn’t really sting.
This seems 1) factually incorrect and 2) missing the point of everything.
First off, in my experience, worry about screen addiction doesn’t cleave along lines of ambition at all. Lots of people who aren’t particularly ambitious care about it, and lots of ambitious people unreflectively lose many hours a day to their devices.
Second, digital intentionality is about so much more than productivity. It’s about living your life on purpose. It touches every part of life, because our devices touch every part of our lives. To say that people only care about their device use because it gets in the way of their ambitions is to misunderstand the value proposition of digital intentionality.
Yesterday I got talking with the station agent while I was waiting for a train, and (completely unprompted by me) he started saying things like “Did you know that in Korea, their books say the internet is a real addiction you can have?” and “You used to have to go to Vegas to be so overstimulated; now they put touchscreens on the street!” and “I go on my phone to use the calculator and then I realize I’m just scrolling and I didn’t even ever use the calculator!”
Or right now I’m sitting at a café, and I just overheard a woman say, “Intelligent people are making things very addictive to distract us.”
‘Normal’ people care about this, which makes sense, because it affects all of us. You don’t have to be ultra-ambitious, or even ambitious at all, to feel the opportunity cost of being on your devices all the time. People lament the moments they miss with their kids or loved ones because they’re looking at their phones. And there are plenty of non-opportunity costs — people complain about their attention spans shortening, their memory getting worse. They think about how they used to be able to read books and now they can’t. And people are on their phones while they’re driving, all the time.
How to Do Nothing is a book about digital intentionality (its subtitle is Resisting the Attention Economy), whose author thinks that the entire concept of productivity makes us forget what it is to be human. To her, devices are bad in part because they keep us focused on productivity. Her thesis is that if you really pay attention to the world around you, you’ll find that it’s so interesting that you just won’t want to spend time on your devices. (She made it sound so cool to not only notice but be able to identify all the birds you see and hear, that now I own binoculars and go birding every weekend!)
Even Cal Newport’s Digital Minimalism is surprisingly value-agnostic, considering that Newport frames most of his books in terms of productivity. He talks about a father who used to love art, but let it fall by the wayside; after reconnecting with what he wants through digital minimalism, he starts drawing a picture to put in his child’s lunchbox every night.
I’ve read a lot of books on digital intentionality, and people mostly come to it not because they’re worried about not accomplishing their goals, but in desperation when they realize the overall impact of their devices on their lives and psyches.
People just want to be able to sit with their thoughts. They want to be able to live in moments, and remember things, and maybe read a book ever again. People want to feel like humans in a world where life is increasingly disembodied.
I’m not into digital intentionality because I have some big goal I want to accomplish, or even because I had some small goal, like reading a lot of books. (I basically don’t have goals! It’s something I struggle with.) I’m into digital intentionality because I didn’t want to lose any more years of my life to shit that gave me no value and that I wouldn’t even remember, that was designed to keep me sedentary just to drive ad revenue to companies that already have too much money. I wanted to go outside and form memories and be a person and talk to other people. And now I do.
2026-01-12 08:07:56
Published on January 12, 2026 12:07 AM GMT
"I have a lot of questions", said Carol. "I need to know how this works."
"Of course", said Zosia. "Ask us anything."
Carol hesitated, gathering her thoughts. She knew that Zosia couldn't lie to her, but she also knew that she was speaking with a highly convincing superintelligence with the knowledge of all the best sophists and rhetoricians in the world. She would have to be careful not to be too easily swayed.
"I'm concerned about how your transformation affects your collective moral worth", she finally said. "I accept that you are very happy. But are you one happy person or many? And if you're one person, are you happy enough to outweigh the collective happiness of all the individuals whom you used to be?"
"That's an excellent question", replied Zosia immediately. "You're trying to determine if humanity is better off now than it was before, and you've astutely drilled down to the heart of the issue.
"To your first question, we honestly don't feel as if we are many individuals now. Subjectively, we feel more like many pieces of one mind. Certainly, we have one unified will. Insofar as different individuals have different sensations and different thoughts, we think about them as subsystems in a different mind, similar to how you can hear or see something without being consciously aware of it until something causes it to come to your conscious attention. When I talk to you, it is like your legs continuing to walk while you navigate to your destination. Your legs and the part of your brain responsible for controlling them have no independent will nor independent personhood. Does that make sense to you?"
Carol's mind raced. Zosia hadn't even tried to convince her that each human was an individual! Was she effectively admitting that eight billion individuals were effectively killed in favor of one? That would be a moral catastrophe.
"Your answer is really disturbing", she finally said. "I don't assign any moral value to the several parts of my brain or my nervous system. If I feel a sensation that is theoretically agreeable or disagreeable but it does not affect my conscious mind, I don't consider that to either add to or subtract from the total happiness in the world. If individuals in your collective are analogous to subsystems in my mind, I would think that your moral worth is that of one individual and not many. That would mean that humanity was much better off when we were many individuals, even if our average happiness was lower."
Zosia smiled. "I understand where you're coming from", she said gently. "But you might think about why you assign no moral value to the subsystems in your mind. Is it because they have no independent will, or is it because they are inherently primitive systems? Consider your visual processing system. Yes, it exists only to gatekeep data from and pass information to your higher-order mind, and to move your eyeballs in response to top-down signals.
"But imagine instead of a simple visual cortex, you had a fully developed human being whose job was to do the same thing that your visual cortex does now. This individual is like any human in every respect except one—his only goal is to serve your conscious mind, and he has no will except your will. I think you would still consider this person worthy of moral consideration even though his function was the same as your visual cortex.
That means that it's not the fact that a system is part of a whole that deprives them of moral worth. No, it's simply its complexity and "human-ness". Yes, Zosia is—I am— merely a part of a whole, not a true individual. But I still have the full range of mental complexity of any individual human. The only thing that's different between me and you is that my will is totally subsumed into the collective will. As we've established, though, it's not independent will that makes someone worthy of moral consideration. I am happy when the collective is happy, but that doesn't make my individual happiness any less meaningful."
Carol considered Zosia's words as she walked home, needing some time to think over their conversation before they would meet again the next day. Zosia seemed convincing. Still, there was something that unsettled her. Zosia spoke as though the hive mind were analogous to individuals who happened to share the exact same knowledge and utility function. But all of them also seemed to have the same personality as well. In the analogy to subsystems of a human mind, you would expect the different individuals to have different methods, even if they had the same knowledge and the same goals. Yet each individual's actions seemed to be the output of a single, unified thought process. That made it seem like there was no local computation being done—each person's actions were like different threads of the same computer process.
Did that undermine Zosia's point, or did it just mean that she had to switch up her mental model from an "individual"—someone with a distinct personality—to an "instance", another copy of the hive mind with distinct experiences but an identical disposition? Carol wasn't sure, but she knew that she had little time to make great philosophical headway. One to three months was how long Zosia had said she had before they would have her stem cells, and therefore her life. Should she resume her efforts to put the world back the way it was?
The question continued to haunt her as she fell into a fitful sleep.