MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

Living on a ball of hair

2025-12-10 15:38:06

Published on December 10, 2025 7:38 AM GMT

Epistemic status: Intuitive world model presented in a visual form

img1 img1img1img1img1img1img1img1img1img1img1img1



Discuss

The funding conversation we left unfinished

2025-12-10 10:17:32

Published on December 10, 2025 2:17 AM GMT

People working in the AI industry are making stupid amounts of money, and word on the street is that Anthropic is going to have some sort of liquidity event soon (for example possibly IPOing sometime next year). A lot of people working in AI are familiar with EA, and are intending to direct donations our way (if they haven't started already). People are starting to discuss what this might mean for their own personal donations and for the ecosystem, and this is encouraging to see.

It also has me thinking about 2022. Immediately before the FTX collapse, we were just starting to reckon, as a community, with the pretty significant vibe shift in EA that came from having a lot more money to throw around.

CitizenTen, in "The Vultures Are Circling" (April 2022), puts it this way:

The message is out. There’s easy money to be had. And the vultures are coming. On many internet circles, there’s been a worrying tone. “You should apply for [insert EA grant], all I had to do was pretend to care about x, and I got $$!” Or, “I’m not even an EA, but I can pretend, as getting a 10k grant is a good instrumental goal towards [insert-poor-life-goals-here]” Or, “Did you hear that a 16 year old got x amount of money? That’s ridiculous! I thought EA’s were supposed to be effective!” Or, “All you have to do is mouth the words community building and you get thrown bags of money.” 

Basically, the sharp increase in rewards has led the number of people who are optimizing for the wrong thing to go up. Hello Goodhart. Instead of the intrinsically motivated EA, we’re beginning to get the resume padders, the career optimizers, and the type of person that cheats on the entry test for preschool in the hopes of getting their child into a better college. I’ve already heard of discord servers springing up centered around gaming the admission process for grants. And it’s not without reason. The Atlas Fellowship is offering a 50k, no strings attached scholarship. If you want people to throw out any hesitation around cheating the system, having a carrot that’s larger than most adult’s yearly income will do that.

Other highly upvoted posts from that era:

I wish FTX hadn't done fraud and collapsed for many reasons, but one feels especially salient currently: we never finished processing how abundant funding impacts a high-trust altruistic community. The conversation had barely started.

I would say that I'm worried about these dynamics emerging again, but there's something a little more complicated here. Ozy actually calls out a similar strand of dysfunction in (parts of) EA in early 2024:

Effective altruist culture ought to be about spending resources in the most efficient way possible to do good. Sure, sometimes the most efficient way to spend resources to do good doesn’t look frugal. I’ve long advocated for effective altruist charities paying their workers well more than average for nonprofits. And a wise investor might make 99 bets that don’t pay off to get one that pays big. But effective altruist culture should have a laser focus on getting the most we can out of every single dollar, because dollars are denominated in lives.
...
It’s cool and high-status to travel the world. It’s cool and high-status to go on adventures. It’s cool and high-status to spend time with famous and influential people. And, God help us, it’s cool and high-status to save the world.

I think something like this is the root of a lot of discomfort with showy effective altruist spending. It’s not that yachting is expensive. It’s that if your idea of what effective altruists should be doing is yachting, a reasonable person might worry that you’ve lost the plot.

So these dynamics are not "emerging again". They haven't left. And I'm worried that they might get turbocharged when money comes knocking again.



Discuss

Do you expect the first AI to cross NY's RAISE Act's "Critical Harm" threshold to be contained?

2025-12-10 09:04:40

Published on December 10, 2025 1:04 AM GMT

I expect a lot of the benefit of the RAISE Act to come from the required safety and security protocols, but I wanted to get a sense of what scenarios people imagine when they think about the "Critical Harm" threshold. To my mind there are two main scenarios. In the first, a somewhat-strong AI is given more authority than it should be, fumbles, and is then contained. In the second, a genuinely strong AGI/ASI has decided killing its enemies is more important than being stealthy and humanity won't be able to contain it. A third scenario where an AGI launches a take-over attempt and fails seems unlikely to me. The first scenario seems like it will inevitably start happening soon and will happen every so often until the second scenario happens, but my timelines are short enough that the second scenario might happen first and

What other scenarios are people thinking of and what odds would they put on them being recoverable failures?



Discuss

TT Self Study Journal # 5

2025-12-10 06:16:38

Published on December 9, 2025 10:16 PM GMT

[Epistemic Status: This is an artifact of my self study. I am using help manage my focus. As such, I don't expect anyone to fully read it. If you have particular interest or expertise, skip to the relevant sections, and please leave a comment, even just to say "good work/good luck". I'm hoping for a feeling of accountability and would like input from peers and mentors. This may also help to serve as a guide for others who wish to study in a similar way to me. ]

Previous Entry: SSJ #4

Highlights

Review of 4th Sprint

My goals for this sprint were:

  • SSJ--1 -- Write
    • Make an article or doc to contain and organize articles I would like to write.
    • Theory of Change
    • OIS explainer
    • MAAT
    • AIA Terminology Review
  • SSJ--2 -- Read
    • Search and read various articles for AIA Terminology Review.
    • Spend some time reading and comment on one random LW article 4 days / week.
  • SSJ--3 -- Math
    • Low priority: Continue reading C.Kosniowski's "algebraic topology"
  • SSJ--4 -- Experimentation (copied from last sprint)
  • SSJ--5 -- Tooling (copied from last sprint)
  • SSJ--6 -- Social
    • Develop my networking plan
      • Create a list of people I respect who may be worth reaching out to for mentorship or networking.
      • Research and reach out to people where possible and pragmatic.
      • Clarify the problems I am interested in focusing on and the capacity in which I am interested in focusing on them. (High overlap with SSJ--1 "Theory of Change" )

So how did I do?

Daily Worklog

Date Progress
Th, Nov 6
  • Finished and published SSJ #4
  • Set up SSJ #5 doc
  • Read and commented on Legible vs. Illegible AI Safety Problems
  • Skimmed some popular AI books while at the library. May post some thoughts on them sometime.
  • Started concept mapping for my "Theory of Change".
Fr, Nov 7
Mo, Nov 10
  • Read and commented on Ontology for AI Cults and Cyborg Egregores.
  • Transformers from Scratch (TfS): Went through and made sense of the structure and read through the overviews of each section. I will at least go through 1.1 and 1.2 in order, and then very likely continue through 1.3.1. I'd like to continue through the rest of the sections in order, but I'll judge based on how long it's taking me. I may skim the rest and only do the exercises later.
  • Started Map articulating all talking (Maat) sequence.
  • Started looking at high dimensional books and papers for SSJ-5
Tu, Nov 11 No progress because of commuting and errands.
Wd, Nov 12

No progress because of:

  • 🎉 My BSc Convocation Ceremony! Computer Science Honours with Math Minor : )
Th, Nov 13 - Fr, Nov 14 Pretty burned out from convocation related socializing, so not much progress. Did some worth ideating and reading and commenting on some LW posts.
Mo, Nov 17
  • Planning and started writing next post in Maat sequence.
Tu, Nov 18 No progress. Distracted by interpersonal "other life stuff".
Wd, Nov 19
Th, Nov 20

Sprint Summary

Overview

I have my BSc now 🎉

I'm also feeling good about the amount of focus on this I've had in first two weeks of the sprint, but then got very distracted and depressed by other things in my life and didn't manage to record any progress for the time from then til now.

So I definitely still want to improve, but it feels like I'm moving in the right direction. I also want to get better at prioritizing things that are worth working on, and sticking to my plan. This also means putting more realistic amounts of work in each sprint. I got similar advice from @Roman Malov and from reading A Pragmatic Vision for Interpretability, so I'm hyped up on that.

More object level, I have started the Transformers from Scratch (TfS) series, written a few posts, and read and commented on several posts.

SSJ--1 -- Write

I had five items on my write list. I'm realizing writing about my theory of change is probably more involved than I thought, so I ended up putting more focus into Maat, which I roughly know everything I want to write, I just need to write it down.

Also, I published a post to make my planned writing publicly visible. I may change the format later, but I like the idea of the list being public, so I'm committing to that.

SSJ--2 -- Read

I didn't do any of the review for AIA terminology review. I still think that's a good idea but is quite involved, so I probably should make it a main focus if I'm going to do it.

I did manage to read and comment on LW posts. I think this is good practice, so I plan to continue. I think looking at the list of all posts is a good way to engage with what other people are currently focusing on, however, I think there is value in reading and commenting on older posts and I'm not sure how to prioritize that very well. So one goal could be to figure out how to prioritize things to read, but that sounds dangerously meta.

SSJ--3 -- Math

I think I might have picked the book up once. I'm not having much traction staying engaged with math study now that I'm not in classes. It is such a difficult thing to do it might be better to treat it like the AIA terminology review, both in having a description of my understanding and some worked examples as a visible output, but also in that I should not try to do it unless it is a main focus.

SSJ--4 -- Experimentation

I actually started going through Transformers from Scratch! I haven't made it very far but I've finally actually started so that's nice.

SSJ--5 -- Tooling

I started looking at literature focused on high dimensional spaces. There's so much content out there it's difficult to get a sense of it all. One noteworthy resource I found is "Understanding High-Dimensional Spaces" by David B. Skillicorn. I definitely want to at least skim this.

SSJ--6 -- Social

I've been engaging with people here on LW which feels like movement in the right direction, but I definitely want to be more conscientious in my planning and execution in this domain. Clarifying my goals here would probably be good.

Goals for 5th Sprint

I think trying to focus on all 6 of my pursuit categories at the same time has made me too scattered. So going forward I will instead keep them as guides, but pick fewer specific goals from within each for each sprint.

The Goals:

  • Every day spend some time on each of the following:
    • Read some LW post or other relevant material. (SSJ--2)
    • Spend some time writing or developing ideas to write (SSJ--1)
    • Work on Transformers from Scratch course (SSJ--2&4)
  • By the end of the sprint:
    • Have clarified my SSJ--6, social networking, goals and strategy, and write a post describing them.

I'm hoping having fewer goals will make them easier to focus on. The "every day" goals make for an easy quantification: Either I worked on it that day or I didn't. I would like to be using better metrics, but I don't want to make up numbers that may not actually indicate anything useful. Numbers are good if they mean something, but if they don't, it's better to speak qualitatively.


List of common acronyms:



Discuss

Lorxus Does Halfhaven: 11/29, 11/30, Highlights, Postmortem

2025-12-10 05:00:59

Published on December 9, 2025 9:00 PM GMT

I've decided to post these in weekly batches. This is the fifth of five. I'm posting these here because Blogspot's comment apparatus sucks and also because no one will comment otherwise. Because this is the last week and because there's only two posts in it, I'm also going to mention a few highlights and talk about what went well and what missed the mark.

 

29. My PhD Thesis: Part 1: Preliminaries - Algebra, Topology, Algebraic Topology

Surprise! We've actually been thinking about the basics of algebraic topology this whole time, skipping right past the horrors of point-set topology...

Every topological space has a fundamental group. Some of them are extremely weird. (Your search term here to start off down that rabbit hole is "Hawaiian earring space".)

 

30. Atoms to Agents As Filtered Through Some Tame Research-Creature

Tools are not agents, but they're almost agents. They have no goal, but seem to be clearly designed for some purpose, or have been subjected to some kind of selection pressure comparable in optimization strength to intentional design. They are always composed of other tools, all the way down to atoms; they are the bedrock of our theory. I claim no constraint on their actionspace, or the actionspace expansion that they afford to an agent wielding them. Hammers, bones, pens, ribosomes, and spoons are all tools; viruses are marginal.

 

Highlights! I've picked out the handful of posts with the ~most reads after a little while, and the handful that I was most proud of. I'll crosspost the best couple of each category at some point.

Highlights (as judged by estimated readers/day, measured on 12/9):

All others had <1.34 readers/day.

Highlights (as judged by me and what I thought were my best posts):

...OK I'd thought these were going to be substantially different from the reader's choice highlights but they mostly overlap. A few notable posts that I really liked but don't seem to have got a ton of attention: Seven-ish Evidentials From My Thought-Language, The Ultimate Sylow Theorem Guide for Algebra Quals, A Partial Theory of Flavor Pairing in Foodcraft, What's the Type of an Ontological Mismatch?, and In Defense of Boring Doom: A Pessimist's Case for Better Pessimism.

Ultimately, though, I was pretty happy with like half of these posts.

 

Postmortem! What was I hoping to get out of this? What went well, and what could have gone better? What could I have done to get more out of Budget Inkhaven Halfhaven?

  • I think it went pretty well, given that I was doing this alongside both a research grant and substantial surprise problems in my personal life.
  • I wish I'd started writing up some of the higher-context ideas earlier on, but then again there's a handful of possible posts I didn't end up writing that I'm glad I didn't.
  • I think that keeping up on writing at a comparable pace to my previous posting project was valuable, and definitely moved the needle a bit on my ability to express ideas, write quickly, and talk about high-context topics I'd otherwise be too timid to discuss. I'm still not happy with how much/little of that I did, but it's something to work off of.
  • I regret not going to Inkhaven, a little, but personal commitments, scheduling constraints, and budget constraints all conspired to make that impossible.
  • All the same, this was a good opportunity to keep emptying out my "post ideas" file; that's even with having generated a fair few extra ideas in the meantime!
  • A few ideas didn't quite make the cut this time around, just like last time.
    • Here are some of the working titles and descriptions of those varyingly dead-but-dreaming posts: A Model of Kakonomics and Low Equilibria, Three Sketches in Overendurance, Things To Know About The Bay Area Ratsphere, Go Hard And Go Weird, Attractor Basin vs Filtering vs Cultivation, Why Conceptual Engineering Is So Important, Three Thumbs Up: The Cultural Event of Blaseball, Homology Chains of Leftovers, Anxiety Is Not the Unit of Effort Either, Seven Affixes From My Thought-Language, Comprehensive Zendo Rules, and The Con Badge Problem.
    • As before, if you really want to see one of these written, reach out to me! Maybe it'll even change my mind.

As ever, I accept public feedback here, and private feedback at https://admonymous.co/lorxus .



Discuss

Tristan's list of things to write

2025-12-10 04:28:17

Published on December 9, 2025 8:28 PM GMT

The comments section of this post contains topics I want to write posts or sequences about. Each idea for a writing topic starts with "ToW". Feel free to vote or comment on any of my ideas. You may also suggest topics you would like me to write about by adding your own comment starting with "ToW".

I'll add comments starting with "WRITTEN" and including a link to the post or sequence.



Discuss