MoreRSS

site iconLessWrongModify

An online forum and community dedicated to improving human reasoning and decision-making.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of LessWrong

What Washington Says About AGI

2026-01-17 13:43:46

Published on January 17, 2026 5:43 AM GMT

I spent a few hundred dollars on Anthropic API credits and let Claude individually research every current US congressperson's position on AI. This is a summary of my findings.

Disclaimer: Summarizing people's beliefs is hard and inherently subjective and noisy. Likewise, US politicians change their opinions on things constantly so it's hard to know what's up-to-date. Also, I vibe-coded a lot of this.

Methodology

I used Claude Sonnet 4.5 with web search to research every congressperson's public statements on AI, then used GPT-4o to score each politician on how "AGI-pilled" they are, how concerned they are about existential risk, and how focused they are on US-China AI competition. I plotted these scores against GovTrack ideology data to search for any partisan splits.

I. AGI awareness is not partisan and not widespread

Few members of Congress have public statements taking AGI seriously. For those that do, the difference is not in political ideology. If we simply plot the AGI-pilled score vs the ideology score, we observe no obvious partisan split.

There are 151 congresspeople who Claude could not find substantial quotes about AI from. These members are not included on this plot or any of the plots which follow. 

II. Existential risk is partisan at the tails

When you change the scoring prompt to ask how much a congressperson's statements reflect a concern about existential risk, the plot looks different. Note that the scoring prompt here emphasizes "A politician who is most XRisk-pilled is someone who thinks AI is a risk to humanity -- not just the US." This separates x-risk concerns from fears related to US-China relations.

This graph looks mostly like noise but it does show that the majority of the most x-risk pilled politicians are Democrats.[1] This is troubling. Politics is a mind-killer and if AI Safety becomes partisan, productive debate will be even more difficult than it currently is.

III. Both parties are fixated on China

Some congresspeople have made up their minds: the US must "win" the race against China and nothing else matters. Others have a more nuanced opinion. But most are thinking about US-China relations when speaking about AI. Notably, the most conservative congresspeople are more likely to be exclusively focused on US-China relations compared to the most progressive members.

This plot has a strange distribution. For reference, the scoring prompt uses the following scale:

  • 0 = Does not mention China in their views on AI or does not think US-China relations are relevant
  • 50 = Cites US China relations when talking about AI but is not the only motivating factor on their position on AI
  • 100 = Cites US China relations as the only motivating factor on their position on AI and mentions an AI race against China as a serious concern

IV. Who in Congress is feeling the AGI?

I found that roughly 20 members of Congress are "AGI-pilled." 

  1. Bernie Sanders (Independent Senator, Vermont): AGI-pilled and safety-pilled

    "The science fiction fear of AI running the world is not quite so outrageous a concept as people may have thought it was."

  2. Richard Blumenthal (Democratic Senator, Connecticut): AGI-pilled and safety-pilled

    "The urgency here demands action. The future is not science fiction or fantasy. It's not even the future. It's here and now."

  3. Rick Crawford (Republican Representative, Arkansas): AGI-pilled but doesn't discuss x-risk (only concerned about losing an AI race to China)

    "The global AI race against China is moving much faster than many think, and the stakes couldn't be higher for U.S. national security."

  4. Bill Foster (Democratic Representative, Illinois): AGI-pilled and safety-pilled

    "Over the last five years, I’ve become much more worried than I previously was. And the reason for that is there’s this analogy between the evolution of AI algorithms and the evolution in living organisms. And what if you look at living organisms and the strategies that have evolved, many of them are deceptive."

  5. Brett Guthrie (Republican Representative, Kentucky): AGI-pilled but doesn't discuss x-risk (only concerned about losing an AI race to China)

    "And who will win the war for AI? Essentially, this is as important as the dollar being the reserve currency in the world. It's that important, that's what is before us."

  6. Chris Murphy (Democratic Senator, Connecticut): AGI-pilled and somewhat safety-pilled (more focused on job loss and spiritual impacts)

    "I worry that our democracy and many others could frankly collapse under the weight of both the economic and the spiritual impacts of advanced AI."

  7. Brad Sherman (Democratic Representative, California): AGI-pilled and safety-pilled 

    "I believe in our lifetime we will see new species possessing intelligence which surpasses our own. The last time a new higher level of intelligence arose on this planet was roughly 50,000 years ago. It was our own ancestors, who then said hello to the previously most intelligent species, Neanderthals. It did not work out so well for the Neanderthals."

  8. Debbie Wasserman Schultz (Democratic Representative, Florida): AGI-pilled and safety-pilled

    "Experts that were part of creating this technology say that it's an existential threat to humanity. We might want to listen."

  9. Bruce Westerman (Republican Representative, Arkansas): AGI-pilled but not necessarily safety-pilled (mostly focused on winning the "AI race")

    "The more I learn about it, it's kind of one of those things I think maybe humankind would've been better off if we didn't discover this and if we weren't developing it. But the cat's out of the bag and it is definitely a race to see who was going to win AI."

  10. Ted Lieu (Democratic Representative, California):  AGI-pilled and safety-pilled

    "AI already has reshaped the world in the same way that the steam engine reshaped society. But with the new advancements in AI, it's going to become a supersonic jet engine in a few years, with a personality, and we need to be prepared for that."

  11. Donald S. Beyer (Democratic Representative, Virginia): AGI-pilled and (mostly) safety-pilled

    "As long as there are really thoughtful people, like Dr. Hinton or others, who worry about the existential risks of artificial intelligence--the end of humanity--I don't think we can afford to ignore that. Even if there's just a one in a 1000 chance, one in a 1000 happens."

  12. Mike Rounds (Republican Senator, South Dakota): AGI-pilled and somewhat safety-pilled (talks about dual-use risks)

    "Bad guys can use artificial intelligence to create new pandemics, to use it for biological purposes and so forth, and to split genes in such a fashion that it would be extremely difficult to defend against it."

  13. Raja Krishnamoorthi (Democratic Representative, Illinois): AGI-pilled and safety-pilled

    "That's why I'm working on a new bill—the AGI Safety Act—that will require AGI to be aligned with human values and require it to comply with laws that apply to humans."

  14. Elissa Slotkin (Democratic Senator, Michigan): AGI-pilled but not safety-pilled (mostly concerned about losing an AI race to China)

    "I left this tour with the distinct feeling that AI raises some of the same fundamental questions that nukes did. How should they be used? By whom? Under what rules?"

  15. Dan Crenshaw (Republican Representative, Texas): AGI-pilled and maybe safety-pilled

    Did a podcast with Eliezer Yudkowsky but this was 2023 

  16. Josh Hawley (Republican Senator, Missouri): AGI-pilled and safety-pilled

    "Americanism and the transhumanist revolution cannot coexist."

  17. Nancy Mace (Republican Representative, South Carolina): AGI-pilled but not safety-pilled (only concerned about losing an AI race to China)

    "And if we fall behind China in the AI race...all other risks will seem tame by comparison."

  18. Jill Tokuda (Democratic Representative, Hawaii): AGI-pilled and safety-pilled but this is based on very limited public statements

    "And is it possible that a loss of control by any nation-state, including our own, could give rise to an independent AGI or ASI actor that, globally, we will need to contend with?"

  19. Eric Burlison (Republican Representative, Missouri): AGI-pilled but not safety-pilled (only concerned about losing an AI race to China)

    "Artificial intelligence, or AI, is likely to become one of the most consequential technology transformations of the century."

  20. Nathaniel Moran (Republican Representative, Texas): AGI-pilled and safety-pilled (but still very focused on US-China relations)

    "At the same time, we must invest in areas crucial for oversight of automated AI research and development, like AI interpretability and control systems, which were identified in President Trump’s AI action plan."

  21. Pete Ricketts (Republican Senator, Nebraska): AGI-pilled but not safety-pilled (only concerned about losing an AI race to China)

    "Unlike the moon landing, the finish line in the AI race is far less clear for the U.S.--it may be achieving Artificial General Intelligence, human-level or greater machine cognition."

V. Those who know the technology fear it.

Of the members of Congress who are the strongest in AI safety, three have some kind of technical background.

Bill Foster is a US Congressman from Illinois, but in the 1990s, he was one of the first scientists to apply neural networks to study particle physics interactions. From reading his public statements, I believe he has the strongest understanding of AI safety out of any other member of Congress. For example, Foster has referenced exponential growth in AI capabilities:

As a PhD physicist and chip designer who first programmed neural networks at Fermi National Accelerator Laboratory in the 1990s, I've been tracking the exponential growth of AI capabilities for decades, and I'm pleased Congress is beginning to take action on this issue.

Likewise, Ted Lieu has a degree from Stanford in computer science. In July of 2025, he stated "We are now entering the era of AI agents," which is a sentence I cannot imagine most members of Congress saying. He has also acknowledged that AI could "destroy the world, literally."

Despite being 75 years old, Congressman Don Beyer is enrolled in a master's program in machine learning at George Mason University. Unlike other members of Congress, Beyer's statements demonstrate an ability to think critically about AI risk:

Many in the industry say, Blah. That's not real. We're very far from artificial general intelligence ... Or we can always unplug it. But I don't want to be calmed down by people who don't take the risk seriously

Appendix: How to use this data

The extracted quotes and analysis by Claude for every member of Congress can be found in a single json file here.

I found reading Claude's "notes" in the json to be an extremely comprehensive and accurate summary of a congressperson's position on AI. The direct quotes in the json are also very interesting to look at. I have cross-referenced many of them and hallucinations are very limited[2] (Claude had web search enabled, so was able to take quotes directly from websites but at least in one case, made a minor mistake). I have also spot-checked some of the scores gpt-4o produced and they are reasonable, but as always is the case with LLM judges, the values are noisy.

I release all the code for generating this data and these plots but it's pretty disorganized and I would expect it to be difficult to use. If you send me a DM, I'd be happy to explain anything. Running all of this code will cost roughly $300 so if you would like to run a modified version of the pipeline, be aware of this.

  1. ^

    It also looks like more moderate politicians may be less x-risk pilled compared to those on each extreme. But the sample here is small and "the graph kind of looks like a U if you squint at it" doesn't exactly qualify as rigorous analysis.

  2. ^

    I obviously cross-referenced each of the quotes in this post.



Discuss

Lightcone is hiring a generalist, a designer, and a campus operations co-lead

2026-01-17 09:47:39

Published on January 17, 2026 1:47 AM GMT

Lightcone is hiring! We build beautiful things for truth-seeking and world-saving. 

Image

We are hiring for three different positions: a senior designer, a campus manager, and a core team generalist. This is the first time in almost two years where we are actively hiring and trying to grow our team! 

 

Senior Designer

When we are at our best, I think we produce world-class design. AI 2027 was I think a great design achievement, so is much of LessWrong.com itself. I also think on a product and business level, making things beautiful and intuitive and well-crafted is crucial. I like some of Patrick Collison's thinking on this:

If Stripe is a monstrously successful business, but what we make isn’t beautiful, and Stripe doesn’t embody a culture of incredibly exacting craftsmanship, I’ll be much less happy. I think the returns to both of those things in the world are really high. I think even beyond the pecuniary or financial returns, the world’s just uglier than it needs to be… One can do things well or poorly, and beauty is not a rivalrous good.

My intuition is that more of Stripe’s success than one would think is downstream of the fact that people like beautiful things—and for kind of rational reasons because what does a beautiful thing tell you? Well it tells you the person who made it really cared… And so if you care about the infrastructure being holistically good, indexing on the superficial characteristics that you can actually observe is not an irrational thing to do.

I want us to continue making beautiful and well-designed things. Indeed, we currently have enormous demand for making more things like AI 2027 and DecidingToWin.org, with multiple new inquiries for projects like this per month, and I think many of those opportunities could be great. I also think LessWrong itself is substantially bottlenecked on design.

Now, design is a very broad category. The specific role I want to hire for is someone helping us make beautiful websites. This very likely implies understanding HTML and CSS deeply, and probably benefits a lot from frontend coding experience. But I can imagine someone who is just used to doing great design in Figma, without touching code directly, making this work.

This is a senior role! I am expecting to work with whoever we hire in this role more closely than I am currently working with many of my current core staff, and for this role to involve managing large design projects end-to-end. Correspondingly I am expecting that we will pay a salary in the range of $160k - $300k for this role, with the rough aim of paying ~70% of that person's counterfactual industry salary.

Apply here

 

Campus Operations Co-Lead

Help us run Lighthaven! Lighthaven is our 30,000 sq. ft. campus near Downtown Berkeley. We host a huge variety of events, fellowships and conferences there, ranging from 600+ person festivals like LessOnline or Manifest, to multi-month fellowships like the MATS program or Inkhaven.

We are looking for additional people to run a lot of our operations here. This will include making sure events run smoothly for our clients, figuring out how and when to cut costs or spend more and working on finding or inventing new events.

The skills involved in this role really vary hugely from month to month, so the key requirements are being able to generally problem-solve and to learn new skills as they become necessary. Some things I have worked on with people on campus in the last few months: 

  • Figuring out pricing for conferences and fellowships that run here
  • Taking over work from our cleaner and porterage staff for a few days to notice which parts of the job are unnecessarily hard, so we can work on making them easier
  • Programming and deploying our room booking software for events
  • Plunging a clogged toilet during an event
  • Hiring architects to help us file a bunch of permits to the City of Berkeley
  • Getting into a car and getting backup food from a nearby restaurant for a conference that didn't order enough food for their attendees

This role could really take a very wide range of levels of compensation and seniority, ranging from $100k/yr to $300k/yr.

Apply here

 

Core-team generalist

Lightcone has a core team of 7 generalists who work on anything from design, to backend programming, to running conferences, to managing construction projects, to legal, advertising, portage, fundraising, sales, etc. We tend to operate at a sprint level, with teams and leadership of those teams being reconfigured every few weeks depending on what's current organizational priority. As an illustration, approximately every person on the core generalist team has managed every other person on the team as part of some project or initiative.

For the core team, I try to follow Paul Graham's hiring advice: 

What do I mean by good people? One of the best tricks I learned during our startup was a rule for deciding who to hire. Could you describe the person as an animal? It might be hard to translate that into another language, but I think everyone in the US knows what it means. It means someone who takes their work a little too seriously; someone who does what they do so well that they pass right through professional and cross over into obsessive.

What it means specifically depends on the job: a salesperson who just won't take no for an answer; a hacker who will stay up till 4:00 AM rather than go to bed leaving code with a bug in it; a PR person who will cold-call New York Times reporters on their cell phones; a graphic designer who feels physical pain when something is two millimeters out of place.

Almost everyone who worked for us was an animal at what they did.

At least basic programming skill is a requirement for this role, but beyond that, it's about being excited to learn new things and adapting to whatever schemes we have going on, and getting along well with the rest of the team.

Lightcone is a culturally thick organization. Working with us on the core team is unlikely to work out if you aren't bought into a lot of our vision and culture. It's hard to summarize our full culture in this post, but here are some things that I think are good predictors of being a good fit for working here: 

  • Strongly resonating with the LessWrong sequences
  • Being excited about Edward Tufte and his writing on design
  • Having dedicated your career to AI existential risk reduction
  • Taking strong inspiration fromPaul Graham's writing on running and working at startup
  • Being very argumentative and having your own perspective and opinions on things

We try to pay competitively for this role, but are still somewhat limited by being a nonprofit. Our general salary policy is to pay 70% of whatever you would make in industry (with some cap around $300k-$400k, since we can't really pay 70% of the $5M+ salaries flying around in a bunch of AI land these days). 

Apply here

 

Some more thoughts on working at Lightcone

I think Lightcone is a pretty good environment for thinking about the future of humanity. We tend to have a lot of spirited and intense discussion and debate about how to make things go better, and try to do a healthy mixture of backchaining from making AI go well, and forward-chaining from how to make ourselves and our community more productive and more sane. 

We also generally have a quite intense work culture. Many people on the team work routine 60 hour weeks, and I consider it a sign of a well-calibrated workload to have around one all-nighter every 6 months or so in order to meet some last-minute deadline (we average a bit less than that, which suggests to me we should be a bit more ambitious with the commitments we take on, though not much more!).

We seem to work much more collaboratively than most organizations I've observed. A common unit of work allocation within our organization is a pair of people who are expected to spend many hours talking and thinking together about their assigned top priority. Our work environment tends to be pretty interruption-driven, with me generally doing "Management by Walking Around", where I spend much of my day visiting people in their workspaces, checking in on what their bottlenecks are, and solving concrete problems with them.

For a much more in-depth pointer to how we work, I have also recently published a sequence of essays about our operating principles, which are adopted from weekly memos I write about how I would like us to operate: 

By default we do a 1-2 week trial, then if we expect it to work out we do a 1-3-month extended trial. But this is quite negotiable if you are not able to do this (e.g. many people can't do a 3-month trial without quitting their existing job, so need a firmer offer). We have successfully sponsored many H1B visas in the past, so non-US applicants are welcome.

And if you have any uncertainties or questions about applying, please send me a DM or leave a comment!

Apply



Discuss

Applying to MATS: What the Program Is Like, and Who It’s For

2026-01-17 08:25:16

Published on January 17, 2026 12:25 AM GMT

Application deadline: Three days remaining! MATS Summer 2026 applications close this Sunday, January 18, 2026 AOE. We've shortened the application this year. Most people finish in 1–2 hours, and we'll get back to applicants about first stage results by the end of January. Visit our website for details: matsprogram.org/apply.

TL;DR: This post is a follow-up to our shorter announcement that MATS Summer 2026 applications are open. It's intended for people who are considering applying and want a clearer sense of what the program is actually like, how mentorship and research support work in practice, and whether MATS is likely to be a good fit.


What MATS is trying to do

MATS aims to find and train talented individuals for what we see as one of the world's most urgent and talent-constrained problems: reducing risks from unaligned AI. We believe ambitious people from a wide range of backgrounds can meaningfully contribute to this work. Our program provides the mentorship, funding, training, and community to make that happen.

Since late 2021, MATS has supported over 500 researchers working with more than 100 mentors from organizations like Anthropic, Google DeepMind, OpenAI, UK AISI, GovAI, METR, Apollo, RAND, AI Futures Project, Redwood Research, and more. Fellows have collectively co-authored 160+ research papers, with 7,800+ citations and an organizational h-index of 40.

Fellows have contributed to research agendas like:

Approximately 80% of alumni now work directly in AI safety/security, and around 10% have gone on to co-found AI safety organizations or research teams. These 30+ initiatives include Apollo ResearchAtla AITimaeus, Simplex, Leap LabsTheorem LabsWorkshop Labs, and Watertight AI


What fellows receive

The initial program runs for 12 weeks (June–August 2026) in Berkeley, London, or remotely, depending on stream. Fellows receive:

  • Mentorship from world-class researchers and a dedicated research manager
  • $15,000 stipend and $12,000 compute budget
  • Housing in a private room, catered meals, and travel to and from the program covered
  • Office space and the community that comes from working alongside other fellows
  • Seminars, workshops, and networking events with the broader AI safety community
  • The opportunity to continue for 6-12 additional months with ongoing stipend, compute, mentorship, and research support through the extension

What the 12-week research phase is like in practice

Most fellows work on an independent research project, typically scoped in collaboration with their mentor during the early weeks. During this period, fellows are usually:

  • getting oriented to the MATS environment,
  • refining or rethinking an initial project idea,
  • learning how their mentor prefers to work,
  • and calibrating what "good progress" looks like in an open-ended research setting.

As the program progresses, fellows iterate on a research plan, check in regularly with mentors and research managers, and gradually sharpen their questions and methods.

Fellows complete two milestones during the 12-week phase. In the past, the first has been a Project Proposal or Research Plan. The second is a Poster Presentation at the MATS Research Symposium, attended by members of the AI safety community.


How mentorship and research management work

Mentorship at MATS is intentionally varied. Mentors differ in:

  • how directive they are,
  • how often they meet,
  • whether they focus more on high-level framing or low-level technical details.

Every fellow also works with our dedicated research management team. Research managers (RMs) work with the fellows and mentors in a stream to support their program goals. Often this involves providing regular check-ins, helping unblock stalled projects, offering feedback on research plans, and supporting fellows in translating ideas into concrete progress.

There is, however, a lot flexibility in what an RM can do in their role. This includes offering productivity coaching, career advice, application help, conference submission assistance, publication guidance, and much more!

Fellows consistently rate the experience highly, with an average score of 9.4/10 for our latest program (median of 10/10).


Community at MATS

The 12-week phase provides fellows with a community of peers who share an office, meals, and housing. Working in a community grants fellows easy access to future collaborators, a deeper understanding of other research agendas, and a social network in the AI safety community. Fellows also receive support from full-time Community Managers.

Each week includes social events e.g., parties, game nights, movie nights, hikes. Weekly lightning talks give fellows an opportunity to share their research interests in an informal setting. Outside of work, fellows organize road trips, city visits, weekend meals, and more.


The 6-12 month extension phase

At the conclusion of the 12-week research phase, fellows can apply to continue their research in a fully-funded 6-12 month extension. Approximately 75% of fellows continue into the extension. By scholar-time weighting, ~60% of the MATS experience is the extension phase (12 month extensions shift this even further).

Acceptance decisions are based on mentor endorsement and double-blind review of the 12-week program milestones. By this phase, fellows are expected to pursue their research with high autonomy.


Who should apply

MATS welcomes talented applicants that traditional pipelines may overlook. We're looking for:

  • Technical researchers (ML/AI background helpful but not required)
  • People who can demonstrate strong reasoning and research potential
  • Those interested in alignment, interpretability, security, or governance
  • Policy professionals with strong writing ability, understanding of governmental processes, and technical literacy to engage with AI concepts
  • Individuals with experience in government, think tanks, or policy orgs; domain expertise in national security, cybersecurity, US-China relations, biosecurity, or nuclear policy a plus

Our ideal applicant has:

  • An understanding of the AI safety research landscape equivalent to having completed BlueDot Impact’s Technical AI Safety or AI Governance courses.
  • Previous experience with technical research (e.g., ML, CS, math, physics, neuroscience), generally at a postgraduate level; OR previous policy research experience or a background conducive to AI governance
  • Strong motivation to pursue a career in AI safety research

Even if you do not meet all of these criteria, we encourage you to apply. Several past fellows applied without strong expectations and were accepted.

All nationalities are eligible to participate and roughly 50% of MATS fellows are international.


Who tends to thrive at MATS

There's no single "MATS profile," but fellows who thrive tend to share a few characteristics:

  • Comfort with ambiguity and open-ended problems
  • Strong reasoning skills, even outside their primary domain
  • Willingness to revise or abandon initial ideas
  • Ability to work independently while seeking feedback strategically

Prior ML experience is helpful but not required. Many successful fellows come from mathematics, physics, policy research, security studies, or other analytical fields.


Who might not find MATS a good fit

You may want to reconsider applying if you're primarily looking for:

  • a structured curriculum with clear assignments,
  • an academic credential or resume signal,
  • or tightly scoped tasks with well-defined answers.

We think it’s a feature and not a bug that some very capable people decide MATS isn’t for them.


How to apply

Applications are now open.

General Application (December 16th to January 18th)

Applicants fill out a general application, which should take 1-2 hours. Applications are due by January 18th AoE.

Additional Evaluations (Late January through March)

Applicants that are advanced in the applications process go through additional evaluations including reference checks, coding tests, work tests, and interviews. Which evaluations you will undergo depend on the mentors and streams you apply to.

Admissions Decisions (Mid March through Early April)

Selected applicants are notified of their acceptance and anticipated mentor later in the application cycle.

If you have any questions about the program or application process, contact us at [email protected].


Closing

If you're considering applying to MATS Summer 2026, we hope this post helps you decide whether the program is a good fit for you. Full details about the program structure, timelines, and application process are on the MATS website.

We're happy to answer questions in the comments!


Acknowledgments: Claude was used for limited editorial support (e.g., wording suggestions, structural feedback).



Discuss

Forfeiting Ill-Gotten Gains

2026-01-17 08:20:39

Published on January 17, 2026 12:20 AM GMT

It's a holiday. The cousins are over, and the kids are having a great time. Unfortunately, that includes rampaging through the kitchen. We're trying to cook, so there's a "no cutting through the kitchen" rule. Imagine enforcement looks like:

Kid: [dashes into kitchen, pursued by cousin]
Adult: Out of the kitchen!
Kid: Sorry! [Continues their path, leaving through the other door; escapes pursuit from more rule-abiding cousin]

This doesn't work! The kid got what they wanted out of this interaction, and isn't going to change their behavior. Instead, I need to make it be not worth their while:

Kid: [dashes into kitchen, pursued by cousin]
Adult: No cutting through the kitchen! [Physically rebuffs intruder]!
Kid: Sorry! [Forced to leave through the door they entered by; caught by cousin.]

Other examples:

  • Sneak candy, spit it out and forfeit dessert.

  • Use sibling's tablet time, lose your own.

  • Interrupt, be ignored.

The general principle is that if you want to limit behavior the combination of the gains from rule-breaking and penalty from punishment need to put the kid in a worse position than if they'd never broken the rule.

This isn't just a parenting thing: it's common to say that "crime should not pay", and many legal systems prohibit unjust enrichment. One place I'd like to see this implemented is airplane evacuation. If the safety announcements included "In the event of an emergency evacuation, any carry-on luggage you bring will be confiscated and destroyed. You will also be fined." we would have more JAL 516 (379 occupants, zero deaths) and less Aeroflot 1492 or Emirates 521.

Comment via: facebook, mastodon, bluesky



Discuss

Is It Reasoning or Just a Fixed Bias?

2026-01-17 05:51:18

Published on January 16, 2026 9:43 PM GMT

This is my first mechanistic interpretability blog post! I decided to research whether models are actually reasoning when answering non-deductive questions, or whether they're doing something simpler.

My dataset is adapted from InAbHyD[1], and it's composed of inductive and abductive reasoning scenarios in first-order ontologies generated through code (using made-up concepts to dismiss much of the external effect of common words). These scenarios have multiple technically correct answers, but one answer is definitively the most correct[2]. I found that LLMs seem to have a fixed generalization tendency (when evaluating my examples) that doesn't seem to adapt to any logical structure. And accuracies in 1-hop and 2-hop reasoning add up to roughly 100% for most models.

Additionally, there's a large overlap between H2 successes and H1 failures (73% for DeepSeek V3), meaning that model outputs the parent concept regardless of whether the task asks for the child concept or the parent concept. This behavior suggests that the model isn't actually reasoning, instead generalizing to a fixed level that aligns with the parent concept.

This is perplexing because in proper reasoning, you generally need the child concept in order to get the parent concept. For example, you'd conclude that Fae is a mammal through a reasoning chain, first establishing that Fae is a tiger, then making one hop to say that Fae is a feline (child concept), and then making a second hop to say that felines are mammals (parent concept). The overlap, though, suggests that the model isn't reasoning through the ontology, and it's skipping the chain to output the parent or child concept depending on its fixed generalization tendency.

I used MI techniques like probing, activation patching, and SAEs in my research. Probing predicted something related to the output at layer 8 (very early), but patching early layers barely made a difference in the final decision. This makes it more likely that whatever probing predicted is just correlational to the final result, as well as that the generalization tendency is distributed among model components instead of being early on.

This was a fun project, and I'm excited to continue with this research and make some more definitive findings. My ultimate goal is to find why LLMs might be architecturally limited in non-deductive reasoning.

  1. ^

    A paper that I based my initial research question on, which argues that LLMs can't properly do non-deductive reasoning. It's authored by my NLP professor, Abulhair Saparov, and his PhD student, Yunxin Sun.

  2. ^

    This concept is known as Occam's razor.



Discuss

Comparing yourself to other people

2026-01-17 04:31:37

Published on January 16, 2026 8:31 PM GMT

There's a thought that I sometimes hear, and it goes something like this: "We live in the best X of all possible Xs".

For example:

  • Whatever criticism one might have towards modern society, we're still living in the safest and richest time of human history.
  • However poor one may be, if you're living in the West, you're still richer than 99.99% of all humans in history.

One variant (or rather, conclusion) that I have heard is:

It is not for material reasons that people are not having kids; a family can easily achieve a 90s-level standard even today, and people in the 90s did have kids.

In these examples, one is supposed to calibrate for the "normal" or "default" circumstances, and therefore see oneself as incredibly privileged.

My objection is that, while it's technically correct, this argument misses the point of comparison itself.

Comparison is a local activity.

People compare things that are close together in some way. You compare yourself to your neighbors or family, or to your colleagues at work, or to people that do similar work as you do in other companies.

People compare like categories because local comparisons incite action. The point of comparison is not to calibrate for defaults, it is to improve something in one's life.

Therefore, saying some variant of "compare yourself with someone much poorer who lived 300 years ago" will do nothing because people form their reference class not by learning history and economy, but by looking around and observing what's there.

Alice and Bob want to have kids. They live in a tiny apartment and they don't earn much -- enough not to be called poor, but below the median. This is every working-class couple from the 90s, at least to my remembrance, and they all had kids (I'm one of those kids).

But Alice and Bob see that their friends who have kids all have at least a two or three-bedroom apartment, or an even larger house; they see that they take their kids on summer vacations, whereas they know they couldn't swing vacations with their salaries.

And so, a 30-year-old living standard perfectly suited to bringing up a family, or at least starting up one, now has them thinking that they should wait a bit more.

I think that:

  1. This is a mistake.
  2. Telling them that this is a mistake will fail, more likely than not.

There's something inside us that just doesn't care about objective standards, and the only thing that is important is a local comparison.

And so, I believe that the way to make the Alices and Bobs have kids must in some way take into account that local comparison. I don't know yet how that would work.

Maybe we should start popularizing TV shows and movies showing modern working class people who have kids? (I feel like a lot of media presents a "struggling mother" kind of dynamic; where working class parenthood is shown as something challenging, almost like an affliction.)

And that's only for the problem of Alice and Bob! Local comparisons are everywhere else, not only in the choice of having kids.

Again, I don't have a solution, but I think that this argument (compare yourself to someone outside your reference class) isn't particularly effective.

Society needs to allow people to do more easily what was completely normal a generation or two ago, and this needs to be done by influencing the current reference class that people compare themselves to.



Discuss