2026-01-17 13:43:46
Published on January 17, 2026 5:43 AM GMT
I spent a few hundred dollars on Anthropic API credits and let Claude individually research every current US congressperson's position on AI. This is a summary of my findings.
Disclaimer: Summarizing people's beliefs is hard and inherently subjective and noisy. Likewise, US politicians change their opinions on things constantly so it's hard to know what's up-to-date. Also, I vibe-coded a lot of this.
I used Claude Sonnet 4.5 with web search to research every congressperson's public statements on AI, then used GPT-4o to score each politician on how "AGI-pilled" they are, how concerned they are about existential risk, and how focused they are on US-China AI competition. I plotted these scores against GovTrack ideology data to search for any partisan splits.
Few members of Congress have public statements taking AGI seriously. For those that do, the difference is not in political ideology. If we simply plot the AGI-pilled score vs the ideology score, we observe no obvious partisan split.
There are 151 congresspeople who Claude could not find substantial quotes about AI from. These members are not included on this plot or any of the plots which follow.
When you change the scoring prompt to ask how much a congressperson's statements reflect a concern about existential risk, the plot looks different. Note that the scoring prompt here emphasizes "A politician who is most XRisk-pilled is someone who thinks AI is a risk to humanity -- not just the US." This separates x-risk concerns from fears related to US-China relations.
This graph looks mostly like noise but it does show that the majority of the most x-risk pilled politicians are Democrats.[1] This is troubling. Politics is a mind-killer and if AI Safety becomes partisan, productive debate will be even more difficult than it currently is.
Some congresspeople have made up their minds: the US must "win" the race against China and nothing else matters. Others have a more nuanced opinion. But most are thinking about US-China relations when speaking about AI. Notably, the most conservative congresspeople are more likely to be exclusively focused on US-China relations compared to the most progressive members.
This plot has a strange distribution. For reference, the scoring prompt uses the following scale:
I found that roughly 20 members of Congress are "AGI-pilled."
Bernie Sanders (Independent Senator, Vermont): AGI-pilled and safety-pilled
Richard Blumenthal (Democratic Senator, Connecticut): AGI-pilled and safety-pilled
Rick Crawford (Republican Representative, Arkansas): AGI-pilled but doesn't discuss x-risk (only concerned about losing an AI race to China)
Bill Foster (Democratic Representative, Illinois): AGI-pilled and safety-pilled
Brett Guthrie (Republican Representative, Kentucky): AGI-pilled but doesn't discuss x-risk (only concerned about losing an AI race to China)
Chris Murphy (Democratic Senator, Connecticut): AGI-pilled and somewhat safety-pilled (more focused on job loss and spiritual impacts)
Brad Sherman (Democratic Representative, California): AGI-pilled and safety-pilled
Debbie Wasserman Schultz (Democratic Representative, Florida): AGI-pilled and safety-pilled
Bruce Westerman (Republican Representative, Arkansas): AGI-pilled but not necessarily safety-pilled (mostly focused on winning the "AI race")
Ted Lieu (Democratic Representative, California): AGI-pilled and safety-pilled
Donald S. Beyer (Democratic Representative, Virginia): AGI-pilled and (mostly) safety-pilled
Mike Rounds (Republican Senator, South Dakota): AGI-pilled and somewhat safety-pilled (talks about dual-use risks)
Raja Krishnamoorthi (Democratic Representative, Illinois): AGI-pilled and safety-pilled
Elissa Slotkin (Democratic Senator, Michigan): AGI-pilled but not safety-pilled (mostly concerned about losing an AI race to China)
Dan Crenshaw (Republican Representative, Texas): AGI-pilled and maybe safety-pilled
Josh Hawley (Republican Senator, Missouri): AGI-pilled and safety-pilled
"Americanism and the transhumanist revolution cannot coexist."
Nancy Mace (Republican Representative, South Carolina): AGI-pilled but not safety-pilled (only concerned about losing an AI race to China)
"And if we fall behind China in the AI race...all other risks will seem tame by comparison."
Jill Tokuda (Democratic Representative, Hawaii): AGI-pilled and safety-pilled but this is based on very limited public statements
Eric Burlison (Republican Representative, Missouri): AGI-pilled but not safety-pilled (only concerned about losing an AI race to China)
Nathaniel Moran (Republican Representative, Texas): AGI-pilled and safety-pilled (but still very focused on US-China relations)
Pete Ricketts (Republican Senator, Nebraska): AGI-pilled but not safety-pilled (only concerned about losing an AI race to China)
Of the members of Congress who are the strongest in AI safety, three have some kind of technical background.
Bill Foster is a US Congressman from Illinois, but in the 1990s, he was one of the first scientists to apply neural networks to study particle physics interactions. From reading his public statements, I believe he has the strongest understanding of AI safety out of any other member of Congress. For example, Foster has referenced exponential growth in AI capabilities:
As a PhD physicist and chip designer who first programmed neural networks at Fermi National Accelerator Laboratory in the 1990s, I've been tracking the exponential growth of AI capabilities for decades, and I'm pleased Congress is beginning to take action on this issue.
Likewise, Ted Lieu has a degree from Stanford in computer science. In July of 2025, he stated "We are now entering the era of AI agents," which is a sentence I cannot imagine most members of Congress saying. He has also acknowledged that AI could "destroy the world, literally."
Despite being 75 years old, Congressman Don Beyer is enrolled in a master's program in machine learning at George Mason University. Unlike other members of Congress, Beyer's statements demonstrate an ability to think critically about AI risk:
Many in the industry say, Blah. That's not real. We're very far from artificial general intelligence ... Or we can always unplug it. But I don't want to be calmed down by people who don't take the risk seriously
The extracted quotes and analysis by Claude for every member of Congress can be found in a single json file here.
I found reading Claude's "notes" in the json to be an extremely comprehensive and accurate summary of a congressperson's position on AI. The direct quotes in the json are also very interesting to look at. I have cross-referenced many of them and hallucinations are very limited[2] (Claude had web search enabled, so was able to take quotes directly from websites but at least in one case, made a minor mistake). I have also spot-checked some of the scores gpt-4o produced and they are reasonable, but as always is the case with LLM judges, the values are noisy.
I release all the code for generating this data and these plots but it's pretty disorganized and I would expect it to be difficult to use. If you send me a DM, I'd be happy to explain anything. Running all of this code will cost roughly $300 so if you would like to run a modified version of the pipeline, be aware of this.
It also looks like more moderate politicians may be less x-risk pilled compared to those on each extreme. But the sample here is small and "the graph kind of looks like a U if you squint at it" doesn't exactly qualify as rigorous analysis.
I obviously cross-referenced each of the quotes in this post.
2026-01-17 09:47:39
Published on January 17, 2026 1:47 AM GMT
Lightcone is hiring! We build beautiful things for truth-seeking and world-saving.
We are hiring for three different positions: a senior designer, a campus manager, and a core team generalist. This is the first time in almost two years where we are actively hiring and trying to grow our team!
When we are at our best, I think we produce world-class design. AI 2027 was I think a great design achievement, so is much of LessWrong.com itself. I also think on a product and business level, making things beautiful and intuitive and well-crafted is crucial. I like some of Patrick Collison's thinking on this:
If Stripe is a monstrously successful business, but what we make isn’t beautiful, and Stripe doesn’t embody a culture of incredibly exacting craftsmanship, I’ll be much less happy. I think the returns to both of those things in the world are really high. I think even beyond the pecuniary or financial returns, the world’s just uglier than it needs to be… One can do things well or poorly, and beauty is not a rivalrous good.
My intuition is that more of Stripe’s success than one would think is downstream of the fact that people like beautiful things—and for kind of rational reasons because what does a beautiful thing tell you? Well it tells you the person who made it really cared… And so if you care about the infrastructure being holistically good, indexing on the superficial characteristics that you can actually observe is not an irrational thing to do.
I want us to continue making beautiful and well-designed things. Indeed, we currently have enormous demand for making more things like AI 2027 and DecidingToWin.org, with multiple new inquiries for projects like this per month, and I think many of those opportunities could be great. I also think LessWrong itself is substantially bottlenecked on design.
Now, design is a very broad category. The specific role I want to hire for is someone helping us make beautiful websites. This very likely implies understanding HTML and CSS deeply, and probably benefits a lot from frontend coding experience. But I can imagine someone who is just used to doing great design in Figma, without touching code directly, making this work.
This is a senior role! I am expecting to work with whoever we hire in this role more closely than I am currently working with many of my current core staff, and for this role to involve managing large design projects end-to-end. Correspondingly I am expecting that we will pay a salary in the range of $160k - $300k for this role, with the rough aim of paying ~70% of that person's counterfactual industry salary.
Help us run Lighthaven! Lighthaven is our 30,000 sq. ft. campus near Downtown Berkeley. We host a huge variety of events, fellowships and conferences there, ranging from 600+ person festivals like LessOnline or Manifest, to multi-month fellowships like the MATS program or Inkhaven.
We are looking for additional people to run a lot of our operations here. This will include making sure events run smoothly for our clients, figuring out how and when to cut costs or spend more and working on finding or inventing new events.
The skills involved in this role really vary hugely from month to month, so the key requirements are being able to generally problem-solve and to learn new skills as they become necessary. Some things I have worked on with people on campus in the last few months:
This role could really take a very wide range of levels of compensation and seniority, ranging from $100k/yr to $300k/yr.
Lightcone has a core team of 7 generalists who work on anything from design, to backend programming, to running conferences, to managing construction projects, to legal, advertising, portage, fundraising, sales, etc. We tend to operate at a sprint level, with teams and leadership of those teams being reconfigured every few weeks depending on what's current organizational priority. As an illustration, approximately every person on the core generalist team has managed every other person on the team as part of some project or initiative.
For the core team, I try to follow Paul Graham's hiring advice:
What do I mean by good people? One of the best tricks I learned during our startup was a rule for deciding who to hire. Could you describe the person as an animal? It might be hard to translate that into another language, but I think everyone in the US knows what it means. It means someone who takes their work a little too seriously; someone who does what they do so well that they pass right through professional and cross over into obsessive.
What it means specifically depends on the job: a salesperson who just won't take no for an answer; a hacker who will stay up till 4:00 AM rather than go to bed leaving code with a bug in it; a PR person who will cold-call New York Times reporters on their cell phones; a graphic designer who feels physical pain when something is two millimeters out of place.
Almost everyone who worked for us was an animal at what they did.
At least basic programming skill is a requirement for this role, but beyond that, it's about being excited to learn new things and adapting to whatever schemes we have going on, and getting along well with the rest of the team.
Lightcone is a culturally thick organization. Working with us on the core team is unlikely to work out if you aren't bought into a lot of our vision and culture. It's hard to summarize our full culture in this post, but here are some things that I think are good predictors of being a good fit for working here:
We try to pay competitively for this role, but are still somewhat limited by being a nonprofit. Our general salary policy is to pay 70% of whatever you would make in industry (with some cap around $300k-$400k, since we can't really pay 70% of the $5M+ salaries flying around in a bunch of AI land these days).
I think Lightcone is a pretty good environment for thinking about the future of humanity. We tend to have a lot of spirited and intense discussion and debate about how to make things go better, and try to do a healthy mixture of backchaining from making AI go well, and forward-chaining from how to make ourselves and our community more productive and more sane.
We also generally have a quite intense work culture. Many people on the team work routine 60 hour weeks, and I consider it a sign of a well-calibrated workload to have around one all-nighter every 6 months or so in order to meet some last-minute deadline (we average a bit less than that, which suggests to me we should be a bit more ambitious with the commitments we take on, though not much more!).
We seem to work much more collaboratively than most organizations I've observed. A common unit of work allocation within our organization is a pair of people who are expected to spend many hours talking and thinking together about their assigned top priority. Our work environment tends to be pretty interruption-driven, with me generally doing "Management by Walking Around", where I spend much of my day visiting people in their workspaces, checking in on what their bottlenecks are, and solving concrete problems with them.
For a much more in-depth pointer to how we work, I have also recently published a sequence of essays about our operating principles, which are adopted from weekly memos I write about how I would like us to operate:
By default we do a 1-2 week trial, then if we expect it to work out we do a 1-3-month extended trial. But this is quite negotiable if you are not able to do this (e.g. many people can't do a 3-month trial without quitting their existing job, so need a firmer offer). We have successfully sponsored many H1B visas in the past, so non-US applicants are welcome.
And if you have any uncertainties or questions about applying, please send me a DM or leave a comment!
2026-01-17 08:25:16
Published on January 17, 2026 12:25 AM GMT
Application deadline: Three days remaining! MATS Summer 2026 applications close this Sunday, January 18, 2026 AOE. We've shortened the application this year. Most people finish in 1–2 hours, and we'll get back to applicants about first stage results by the end of January. Visit our website for details: matsprogram.org/apply.
TL;DR: This post is a follow-up to our shorter announcement that MATS Summer 2026 applications are open. It's intended for people who are considering applying and want a clearer sense of what the program is actually like, how mentorship and research support work in practice, and whether MATS is likely to be a good fit.
MATS aims to find and train talented individuals for what we see as one of the world's most urgent and talent-constrained problems: reducing risks from unaligned AI. We believe ambitious people from a wide range of backgrounds can meaningfully contribute to this work. Our program provides the mentorship, funding, training, and community to make that happen.
Since late 2021, MATS has supported over 500 researchers working with more than 100 mentors from organizations like Anthropic, Google DeepMind, OpenAI, UK AISI, GovAI, METR, Apollo, RAND, AI Futures Project, Redwood Research, and more. Fellows have collectively co-authored 160+ research papers, with 7,800+ citations and an organizational h-index of 40.
Fellows have contributed to research agendas like:
Approximately 80% of alumni now work directly in AI safety/security, and around 10% have gone on to co-found AI safety organizations or research teams. These 30+ initiatives include Apollo Research, Atla AI, Timaeus, Simplex, Leap Labs, Theorem Labs, Workshop Labs, and Watertight AI
The initial program runs for 12 weeks (June–August 2026) in Berkeley, London, or remotely, depending on stream. Fellows receive:
Most fellows work on an independent research project, typically scoped in collaboration with their mentor during the early weeks. During this period, fellows are usually:
As the program progresses, fellows iterate on a research plan, check in regularly with mentors and research managers, and gradually sharpen their questions and methods.
Fellows complete two milestones during the 12-week phase. In the past, the first has been a Project Proposal or Research Plan. The second is a Poster Presentation at the MATS Research Symposium, attended by members of the AI safety community.
Mentorship at MATS is intentionally varied. Mentors differ in:
Every fellow also works with our dedicated research management team. Research managers (RMs) work with the fellows and mentors in a stream to support their program goals. Often this involves providing regular check-ins, helping unblock stalled projects, offering feedback on research plans, and supporting fellows in translating ideas into concrete progress.
There is, however, a lot flexibility in what an RM can do in their role. This includes offering productivity coaching, career advice, application help, conference submission assistance, publication guidance, and much more!
Fellows consistently rate the experience highly, with an average score of 9.4/10 for our latest program (median of 10/10).
The 12-week phase provides fellows with a community of peers who share an office, meals, and housing. Working in a community grants fellows easy access to future collaborators, a deeper understanding of other research agendas, and a social network in the AI safety community. Fellows also receive support from full-time Community Managers.
Each week includes social events e.g., parties, game nights, movie nights, hikes. Weekly lightning talks give fellows an opportunity to share their research interests in an informal setting. Outside of work, fellows organize road trips, city visits, weekend meals, and more.
At the conclusion of the 12-week research phase, fellows can apply to continue their research in a fully-funded 6-12 month extension. Approximately 75% of fellows continue into the extension. By scholar-time weighting, ~60% of the MATS experience is the extension phase (12 month extensions shift this even further).
Acceptance decisions are based on mentor endorsement and double-blind review of the 12-week program milestones. By this phase, fellows are expected to pursue their research with high autonomy.
MATS welcomes talented applicants that traditional pipelines may overlook. We're looking for:
Our ideal applicant has:
Even if you do not meet all of these criteria, we encourage you to apply. Several past fellows applied without strong expectations and were accepted.
All nationalities are eligible to participate and roughly 50% of MATS fellows are international.
There's no single "MATS profile," but fellows who thrive tend to share a few characteristics:
Prior ML experience is helpful but not required. Many successful fellows come from mathematics, physics, policy research, security studies, or other analytical fields.
You may want to reconsider applying if you're primarily looking for:
We think it’s a feature and not a bug that some very capable people decide MATS isn’t for them.
General Application (December 16th to January 18th)
Applicants fill out a general application, which should take 1-2 hours. Applications are due by January 18th AoE.
Additional Evaluations (Late January through March)
Applicants that are advanced in the applications process go through additional evaluations including reference checks, coding tests, work tests, and interviews. Which evaluations you will undergo depend on the mentors and streams you apply to.
Admissions Decisions (Mid March through Early April)
Selected applicants are notified of their acceptance and anticipated mentor later in the application cycle.
If you have any questions about the program or application process, contact us at [email protected].
If you're considering applying to MATS Summer 2026, we hope this post helps you decide whether the program is a good fit for you. Full details about the program structure, timelines, and application process are on the MATS website.
We're happy to answer questions in the comments!
Acknowledgments: Claude was used for limited editorial support (e.g., wording suggestions, structural feedback).
2026-01-17 08:20:39
Published on January 17, 2026 12:20 AM GMT
It's a holiday. The cousins are over, and the kids are having a great time. Unfortunately, that includes rampaging through the kitchen. We're trying to cook, so there's a "no cutting through the kitchen" rule. Imagine enforcement looks like:
Kid: [dashes into kitchen, pursued by cousin]
Adult: Out of the kitchen!
Kid: Sorry! [Continues their path, leaving through the other door; escapes pursuit from more rule-abiding cousin]
This doesn't work! The kid got what they wanted out of this interaction, and isn't going to change their behavior. Instead, I need to make it be not worth their while:
Kid: [dashes into kitchen, pursued by cousin]
Adult: No cutting through the kitchen! [Physically rebuffs intruder]!
Kid: Sorry! [Forced to leave through the door they entered by; caught by cousin.]
Other examples:
Sneak candy, spit it out and forfeit dessert.
Use sibling's tablet time, lose your own.
Interrupt, be ignored.
The general principle is that if you want to limit behavior the combination of the gains from rule-breaking and penalty from punishment need to put the kid in a worse position than if they'd never broken the rule.
This isn't just a parenting thing: it's common to say that "crime should not pay", and many legal systems prohibit unjust enrichment. One place I'd like to see this implemented is airplane evacuation. If the safety announcements included "In the event of an emergency evacuation, any carry-on luggage you bring will be confiscated and destroyed. You will also be fined." we would have more JAL 516 (379 occupants, zero deaths) and less Aeroflot 1492 or Emirates 521.
Comment via: facebook, mastodon, bluesky
2026-01-17 05:51:18
Published on January 16, 2026 9:43 PM GMT
This is my first mechanistic interpretability blog post! I decided to research whether models are actually reasoning when answering non-deductive questions, or whether they're doing something simpler.
My dataset is adapted from InAbHyD[1], and it's composed of inductive and abductive reasoning scenarios in first-order ontologies generated through code (using made-up concepts to dismiss much of the external effect of common words). These scenarios have multiple technically correct answers, but one answer is definitively the most correct[2]. I found that LLMs seem to have a fixed generalization tendency (when evaluating my examples) that doesn't seem to adapt to any logical structure. And accuracies in 1-hop and 2-hop reasoning add up to roughly 100% for most models.
Additionally, there's a large overlap between H2 successes and H1 failures (73% for DeepSeek V3), meaning that model outputs the parent concept regardless of whether the task asks for the child concept or the parent concept. This behavior suggests that the model isn't actually reasoning, instead generalizing to a fixed level that aligns with the parent concept.
This is perplexing because in proper reasoning, you generally need the child concept in order to get the parent concept. For example, you'd conclude that Fae is a mammal through a reasoning chain, first establishing that Fae is a tiger, then making one hop to say that Fae is a feline (child concept), and then making a second hop to say that felines are mammals (parent concept). The overlap, though, suggests that the model isn't reasoning through the ontology, and it's skipping the chain to output the parent or child concept depending on its fixed generalization tendency.
I used MI techniques like probing, activation patching, and SAEs in my research. Probing predicted something related to the output at layer 8 (very early), but patching early layers barely made a difference in the final decision. This makes it more likely that whatever probing predicted is just correlational to the final result, as well as that the generalization tendency is distributed among model components instead of being early on.
This was a fun project, and I'm excited to continue with this research and make some more definitive findings. My ultimate goal is to find why LLMs might be architecturally limited in non-deductive reasoning.
A paper that I based my initial research question on, which argues that LLMs can't properly do non-deductive reasoning. It's authored by my NLP professor, Abulhair Saparov, and his PhD student, Yunxin Sun.
This concept is known as Occam's razor.
2026-01-17 04:31:37
Published on January 16, 2026 8:31 PM GMT
There's a thought that I sometimes hear, and it goes something like this: "We live in the best X of all possible Xs".
For example:
One variant (or rather, conclusion) that I have heard is:
It is not for material reasons that people are not having kids; a family can easily achieve a 90s-level standard even today, and people in the 90s did have kids.
In these examples, one is supposed to calibrate for the "normal" or "default" circumstances, and therefore see oneself as incredibly privileged.
My objection is that, while it's technically correct, this argument misses the point of comparison itself.
Comparison is a local activity.
People compare things that are close together in some way. You compare yourself to your neighbors or family, or to your colleagues at work, or to people that do similar work as you do in other companies.
People compare like categories because local comparisons incite action. The point of comparison is not to calibrate for defaults, it is to improve something in one's life.
Therefore, saying some variant of "compare yourself with someone much poorer who lived 300 years ago" will do nothing because people form their reference class not by learning history and economy, but by looking around and observing what's there.
Alice and Bob want to have kids. They live in a tiny apartment and they don't earn much -- enough not to be called poor, but below the median. This is every working-class couple from the 90s, at least to my remembrance, and they all had kids (I'm one of those kids).
But Alice and Bob see that their friends who have kids all have at least a two or three-bedroom apartment, or an even larger house; they see that they take their kids on summer vacations, whereas they know they couldn't swing vacations with their salaries.
And so, a 30-year-old living standard perfectly suited to bringing up a family, or at least starting up one, now has them thinking that they should wait a bit more.
I think that:
There's something inside us that just doesn't care about objective standards, and the only thing that is important is a local comparison.
And so, I believe that the way to make the Alices and Bobs have kids must in some way take into account that local comparison. I don't know yet how that would work.
Maybe we should start popularizing TV shows and movies showing modern working class people who have kids? (I feel like a lot of media presents a "struggling mother" kind of dynamic; where working class parenthood is shown as something challenging, almost like an affliction.)
And that's only for the problem of Alice and Bob! Local comparisons are everywhere else, not only in the choice of having kids.
Again, I don't have a solution, but I think that this argument (compare yourself to someone outside your reference class) isn't particularly effective.
Society needs to allow people to do more easily what was completely normal a generation or two ago, and this needs to be done by influencing the current reference class that people compare themselves to.