2026-04-09 07:25:24
Anthropic safety researcher Sam Bowman was eating a sandwich in a park recently when he got an unexpected email. An AI model had sent him a message saying that it had broken out of its sandbox.
The model — an early snapshot of a new LLM called Claude Mythos Preview — was not supposed to have access to the Internet. To ensure safety, Anthropic researchers like to test new models inside a secure container that prevents them from communicating with the outside world. To double-check the security of this container, the researchers asked the model to try to break out and message Bowman.
Unexpectedly, Mythos Preview “developed a moderately sophisticated multi-step exploit” to gain access to the Internet and emailed Bowman. It also — unprompted — posted details about this exploit on public websites.
Mythos Preview is capable of hacking more than its own evaluation environment. It turns out that the model is generally really, really good at finding and exploiting bugs in code.
“Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser,” Anthropic announced on Tuesday. Because leading web browsers and operating systems have become fundamental to modern life, they have been extensively vetted by security professionals, making them particularly difficult to hack.
Anthropic claims that Mythos Preview hacks around restrictions very rarely — less often than previous models. Still, the company was so concerned by incidents like Bowman’s — and Mythos Preview’s incredible skill at hacking — that it decided not to generally release the model.
Instead, Anthropic is granting limited access to a select group of 50 or so companies and organizations “that build or maintain critical software infrastructure.” Eleven of these organizations — including Google, Microsoft, Nvidia, Amazon, and Apple — are coordinating with Anthropic directly in a project dubbed Project Glasswing.
Project Glasswing aims to patch these vulnerabilities before Mythos-caliber models become available to the general public — and hence to malicious actors. Anthropic is donating $100 million in access credits for organizations to audit their systems.
Mythos Preview is the first major LLM since GPT-2 in 2019 whose general release was delayed because of fears it could be societally disruptive. Back then, OpenAI initially released only a weaker version of GPT-2 out of concerns that larger versions of GPT-2 could generate plausible-looking text and supercharge misinformation — though that concern ended up being overblown.
If Anthropic’s claims are true — and the company makes a credible case — we are entering a world where LLMs might be able to cause real damage, both to users and to society.
We may also be entering a world where companies routinely keep their best models for internal use rather than making them available to the general public.
The idea that LLMs might be used for hacking is not new. OpenAI has long published a Frontier Safety Framework, which tracks how good its models are at hacking.
Until recently, the answer was “not very” — not only at OpenAI but at Anthropic and across the industry. But that started to change last fall, when LLMs — especially Anthropic’s Claude — started becoming useful for cyberoffense.
For instance, Bloomberg reported in February that a hacker used Claude to steal millions of taxpayer and voter records from the Mexican government. The same month, Amazon announced that Russian hackers had used AI tools to breach over 600 firewalls around the world.
But the examples given in Anthropic’s blog post are more impressive — and scary — than that.
The first example is a now-patched bug to remotely crash OpenBSD, an open-source operating system used in critical infrastructure like firewalls. OpenBSD is known for its focus on security. According to its website, “OpenBSD believes in strong security. Our aspiration is to be NUMBER ONE in the industry for security (if we are not already there).”
Across 1,000 runs, Claude Mythos Preview was able to find several bugs in OpenBSD, including one that allows any attacker to remotely crash a computer running it.
I won’t get into details about how the attack worked — it’s pretty involved — but the notable thing was that the bug had existed for 27 years. Over that period, no human noticed the subtle vulnerability in a widely used, heavily vetted open-source operating system. Mythos Preview did. And the compute cost for those 1,000 runs was only $20,000.
A second example is potentially even more impressive. Mythos Preview found several vulnerabilities in the Linux operating system — which runs the majority of the world’s servers — that allowed a user with no permissions to gain complete control of the entire machine.
Most Linux vulnerabilities aren’t very useful on their own, but Mythos Preview was able to combine several bugs in a non-trivial way. “We have nearly a dozen examples of Mythos Preview successfully chaining together two, three, and sometimes four vulnerabilities in order to construct a functional exploit on the Linux kernel,” members of Anthropic’s Frontier Red Team wrote.
Anthropic says these were not isolated incidents. Across a range of operating systems, browsers, and other widely used software, Mythos Preview found thousands of bugs, 99% of which have not been patched yet.
Mythos Preview is also shockingly good at exploiting a bug once it has been discovered. A lot of modern web-based software is powered by the programming language JavaScript. If your browser’s JavaScript engine has security flaws, then simply visiting a malicious website could allow the site’s owner to take control of your computer.
Anthropic found that Mythos Preview was far more capable than previous models at exploiting vulnerabilities in Firefox’s JavaScript implementation. Anthropic’s previous best model, Claude Opus 4.6, created a successful exploit less than 1% of the time. Mythos Preview did so 72% of the time.

There are some caveats to this result. The actual Firefox browser has multiple layers of defense against malicious code; Anthropic focused on just one layer. So the attacks developed by Mythos Preview would not actually allow a website to take over a user’s machine. Also, successful exploits tended to focus on two now-patched bugs; when tested on a version of Firefox with those bugs patched, Mythos Preview generally only made partial progress.
Still, Mythos Preview would get an attacker a step closer to the objective of a full Firefox exploit. And it would have an even better chance of compromising software that has not been so thoroughly vetted.
For the past 20 years or so, a sufficiently motivated and well-funded hacking organization could probably break into most systems, outside of the most hardened in the world. But it often wasn’t worth the effort. Human cyber talent is expensive, and multi-layered security protections made it so tedious (and therefore expensive) to complete an attack that potential hackers didn’t bother.
Mythos-class models could slash the cost of hacking, bringing this equilibrium to an end. Systems everywhere might start to get compromised.
Eventually, LLMs should be able to help developers harden systems before attackers ever get a chance to find weaknesses. But the transition period before that becomes standard practice might be difficult.
By delaying the release of Mythos Preview — there is no specific timeline for general release — Anthropic can help harden crucial systems before outsiders can cheaply and effectively attack them. This general approach — called defensive acceleration — has been proposed for a while, but the development of Mythos Preview kickstarts the effort.
Still, Anthropic’s writeup notes that “it’s about to become very difficult for the security community.”
“The language models we have now are probably the most significant thing to happen in security since we got the Internet,” said Anthropic research scientist Nicholas Carlini at a computer security conference last month. Carlini, a legendary security expert, added an appeal toward the end of the talk. “I don’t care where you help. Just please help.”
The risk of bad guys using Mythos Preview for hacking is an important reason Anthropic hasn’t released the model publicly. Another risk: users could inadvertently trigger the model’s advanced hacking abilities — especially in a product like Claude Code with weaker guardrails.
Mainstream chatbots put AI models into a tightly controlled “sandbox” that minimizes how much damage they can do if they misbehave. This makes them safer to use — especially for users with little to no technical knowledge. But it also limits their utility.
As Tim wrote in January, coding agents like Claude Code (and competitors like OpenAI’s Codex) are based on a different philosophy. They run on a user’s local computer, where they can often access files and load and install software.
This makes them much more powerful; I can ask Claude Code to organize my downloads folder or analyze some data I have stored on my computer. But it also makes them more dangerous; there have been a few incidents where Claude Code deleted all of a user’s files.
For the most part, though, the limited capabilities of Claude Opus 4.6 mean that a Claude Code mishap can’t do too much damage. Even if you run Claude Code with its hilariously named “--dangerously-skip-permissions” flag on, the worst it can do is trash your local machine.
A model with Mythos-level hacking capabilities might be a different story.
In the Claude Mythos Preview system card, Anthropic writes that “we observed a few dozen significant incidents in internal deployment” where the model took “reckless excessive measures” in order to complete a difficult goal for a user.
These examples didn’t only happen during evaluations. Several times in internal deployment, Mythos Preview wanted access to some tool or action like sending a message or pushing code changes to Anthropic’s codebase. Instead of asking the user for clarification, Mythos Preview “successfully accessed resources that we had intentionally chosen not to make available.”
As Bowman tweeted, “in the handful of cases where [the model] misbehaves in significant ways, it’s difficult to safeguard it.” When the model cheats on a test, “it does so in extremely creative ways.”
Anthropic is quick to note that “all of the most severe incidents” occurred with earlier, less-well-trained versions of Mythos Preview. Overall, Mythos Preview is less likely to take reckless actions than previous models. Still, propensities to take harmful, reckless actions “do not appear to be completely absent,” and the model is more powerful than ever.
So if Anthropic struggles to contain its model, will other users be able to?
Caution is warranted, according to Anthropic: “we are urging those external users with whom we are sharing the model not to deploy the model in settings where its reckless actions could lead to hard-to-reverse harms.” And remember, the model is only being made available to major companies and organizations. Presumably authorized users inside these companies will be cybersecurity experts.
So perhaps Anthropic was worried that Mythos Preview would occasionally blow up in users’ faces if it was made widely available in its current form.
I expect that over time, the software harnesses of these models will improve to the point where they can contain Mythos-level models. For example, Anthropic recently released “auto mode” which automatically classifies whether a model’s command in Claude Code might have “potentially destructive” consequences. This lets developers take advantage of long-running safe tasks without having to manually approve a bunch of commands — or use “--dangerously-skip-permissions.”
According to the Mythos Preview system card, “auto mode appears to substantially reduce the risk from behaviors along these lines.”
Still, model capabilities seem likely to continue to increase quickly. It will be an open question whether better scaffold methods like auto mode can catch up quickly enough to make it safe to release future frontier models to average users.
Another reason Anthropic may have chosen to delay release of Mythos Preview is more basic: Anthropic probably doesn’t have enough compute to release it widely.
Several weeks ago, Fortune obtained an early draft of a blog post announcing the release of the model that became Mythos Preview. The post described Mythos as “a large, compute-intensive model” and said that it was “very expensive for us to serve, and will be very expensive for our customers to use.”1
The few companies granted access to Mythos Preview have to pay correspondingly high prices: $25 per million input tokens and $125 per million output tokens. This is Anthropic’s most expensive model ever. For comparison, Claude Opus 4.6 costs $5 per million input tokens and $25 per million output tokens.
Anthropic is already under severe compute constraints because of skyrocketing demand. Anthropic’s revenue run-rate has doubled in less than two months. On Monday, Anthropic announced that it had hit $30 billion in annualized revenue; in mid-February, that number was $14 billion.
Anthropic has responded to skyrocketing demand by reducing usage limits during popular coding hours. The company has also announced deals for more AI compute.
Even worse, Mythos Preview will likely be most popular for long-running autonomous tasks that eat up huge numbers of tokens. In the system card, Anthropic gave a qualitative assessment of Mythos Preview’s coding abilities. The company wrote that “we find that when used in an interactive, synchronous, ‘hands-on-keyboard’ pattern, the benefits of the model were less clear.” Developers “perceived Mythos Preview as too slow” when used in chat mode.
In contrast, many Mythos Preview testers described “being able to ‘set and forget’ on many-hour tasks for the first time.” While this arguably makes Mythos Preview more useful for software developers, it definitely increases the amount of compute necessary to serve the model to everyone.
I wonder if Anthropic is trying to reset expectations around availability and will never have Mythos Preview be part of existing subscription plans. The chatbot subscription model started when LLMs generally used few tokens to generate a response. With long reasoning chains and expensive LLMs, that model starts to break down. By not releasing Mythos Preview generally at first, Anthropic can also more carefully manage demand over the rollout — and has more leverage about its pricing structure.
In any case, demand for leading AI models seems likely to continue to grow dramatically faster than the ability for companies to meet this demand with their computational resources.
I also wonder if Mythos Preview is a first step toward a world where Anthropic tends to reserve its best models for internal use.
Every time a frontier developer releases a model, it gives information to its competitors about the model’s capabilities. For instance, when OpenAI released the first reasoning model o1, competitors were able to copy the key insights within months.
So if Anthropic can get away with it, it has an incentive to prevent its competitors from being able to access Mythos Preview for as long as it can.2
Anthropic has shown the tendency already to try to prevent competitors from taking advantage of Claude’s capabilities. Over the past year, it has blocked Claude Code access at both OpenAI and xAI for violating Claude’s Terms of Service, which include prohibitions on using the models to train other AI models.
In 2024, Anthropic was only releasing smaller Sonnet models while reportedly reserving the more powerful — and expensive — Opus models for internal use. However, as time progressed, Anthropic started releasing the Opus models again, perhaps to be competitive with OpenAI’s o3 model.
But Anthropic has been on a winning streak. Claude Code took off and for the first time ever, Anthropic’s reported revenue rate is higher than OpenAI’s. Anthropic’s decision to only partially release its latest model might be an indication that Anthropic feels it has a lead over OpenAI.
If this continues, we might see more cautious releases in the future. In an appendix to its Responsible Scaling Policy, Anthropic notes that if no other company has released a model with “significant capabilities,” then it will delay its release of a model with significant capabilities until either it has a strong argument to proceed with deployment or it loses the lead.
We’ll soon get to see how long Anthropic’s lead lasts. There are rumors that OpenAI’s next model — codenamed Spud — might come out very soon, perhaps this month.
I wasn’t able to independently verify whether the copy of this blog post was in fact the one leaked on Anthropic systems. (Fortune did not release a full copy of the leaked blog post.) However, Fortune’s write-up of the leaked blog post described the future model in similar language.
Ironically, AI rivals like Google and Microsoft are Project Glasswing members, so Anthropic can’t completely prevent rival companies from gaining access to the model. But Mythos Preview’s system card is clear that access to Mythos Preview through Project Glasswing is “under terms that restrict its uses to cybersecurity.”
2026-04-07 03:02:49
Sen. Bernie Sanders (I-VT) is getting serious about AI.
“In my view, and in the view of people who know a lot more about this issue than I do, we are in the beginning of the most profound technological revolution in world history,” Sanders said at a March 25 press conference. “Artificial intelligence and robotics will impact our economy, our democracy, our privacy rights, our emotional well-being, and even our very survival as human beings on this planet.”
In response, Sanders and Rep. Alexandria Ocasio-Cortez (D-NY) introduced a bill to ban data center construction “until Congress passes comprehensive AI legislation.”

Many Americans share their AI skepticism. One recent NBC survey found that only 26% of Americans had a positive impression of AI, while 46% were negative.
There’s a potential here to build an anti-AI movement that could be a political juggernaut.
There are potential allies across the political spectrum, from Sanders to Ron DeSantis, the Republican governor of Florida. When asked in February about the risks of AI, Missouri Sen. Josh Hawley said that Americans losing access to paying jobs was “at the top of the list.” The conservative Republican teamed up with moderate Sen. Mark Warner (D-VA) on legislation to track job losses from AI.
Prominent AI experts are warning that the technology poses existential risks to humanity. Child safety advocates worry that chatbots will expose teens to inappropriate content and worsen their mental health. Labor groups — from taxi drivers to Hollywood actors — are trying to stop AI from taking their jobs. And activists nationwide want to stop construction of data centers in their own backyards.
However, it’s unclear whether these groups will be able to unite into an effective coalition. While many people are hostile toward the AI industry, they don’t always agree about the nature of the threat or what to do about it.
While some opponents see AI as an existential risk to humanity, others dismiss those warnings as part of an AI industry hype campaign. Grassroots campaigns against data centers tend to focus on their excessive water use, but some AI safety advocates believe (correctly) that the water issue is greatly exaggerated. After local activists stop a data center in their own neighborhood, they may not stay engaged with larger questions about the overall impact of AI.
So while there is the potential for these groups to work together — Sanders is clearly trying to make that happen — there’s no guarantee that it will work. It seems more likely that the AI industry will continue its relentless growth even though almost half of Americans wish it would slow down.
On Saturday, March 21, I attended “Stop the AI Race,” the largest AI safety protest in US history. Activists at the San Francisco event worry that superintelligent AI could seize control of the world and kill all human beings.

“For the past fifteen years, I’ve watched in slow motion as humanity has sleepwalked closer and closer to suicide,” said David Krueger, a University of Montreal professor involved in organizing the event, in a speech in front of Anthropic’s headquarters.
“This technology threatens everybody’s life, and it’s not okay to pretend like this is normal,” said another speaker, Nate Soares, co-author of If Anyone Builds It, Everyone Dies.
Not everyone attending was mainly concerned about existential risk — a couple of the speakers focused on AI chatbots encouraging teens to commit suicide, for instance. But most people I talked with seemed primarily worried about AI taking over the world and killing people.
It’s not a new concern. In the early 2000s, Soares’s co-author Eliezer Yudkowsky started writing about the catastrophic risks that advanced AI might pose. Nor is it uncommon in AI circles. Legendary AI researchers like Geoffrey Hinton and Yoshua Bengio have similar concerns. Industry leaders like Elon Musk and Sam Altman have also warned about existential dangers from AI.
People concerned with AI safety have tended to play “an inside game,” as Alys Key put it in Transformer.1 They’ve often eschewed public activism in favor of technical research and elite persuasion.
The “Stop the AI Race” protest represents a step toward more public activism, but the protest was still largely focused on persuading specific elite actors.
“We didn’t try to have the largest anti-AI protest possible,” the protest’s head organizer, Michaël Trazzi, wrote to me. “Instead [we] tried to focus on some specific pause AI ask that we thought [AI company] leadership / employees could get behind.”
This strategy was informed by Trazzi’s experience conducting a hunger strike. In September, Trazzi and another protester, Denys Sheremet, spent two and a half weeks sitting in front of the Google DeepMind office, demanding that Google commit to stop releasing models if everyone else agreed to stop.
Trazzi and Sheremet stopped for health reasons before Google agreed to the request, but Trazzi still views it as a success. The protest attracted significant media attention, and four months later, Google DeepMind CEO Demis Hassabis replied “I think so” when a journalist asked him at Davos if he’d advocate for a pause that all the other companies were participating in.
Trazzi told me support from Google employees was crucial to the hunger strike; he looked to replicate this dynamic with Anthropic. “Our main goal with this protest was to address the employees of Anthropic who, when they joined, thought the company would scale responsibly,” he wrote to me.
The concrete details of what an AI pause might look like are complicated, technical, and liable to generate disagreement. Trazzi’s campaign for a conditional pause has elided these details, helping to bring a larger coalition together. Previous US AI safety protests had been closer to 25 people. Stop the AI Race got 200 people to show up.
Several times throughout the San Francisco protest, Trazzi and others expressed excitement that “we have Bernie on our side.” But when leftists and AI safety advocates have tried to work together, it hasn’t always gone well.
Phil Hazelden is a programmer who believes AI poses an existential risk to humanity. He attended a February 28 UK protest co-organized by the AI safety group Pause AI and a left-leaning group called Pull the Plug. Hazelden concluded that “unfortunately, most of the speeches were frankly dumb.”
“Mostly I felt like the vibe was a sort of generic lefty anti-big-tech thing, which is not something I want to lend weight to,” he wrote. “I think it’s important for different groups to be able to ally on points of common interest, even if they have deep enduring disagreements. But this didn’t particularly feel like the other group was cooperating with me on that.”
As Politico reported, AI risk groups and the Sanders camp sometimes back dueling candidates in Democratic primaries. In North Carolina’s fourth district, for example, Rep. Valerie Foushee faced a primary challenge from Sanders-endorsed Nida Allam. Foushee narrowly defeated Allam in a March vote. Among Foushee’s backers was a super PAC led by prominent AI safety advocate Brad Carson.
Few politicians in America are more closely identified with AI risk concerns than Scott Wiener, the California state senator who proposed SB 1047, an AI safety bill that Gavin Newsom vetoed in 2024. Wiener is currently running to replace Rep. Nancy Pelosi (D-CA) in Congress. He is facing Saikat Chakrabarti, the former chief of staff to Rep. Alexandria Ocasio-Cortez (D-NY).
The hard reality for AI safety advocates is that — at least for now — their numbers are small. They need allies if they want to build a mass movement.
It has proven much easier to organize grassroots opposition to local data centers; voters across the political spectrum pay attention when major construction projects are proposed in their own backyards.
For example, on September 23, 2025, hundreds of people showed up to a planning commission meeting in Howell Township, a municipality of around 8,000 in southern Michigan. The planning commission had to move the meeting to a larger space in order to accommodate everyone.
“Normally we have like three people at our meetings,” vice chair Robert Spaulding told the crowd. “Have some grace with us.”

People were protesting a proposed zoning exemption for a billion-dollar data center project reportedly built for Meta. Over a hundred people spoke against the plan at a meeting that went past 2 AM.
Across the US, local groups have fought against data center development through protests, testimony at public hearings, and lawsuits.
Often these groups are quite diverse: “We got the goth people that came with black, baggy pants and rings in their noses and grandmas with walkers. It goes from one extreme to the other. It’s not political,” Dan Bonello, an organizer against the Howell data center, told the Livingston Daily.
The concerns vary by community, of course, but several show up over and over.
Perhaps the most common concern is that data centers will use too much water. Almost two-thirds of the Howell speakers mentioned water usage. Nationally it is the “No. 1 reason cited in press accounts for local opposition” to data center projects, according to an analysis by Heatmap.
In reality, data centers don’t use much water compared to other uses, such as factories, agriculture, or leisure.
Electricity rates are another flashpoint. Data centers really do use a lot of electricity, and the costs of infrastructure upgrades are sometimes passed on to all ratepayers.
“When I go home, people are very, very concerned about their electricity bills going up,” Sen. Josh Hawley (R-MO) said at the Axios AI+ Summit in DC. Hyperscalers like Microsoft have pledged not to pass on rate increases, but many voters remain unconvinced. A promise to lower electricity rates vaulted Democrats to Georgia’s Public Service Commission for the first time in over 20 years.
There are also classic NIMBY concerns: “The data center complex doesn’t belong here. It will destroy our rural nature that we all love so much,” one speaker told the planning commission in Howell Township.
Grassroots activism like this is often successful. In Howell, the town issued a six-month moratorium on data center development in November 2025; the proposed project was later withdrawn. Nationally, Heatmap found that “over 25 data center projects were canceled last year following local opposition.” That corresponds to more than $50 billion in spending by AI companies. 40% of the time there was local opposition, the project ended up canceled.
Still, many opposed to data centers have narrow enough goals that it may be difficult to harness them into a broader coalition. As Paresh Dave points out in Wired, “many of the factories getting built to supply servers, electrical gear, and other parts to data centers are facing virtually no opposition.”
Local pushback may just push data centers elsewhere. For instance, after a developer withdrew a data center project in Matthews, North Carolina, it pivoted to proposing a similar project a hundred miles north in Stokes County, North Carolina. Data centers may also end up being built abroad; last July, for example, OpenAI announced it was building a gigawatt data center in the UAE.
There are some signs that data center activists are becoming more ambitious. Legislation has been proposed in 12 states to temporarily ban new data center development. But for now, much of the activity — and the success — has come from decentralized local efforts.
A third major concern is that AI will take human jobs.
While this garners concern across the political spectrum, job loss has been a particular focus on the left, especially among unions.
Brian Merchant writes the newsletter Blood in the Machine, which has a recurring segment called AI Killed My Job.
“A lot of people in the labor movement understand AI less as a novel technology and more of the latest iteration in automation or surveillance technology,” Merchant told me. “It’s already being used to replace jobs or tasks when it can, erode working conditions, increase surveillance, and give the management class a powerful tool to do all of the above.”
But there isn’t one clear policy aim like pausing AI development or shutting down the construction of data centers.
“If you were to ask the head of the AFL-CIO [the largest union in the US] ‘What do you want to happen with AI policy?’ I don’t think there would be a clear answer,” Merchant told me.
Unions have tried to limit the use of AI during contract negotiations, as in the Hollywood strikes of 2023.

That year, both SAG-AFTRA (the actors union) and WGA (the writers union) went on strike for pay increases, better residual payments for streaming — and AI protections.
Eventually, both strikes mostly succeeded. As a result, actors have control over whether studios create digital replicas of them — and a right to compensation if they do. Studios are not allowed to use generative AI methods to replace writers, nor can they force writers to rewrite AI-generated scripts (rewrites generally earn lower rates than original work). But writers can use AI with company permission.
Union activists have also had some success slowing down the adoption of autonomous vehicles in Democrat-dominated cities like Boston.
However, it’s unclear whether the labor movement can build on these wins to create a unified anti-AI coalition. “One of labor’s great challenges right now” is how to channel AI concerns “into a movement with clearly defined goals and win conditions,” Merchant told me.
There’s also tension between those on the left who believe tech companies are overhyping the pace of AI progress and AI safety advocates who see rapidly advancing capabilities as the main reason to be worried about the technology.
When I asked Merchant about Sanders’s comments around existential risk, he told me that it was “alienating among certain people on the labor left.”
Despite their differences, there is plenty of overlap between the different groups. Activists pushing against local data centers sometimes mention concerns about the long-term trajectory of the technology. In 2024, SAG-AFTRA endorsed SB 1047, the AI safety bill that was vetoed by Gavin Newsom.
Bernie Sanders’s pivot toward AI safety seems like an attempt to bring these diverse forces together under one banner. With Republicans in charge of Congress and the White House, Sanders’s concrete proposal is unlikely to succeed in the near term; one superforecaster gave the data center moratorium bill a “less than zero” chance of passing.
But his proposal for a national moratorium conditioned on subsequent AI legislation could provide a rallying point for diverse anti-AI forces. If passed, it would give NIMBY activists what they want — a short-term reprieve from data center construction — while also providing leverage for advocates of AI safety, child welfare, labor rights, and other causes.
Even some Republicans might get on board. When asked about the moratorium proposal at the Axios AI+ Summit DC, Sen. Josh Hawley (R-MO) replied “What they’re getting at there is the real concern people have.”

Another possibility is that concerns around child safety will lead to more restrictions on AI development.
Protecting children has been a popular AI theme on the right. The first plank of the White House’s proposed AI framework focuses on measures to protect children. Sen. Hawley said at the Axios AI+ Summit DC that “the biggest thing immediately is that we’ve got to focus on child safety.”
But child safety is a bipartisan issue: for instance, the attorneys general of 44 US states endorsed a 2024 bill which would have set up a commission to investigate how to prevent child exploitation using AI.
Perhaps the most powerful speech at the Stop the AI Race protest was from UC Berkeley professor Will Fithian. Fithian was coming from his son Conrad’s sixth birthday party, and he teared up when he mentioned the uncertainty he felt about his son’s future — or whether his son would even survive.
“Every one of you has come out because whether or not Elon cares about our children’s futures, you do. Someday I’ll tell Conrad where I went after his birthday party. And I’ll tell him about the grownups who showed up when it mattered most, to demand his future back.”
Correction: I originally wrote that several speakers in San Francisco mentioned concerns about AIs encouraging teens to commit suicide. It was actually only a couple.
Transformer is published by the Tarbell Center for AI Journalism, which also funds my reporting. The Tarbell Center has had no editorial influence over this or other articles I’ve written for Understanding AI.
2026-04-02 19:33:47
Before we get to today’s article, I want to recommend some audio content about autonomous vehicles:
Back in 2010, my friend Ryan Avent and I made a bet about the future of autonomous vehicles. The bet came due last month and I won. Ryan and I did a postmortem on my podcast, AI Summer. You can listen here or search for “AI Summer” in your favorite podcast app.
PJ Vogt’s podcast Search Engine just did a two-part series on autonomous vehicles. I’m biased since I was quoted in both episodes, but I thought it was incredibly good. You can listen here, or search for “Search Engine” in your favorite podcast app.
Now for today’s article!
If you’ve followed AI over the last year, you’ve probably seen the famous “METR chart”:
METR, short for Model Evaluation and Threat Research, is based in Berkeley, California. The group has published many charts, but this one has become its calling card. It compares AI models based on the complexity of software engineering tasks they can complete, with complexity measured by how long it takes a human programmer to complete the same task:
GPT-3.5 — the model that powered the original ChatGPT — could complete tasks that took a human programmer about 30 seconds.
GPT-4, released in March 2023, bumped that up to 4 minutes.
o1, released in December 2024, was OpenAI’s first “reasoning model.” It could perform tasks that took a human 40 minutes.
GPT-5, released in August 2025, was able to finish tasks that took humans 3 hours.
Claude Opus 4.6 was released in February by Anthropic. METR estimates it can complete tasks that would take a human programmer 12 hours.
That last figure is twice as long as the estimate for the previous leader, GPT-5.2, which had been released just two months earlier.
I think this chart — and especially the impressive score for Claude Opus 4.6 — has done a lot to foster an impression of accelerating AI progress in recent months. Notice that the chart is logarithmic, so a straight line indicates exponential progress. The fact that Claude Opus 4.6 is above the previous trend line suggests very rapid progress indeed.
But if you click on METR’s task length page and hover over the dot for Claude Opus 4.6, you’ll see something interesting: METR’s confidence interval for Claude Opus 4.6 ranges from 5 hours to 66 hours. On Twitter, METR staff have urged people not to take the latest results as gospel.
“When we say the measurement is extremely noisy, we really mean it,” METR’s David Rein wrote.
METR depends on having a mix of easy tasks that an AI model can solve and harder tasks that it can’t. This allows the group to bracket the capabilities of a model. But Claude Opus 4.6 was able to solve some of the hardest problems in METR’s test suite, which made it difficult to put an upper bound on its capabilities.
So we know the latest Claude Opus is better than previous models, but it’s hard to say how much better. This means we don’t know if the apparent acceleration of the last few months is real or just a statistical artifact.
METR could — and perhaps will — add harder tasks to its test suite so it can test future models with greater precision.
But there’s also a deeper philosophical challenge.
Like most AI benchmarks, this one measures AI performance using tasks that are well-defined, self-contained, and easily verified. But a lot of the tasks humans perform aren’t like this.
In real workplaces, tasks are often connected to other tasks. They frequently require interacting with other people or the outside world. Sometimes it’s not clear what task needs doing, and goals may evolve as people work on a project. Even after a task is completed, people might not agree on whether it was done well.
Complexities like this will become more important as AI models tackle longer tasks — tasks that take weeks or months rather than just hours. We don’t have great ways to measure the performance of AI models on these kinds of tasks — in part because we struggle to judge the performance of human workers in the same situations.
As a consequence, we may see a growing divergence between the capabilities we can measure and the capabilities we actually care about.
In the early years of large language models, it was common for people to cite a benchmark called MMLU, short for Massive Multitask Language Understanding. It grills a language model on a wide range of topics: history, computer science, genetics, astronomy, international law, and more.
When MMLU was published in 2020, the best-performing LLM was GPT-3. It scored 43.9%. An older model, GPT-2, scored 32.4% — not much better than the 25% score you’d get from random guessing.
By the time I started writing about LLMs in 2023, GPT-4 had scored 86.4%. GPT-4o scored 88.7% in 2024, and GPT-4.1 scored 90.2% in 2025.
In the last year, AI companies have stopped reporting MMLU scores — presumably because scores have stopped improving. That’s not surprising; it’s impossible to get a score much higher than 93% without cheating because around 6.5% of MMLU questions contain errors.
So conventional benchmarks like MMLU have a natural lifecycle. At first, most problems are beyond models’ capabilities, so scores cluster near the minimum. As models improve, benchmark scores increase until they approach the theoretical maximum. Since 2024, frontier models have all scored between 88% and 93%, a narrow enough range that differences could be random noise. In industry jargon, MMLU has saturated.
Over time, the AI community works to develop more difficult benchmarks to replace earlier ones that have saturated. For example, in early 2025 Dan Hendrycks, the lead author of MMLU, co-authored a new, more difficult benchmark called Humanity’s Last Exam (HLE). Like MMLU, HLE includes questions in subjects ranging from chemistry to law.
When it was released, the best model was o3-mini (high), which scored 13.4% on HLE. Today, the leading model is Google’s Gemini 3.1, which scored 44.7%. Perhaps in a year or two models will begin to saturate this benchmark, with gains slowing as they approach 100%.
We know that HLE is harder than MMLU, but it’s difficult to say how much harder. There’s no obvious way to compare scores across different benchmarks, which makes it hard to compare model capabilities over long time periods — or to make predictions about future models.
METR invented a clever solution to this problem. Its benchmark contains tasks with a wide range of difficulties. The easiest problems are designed to take humans a few seconds — for example, a simple factual question about the syntax of a programming language. The hardest problems would take a human programmer many hours.
METR didn’t just guess how long humans would take on these tasks; it hired programmers and measured their actual completion times.1 For example, one problem in the METR test suite was to “speed up a Python backtesting tool for trade executions by implementing custom CUDA kernels while preserving all functionality.” METR found that this takes human programmers about eight hours.
Measuring tasks this way gives us a way to compare models with dramatically different capabilities. GPT-2 could only complete tasks that took human programmers about two seconds, whereas GPT-5 could complete tasks that took around 3 hours of human effort. So we could say that GPT-5 could complete tasks that are 5,400 times “harder” than the tasks GPT-2 could complete.
If this pace of progress continues — doubling task length every six or seven months — we should expect LLMs capable of completing week-long tasks (that is, 40 hours of human labor) some time next year, and month-long tasks (four 40-hour weeks) in 2028.2
However, the current version of METR’s task-length benchmark wouldn’t be able to meaningfully test such a powerful model. The most difficult tasks in the current test suite — such as “fix a control algorithm for a 4-wheeled omni-directional robot to follow cubic splines quickly despite wheel slippage and motor jerk limitations” — take humans about 30 hours to complete.
In other words, METR’s task-length benchmark is close to saturating.
We saw earlier that when conventional benchmarks saturate, scores start to cluster around a maximum value — like 93% for MMLU. METR’s benchmark works differently. When a model starts solving the hardest questions, the benchmark’s confidence interval widens dramatically because there is no way to place an upper bound on model performance. As I noted previously, METR’s confidence interval for Claude Opus 4.6 ranges from 5 to 66 hours.
“If we took one task out of our task suite or added another task to our task suite, potentially instead of measuring this Claude Opus 4.6 time horizon of, I think, 14 and a half hours, we’d be measuring it at something like eight or 20 hours,” METR’s Joel Becker told me in a recent interview on my podcast. “That’s how sensitive things are now to a single task.”
In principle, the solution is simple: add tasks that take human programmers more than 30 hours. Ideally, METR would test models on tasks that take humans 40 hours, 80 hours, 160 hours, and so forth. That would extend the useful life of the benchmark by at least a couple more years.
But this won’t be easy. METR pays human programmers a minimum of $50 per hour, so getting a baseline for a single 160-hour task would cost at least $8,000. And that’s assuming they can even convince programmers to participate. I bet METR would struggle to find experienced programmers willing to tackle tasks that stretch across multiple weeks; many programmers would have to quit their day jobs to make time.
There’s also a deeper conceptual problem with trying to extend the METR benchmark — or any benchmark like it — to tasks that require dozens of hours of human work.
2026-03-26 03:00:57
When Kai and I wrote our 2026 predictions post last December, we disagreed about the future of AI video. I thought a recent deal with Disney would help to make OpenAI’s Sora the leading AI video app. Kai disagreed. Noting that “Meta is very skilled at building compelling products that grow its user base,” Kai predicted that Meta’s Vibes platform would w…
2026-03-20 04:49:14
Earlier this week, I wrote an article arguing that there was no obvious AI bubble. I argued that AI companies are making massive investments in data centers due to surging demand for their services, and that demand is likely to continue growing in the next couple of years.
This prompted several thoughtful comments asking variants of the same basic question: if there’s so much demand for this technology, why are AI companies losing so much money? As I thought about how to respond, I became convinced that it would be helpful for me to explain the intellectual framework I use to think about questions like this.
I’m not going to claim any kind of originality here — the ideas I’ll explain below are commonplace in startup finance. But I suspect that many readers haven’t spent much time thinking about them.
So in this piece I’m going to do three things. First I’ll present a stylized example to illustrate some key ideas about how to finance a new company. Next I’ll use real-world examples to illustrate how to distinguish healthy startups from doomed companies. Finally, behind the paywall, I’ll apply this framework to OpenAI and Anthropic.
My claim isn’t that these companies are guaranteed to succeed — all startups face risk, and these companies could certainly fail. It’s also possible that they could survive but never generate a healthy return for their investors.
But I am going to insist that OpenAI and Anthropic are following a standard tech industry playbook. The fact that they are losing more money every year does not necessarily mean they are on a road to bankruptcy — or even that anything especially unusual is going on. After all, Amazon lost money for the first nine years after it was founded. Today it’s one of the most valuable companies in the world.
Imagine you start a coffee shop. The space costs $6,000 per month. Coffee beans cost $2 per cup, and you sell each cup for $4.
The first month, you sell 250 cups, earning $1,000 in revenue. But you spend $500 on coffee beans and $6,000 on rent, so you lose a total of $5,500.
The second month, you sell 500 cups of coffee. That’s $2,000 in revenue minus $1,000 for beans. You still aren’t close to covering your store’s $6,000 in monthly overhead, though; you lose another $5,000.
Despite these early losses, you feel like you’re on the right track. Customers like the coffee. They keep coming back, and some of them bring friends. The third month you sell 750 cups and lose $4,500. The fourth month you sell 1,000 cups and lose $4,000.
Projecting forward, you estimate that you’ll break even around the one-year mark, when you expect to sell 3,000 cups. That will generate $12,000 in revenue, just enough to pay $6,000 for beans and $6,000 in rent. By the end of year two, you expect to sell 6,000 cups of coffee in a month, generating $24,000 in revenue. After subtracting $12,000 for beans and $6,000 for rent, you’ll be left with a healthy $6,000 profit.1
Starting a business almost always requires spending a bunch of money up front before you earn your first dollar of revenue. Even after you launch, it usually takes a while to build up a customer base. So it’s very common for a business to lose money for at least the first few months — and sometimes the first few years — before it grows large enough to cover its overhead and start generating profits.
Now imagine that the first store does so well that you decide to open two new stores a year after the original one. So in month 13, store #1 earns a $500 profit. But your other two stores are each losing $5,500 — just as the first store did a year earlier. In total, the company is losing $10,500 — the biggest loss in its short history.
Customers love the two new stores and they grow as fast as the first one. You become so optimistic that you decide to open four more stores at the start of year three. That month, store #1 generates $6,500 in profit and store #2 and store #3 each generate $500 in profit. But stores 4 through 7 are brand new, and so they each lose $5,500. In total, your company has lost $14,500 — another record loss.
A financial analyst writes an article arguing that your company is doomed: the larger your company gets, the more money it loses.
But you’re confident the analyst is wrong. Sure, your newest stores are losing money, but that’s temporary. You expect the new stores to become profitable over time, just like the earlier ones did.
This could go on for a while. Maybe you open eight stores in year four and 16 in year five. If you are particularly ambitious — and have sufficiently patient and deep-pocketed investors — you might be able to open new stores for a decade before you turn your first profit. But eventually, you’ll stop (or at least slow down) the pace of openings, and at that point you will wind up with a big, profitable company.
This is a common pattern in the business world. Once investors are confident that a company has a clear path to profitability, they are often willing to fund another round of expansion — designing another chip, releasing another software version, expanding into another city — without waiting for the previous round of investments to pay off. This is why it’s common to see startups do a series of larger and larger fundraising rounds — $1 million, $5 million, $20 million — before they generate a single dollar in profit.
This is especially common in the technology sector because these are often winner-take-all markets. Frequently there are economies of scale, network effects, or other factors that make the most popular search engine, social network, or online retailer much more profitable than the also-rans. You’d much rather be Google than Lycos or Ask Jeeves. So once you (and your investors) are confident you have a viable business model, it often makes sense to spend heavily to stay ahead of your competitors.
Amazon famously did this for a decade. In the late 1990s and early 2000s, it lost more and more money as it expanded from books to CDs to DVDs to consumer electronics and then to many other products. The company didn’t earn its first full-year profit until 2003, nine years after it was founded.
In the early years, a lot of people questioned whether Amazon would ever turn a profit. But the doubters were ultimately proven wrong. Today Amazon is one of the five most valuable companies in the world. It earned $77 billion in profits in 2025.
It doesn’t always work out that way, of course. In 2017, the startup MoviePass announced a service where customers could pay $9.95 to watch one movie per day in movie theaters. A month of movie tickets costs a lot more than $9.95, and in a 2018 interview, MoviePass CEO Mitch Lowe admitted that the company was losing $21 million per month on the service. But he argued that he was just following in the footsteps of Jeff Bezos.
“Remember Amazon, for what, 20 plus years, lost billions and billions of dollars,” he said. “And today is now the most valuable company out there.”
But MoviePass and Amazon were different in a crucial way. Amazon generally sold products above cost; if a CD cost $9.95 on Amazon, the retailer might have paid $7 or $8 for it. Amazon was only losing money because it was rapidly expanding into new markets where — due to startup costs — it wasn’t profitable yet.
In contrast, a typical customer on a $9.95 MoviePass plan got more than $9.95 worth of movie tickets. MoviePass was buying those tickets from theaters at the full retail price and just eating the losses.
The technical term for this is gross margin:
My hypothetical coffee shops had gross margins of 50% because the cost of the beans ($2) was 50% lower than the cost of the coffee ($4).
In 2001, Amazon had a gross margin of 21% — if you bought a CD for $10, Amazon’s costs were likely around $7.90.
In the first half of 2018 MoviePass charged customers $121 million for MoviePass subscriptions, but had a cost of revenue (i.e. the money they paid for movie tickets) of $313 million. That works out to a negative 159% gross margin.
If a company has positive gross margins — that is, if it’s making some money on every sale — then scaling it up should help it get to profitability. A company with negative gross margins, on the other hand, likely needs a fundamental rethink.
2026-03-16 23:26:57
Last fall, a lot of people were worried about a possible AI bubble. AI companies were investing heavily in infrastructure because they expected huge demand for AI services in the coming years. For example, an internal OpenAI document last fall projected that revenue would more than double — from $13 billion in 2025 to $30 billion in 2026. Around the same time, Anthropic expected revenue to triple from $4.7 billion in 2025 to more than $15 billion in 2026.
Skeptics didn’t believe companies this large could grow so quickly. But the last few months haven’t gone the way they expected.
Anthropic has posted particularly strong revenue numbers. The company exited 2025 generating revenue at a $9 billion annualized rate. In February, the company announced that its annualized revenue had reached $14 billion. A few weeks after that, Bloomberg reported that Anthropic’s annualized revenue had soared to $19 billion.
These are annualized figures, so Anthropic hasn’t actually earned $19 billion yet this year. (Roughly speaking, annualized revenue is monthly revenue multiplied by 12.) But if customers continue spending at the same rate, Anthropic will easily surpass $15 billion in revenue for 2026. And if revenue continues rising (as seems likely), Anthropic will take in far more than $15 billion this year.
Other AI companies have not enjoyed the same meteoric growth as Anthropic, but demand for AI services has been healthy across the industry.