2026-04-23 01:24:17
Documents viewed by Where’s Your Ed At shed additional light on Microsoft’s transition to token-based billing for GitHub Copilot, as the company grapples with spiraling costs of AI compute.
As reported on Monday (and as announced soon after by Microsoft), the company has taken the step to suspend new sign-ups for individual and student accounts, has removed Anthropic’s Opus models from the cheapest $10-a-month plan, and plans to further tighten usage limits.
According to the documents, the announcement for token-based billing will be tomorrow (4/23), with changes to GitHub Copilot rolling out at the beginning of June.
Explainer: At present, GitHub Copilot users have a certain amount of “requests” — interactions where you ask the model to do something, with Pro ($10-a-month) accounts getting 300 a month, and Pro+ ($39-a-month) getting 1500. More-expensive models use more requests, cheaper ones use less (I’ll explain in a bit).
Moving to “token-based billing” means that instead of using “requests,” GitHub Copilot users will pay for the actual cost of tokens. For example, Claude Opus 4.7 costs $5 per million input tokens (stuff you feed in) and $25 per million output tokens (stuff the model outputs, including tokens for chain-of-thought reasoning.)
Users will pay a monthly subscription to access GitHub Copilot, and receive a certain allotment of AI tokens based on their subscription level. Organizations paying for GitHub Copilot will have “pooled” AI credits, meaning that tokens are shared across the entire organization.
GitHub Copilot Business Customers will pay $19 per-user-per-month and receive $30 of pooled AI credits, and Copilot Enterprise customers will pay $39 per-user-per-month and receive $70 of pooled AI credits.
While the documents refer to moving “all” GitHub Copilot users to token-based billing, it’s unclear at this time how Microsoft will be handling individual Pro or Pro+ subscribers.
If you liked this news hit and want to support my independent reporting and analysis, why not subscribe to my premium newsletter?
It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA, Anthropic and OpenAI’s finances, and the AI bubble writ large. I recently put out the timely and important Hater’s Guide To The SaaSpocalypse, another on How AI Isn't Too Big To Fail, a deep (17,500 word) Hater’s Guide To OpenAI, and just last week put out the massive Hater’s Guide To Private Credit.
Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.
2026-04-22 06:44:29
The following exists as a record of what happened previously, please see above for the full story.
In developing news, Anthropic appears to have removed access to AI coding tool Claude Code from its $20-a-month "Pro" accounts. This is likely another cost-cutting move that follows a recent change (per The Information) that forced enterprise users to pay on a per-million-token based rate rather than having rate limits that were, based on researchers' findings, often much higher than the cost of the subscription.
Update: Anthropic's Amol Avasare claims that it is "...running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected." This does not really make sense given the fact that all support documents and the Claude website reflect that Pro users do not have access to Claude Code.
I am waiting for further comment.
Previously, users were able to access Claude using their Pro subscriptions via a command-line interface and both the web and desktop Claude apps. Users were, instead of paying on a per-million-token basis, allowed to use their subscription to access Claude Code, but will likely now have to pay for API access.
Anthropic's Claude Code support documents (as recently as this April 10th archived page) previously read "Using Claude Code with your Pro or Max plan." The page now reads "Using Claude Code with your Max plan."
Pricing on Anthropic's website reflects the removal of Claude Code on both mobile and desktop.


Some Pro users report that they are still able to access Claude Code via the web app and Command-Line Interface.
It is unclear at this time whether this change is retroactive or for new Pro subscribers, or whether Anthropic intends to entirely remove access to Claude Code (without paying for API tokens) from every Pro customer.
I have requested a comment from Anthropic, and will update this piece when I receive it, or if Anthropic confirms this move otherwise.
If you liked this news hit and want to support my independent reporting and analysis, why not subscribe to my premium newsletter?
It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA, Anthropic and OpenAI’s finances, and the AI bubble writ large. I recently put out the timely and important Hater’s Guide To The SaaSpocalypse, another on How AI Isn't Too Big To Fail, a deep (17,500 word) Hater’s Guide To OpenAI, and just last week put out the massive Hater’s Guide To Private Credit.
Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.
2026-04-22 00:28:59
If you liked this piece, please subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA, Anthropic and OpenAI’s finances, and the AI bubble writ large. I recently put out the timely and important Hater’s Guide To The SaaSpocalypse, another on How AI Isn't Too Big To Fail, a deep (17,500 word) Hater’s Guide To OpenAI, and just last week put out the massive Hater’s Guide To Private Credit.
Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.
Soundtrack — Megadeth — Hangar 18 (Eb Tuning)
For the best part of four years I’ve been wrapped up in writing these massive, sprawling narratives about the AI bubble and the tech industry at large. I still intend to write them, but today I’m going to do what I do best — explaining all the odd shit that’s happening in the tech industry and explaining why it’s concerning to me.
And because I love a good bit, I’m tying these stories to my pale horses of the AIpocalypse — signs that things are beginning to unwind in the most annoying bubble in history.
Anyway, considering that the newsletter and the podcast are now my main form of income, I’m going to be experimenting with the formats across the free and premium to keep things interesting and varied.
Pale Horse: Any further price increases or service degradations from Anthropic and OpenAI are a sign that they’re running low on cash.
Let’s start with a fairly direct statement: Anthropic should stop taking on new customers until it works out its capacity issues.
So, generally any service — Netflix, for example — you use with any regularity has the “four nines” of availability, meaning that it’s up 99.99% of the time. Once a company grows beyond a certain scale, having four 9s is considered standard business practice…
…unless you’re Anthropic!
As of writing this sentence, Anthropic’s availability for its Claude Chatbot has 98.79% uptime, its platform/console is at 99.14%, its API is at 99.09%, and Claude Code is at 99.25% for the last 90 days.
Let me put this into context. When you have 99.99% uptime, a service is only down for a minute (and 0.48 of a second) each week. If you’re hitting 98.79% uptime, as with the Claude chatbot, your downtime jumps to two hours, one minute, and 58 seconds.
Or, put another way, 98.79% uptime equates to nearly four-and-a-half days in a calendar year where the service is unavailable.
More-astonishingly, Claude for Government sits at 99.91%. Government services are generally expected to be four 9s minimum, or 5 (99.999%) for more important systems underlying things like emergency services.
This is a company that recently raised $30 billion dollars and gets talked about like somebody’s gifted child, yet Anthropic’s services seem to have constant uptime issues linked to a lack of capacity.
Since mid-February, outages for systems across Anthropic have become so common that some of its enterprise clients are switching to other AI model players.
David Hsu, founder and CEO of software development platform Retool, said he prefers to use Anthropic’s Opus 4.6 model to power his company’s AI agent tool because he believes it is the best model for enterprise. He recently changed to OpenAI’s model to power his company’s agent. “Anthropic has just been going down all the time,” he said.
The reliability of core services on the internet is often measured in nines. Four nines means 99.99% of uptime—a typical percentage that a software company commits to customers. As of April 8, Anthropic’s Claude API had a 98.95% uptime rate in the last 90 days.
Yet Anthropic’s problems go far further than simple downtime (as I discussed last week), leading to (deliberately or otherwise) severe performance issues with Opus 4.6:
One of the most detailed public complaints originated as a GitHub issue filed by Stella Laurenzo on April 2, 2026, whose LinkedIn profile identifies her as Senior Director in AMD’s AI group.
In that post, Laurenzo wrote that Claude Code had regressed to the point that it could not be trusted for complex engineering work, then backed that claim with a sprawling analysis of 6,852 Claude Code session files, 17,871 thinking blocks and 234,760 tool calls.
The complaint argued that, starting in February, Claude’s estimated reasoning depth fell sharply while signs of poorer performance rose alongside it, including more premature stopping, more “simplest fix” behavior, more reasoning loops, and a measurable shift from research-first behavior to edit-first behavior.
While Anthropic claims that it doesn’t degrade models to better serve demand, that doesn’t really square with the many, many users complaining about the problem. Anthropic’s response has, for the most part, been to pretend like nothing is wrong, with a spokesperson waving off Carl Franzen of VentureBeat (who has a great article on the situation here) by pointing him to two different Twitter posts, neither of which actually explain what’s going on.
Things only got worse with last week’s launch of Opus 4.7, which appears to have worse performance and burn more tokens.
One Reddit post titled, "Claude Opus 4.7 is a serious regression, not an upgrade," has 2,300 upvotes. An X user's suggestion that Opus 4.7 wasn't really an improvement over Opus 4.6 got 14,000 likes. In one informal but popular test of AI intelligence, Opus 4.7 appears to say that there were two Ps in "strawberry." Another user screenshot shows it saying that it didn't cross reference because it was "being lazy." Some Redditors found that Opus 4.7 was rewriting their résumés with new schools and last names. Multiple X users posited that Opus 4.7 had simply gotten dumber.
Some X users have suggested the culprit is the AI model's reasoning times. Anthropic says the new "adaptive reasoning" function lets the model decide when to think for longer or shorter periods. One user wrote that they couldn't "get Opus 4.7 to think." Another wrote that it "nerfs performance."
"Not accurate," Anthropic's Boris Cherny, the creator of Claude Code, responded. "Adaptive thinking lets the model decide when to think, which performs better."
I think it’s deeply bizarre that a huge company allegedly worth hundreds of billions of dollars A) can’t seem to keep its services online with any level of consistency, B) appears to be making its products worse, and C) refuses to actually address or discuss the problem. Users have been complaining about Claude models getting “dumber” going back as far as 2024, each time faced with a tepid gaslighting from a company with a CEO that loves to talk about his AI products wiping out half of white collar labor.
Some might frame this as Anthropic having “insatiable demand for its products,” but what I see is a terrible business with awful infrastructure run in an unethical way. It is blatantly, alarmingly obvious that Anthropic cannot afford to provide a stable and reliable service to its customers, and its plans to expand capacity appear to be signing deals with Broadcom that will come online “starting in 2027,” near-theoretical capacity with Hut8, which does not appear to have ever built an AI data center, and also with CoreWeave, a company that is yet to build the full capacity for its 2025 deals with OpenAI and only has around 850MW of “active power capacity” — so around 653MW of actual compute capacity — as of the end of 2025, up from 360MW of power at end of 2024.
Remember: data centers take forever to build, and there’s only a limited amount of global capacity, most of which is taken up by Microsoft, Google, Amazon, Meta and OpenAI, with the first three of those already providing capacity to both Anthropic and OpenAI.
We’re likely hitting the absolute physical limits of available AI compute capacity, if we haven’t already done so, and even if other data centers are coming online, is the plan to just hand them over to OpenAI or Anthropic in perpetuity?
It’s also unclear what the goal of that additional capacity might be, as I discussed last week:
Yet it’s unclear whether “more capacity” means that things will be cheaper, or better, or just a way of Anthropic scaling an increasingly-shittier experience.
To explain, when an AI lab like Anthropic or OpenAI “hits capacity limits,” it doesn’t mean that they start turning away business or stop accepting subscribers, but that current (and new) subscribers will face randomized downtime and model issues, along with increasingly-punishing rate limits.
Neither company is facing a financial shortfall as a result of being unable to provide their services (rather, they’re facing financial shortfalls because they’re providing their services to customers), and the only ones paying that price because of these “capacity limits” are the customers.
What’s the goal, exactly? Providing a better experience to its current customers? Securing enough capacity to keep adding customers? Securing enough capacity to support larger models like Mythos? When, exactly, does Anthropic hit equilibrium, and what does that look like?
There’s also the issue of cost.
Anthropic is currently losing billions of dollars a year offering a service with amateurish availability and oscillating quality, and continues to accept new subscribers, meaning that capacity issues are not affecting its growth. As a result, adding more capacity simply makes the product work better for a much higher cost.
Anthropic’s growth story is a sham built on selling subscriptions that let users burn anywhere from $8 to $13.50 for every dollar of subscription revenue and providing a brittle, inconsistent service, made possible only through a near-infinite stream of venture capital money and infrastructure providers footing the bill for data center construction.
Put another way, Anthropic doesn’t have to play by the rules. Venture capital funding allows it to massively subsidize its services. The endless, breathless support from the media runs cover for the deterioration of its services. A lack of any true regulation of tech, let alone AI, means that it can rugpull its customers with varying rate limits whenever it feels like.
If Anthropic were forced to charge its actual costs — and no, I don’t believe its API is profitable no matter how many people misread Dario Amodei’s interview — its growth would quickly fall apart as customers faced the real costs of AI (which I’ll get to in a bit). If Anthropic was forced to provide a stable service, it would have to stop accepting new customers or massively increase its inference costs.
Anthropic is a con, and said con is only made possible through endless, specious hype. Everybody who blindly applauded everything this company did is a mark.
Congratulations to all the current winners of the “Fell For It Again Award.” Per the Financial Times:
Anthropic has said it will hold off on a wider release of the model until it is reassured that it is safe and cannot be abused by bad actors. The company also has a finite amount of computing power and has suffered outages in recent weeks.
Multiple people with knowledge of the matter suggested Anthropic was holding back from a wider release until it could reliably serve the model to customers.
So, yeah, anyone in the media who bought the line of shit from Dario Amodei that this was “too dangerous to release” is a mark. Cal Newport has an excellent piece debunking the hype, but my general feeling is that if Mythos was so powerful, how did Claude Code’s source code leak?
Did… Anthropic not bother to use its super-powerful Mythos model to check? Or did it not find anything? Either way, very embarrassing for all involved.
Pale Horse: data center collapses, misc.
As I’ve discussed in the past, only 5GW of AI compute capacity is currently under construction worldwide (based on research from Sightline Climate), with “under construction” meaning everything from a scaffolding yard with a fence (as is the case with Nscale’s Loughton-based data center) to a building nearing handoff to the client.
I reached out to Sightline to get some clarity, and they told me that of the 114GW of capacity due to come online by the end of 2028, only 15.2GW is under construction, including the 5GW due in 2026.
That’s…very bad.
It gets worse when you realize that the majority of that construction is for two companies:
Sidenote: I’ll also add that Anthropic has agreed to spend $100 billion on Amazon Web Services over the next decade as part of its $5 billion (with “up to $20 billion” more in the future, and no, there’s no more details than that) investment deal with Amazon, with Anthropic apparently securing 5GW of capacity and bringing “nearly 1GW of Trainium2 and 3 capacity online by the end of the year,” which I do not believe, but whatever.These deals shouldn’t be legal.
So, to summarize, at least 4.6GW of the 15.2GW of data center capacity under construction is for OpenAI, with at least another 4GW of that reserved for Anthropic through partners like Microsoft, Google and Amazon. In truth, the number could be much higher.
This is a fundamentally insane situation. OpenAI and Anthropic both burn billions of dollars a year, with The Information reporting that Anthropic expects to burn at least $11 billion and OpenAI $25 billion in 2026. The only way that these companies can continue to exist is by raising endless venture capital funding or, assuming they make it to IPO, endless debt offerings or at-the-market stock sales.
It’s also very concerning that only such a small percentage of announced compute capacity is being built, especially when you run the numbers against NVIDIA’s actual sales.
Last year, Jerome Darling of TD Cowen estimated that it cost around $30 million per megawatt in critical IT (GPUs, servers, storage, and so on) and $12 million to $14 million per megawatt to build a data center, making critical IT around 68% (at the higher end of construction) of the total cost-per-megawatt.
Now, to be clear, those gigawatt and megawatt numbers for data centers refer to the power rather than critical IT, and if we take an average PUE (power usage efficiency, a measurement of how efficient a data center’s power is) of 1.35, we get 11.2GW of critical IT hardware, with the majority (I’d say 90%) being GPUs, bringing us down to around 10.1GW of GPUs.
If we then cut that up into GB200 or GB300 NVL72 racks with a power draw of around 140KW, that’s around 71,429 racks’ worth of hardware at an average of $4 million each, which gives us around $285.7 billion in revenue for NVIDIA.
NVIDIA claims it had a combined $500 billion in orders between 2025 and 2026, and $1 trillion of sales through 2027, and it’s unclear where any of those orders are meant to go other than a warehouse in Taiwan.
At this point, I think it’s fair to ask why anyone is buying more GPUs, as there’s nowhere to fucking put them. Every beat-and-raise earnings from NVIDIA is now deeply suspicious.
New Pale Horse: Any and all signs that companies are facing the economic realities of AI, including any complaints around or adaptations to deal with the increasing costs of AI.
Last week, a report from Goldman Sachs revealed that (and I quote) “...companies are overrunning their initial budgets for inference by orders of magnitude (we heard one industry datapoint on inference costs in engineering now approaching about 10% of headcount cost, but could be on track to be on par with headcounts costs in the next several quarters based on current trajectories.”
To simplify, this means that some companies are spending as much as 10% of the cost of their employees on generative AI services, all without appearing to provide any stability, quality or efficiency gains, or (not that I want this) justification to lay people off.
The Information’s Laura Bratton also reported last week that Uber had managed to blow through its entire AI budget for the year a few months into 2026:
Uber’s surging use of AI coding tools, particularly Anthropic’s Claude Code, has maxed out its full year AI budget just a few months into 2026, according to chief technology officer Praveen Neppalli Naga.
“I'm back to the drawing board because the budget I thought I would need is blown away already,” Neppalli Naga said in an interview.
…
He wouldn’t disclose exact figures of the company’s software budget or what it spends on AI coding tools. Uber’s research and development expenses, which typically reflect companies’ costs of developing new AI products, rose 9% to $3.4 billion in 2025 from the previous year, and the firm said in a recent securities filing it expects that cost will continue rising on an absolute dollar basis.
Uber’s CTO also added that about “...11% of real, live updates to the code in its backend systems are being written by AI agents primarily built with Claude Code, up from just a fraction of a percent three months ago.” Anyone who has ever used Uber’s app in the last year can see how well that’s going, especially if they’ve had to file any kind of support ticket.
Honestly, I find this all completely fucking insane. The whole sales pitch for generative AI is that it’s meant to be this magical, efficiency-driving panacea, yet whenever you ask somebody about it the answer is either “yeah, we’re writing all the code with it!” without any described benefits or “it costs so much fucking money, man.”
Let’s get practical about these economics, and use Spotify as an example because its CEO proudly said that its “top engineers” are barely writing code anymore, though to be clear, the Goldman Sachs example didn’t specifically name any one company.
For the sake of argument, let’s say that the company has 3000 engineers — one of its sites claims it has 2700, but I’ve seen reports as high as 3500. Let’s also assume, based on the Spotify Blind (an anonymous social media site for tech workers), that these engineers make a median salary of 192,000 a year.
In the event that Spotify spent 10% of its engineering headcount (around $576 million) on AI inference, it would be spending roughly $57.6 million, or approximately 4.1% of its $1.393 billion in Research and Development costs from its FY2025 annual report. Eager math-doers in the audience will note that 100% of headcount would be nearly half of the R&D budget, or around a quarter of its $2.2 billion in net income for the year.
Now, to be clear, these numbers likely already include some AI inference spend, but I’m just trying to illustrate the sheer scale of the cost.
While this is great for Anthropic (and to a lesser extent OpenAI), I don’t see how it works out for any of its customers. A flat 10% bump on the cost of software engineering is the direct opposite of what AI was meant to do, and in the event that costs continue to rise, I’m not sure how anybody justifies the expense much further.
And we’re going to find out fairly quickly, because the world of token subsidies is going away.
Pale Horse: Any further price increases or service degradations from AI startups, and yes, that’s what I’d call GitHub Copilot, in the sense that it loses hundreds of millions of dollars and makes fuck-all revenue.
As I reported yesterday, internal documents have revealed that Microsoft plans to temporarily suspend individual account signups to its GitHub Copilot coding product, tighten rate limits across the board, remove Opus models from its $10-a-month Pro subscription, and transition from requests (single interactions with GitHub Copilot) towards token-based billing some time later this year, with Microsoft confirming some of these details (but not token-based billing) in a blog post.
This is a significant move, driven by (per my own reporting) Microsoft’s week-over-week costs of running GitHub Copilot nearly doubling since January.
An aside/explainer: if you’re confused as to what “token-based billing” means, know that the vast majority of AI services currently subsidize their subscriptions, using another measure (such as “requests” or “rate limits”) to meter out how much a user can use the service. Nevertheless, these services still burn tokens at whatever rate that it costs to pay for them — for example, $5 per million input and $25 per million output for Opus 4.7, as I mentioned previously — meaning that the company almost always loses money unless a person doesn’t use the subscription very much.
Companies did this to grow their subscriber numbers, and I think they assumed things would get cheaper somehow. Great job, everyone!
The move to token-based billing will see GitHub users charged based on their usage of the platform, and how many tokens their prompts consume — and thus, how much compute they use. It’s unclear at this time when this will begin, but it significantly changes the value of the product.
I’ll also say that the fact that Microsoft has stopped signing up new paid GitHub Copilot subscriptions entirely is one of the most shocking moves in the history of software. I’ve literally never seen a company do this outside of products it intended to kill entirely, and that’s likely because — per my source — it intends to move paid customers over to token-based-billing, though it’s unclear what these tiers would look like, as the $10-a-month and $39-a-month subscriptions are mostly differentiated based on the amount of requests you can use.
What’s remarkable about this story is that Microsoft is one of the few players capable of bankrolling AI in perpetuity, with over $20 billion a quarter in profits since the middle of 2023.
Its decision to start cutting costs around AI suggests that said costs have become unbearable — The Information reported back in January that it was on pace to spend $500 million a year with Anthropic alone, and if that amount has doubled, it likely means that Microsoft is spending upwards of ten times its GitHub Copilot revenue, as I can report today that at the end of 2025, GitHub Copilot was at around $1.08 billion, with the majority of that revenue coming from its CoPilot Business and Enterprise subscriptions.
The Information also reported a few weeks ago that GitHub had recently seen a surge of outages attributed to “spiking traffic as well as its effort to move its applications from its own servers to Microsoft’s Azure cloud”:
“Since January, every month, every week almost now has some new peak stat for the highest [usage] rate ever,” [GitHub COO Kyle] Daigle said. He attributed the growth to “both agents and humans,” and also noted that the rise of AI coding tools has led to a rise in humans without deep coding knowledge starting to use GitHub’s platform more.
“Agents” in this case could refer to just about anything — OpenAI’s Codex, Anthropic’s Claude Code, or even people plugging in the wasteful, questionably-useful OpenClaw to their GitHub Copilot account, and if that’s what happened, it’s very likely behind the move to Token-Based Billing and rate limits.
In any case, if Microsoft’s making this move, it means that CFO Amy Hood — the woman behind last year’s pullback on data center construction — has decided that the subsidy party is over. Though Microsoft is yet to formally announce the move to Token-Based Billing, I imagine it’ll be sometime this week that it rips off the bandage.
Two weeks ago, Anthropic did the same with its enterprise customers, shifting them to a flat $20-a-seat fee and otherwise charging the per-token rate for whatever models they wanted to use.
I’m making the call that by the end of 2026, a majority of AI services will move some or all of their customers to token-based billing as they reckon with the true costs of running AI models.
I kept things simple today both to give myself a bit of a break and because these were stories I felt needed telling.
Nevertheless, I do have to remark on how ridiculous everything has become.
Everywhere you turn, somebody is talking about “agents” in a way that doesn’t remotely match with reality, like Aaron Levie’s epic screeds about how “AI agents make it so every other company on the planet starts to create software for bringing automation to their workflows in a way that would be either infeasible technically or unaffordable economically,” a statement that may as well be about fucking unicorns and manticores as far as its connections to reality.
I feel bad picking on Aaron, as he doesn’t seem like a bad guy. He is, however, increasingly-indicative of the hysterical brainrot of executive AI hysteria, where the only way to discuss the industry is in vaguely futuristic-sounding terms about “agents” and “inference” and “tokens as a commodity,” all with the intent of obfuscating the ugly, simple truth: that generative AI is deeply unprofitable, doesn’t seem to provide tangible productivity benefits, and appears to only lose both the business and the customer money.
Though my arguments might be verbose, they’re ultimately pretty simple: AI does not provide even an iota of the benefits — economic or otherwise — to justify its ruinous costs. Every new story that runs about cost-cutting or horrible burnrates increasingly validates my position, and for the most part, boosters respond by saying “well LOOK at how BIG the REVENUES are.”
It isn’t! AI revenues are dogshit. They’re awful. They’re pathetic. The entire industry — including OpenAI and Anthropic’s theoretical revenues of $13.1 billion and $4.5 billion — hit around $65 billion last year, and that includes the revenues from providing compute generated by neoclouds like CoreWeave and hyperscalers like Microsoft.
I’m also just gonna come out and say it: I think the AI startups are misleading their investors and the general public about their revenues. My reporting from last year had OpenAI’s revenues at somewhere in the region of $4.3 billion in the first three quarters of 2025, and Anthropic CFO Krishna Rao said in an an affidavit that the company had made revenue “exceeding” (sigh) $5 billion through March 9, 2026, which does not make sense when you add up all the annualized revenue figures reported about this company.
Cursor is also reportedly at $6 billion in annualized revenue (or around $500 million a month) and “gross margin positive” — which I also doubt given that it had to raise over $3 billion last year and is apparently raising another $2 billion this year.
Even if said numbers were real, the majority of OpenAI, Cursor and Anthropic’s revenues come from subsidized software subscriptions. Things have gotten so dire that even Deidre Bosa of CNBC agrees with me that AI demand is inflated by token-maxxing and subsidized services.
Otherwise, everybody else is making single or double-digit millions of dollars and losing hundreds of millions of dollars to get there. And per founder Scott Stevenson, overstating annualized revenues is extremely common, with AI startups booking “three-year-long” enterprise deals with the first year discounted and a twelve-month out:
The reason many AI startups are crushing revenue records is because they are using a dishonest metric
The biggest funds in the world are supporting this and misleading journalists for PR coverage.
The setup: Company signs 3-year enterprise deals. Year 1 is discounted (say $1M), Year 2 steps up ($2M), Year 3 is full price ($3M).
They report $3M as “ARR” — even though they’re only collecting $1M right now.
The worst part: The customer has an opt-out option at 12 months! It’s not actually a 3 year contract.
While it’s hard to say how widespread this potential act of fraud might be, Stevenson estimates that more than 50% of enterprise AI startups are using “contracted ARR” to pump their values. One (honest) founder responded to Stevenson saying that his company has $350,000 in contracted ARR but only $42,000 of ARR, adding that “next year is gonna be awesome though,” which I don’t think will be the case for what appears to be a chatbot for finding investors.
This industry’s future is predicated entirely on the existence of infinite resources, and most AI companies are effectively front-ends for models owned by Anthropic and OpenAI, two other companies that rely on infinite resources to run their services and fund their infrastructure.
And at the top of the pile sits NVIDIA, the largest company on the stock market, which is selling more GPUs than can be possibly installed, and very few people seem to notice or care.
I’m talking about hundreds of billions of dollars of GPUs sitting in warehouses that aren’t being installed, with it taking six months to install a single quarter’s worth of GPU sales. The assumption, based on every financial publication I’ve read, appears to be “it will keep selling GPUs forever, and it will all be so great.”
Where are you going to put them, Jensen? Where do the fucking GPUs go? There isn’t enough capacity under construction! If, in fact, NVIDIA is actually selling as many GPUs as it says, it’s likely taking liberties with “transfers of ownership” where NVIDIA marks a product as “sold” to somebody that has yet to actually take it on.
Sidenote: There’re already signs that GPUs are beginning to pile up.
You see, when a hyperscaler buys an AI server, what actually happens is an ODM — original design manufacturer — buys the GPUs from NVIDIA, builds the server, and then ships it to the data center, which, to be clear, is all above board and normal. These ODMs also book the entire value of the NVIDIA GPU as revenue, which is why revenues for companies like Foxconn, Wystron and Quanta Computing have all spiked during the AI bubble.
Oh, right, the signs. Per Quanta Computing’s fourth quarter financial results, inventory — as in stuff that’s sitting waiting to go somewhere — has spiked from $10.54 billion in Q3 2025 to $16.3 billion 2025, and nearly doubled year-over-year ($8.33 billion) as gross profit dropped from 7.9% in Q4 2024 to 7% Q4 2025. While this isn’t an across-the-board problem (Wistron’s inventories dropped quarter-over-quarter, for example), Taiwanese ODMs are going to be one of the first places to watch for inventory accumulation.
In any case, I keep coming back to the word “hysteria,” because it’s hard to find another word to describe this hype cycle. The way that the media, the markets, analysts, executives, and venture capitalists discuss AI is totally divorced from reality, discussing “agents” in terms that don’t match with reality and AI data centers in terms of “gigawatts” that are entirely fucking theoretical, all with a terrifying certainty that makes me wonder what it is I’m missing.
But every sign points to me being right, and if I’m right at the scale I think I’m right, I think we’re about to have a legitimacy crisis in investing and mainstream media, because regular people are keenly aware that something isn’t right, in many cases, it’s because they’re able to count.
2026-04-21 01:11:58
Note: Microsoft has now confirmed some of these details in a blog post.
Leaked internal documents viewed by Where’s Your Ed At reveal that Microsoft intends to pause new signups for the student and paid individual tiers of AI coding product GitHub Copilot, tighter rate limits, and eventually move users to “token-based billing,” charging them based on what the actual cost of their token burn really is.
Explainer: At present, GitHub Copilot users have a certain amount of “requests” — interactions where you ask the model to do something, with Pro ($10-a-month) accounts getting 300 a month, and Pro+ ($39-a-month) getting 1500. More-expensive models use more requests, cheaper ones use less (I’ll explain in a bit).
Moving to “token-based billing” would mean that instead of using “requests,” GitHub Copilot users would pay for the actual cost of tokens. For example, Claude Opus 4.7 costs $5 per million input tokens (stuff you feed in) and $25 per million output tokens (stuff the model outputs, including tokens for chain-of-thought reasoning.
The document says that although token-based billing has been a top priority for Microsoft, it became more urgent in recent months, with the week-over-week cost of running GitHub Copilot nearly doubling since January.
The move to token-based billing will see GitHub users charged based on their usage of the platform, and how many tokens their prompts consume — and thus, how much compute they use. It’s unclear at this time when this will begin.
This is a significant move, reflecting the significant cost of running models on any AI product. Much like Anthropic, OpenAI, Cursor, and every other AI company, Microsoft has been subsidizing the cost of compute, allowing users to burn way, way more in tokens than their subscriptions cost.
The party appears to be ending for subsidized AI products, with Microsoft’s upcoming move following Anthropic’s (per The Information) recent changes shifting enterprise users to token-based billing as a means of reducing its costs.
GitHub Copilot currently has two tiers for individual developers — a $10-per-month package called GitHub Copilot Pro, and a $39-a-month subscription called GitHub Copilot Pro+.
According to the leaked documents, both of these tiers will be impacted by the shutdown, as will the GitHub Copilot Student product, which is included within the free GitHub Education package.
According to the documents, Microsoft also intends to tighten rate limits on some Copilot Business and Enterprise plans, as well as on individual plans, where limits have already been squeezed, and plans to suspend trials of paid individual plans as it attempts to “fight abuse.”
Although Microsoft has regularly tweaked the rate limits for individual GitHub Copilot accounts, most recently at the start of April, the document notes that these changes weren’t enough, and that more rate limits changes are to come in the next few weeks.
As part of this cost-cutting exercise, Microsoft intends to remove Anthropic’s Opus family of AI models from the $10-per-month GitHub Copilot Pro package altogether.
Microsoft most recently retired Opus 4.6 Fast at the start of April for GitHub Copilot Pro+ users, although this decision was framed as a way to “further improve service reliability” and “[streamline] our model offerings and focusing resources on the models our users use the most.”
Other Opus models — namely Opus 4.6 and Opus 4.5 — will be removed from the GitHub Copilot Pro+ tier in the coming weeks, as Microsoft transitions to Anthropic’s latest Opus 4.7 model.
The move towards Opus 4.7 will likely see GitHub Copilot Pro+ users reach their usage limits faster.
Microsoft is offering a 7.5x request multiplier until April 30 — although it’s unclear what the multiplier will be after this date. This might sound like a good thing, but it actually means that each request using Opus 4.7 is actually 7.5 of them. Redditors immediately worked that out and are a little bit worried.
Premium request multipliers allow GitHub to reflect the cost of compute for different models. LLMs that require the most compute will have higher premium request multipliers compared to those that are comparatively more lightweight.
For example, the GPT-5.4 Mini model has a premium request multiplier of 0.33 — meaning that every prompt is treated as one-third of a premium request — whereas the now-retired Claude Opus 4.6 Fast had a 30x multiplier, meaning each request was treated as thirty of them.
The standard version of Claude Opus 4.6 has a premium request multiplier of three — meaning that, even with the promotional pricing, Claude Opus 4.7 is around 250% more expensive to use.
The announcements for all of these changes are scheduled to take place throughout the week.
If you liked this news hit and want to support my independent reporting and analysis, why not subscribe to my premium newsletter?
It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA, Anthropic and OpenAI’s finances, and the AI bubble writ large. I recently put out the timely and important Hater’s Guide To The SaaSpocalypse, another on How AI Isn't Too Big To Fail, a deep (17,500 word) Hater’s Guide To OpenAI, and just last week put out the massive Hater’s Guide To Private Credit.
Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.
2026-04-18 00:57:30
A few years ago, I made the mistake of filling out a form to look into a business loan, one that I never ended up getting. Since then I receive no less than three texts a day offering me lines of credit ranging from $150,000 to as much as $10 million, each one boasting about how quickly they could fund me and how easy said funding would be. Some claim that they’ve been “looking over my file” (I’ve never provided any actual information), others say that they’re “already talking to underwriting,” and some straight up say that they can get me the money in the next 24 hours.
Some of the texts begin with a name (“Hey Ed, It’s Zack”) or sternly say “Edward, it’s time to raise capital.” Others cut straight to the chase and tell me that they have been “arranged for five hundred and fourty (sic) thousand,” and others send the entire terms of a loan that I assume will be harder to get than responding “yes.” While many of them are obvious, blatant scams, others lead to complaint-filled Better Business Bureau pages that show that, somehow, these entities have sent them real money, albeit under terms that piss off their customers and occasionally lead to them getting sued by the government.
That’s because right now, anybody with the right lawyers, accountants and financial backing can create their own fund and start issuing loans to virtually anyone they deem worthy.
And while they’ll all say that they use “industry-standard” underwriting, no regulatory standard exists.
This, my friends, is the world of private credit — a giant, barely-regulated time bomb of indeterminate (but most certainly trillions of dollars ) size that has become a load-bearing pillar of pensions and insurance funds, and according to Federal Reserve data, private credit has borrowed around $300 billion (as of 2023) from big banks, representing around 14% of their total loans.
Sidenote: while there are some strict “private credit” firms — such as software specialist Hercules Capital — many of the “private credit” firms I’ll discuss are really asset managers. These asset managers create and raise specialist private credit funds that either extend debt directly to a party (such as Apollo’s involvement in xAI’s $5.4 billion compute deal), or as part of a leveraged buyout, where a private equity firm buys another company and raises the debt using the company’s own assets and cashflow as collateral, putting the debt on the company’s balance sheet.
The eager, aggressive growth of private credit has even led it to start targeting individual investors, per the Financial Times:
Last year, a retired doctor in France’s southern region of Provence received a brochure in the mail from his bank touting a new investment opportunity.
A New York asset manager called Blackstone was offering the 77-year-old the chance to invest €25,000 into its flagship private debt fund. The former doctor called his son to ask: had he ever heard of Blackstone, or private debt?
His son Mathieu Chabran, co-founder of alternative investment group Tikehau Capital, had indeed heard of the powerful pioneer of private markets. But he was floored to discover that a company with $1tn in assets, which has minted over half a dozen billionaires, was seeking new business from novice investors such as his father.
The FT also neatly summarizes the problem of having regular investors involving themselves in the world of private credit:
He believes people like his father do not fully understand the risks of investing in funds that are harder to sell out of but which offer the opportunity to invest in private loans, property deals and corporate takeovers, with the allure of high returns.
And those high returns come with a cost: a lack of flexibility ranging from “you can only redeem your funds every quarter, and only a small percentage of your funds,” to “you can’t redeem your funds if everybody else tries to at the same time,” to “we make the rules here, shithead.” When an asset manager sets up a private credit fund, it often sets terms around how often — or how much — investors can pull at once, usually set around 5%, because in most cases, private credit funds are highly illiquid, as despite them acting like a financial institution, they more often than not don’t have very much money on hand for investors.
Why? Because the “private” part of private credit means that the lender directly negotiates with the borrower and values the loans based on their own internal models. Said loans generally have little or no secondary market, and private credit wants to hold them to maturity so that it can continue to provide ongoing yield (which I’ll explain in a little bit).
Sidenote: When you read about a “private credit fund,” it’s often a fund owned by an asset manager. For example, Blackstone recently raised “Blackstone Capital Opportunities Fund V,” a $10 billion “opportunistic” credit fund that incorporates as a special purpose vehicle that holds and invests the capital, and eventually sends out disbursements. Investors include New York State’s Common Retirement Fund ($250 million), Texas’ Municipal Retirement System ($200 million), and Louisiana Teachers’ Retirement System ($125 million), per Private Debt Investor.
Funds tend to have a life-cycle of somewhere between five and 10 years, which only really works if everybody keeps paying their loans.
Things were going great for private credit for the longest time, but late last year, some buzzkills at the Financial Times discovered that auto parts manufacturer First Brands and subprime auto loan company Tricolor had taken on billions of dollars of loans under dodgy circumstances, double-pledging collateral (IE: giving the same stuff as collateral on different loans) and outright falsifying lending documents, allowing the both of them to borrow upwards of $10 billion from private credit firms, including billions from North Carolina-based firm Onset Capital, which nearly collapsed but was eventually rescued by Silver Point Capital.
After the collapse of First Brands and Tricolor, JP Morgan’s Jamie Dimon said that “when you see cockroaches, there are probably more,” the kind of sinister quote baked specifically to lead off a movie about a financial crisis.
Seemingly inspired to start freaking people out, on November 5, software-focused asset manager Blue Owl announced it would merge its publicly-traded OBDC fund with its privately-traded OBDC II fund, and, well, it didn’t go well, per my Hater’s Guide To Private Equity:
Blue Owl tried to merge a private fund (OBDC II, which allowed quarterly payouts) into another, publicly-traded fund (OBDC), but OBDC II’s value (as judged by Blue Owl itself) was 20% lower than that of OBDC, all to try and hide what are clearly problems with the economics of the fund itself. The FT has a great story about it.
Two weeks later on November 18 2025, Blue Owl said it would freeze redemptions on OBDC II until after the merger closed, then canceled it a day later citing “market conditions.” Two months later in February 2026, Blue Owl would announce that it was permanently halting redemptions from OBDC II, and sold $1.4 billion in assets from both OBDC II and two other funds. The buyers of the assets? Several large pension funds that had a vested interest in keeping the value of the assets high, and Kuvare, an insurance company with $20 billion of assets under management that Blue Owl bought in 2024. This is perfectly legal, extremely normal, and very good.
Private equity is also the principal funding source for private equity’s leveraged buyouts, accounting for over 70% of all leveraged buyout funding for the last decade, which means that private credit — and anyone unfortunate enough to fund it! — is existentially tied to the ability of the portfolio companies’ ability to pay, and their continued ability to refinance their debt.
This is a problem when your assets are decaying in value. As I discussed in the Hater’s Guide To Private Equity, PE firms massively over-invested between 2017 and 2021, leaving them with a backlog of 31,000 companies valued at $3.7 trillion that they can’t sell or take public, likely because many of these acquisitions were vastly overvalued.
You see, when things were really good, asset managers raised hundreds of billions of dollars from pension funds, insurance funds (some of which they owned), and institutional investors, and then issued hundreds of billions of dollars more (at times using leverage from banks to do so) in loans to private equity firms that went on to buy everything from software companies to restaurant franchises. Said debt would immediately go on the balance sheet of the acquired company, creating a “reliable,” “consistent” yield with every loan payment that the fund could then send on to its investors, on a quarterly or monthly basis.
The problem is that these investments were made under very different economic circumstances, when money was easy to raise and exits were straightforward, leading to many assets being massively overvalued, and holding debt that was issued under revenue and growth projections that only made sense in a low-interest environment. In simple terms, these loans were given to companies assuming they’d be able to pay them long term, and assuming that the sunny economic conditions would continue indefinitely, making them tough to refinance or, in some cases, for the debtor to continue paying.
And nowhere is that problem more pronounced than the world of software.
The jitters caused by First Brands and Tricolor eventually turned into full-on tremors thanks to the SaaSpocalypse (covered in the Hater’s Guide a month ago):
Before 2018, Software As A Service (SaaS) companies had had an incredible run of growth, and it appeared basically any industry could have a massive hypergrowth SaaS company, at least in theory. As a result, venture capital and private equity has spent years piling into SaaS companies, because they all had very straightforward growth stories and replicable, reliable, and recurring revenue streams.
Between 2018 and 2022, 30% to 40% of private equity deals (as I’ll talk about later) were in software companies, with firms taking on debt to buy them and then lending them money in the hopes that they’d all become the next Salesforce, even if none of them will. Even VC remains SaaS-obsessed — for example, about 33% of venture funding went into SaaS in Q3 2025, per Carta.
The Zero Interest Rate Policy (ZIRP) era drove private equity into fits of SaaS madness, with SaaS PE acquisitions hitting $250bn in 2021. Too much easy access to debt and too many Business Idiots believing that every single software company would grow in perpetuity led to the accumulation of some of the most-overvalued software companies in history.
The SaaSpocalypse is often (incorrectly) described as a result of AI “disrupting incumbent software companies,” when the reality is that private equity (and private credit) made the mistaken bet that every single software company would grow in perpetuity.
The larger software industry is in decline, with a McKinsey study of 116 public software companies with over $500 million in revenue from 2024 showing that growth efficiency had halved since 2021 as sales and marketing spend exploded, and BDO’s annual SaaS report from 2025 saying that SaaS company growth ranged from flat to active declines, which is why there’s now $46.9 billion in distressed software loans as of February 2026.
And to be clear, it’s not just private equity’s victims that are taking out loans. Over $62 billion in venture debt was issued in 2025, with established companies like Databricks ($5.2 billion in credit per the Wall Street Journal in 2024) and Dropbox ($2.7 billion from Blackstone in 2025) raising debt just as the overall software industry slows, with AI failing to pick up the pace.
This is a big fucking problem for private credit. Per the Wall Street Journal, asset managers are massively exposed to software companies, and have deliberately mislabeled some assets (such as saying a healthcare software company is just a “healthcare company”) to obfuscate the scale of the problem:
The Blue Owl Credit Income Corp. fund said that 11.6% of its portfolio consisted of loans to “internet software and services” companies at the end of the fourth quarter. The Journal found its software exposure to be around 21%.
The Blackstone Private Credit Fund, known as Bcred, reported 25.7% in software at the end of the third quarter, while the Journal found roughly 33% exposure.
Ares Capital Corp. reported 23.8% in “software and services” at the end of the fourth quarter, while the Journal found nearly 30% exposure.
The Apollo Debt Solutions fund reported 13.6% in software in the fourth quarter, while the Journal found a roughly 16% exposure.
And as I’ll explain, “obfuscation” is a big part of the private credit business model.
If I’m honest, preparing this week’s premium has been remarkably difficult, both in the amount of information I’ve had to pull together and how deeply worried it’s made me.
In the aftermath of the great financial crisis, insurance and pension funds found themselves desperate for yield — regular returns — to meet their payment obligations. Private credit has become the yield-bearer of choice, feeding over a trillion dollars of these funds’ investments into leveraged buyouts, AI data centers, loans to software companies, and failing restaurant franchises.
In some cases, asset managers have purchased insurance companies with the explicit intention of using them as funders for future private credit investments, such as Apollo’s acquisition of Athene, KKR’s acquisition of Global Atlantic, and Blue Owl’s acquisition of Kuvare. More on this later, as it fucking sucks.
Asset managers offering private credit market themselves as bank-like stewards of capital, but lack many (if any) of the restrictions that make you actually trust a bank. They self-deal, investing their insurance affiliates’ funds in their own equity investments (such as when KKR used Global Atlantic to invest in data center developer CyrusOne, a company it acquired in 2022), value and revalue assets based on mysterious and undocumented private models, and account for (as I mentioned) 70% of all funding of leveraged buyouts in the last decade, of which 30 to 40% were software companies purchased between 2018 and 2022, meaning that hundreds of billions of dollars of retirement and insurance funds are dependent on overvalued software companies paying loans funded during the zero interest free era.
While a market crash feels scary, what’s far scarier is that the present and future ability of many retirement and insurance funds is dependent on whether private equity-owned entities, software companies. and AI data center firms are able to keep paying their debts. If private credit fund returns begin to lag, the retirement and insurance industry lacks a viable replacement, and I don’t know how to fix that.
Fuck it, I’ll level with you. I think asset managers are scumbags, and I think the way that they do business is fucking disgraceful. The unbelievable amount of risk that asset managers have passed onto people’s fucking retirements is enough to turn my stomach, and if I’m honest, I don’t understand how this entire thing hasn’t broken already.
If I had to guess, it’s one of two reasons: that private credit funds have yet to escalate their risk enough, or we’re yet to see said risk’s consequences, with First Brands and Tricolor being just the beginning.
And Wall Street is prepared to profit, with S&P Dow Jones launching a credit default swap derivatives product to bet against a collection of 25 different banks, insurers, REITs, and business development companies. Bank of America, Deutsche Bank, Barclays and Goldman Sachs will start selling the derivatives next week, per Reuters, and I’d argue that enough demand could spark a genuine panic across publicly-traded asset managers.
In any case, this is a situation where I fear not one massive catastrophe, but a series of smaller calamities caused by decades of hubris and questionable risk management resulting from the unbelievably stupid decision to let private entities act like banks.
This is the Hater’s Guide To Private Credit, or The Big Shart.
2026-04-15 00:22:59
If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter?
It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA, Anthropic and OpenAI’s finances, and the AI bubble writ large. I recently put out the timely and important Hater’s Guide To The SaaSpocalypse, another on How AI Isn't Too Big To Fail, and a deep (17,500 word) Hater’s Guide To OpenAI.
Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.
Soundtrack: Muse — Stockholm Syndrome
I think the most enlightening thing about AI is that it shows you how even the most mediocre text inspires some sort of emotion. Soulless LinkedIn slop makes you feel frustration with a person for their lack of authenticity, but you can still imagine how they forced it out of their heads. You still connect with them, even if it’s in a bad way.
AI copy is dead. It is inert. The reason you can spot it is that it sounds hollow. I don’t care if a website says stuff on it because I typed in, just like I don’t care if it responds in a way that sounds human, because it all feels like nothing to me. I am not here to give a website respect, I will not be impressed by a website, nor will I grant a website any extra credit if it can’t do the right thing every time. The computer is meant to work for me. If the computer doesn’t do what I want, I change the kind of computer I use. LLMs will always hallucinate, their outputs are not trustworthy as a result, they cannot be deterministic, and any chance of any mistakes of any kind are unforgivable. I don’t care how the website made you feel: it’s a machine that doesn’t always work, and that’s not a very good machine.
I feel nothing when I see an LLM’s output. Tell me thank you or whatever, I don’t care. You’re a website. Oh you can spit out code? Amazing. Still a website.
Perhaps you’ve found value in LLMs. Congratulations! You should feel no compulsion to have to convince me, nor should you feel any pride in using a particular website. And if you feel you’re being judged for using AI, perhaps you should ask why you feel so vilified? Did the industry do something to somehow warrant judgment? Is there something weird or embarrassing about the product, such as it famously having a propensity to get things wrong? Perhaps it loses billions of dollars? Oh, it’s damaging to the environment too? And people are telling outright lies about it and constantly saying it’ll replace people’s jobs? And the CEOs are all greedy oafish sociopaths? Did you try being cloying, judgmental, condescending, and aggressive to those who don’t like AI? Oh, that didn’t work? I can’t imagine why.
Sounds embarrassing! You must really like that website.
ChatGPT is a website. Claude is a website. While I guess Claude Code runs in a terminal window, that just means it’s an app, which I put in exactly the same mental box as I do a website.
Yet everything you read or hear or see about AI does everything it can to make you think that AI is something other than a website or an app. People that “discover the power of AI” immediately stop discussing it in the same terms as Microsoft Word, Google, or any other app or website. It’s never just about what AI can do today, but always about some theoretical “AGI” or vague shit about “AI agents” that are some sort of indeterminate level of “valuable” without anyone being able to describe why.
Truly useful technology isn’t described in oblique or hyperbolic terms. For example, last week, IBM’s Dave McCann described using a series of “AI agents” to Business Insider
The agent — it's actually a collection of AI agents and assistants — scans McCann's calendar for client meetings and drafts a list of 10 things he needs to know for each one. The goal, McCann told Business Insider, was to free up time he and his staff spent preparing for the meetings.
Sounds like a website to me.
The agent reviews in-house data, what IBM and the client are doing in the market, external data, and account details — such as project status and services sold and purchased, McCann said. It can also identify industry trends and client needs by, for example, reviewing a firm's annual report and identifying a corresponding service IBM could provide.
Sounds like a website using an LLM to summarize stuff to me. Why are we making all this effort to talk about what a website does?
Digital Dave also saves McCann's team time, he said, because the three or four staffers who used to spend hours pulling together insights for the prep calls are now free to do other work.
"It's not just about driving efficiencies, but it's really about transforming how work gets done," McCann said.
My friend, this isn’t a “series of agents.” It’s an LLM that looks at stuff and spits out an answer. Chatbots have done this kind of thing forever. These aren’t “agents.” “Agents” makes it sound like there’s some sort of futuristic autonomous presence rather than a chatbot that’s looking at documents using technology that’s guaranteed to hallucinate incorrect information.
One benefit of building agents, McCann said, is that IBMers who develop them can share them with others on their team or more broadly within the company, "so it immediately creates that multiplier effect."
Many of the people who report to him have created agents, he said. There's a healthy competition, McCann said, to engineer the most robust digital sidekicks, especially because workers can build off of what their colleagues created.
Here’s a fun exercise: replace the word “agent” with “app,” and replace “AI” with “application.” In fact, let’s try that with the next quote:
Apps can handle a range of functions, including gathering information, processing paperwork, drafting communications, taking meeting minutes, and pulling research. It's still early, but these systems are quickly becoming a major focus of corporate application efforts as companies look to turn applications into something that can actually take work off employees' plates.
A variety of functions including searching for stuff, looking at stuff, generating stuff, transcribing a meeting, and searching for stuff. Wow! Who gives a fuck. Every “AI agent” story is either about code generation, summarizing some sort of information source, or generating something based on an information source that you may or may not be able to trust.
“Agent” is an intentional act of deception, and even “modern” agents like OpenClaw and its respective ripoffs ultimately boil down to “I can send you a reminder” or “I can transcribe a text you send me.”
Yet everybody seems to want to believe these things are “valuable” or “useful” without ever explaining why. A page of OpenClaw integrations claiming to share “real projects, real automations [and] real magic” includes such incredible, magical use cases as “reads my X bookmarks and discusses them with me,” “check incoming mail and remove spam,” “researches people before meetings and creates briefing docs,” “schedule reminders,” “tracking who visits a website” (summarizing information), and “using voice notes to tell OpenClaw what to do,” which includes “distilling market research” (searching for stuff) and “tightening a proposal” (generating stuff after looking at it).
I’d have no quarrel with any of this if it wasn’t literally described as magical and innovative. This is exactly the shit that software has always done — automations, shortcuts, reminders, and document work. Boring, potentially useful stuff done in an inefficient way requiring a Mac Mini and hundreds of dollars a day of API calls.
Even Stephen Fry’s effusive review of the iPad from 2010, in referring to it as a “magical object,” still referred to it as “class,” “a different order of experience,” remarking on its speed, responsiveness, its “smooth glide,” and remarking that it’s so simple. Even Fry, a writer beloved for his effervescence and sophisticated lexicon, was still able to point at the things he liked (such as the design and simplicity) in clear terms. Even in couching it in terms of the future, Fry is still able to cogently explain why he’s excited about the present.
Conversely, articles about Large Language Models and their associated products often describe them in one of three ways:
This simply doesn’t happen outside of bubbles. The original CNET review of the iPhone — a technology I’d argue literally changed the way that human beings live their lives — still described it in terms that mirrored the reality we live in:
THE GOOD The Apple iPhone has a stunning display, a sleek design and an innovative multitouch user interface. Its Safari browser makes for a superb web surfing experience, and it offers easy-to-use apps. As an iPod, it shines.
THE BAD The Apple iPhone has variable call quality and lacks some basic features found in many cellphones, including stereo Bluetooth support and a faster data network. Integrated memory is stingy for an iPod, and you have to sync the iPhone to manage music content.
THE BOTTOM LINE Despite some important missing features, a slow data network and call quality that doesn't always deliver, the Apple iPhone sets a new benchmark for an integrated cellphone and MP3 player.
I’d argue that technologies like cloud storage, contactless payments, streaming music, and video and digital photography have transformed our societies in ways that were obvious from the very beginning. Nobody sat around cajoling us to accept that we’d need to sunset our Nokia 3210s and get used to touchscreens because it was blatantly obvious that it was better on using the first iPhone.
Nobody ostracized you for not being sufficiently excited about iPhone apps. Git, launched in 2005, is arguably one of the single-most transformational technologies in tech history, changing how software engineers built all kinds of software. And I’d argue that Github, which came a few years later, was equally transformational.
Editor’s note: If you used SourceForge or Microsoft Visual SourceSafe, which earned the nickname Microsoft Visual SourceShredder due to the catastrophic (and potentially career-ending) ways it failed, you know.
I can’t find a single example of somebody being shamed for not being sufficiently excited, other than people arguing over whether Git was the superior version control software, or saying that Github, a cloud-based repository for code and collaboration, was obvious in its utility. Those that liked it didn’t feel particularly defensive. Even articles about GitHub’s growth spoke entirely in terms rooted in the present.
I realize this was before the hyper-polarized world of post-Musk Twitter, one where venture capital and the tech industry in general was a fraction of the size, but it’s really weird how different it feels when you read about how the stuff that actually mattered was covered.
I must repeat that this was a very different world with very different incentives. Today’s tech industry is a series of giant group chats across various social networks and physical locations, with a much-larger startup community (yCombinator’s last batch had 199 people — the first had 8) influenced heavily by the whims of investors and the various cults of personality in the valley. While social pressure absolutely existed, the speed at which it could manifest and mutate was minute in comparison to the rabid dogs of Twitter or the current state of Hackernews. There were fewer VCs, too.
In any case, no previous real or imagined tech revolution has ever inspired such eager defensiveness, tribalism or outright aggression toward dissenters, nor such ridiculous attempts to obfuscate the truth about a product outside of cryptocurrency, an industry with obvious corruption and financial incentives.
We’ve never had a cult of personality around a specific technology at this scale. There is something that AI does to people — in the way it both functions and the way that people react to it — that inspires them to act, defensively, weirdly, tribally.
I think it starts with LLMs themselves, and the feeling they create within a user.
We all love prompts. We love to be asked questions about ourselves. We feel important when somebody takes interest in what we’re doing, and even more-so when they remember things about it and seem to be paying attention. LLMs are built to completely focus themselves on us and do so while affirming every single interaction.
Human beings also naturally crave order and structure, which means we’ve created frameworks in our head about what authoritative-sounding or looking information looks like, and the language that engenders trust in it. We trust Wikipedia both because it’s an incredibly well-maintained library of information riddled with citations and because it tonally and structurally resembles an authoritative source. Large Language Models have been explicitly trained to deliver information (through training on much of the internet including Wikipedia) in a structured manner that makes us trust it like we would another source massaged with language we’d expect from a trusted friend or endlessly-patient teacher.
All of this is done with the intention of making you forget that you’re using a website. And that deception is what starts to make people act strangely.
The fact that an LLM can maybe do something is enough to make people try it, along with the constant pressure from social media, peers and the mainstream media.
Some people — such as myself — have used LLMs to do things, seen that making them do said things isn’t going to happen very easily, and walked away because I am not going to use a website that doesn’t do what it says.
As I’ve previously said, technology is a tool to do stuff. Some technology requires you to “get used to it” — iPhones and iPads were both novel (and weird) in their time, as was learning to use the Moonlander ZSK — but in basically every example doesn’t involve you tolerating the inherent failings of the underlying product under the auspices of it “one day being better.” Nowhere else in the world of technology does someone gaslight you into believing that the problems don’t exist or will magically disappear.
It’s not like the iPhone only occasionally allowed you to successfully take a photo, and reliable photography was something that you’d have to wait until the iPhone 3GS to enjoy. While the picture quality improved over time, every generation of iPhone all did the same basic thing successfully, reliably, and consistently.
I also think that the challenge of making an LLM do something useful is addictive and transformative. When people say they’ve “learned to use AI,” often they mean that they’ve worked out ways to fudge their prompts, navigate its failures, mitigate its hallucinations, and connect it to various different APIs and systems of record in such a way that it now, on a prompt, does something, and because they’re the ones that built this messy little process, they feel superior — because the model has repeatedly told them that they were smart for doing it and celebrated with them when they “succeeded.”
The term “AI agent” exists as both a marketing term and a way to ingratiate the user. Saying “yeah I used a chatbot to do some stuff” sounds boring, like you’re talking to an app or a website, but “using an AI agent” makes you sound like a futuristic cyber-warrior, even though you’re doing exactly the same thing.
LLMs are excellent digital busyboxes for those who want to come up with a way to work differently rather than actually doing work. In WIRED’s article about journalists using AI, Alex Heath boasts that he “feels like he’s cheating in a way that feels amazing”:
When technology reporter Alex Heath has a scoop, he sits down at his computer and speaks into a microphone. He’s not talking to a human colleague—Heath went independent on Substack last year—he’s talking to Claude. Using the AI-powered voice-to-text service Wispr Flow, Heath transmits his ideas to an AI agent, then lets it write his first draft.
Heath sat down with me last week to showcase how he’s integrated Anthropic’s Claude Cowork into his journalistic process. The AI tool is connected to his Gmail, Google Calendar, Granola AI transcription service, and Notion notes. He’s also built a detailed skill—a custom set of instructions—to help Claude write in his style, including the “10 commandments” of writing like Alex Heath. The skill includes previous articles he’s written, instructions on how he likes his newsletters to be structured, and notes on his voice and writing style.
Claude Cowork then automates the drafting process that used to take place in Heath’s head. After the agent finishes its first draft, Heath goes back and forth with it for up to 30 minutes, suggesting revisions. It’s quite an involved process, and he still writes some parts of the story himself. But Heath says this workflow saves him hours every week, and he now spends 30 to 40 percent less time writing.
The linguistics of “transmitting an idea to an AI agent” misrepresent what is a deeply boring and soulless experience. Alex speaks into a microphone, his words are transcribed, then an LLM burps out a draft. A bunch of different services connect to Claude Cowork and a text document (that’s what the “custom set of instructions” is) that says how to write like him, and then it writes like him, and then he talks to it and then sometimes writes bits of the story himself.
This is also most decidedly not automation. Heath still must sit and prompt a model again and again. He must still maintain connections to various services and make sure the associated documents in Notion are correct. He must make sure that Granola actually gets the transcriptions from his interview. He must (I would hope) still check both the AI transcription and the output from the model to make sure quotes are accurate. He must make sure his calendar reflects accurate information. He must make sure that Claude still follows his “voice and writing style” — if you can call it that given the amount of distance between him and the product.
Per Heath:
“I never did this because I liked being a writer. I like reporting, learning new things, having an edge, and telling people things that will make them feel smart six months from now.”
Well, Alex, you’re not telling anybody anything, your ideas and words come out of a Large Language Model that has convinced you that you’re writing them.
In any case, Heath’s process is a great example of what makes people think they’re “using powerful AI.” Large Language Models are extremely adept at convincing human beings to do most of the work and then credit “AI” with the outcomes. Alex’s process sounds convoluted and, if I’m honest, a lot more work than the old way of doing things. It’s like writing a blog using a machine from Pee-wee’s Playhouse.
I couldn’t eat breakfast that way every morning. I bet it would get old pretty quick.
This is the reality of the Large Language Model era. LLMs are not “artificial intelligence” at all. They do not think, they do not have knowledge, they are conjuring up their own training data (or reflecting post-training instructions from those developing them or documents instructing them to act a certain way), and any time you try and make them do something more-complicated, they begin to fall apart, and/or become exponentially more-expensive.
You’ll notice that most AI boosters have some sort of bizarre, overly-complicated way of explaining how they use AI. They spin up “multiple agents” (chatbots) that each have their own “skills document” (a text document) and connect “harnesses” (python scripts, text files that tell it what to do, a search engine, an API) that “let it run agentic workflows” (query various tools to get an outcome.”
The so-called “agentic AI” that is supposedly powerful and autonomous is actually incredibly demanding of its human users — you must set it up in so many different ways and connect it to so many different services and check that every “agent” (different chatbot) is instructed in exactly the right way, and that none of these agents cause any problems (they will) with each other. Oh, don’t forget to set certain ones to “high-thinking” for certain tasks and make sure that other tasks that are “easier” are given to cheaper models, and make sure that those models are prompted as necessary so they don’t burn tokens.
But the process of setting up all those agents is so satisfying, and when they actually succeed in doing something — even if it took fucking forever and costs a bunch and is incredibly inefficient — you feel like a god! And because you can “spin up multiple agents,” each one ready and waiting for you to give them commands (and ready to affirm each and every one of them), you feel powerful, like you’re commanding an army that also requires you to monitor whatever it does.
Sidebar: the psychological reward of building convoluted systems (which you can call “complex” if you want to feel fancy) is enough to drive somebody mad. OpenAI co-founder Andrej Karpathy recently described “building personal knowledge bases for various topics of research interest,” describing a dramatic and contrived process through which he has, by the sounds of it, created some sort of half-assed Wikipedia clone he can ask questions of using an LLM, with the results (and the content) also generated by AI. A user responded saying that he’d been doing a “less pro version of this using OpenClaw and Obsidian.”
It’s a very Silicon Valley way of looking at the world — a private Wikipedia that you use to…search…things you already know? Or want to know? You could just read a book I guess. Then again, in another recent tweet, Karpathy described drafting a blog post, using an LLM to “meticulously improve the argument over four hours,” then watch as the LLM “demolished the entire argument and convinced him the opposite was in fact true,” suggesting he didn’t really do much thinking about it in the first place.
God, these people sound like lunatics! I’m sorry! What’re you talking about man? You argued with a website for hours until it convinced you of something then it manipulated you into believing you were wrong? Why do you respect it? It’s a website! It doesn’t have opinions or thoughts or feelings. You are arguing with a calculator trained to sound human.
The reason that LLMs have become so interesting for software engineers is that this is already how they lived. Writing software is often a case of taping together different systems and creating little scripts and automations that make them all work, and the satisfaction of building functional software is incredible, even at the early stages.
Large Language Models perform an impression of automating that process, but for the most part force you, the user, to do the shit that matters, even if that means “be responsible for the code that it puts out.” Heath’s process does not appear to take less time than his previous one — he’s just moved stuff around a bit and found a website to tell him he’s smart for doing so.
They are Language Models interpreting language without any knowledge or thoughts or feelings or ability to learn, and each time they read something they interpret meaning based on their training data, which means they can (and will!) make mistakes, and when they’re, say, talking to another chatbot to tell it what to do next, that little mistake might build a fundamental flaw in the software, or just break the process entirely.
And Large Language Models — using the media — exist to try and convince you that these mistakes are acceptable. When Anthropic launched its Claude For Finance tool, which claims to “automate financial modeling” with “pre-built agents” (chatbots) but really appears to just be able to create questionably-useful models via Excel spreadsheets and “financial research” based on connecting to documents in your various systems, I imagine with a specific system prompt. Anthropic also proudly announced that it had scored a 55.3% on the Finance Agent Test.
I hate to repeat myself, but I will not respect a website, and I will not tolerate something being “55% good” at something if its alleged use case is that it’s an artificial intelligence.
Yet that’s the other remarkable thing about the LLM era — that there are people who are extremely tolerant of potential failures because they believe they’re either A) smart enough to catch them or B) smart enough to build systems that do so for them, with a little sprinkle of “humans make mistakes too,” conflating “an LLM that doesn’t know anything fucking up by definition” with “a human being with experiences and the capacity for adaptation making a mistake.”
Sidenote: I also believe that there is a contingent of people who are very impressed with LLMs who are really just impressed with the coding language Python. Python is awesome! It can organize your files, scrape websites, extra text from PDFs, manage your inbox, and send emails. Anyone you read talking about how LLMs “allowed them to look through a massive dataset” is likely using Python. Many of the associated tools that LLMs use use Python. Manus, the so-called “intelligent agent” firm that Meta bought last year, daisy-chains Python and Java in an incredibly-inefficient way to sometimes get things right, almost.
I truly have no beef with people using LLMs to speed up Python scripts to do fun little automations or to dig through big datasets, but please don’t try and convince me they’re being futuristic by doing so. If you want to learn Python, I recommend reading Al Sweigart’s Automate The Boring Stuff.
Anytime somebody sneers at you and says you are being “left behind” because you’re not using AI should be forced to show you what it is they’ve created or done, and the specific system they used to do so. They should have to show you how much work it took to prepare the system, and why it’s superior to just doing it themselves.
Karpathy also had a recent (and very long) tweet about “the growing gap in understanding of AI capability,” involving more word salad than a fucking SweetGreen:
So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions.
Wondering what those “staggering improvements” are?
TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
The one tangible (and theoretical!) example Karpathy gives is an example of how hard people work to overstate the capabilities of LLMs. “Coherently restructuring” a codebase might happen when you feed it to an LLM (while also costing a shit-ton of tokens, but putting that aside), or it might not understand at all because Claude Opus is acting funny that day, or it might sort-of fix it but mess something subtle up that breaks things in the future. This is an LLM doing exactly what an LLM does — it looks at a block of text, sees whether it matches up with what a user said, sees how that matches with its training data, and then either tells you things to do or generates new code, much like it would do if you had a paragraph of text you needed to fact-check. Perhaps it would get some of the facts right if connected to the right system. Perhaps it might make a subtle error. Perhaps it might get everything wrong.
This is the core problem with the “checkmate, boosters — AI can write code!” problem. AI can write code. We knew that already. It gets “better” as measured by benchmarks that don’t really compare to real world success, and even with the supposedly meteoric improvements over the last few months, nobody can actually explain what the result of it being better is, nor does it appear to extend to any domain outside of coding.
You’ll also notice that Karpathy’s language is as ingratiating to true believers as it is vague. Other domains are left unexplained other than references to “research” and “math.” I’m in a research-heavy business, and I have tried the most-powerful LLMs and highest-priced RAG/post-RAG research tools, and every time find them bereft of any unique analysis or suggestions.
I don’t dispute that LLMs are useful for generating code, nor do I question whether or not they’re being used by software developers at scale. I just think that they would be used dramatically less if there weren’t an industrial-scale publicity campaign run through the media and the majority of corporate America both incentivizing and forcing them to do so.
Similarly, I’m not sure anybody would’ve been anywhere near as excited if OpenAI and Anthropic hadn’t intentionally sold them a product that was impossible to support long-term.
This entire industry has been sold on a lie, and as capacity becomes an issue, even true believers are turning on the AI labs.
About a year ago, I warned you that Anthropic and OpenAI had begun the Subprime AI Crisis, where both companies created “priority processing tiers” for enterprise customers (read: AI startups like Replit and Cursor), dramatically increasing the cost of running their services to the point that both had to dramatically change their features as a result. A few weeks later, I wrote another piece about how Anthropic was allowing its subscribers to burn thousands of dollars’ worth of tokens on its $100 and $200-a-month subscriptions, and asked the following question at the end:
…do you think that the current version of Claude Code is going to be what you get? Anthropic has proven it’ll rate limit their business customers, what's stopping it from doing the same to you and charging more, just like Cursor?
I was right to ask, as a few weeks ago (as I wrote in the Subprime AI Crisis Is Here) that Anthropic had added “peak hours” to its rate limits, and users found across the board that they were burning through their limits in some cases in only a few prompts. Anthropic’s response was, after saying it was looking into why rate limits were being hit so fast, to say that users were ineffectively utilizing the 1-million-token context window and failing to adjust Claude’s “thinking effort level” based on whatever task it is they were doing.
Anthropic’s customers were (and remain) furious, as you can see in the replies of its thread on the r/Anthropic Subreddit.
To make matters worse, it appears that — deliberately or otherwise — Anthropic has been degrading the performance of both Claude Opus 4.6 and Claude Code itself, with developers, including AMD Senior AI Director Stella Laurenzo, documenting the problem at length (per VentureBeat):
One of the most detailed public complaints originated as a GitHub issue filed by Stella Laurenzo on April 2, 2026, whose LinkedIn profile identifies her as Senior Director in AMD’s AI group.
In that post, Laurenzo wrote that Claude Code had regressed to the point that it could not be trusted for complex engineering work, then backed that claim with a sprawling analysis of 6,852 Claude Code session files, 17,871 thinking blocks and 234,760 tool calls.
The complaint argued that, starting in February, Claude’s estimated reasoning depth fell sharply while signs of poorer performance rose alongside it, including more premature stopping, more “simplest fix” behavior, more reasoning loops, and a measurable shift from research-first behavior to edit-first behavior.
Think that Anthropic cares? Think again:
Anthropic’s public response focused on separating perceived changes from actual model degradation. In a pinned follow-up on the same GitHub issue posted a week ago, Claude Code lead Boris Cherny thanked Laurenzo for the care and depth of the analysis but disputed its main conclusion.
Cherny said the “redact-thinking-2026-02-12” header cited in the complaint is a UI-only change that hides thinking from the interface and reduces latency, but “does not impact thinking itself,” “thinking budgets,” or how extended reasoning works under the hood.
He also said two other product changes likely affected what users were seeing: Opus 4.6’s move to adaptive thinking by default on Feb. 9, and a March 3 shift to medium effort, or effort level 85, as the default for Opus 4.6, which he said Anthropic viewed as the best balance across intelligence, latency and cost for most users.
Cherny added that users who want more extended reasoning can manually switch effort higher by typing /effort high in Claude Code terminal sessions.
Another developer found that Claude Opus 4.6 was “thinking 67% less than it used to,” though Anthropic didn’t even bother to respond. In fact, Anthropic has done very little to explain what’s actually happening, other than to say that it doesn’t degrade its models to better serve demand.
To be clear, this is far from the only time that I’ve seen people complain about these models “getting dumber” — users on basically every AI Subreddit will say, at some point, that models randomly can’t do things they used to be able to, with nobody really having an answer other than “yeah dude, same.”
Back in September 2025, developer Theo Browne complained that Claude had got dumber, but Anthropic near-immediately responded to say that the degraded responses were a result of bugs that “intermittently degraded responses from Claude,” adding the following:
To state it plainly: We never reduce model quality due to demand, time of day, or server load. The problems our users reported were due to infrastructure bugs alone.
Which begs the question: is Anthropic accidentally making its models worse? Because it’s obvious it’s happening, it’s obvious they know something is happening, and its response, at least so far, has been to say that either users need to tweak their settings or nothing is wrong at all. Yet these complaints have happened for years, and have reached a crescendo with the latest ones that involve, in some cases, Claude Code burning way more tokens for absolutely no reason, hitting rate limits earlier than expected or wasting actual dollars spent on API calls.
Some suggest that the problems are a result of capacity issues over at Anthropic, which have led to a stunning (at least for software used by millions of people) amounts of downtime, per the Wall Street Journal:
The reliability of core services on the internet is often measured in nines. Four nines means 99.99% of uptime—a typical percentage that a software company commits to customers. As of April 8, Anthropic’s Claude API had a 98.95% uptime rate in the last 90 days.
This naturally led to boosters (and, for that matter, the Wall Street Journal) immediately saying that this was a sign of the “insatiable demand for AI compute”:
Spot-market prices to access Nvidia’s GPUs, or graphics processing units, in data-center clouds have risen sharply in recent months across the company’s entire product line, according to Ornn, a New York-based data provider that publishes market data and structures financial products around GPU pricing.
Renting one of Nvidia’s most-advanced Blackwell generation of chips for one hour costs $4.08, up 48% from the $2.75 it cost two months ago, according to the Ornn Compute Price Index.
“There’s a massive capacity crunch that’s unlike anything I’ve seen in the more than five years I’ve been running this business,” said J.J. Kardwell, chief executive of Vultr, a cloud infrastructure company. “The question is, why don’t we just deploy more gear? The lead times are too long. Data center build times are long, the power that’s available through 2026 is already all spoken for.”
Before I go any further: if anyone has been taking $2.75-per-hour-per-GPU for any kind of Blackwell GPU, they are losing money. Shit, I think they are at $4.08. While these are examples from on-demand pricing (versus paid-up years-long contracts like Anthropic buys), if they’re indicative of wider pricing on Blackwell, this is an economic catastrophe.
In any case, Anthropic’s compute constraints are a convenient excuse to start fucking over its customers at scale. Rate limits that were initially believed to be a “bug” are now the standard operating limits of using Anthropic’s services, and its models are absolutely, fundamentally worse than they were even a month ago.
It’s January 14 2026, and you just read The Atlantic’s breathless hype-slop about Claude Code, believing that it was “bigger than the ChatGPT moment,” that it was an “inflection point for AI progress,” and that it could build whatever software you imagined. While you’re not exactly sure what it is you’re meant to be excited about, your boss has been going on and on about how “those who don’t use AI will be left behind,” and your boss allows you to pay $200 for a year’s access to Claude Pro.
You, as a customer, no longer have access to the product you purchased. Your rate limits are entirely different, service uptime is measurably worse, and model performance has, for some reason, taken a massive dip. You hit your rate limits in minutes rather than hours. Prompts that previously allowed you a healthy back-and-forth over a project are now either impractical or impossible.
Your boss now has you vibe-coding barely-functional apps as a means of “integrating you with the development stack,” but every time you feed it a screenshot of what’s going wrong with the app you seem to hit your rate limits again. You ask your boss if he’ll upgrade you to the $100-a-month subscription, and he says that “you’ve got to make do, times are tough.” You sit at your desk trying to work out what the fuck to do for the next four hours, as you do not know how to code and what little you’ve been able to do is now impossible.
This is the reality for a lot of AI subscribers, though in many cases they’ll simply subscribe to OpenAI Codex or another service that hasn’t brought the hammer down on their rate limits.
…for now, at least.
The con of the Large Language Model era is that any subscription you pay for is massively subsidized, and that any product you use can and will see its service degraded as these companies desperately try to either ease their capacity issues or lower their burn rate.
Yet it’s unclear whether “more capacity” means that things will be cheaper, or better, or just a way of Anthropic scaling an increasingly-shittier experience.
To explain, when an AI lab like Anthropic or OpenAI “hits capacity limits,” it doesn’t mean that they start turning away business or stop accepting subscribers, but that current (and new) subscribers will face randomized downtime and model issues, along with increasingly-punishing rate limits.
Neither company is facing a financial shortfall as a result of being unable to provide their services (rather, they’re facing financial shortfalls because they’re providing their services to customers. And yet, the only people that are the only people paying that price because of these “capacity limits” are the customers.
This is because AI labs must, when planning capacity, make arbitrary guesses about how large the company will get, and in the event that they acquire too much capacity, they’ll find themselves in financial dire straits, as Anthropic CEO Dario Amodei told Dwarkesh Patel back in February:
So when we go to buying data centers, again, the curve I’m looking at is: we’ve had a 10x a year increase every year. At the beginning of this year, we’re looking at $10 billion in annualized revenue. We have to decide how much compute to buy. It takes a year or two to actually build out the data centers, to reserve the data center.
Basically I’m saying, “In 2027, how much compute do I get?” I could assume that the revenue will continue growing 10x a year, so it’ll be $100 billion at the end of 2026 and $1 trillion at the end of 2027. Actually it would be $5 trillion dollars of compute because it would be $1 trillion a year for five years. I could buy $1 trillion of compute that starts at the end of 2027. If my revenue is not $1 trillion dollars, if it’s even $800 billion, there’s no force on earth, there’s no hedge on earth that could stop me from going bankrupt if I buy that much compute.
What happens if you don’t buy enough compute? Well, you find yourself having to buy it last-minute, which costs more money, which further erodes your margins, per The Information:
In another sign of its financial pressures, OpenAI told investors that its gross profit margins last year were lower than projected due to the company having to buy more expensive compute at the last minute in response to higher than expected demand for its chatbots and models, according to a person with knowledge of the presentation. (Anthropic has experienced similar problems.)
In other words, compute capacity is a knife-catching game. Ordering compute in advance lets you lock in a better rate, but having to buy compute at the last-minute spikes those prices, eating any potential margin that might have been saved as a result of serving that extra demand.
Order too little compute and you’ll find yourself unable to run stable and reliable services, spiking your costs as you rush to find more capacity. Order too much capacity and you’ll have too little revenue to pay for it.
It’s important to note that the “demand” in question here isn’t revenue waiting in the wings, but customers that are already paying you that want to do more with the product they paid for. More capacity allows you to potentially onboard new customers, but they too face the same problems as your capacity fills.
This also begs the question: how much capacity is “enough”? It’s clear that current capacity issues are a result of the inference (the creation of outputs) demands of Anthropic’s users. What does adding more capacity do, other than potentially bringing that under control?
This also suggests that Anthropic’s (and OpenAI’s by extension) business model is fundamentally flawed. At its current infrastructure scale, Anthropic cannot satisfactorily serve its current paying customer base, and even with this questionably-stable farce of a product, Anthropic still expects to burn $14 billion. While adding more capacity might potentially allow new customers to subscribe, said new customers would also add more strain on capacity, which would likely mean that nobody’s service improves but Anthropic still makes money.
It ultimately comes down to the definition of the word “demand.”
Let me explain.
Data center development is very slow. Only 5GW of capacity is under construction worldwide (and “construction” can mean anything from a single steel beam to a near-complete building). As a result, both Anthropic and OpenAI are planning and paying for capacity years in advance based on “demand.”
“Demand” in this case doesn’t just mean “people who want to pay for services,” but “the amount of compute that the people who pay us now and may pay us in the future will need for whatever it is they do.”
The amount of compute that a user may use varies wildly based on the model they choose and the task in question — a source at Microsoft once told me in the middle of last year that a single user could take up as many as 12 GPUs with a coding task using OpenAI’s o4-mini — which means that in a very real sense these guys are guessing and hoping for the best.
It also means that their natural choice will be to fuck over their current users to ease their capacity issues, especially when those users are paying on a monthly or — ideally — annual basis. OpenAI and Anthropic need to show continued revenue growth, which means that they must have capacity available for new customers, which means that old customers will always be the first to be punished.
We’re already seeing this with OpenAI’s new $100-a-month subscription, a kind of middle ground between its $20 and $200-a-month ChatGPT subscriptions that appears to have immediately reduced rate limits for $20-a-month subscribers.
To obfuscate the changes further, OpenAI also launched a bonus rate limit period through May 31 2026, telling users that they will have “10x or 20x higher rate limits than plus” on its pricing page while also featuring a tiny little note that’s very easy for somebody to miss:

This is a fundamentally insane and deceptive way to run a business, and I believe things will only get worse as capacity issues continue. Not only must Anthropic and OpenAI find a way to make their unsustainable and unprofitable services burn less money, but they must also constantly dance with metering out whatever capacity they have to their customers, because the more extra capacity they buy, the more money they lose.
However you feel about what LLMs can do, it’s impossible to ignore the incredible abuse and deception happening to just about every customer of an AI service.
As I’ve said for years, AI companies are inherently unsustainable due to the unreliable and inconsistent outputs of Large Language Models and the incredible costs of providing the services. It’s also clear, at this point, that Anthropic and OpenAI have both offered subscriptions that were impossible to provide at scale at the price and availability that they were leading up to 2026, and that they did so with the intention of growing their revenue to acquire more customers, equity investment and attention.
As a result, customers of AI services have built workflows and habits based on an act of deceit. While some will say “this is just what tech companies do, they get you in when it’s cheap then jack up the price,” doing so is an act of cowardice and allegiance with the rich and powerful.
To be clear, Anthropic and OpenAI need to do this. They’ve always needed to do this. In fact, the ethical thing to do would’ve been to charge for and restrict the services in line with their actual costs so that users could have reliable and consistent access to the services in question. As of now, anyone that purchases any kind of AI subscription is subject to the whims of both the AI labs and their ability to successfully manage their capacity, which may or may not involve making the product that a user pays for worse.
The “demand” for AI as it stands is an act of fiction, as much of that demand was conjured up using products that were either cheaper or more-available. Every one of those effusive, breathless hype-screeds about Claude Code from January or February 2026 are discussing a product that no longer exists. On June 1 2026, any article or post about Codex’s efficacy must be rewritten, as rate limits will be halved.
While for legal reasons I’ll stop short of the most obvious word, Anthropic and OpenAI are running — intentionally or otherwise — deeply deceitful businesses where their customers cannot realistically judge the quality or availability of the service long-term. These companies also are clearly aware that their services are deeply unpopular and capacity-constrained, yet aggressively court and market toward new customers, guaranteeing further service degradations and potential issues with models.
This applies even to API customers, who face exactly the same downtime and model quality issues, all with the indignity of paying on a per-million token basis, even when Claude Opus 4.6 decides to crap itself while refactoring something, running token-intensive “agents” to fix simple bugs or fails to abide by a user’s guidelines.
This is not a dignified way to use software, nor is it an ethical way to sell it.
How can you plan around this technology? Every month some new bullshit pops up. While incremental model gains may seem like a boon, how do you actually say “ok, let’s plan ahead” for a technology that CHANGES, for better or for worse, at random intervals? You’re constantly reevaluating model choices and harnesses and prompts and all kinds of other bullshit that also breaks in random ways because “that’s how large language models work.” Is that fun? Is that exciting? Do you like this? It seems exhausting to me, and nobody seems to be able to explain what’s good about it.
How, exactly, does this change?
Right now, I’d guess that OpenAI has access to around 2GW of capacity (as of the end of 2025), and Anthropic around 1GW based on discussions with sources. OpenAI is already building out around 10GW of capacity with Oracle, as well as locking in deals with CoreWeave ($22.4 billion), Amazon Web Services ($138 billion), Microsoft Azure ($250 billion), and Cerebras (“750MW”).
Meanwhile, Anthropic is now bringing on “multiple gigawatts of Google’s next-generation TPU capacity” on top of deals with Microsoft, Hut8, CoreWeave and Amazon Web Services.
Both of these companies are making extremely large bets that their growth will continue at an astonishing, near-impossible rate. If OpenAI has reached “$2 billion a month” (which I doubt it can pay for) with around 2GW of capacity, this means that it has pre-ordered compute assuming it will make $10 billion or $20 billion a month in a few short years, which fits with The Information’s reporting that OpenAI projects it will make $113 billion in revenue in 2028.
And if it doesn’t make that much revenue — and also doesn’t get funding or debt to support it — OpenAI will run out of money, much as Anthropic will if that capacity gets built and it doesn’t make tens of billions of dollars a month to pay for it.
I see no scenario where costs come down, or where rate limits are eased. In fact, I think that as capacity limits get hit, both Anthropic and OpenAI will degrade the experience for the user (either through model degradation or rate limit decay) as much as they can.
I imagine that at some point enterprise customers will be able to pay for an even higher priority tier, and that Anthropic’s “Teams” subscription (which allows you to use the same subsidized subscriptions as everyone else) will be killed off, forcing anyone in an organization paying for Claude Code (and eventually Codex) via the API, as has already happened for Anthropic’s enterprise users.
Anyone integrating generative AI is part of a very large and randomized beta test. The product you pay for today will be materially different in its quality and availability in mere months. I told you this would happen in September 2024. I have been trying to warn you this would happen, and I will repeat myself: these companies are losing so much more money than you can think of, and they are going to twist the knife in and take as many liberties with their users and the media as they can on the way down.
It is fundamentally insane that we are treating these companies as real businesses, either in their economics or in the consistency of the product they offer.
These are unethical products sold in deceptive ways, both in their functionality and availability, and to defend them is to help assist in a society-wide con with very few winners.
And even if you like this, mark my words — your current way of life is unsustainable, and these companies have already made it clear they will make the service worse, without warning, if they even acknowledge that they’ve done so directly. The thing you pay for is not sustainable at its current price and they have no way to fix that problem.
Do you not see you are being had? Do you not see that you are being used?
Do any of you think this is good? Does any of this actually feel like progress?
I think it’s miserable, joyless and corrosive to the human soul, at least in the way that so many people talk about AI. It isn’t even intelligent. It’s just more software that is built to make you defend it, to support it, to do the work it can’t so you can present the work as your own but also give it all the credit.
And to be clear, these companies absolutely fucking loathe you. They’ll make your service worse at a moment’s notice and then tell you nothing is wrong.
Anyone using a subscription to OpenAI or Anthropic’s services needs to wake up and realize that their way of life is going away — that rate limits will make current workflows impossible, that prices will increase, and that the product they’re selling even today is not one that makes any economic sense.
Every single LLM product is being sold under false pretenses about what’s actually sustainable and possible long term.
With AI, you’re not just the product, you’re a beta tester that pays for the privilege.
And you’re a mark for untrustworthy con men selling software using deceptive and dangerous rhetoric.
I will be abundantly clear for legal reasons that it is illegal to throw a Molotov cocktail at anyone, as it is morally objectionable to do so. I explicitly and fundamentally object to the recent acts of violence against Sam Altman.
It is also morally repugnant for Sam Altman to somehow suggest that the careful, thoughtful, determined, and eagerly fair work of Ronan Farrow and Andrew Marantz is in any way responsible for these acts of violence. Doing so is a deliberate attempt to chill the air around criticism of AI and its associated companies. Altman has since walked back the comments, claiming he “wishes he hadn’t used” a non-specific amount of the following words:
A lot of the criticism of our industry comes from sincere concern about the incredibly high stakes of this technology. This is quite valid, and we welcome good-faith criticism and debate. I empathize with anti-technology sentiments and clearly technology isn’t always good for everyone. But overall, I believe technological progress can make the future unbelievably good, for your family and mine.
While we have that debate, we should de-escalate the rhetoric and tactics and try to have fewer explosions in fewer homes, figuratively and literally.
These words remain on his blog, which suggests that Altman doesn’t regret them enough to remove them.
I do, however, agree with Mr. Altman that the rhetoric around AI does need to change.
Both he and Mr. Amodei need to immediately stop overstating the capabilities of Large Language Models. Mr. Altman and Mr. Amodei should not discuss being “scared” of their models, or being “uncomfortable” that men such as they are in control unless they wish to shut down their services, or that they “don’t know if models are conscious.”
They should immediately stop misleading people through company documentation that models are “blackmailing” people or, as Anthropic did in its Mythos system card, suggest a model has “broken containment and sent a message” when it A) was instructed to do so and B) did not actually break out of any container.
They must stop discussing threats to jobs without actual meaningful data that is significantly more sound than “jobs that might be affected someday but for now we’ve got a chatbot.” Mr. Amodei should immediately cease any and all discussions of AI potentially or otherwise eliminating 50% of white collar jobs, as Mr. Altman should cease predicting when Superintelligence might arrive, as Mr. Amodei should actively reject and denounce any suggestions of AI “creating a white collar bloodbath.”
Those that defend AI labs will claim that these are “difficult conversations that need to be had,” when in actuality they engage in dangerous and frightening rhetoric as a means of boosting a company’s valuation and garnering attention. If either of these men truly believed these things were true, they would do something about it other than saying “you should be scared of us and the things we’re making, and I’m the only one brave enough to say anything.”
These conversations are also nonsensical and misleading when you compare them to what Large Language Models can do, and this rhetoric is a blatant attempt to scare people into paying for software today based on what it absolutely cannot and will not do in the future. It is an attempt to obfuscate the actual efficacy of a technology as a means of deceiving investors, the media and the general public.
Both Altman and Amodei engage in the language of AI doomerism as a means of generating attention, revenue and investment capital, actively selling their software and future investment potential based on their ownership of a technology that they say (disingenuously) is potentially going to take everybody’s jobs.
Based on reports from his Instagram, the man who threw the molotov cocktail at Sam Altman’s house was at least partially inspired by If Anyone Builds It, Everyone Dies, a doomer porn fantasy written by a pair of overly-verbose dunces spreading fearful language about the power of AI, inspired by the fearmongering of Altman himself. Altman suggested in 2023 that one of the authors might deserve the Nobel Peace Prize.
I only see one side engaged in dangerous rhetoric, and it’s the ones that have the most to gain from spreading it.
I need to be clear that this act of violence is not something I endorse in any way. I am also glad that nobody was hurt.
I also think we need to be clear about the circumstances — and the rhetoric — that led somebody to do this, and why the AI industry needs to be well aware that the society they’re continually threatening with job loss is one full of people that are very, very close to the edge. This is not about anybody being “deserving” of anything, but a frank evaluation of cause and effect.
People feel like they’re being fucking tortured every time they load social media. Their money doesn’t go as far. Their financial situation has never been worse. Every time they read something it’s a story about ICE patrols or a near-nuclear war in Iran, or that gas is more expensive, or that there’s worrying things happening in private credit. Nobody can afford a house and layoffs are constant.
One group, however, appears to exist in an alternative world where anything they want is possible. They can raise as much money as they want. They can build as big a building as they want anywhere in the world. Everything they do is taken so seriously that the government will call a meeting about it. Every single media outlet talks about everything they do. Your boss forces you to use it. Every piece of software forces you to at least acknowledge that they use it too. Everyone is talking about it with complete certainty despite it not being completely clear why. As many people writhe in continual agony and fear, AI promises — but never quite delivers — some sort of vague utopia at the highest cost known to man.
And these companies are, in no uncertain terms, coming for your job.
That’s what they want to do. They all say it. They use deceptively-worded studies that talk about “AI-exposed” careers to scare and mislead people into believing LLMs are coming for their jobs, all while spreading vague proclamations about how said job loss is imminent but also always 12 months away. Altman even says that jobs that will vanish weren’t real work to begin with, much as former OpenAI CTO Mira Murati said that some creative jobs shouldn’t have existed in the first place.
These people who sell a product with no benefit comparable on any level to its ruinous, trillion-dollar cost are able to get anything they want at a time when those who work hard are given a kick in the fucking teeth, sneered at for not “using AI” that doesn’t actually seem to make their lives easier, and then told that their labor doesn’t constitute “real work.”
At a time when nobody living a normal life feels like they have enough, the AI industry always seems to get more. There’s not enough money for free college or housing or healthcare or daycare but there’s always more money for AI compute.
Regular people face the harshest credit market in generations but private credit and specifically data centers can always get more money and more land.
AI can never fail — it can only be failed. If it doesn’t work, you simply don’t know how to “use AI” properly and will be “at a huge disadvantage" despite the sales pitch being “this is intelligent software that just does stuff.” AI companies can get as much attention as they need, their failings explained away, their meager successes celebrated like the ball dropping on New Years Eve, their half-assed sub-War Of The Worlds “Mythos” horseshit treated like they’ve opened the gates of Hell.
Regular people feel ignored and like they’re not taken seriously, and the people being given the most money and attention are the ones loudly saying “we’re richer than anyone has ever been, we intend to spend more than anyone has ever spent, and we intend to take your job.”
Why are they surprised that somebody mentally unstable took them seriously? Did they not think that people would be angry? Constantly talking about how your company will make an indeterminate amount of people jobless while also being able to raise over $162 billion in the space of two years and taking up as much space on Earth as you please is something that could send somebody over the edge.
Every day the news reminds you that everything sucks and is more expensive unless you’re in AI, where you’ll be given as much money and told you’re the most special person alive. I can imagine it tearing at a person’s soul as the world beats them down. What they did was a disgraceful act of violence.
Unstable people in various stages of torment act in erratic and dangerous ways. The suspect in the molotov cocktail incident apparently had a manifesto where he had listed the names and addresses of both Altman and multiple other AI executives, and, per CNBC, discussed the threat of AI to humanity as a justification for his actions. I am genuinely happy to hear that this person was apprehended without anyone being hurt.
These actions are morally wrong, and are also the direct result of the AI industry’s deceptive and manipulative scare campaign, one promoted by men like Altman and Amodei, as well as doomer fanfiction writers like Yudowsky, and, of course, Daniel Kokotajlo of AI 2027 — both of whom have had their work validated and propagated via the New York Times.
On the subject of “dangerous rhetoric,” I think we need to reckon with the fact that the mainstream media has helped spread harmful propaganda, and that a lack of scrutiny of said propaganda is causing genuine harm.
I also do not hear any attempts by Mr. Altman to deal with the actual, documented threat of AI psychosis, and the people that have been twisted by Large Language Models to take their lives and those of others. These are acts of violence that could have been stopped had ChatGPT and similar applications not been anthropomorphized by design, and trained to be “friendly.”
These dangerous acts of violence were not inspired by Ronan Farrow publishing a piece about Sam Altman. They were caused by a years-long publicity campaign that has, since the beginning, been about how scary the technology is and how much money its owners make.
I separately believe that these executives and their cohort are intentionally scaring people as a means of growing their companies, and that these continual statements of “we’re making something to take your job and we need more money and space to do it” could be construed as a threat by somebody that’s already on edge.
I agree that the dangerous rhetoric around AI must stop. Dario Amodei and Sam Altman must immediately cease their manipulative and disingenuous scare-tactics, and begin describing Large Language Models in terms that match their actual abilities, all while dispensing with any further attempts to extrapolate their future capabilities. Enough with the fluff. Enough with the bullshit. Stop talking about AGI. Start talking about this like regular old software, because that’s all that ChatGPT is.
In the end, if Altman wants to engage with “good-faith criticism,” he should start acting in good faith.
That starts with taking ownership of his role in a global disinformation campaign. It starts with recognizing how the AI industry has sold itself based on spreading mythology with the intent of creating unrest and fear.
And it starts with Altman and his ilk accepting any kind of responsibility for their actions.
I’m not holding my breath.