In
The Selling Of AI I compared the market strategy behind the AI bubble to the drug-dealer's algorithm, "the first one's free". As the drugs take hold of an addict, three things happen:
- Their price rises.
- The addict needs bigger doses for the same effect.
- Their deleterious effects kick in.
As expected, this what is happening to AI. Follow me below the fold for the details.
The price rises
Ethan Ding starts
tokens are getting more expensive thus:
imagine you start a company knowing that consumers won't pay more than $20/month. fine, you think, classic vc playbook - charge at cost, sacrifice margins for growth. you've done the math on cac, ltv, all that. but here's where it gets interesting: you've seen the a16z chart showing llm costs dropping 10x every year.
so you think: i'll break even today at $20/month, and when models get 10x cheaper next year, boom - 90% margins. the losses are temporary. the profits are inevitable.
it’s so simple a VC associate could understand it:
year 1: break even at $20/month
year 2: 90% margins as compute drops 10x
year 3: yacht shopping
it’s an understandable strategy: "the cost of LLM inference has dropped by a factor of 3 every 6 months, we’ll be fine”
The first problem with this is that only
8% of the users will pay the $20/month, the 92% use it for free (
Menlo Ventures thinks it is only 3%). Indeed it turns out that an entire government agency only pays $1/year, as Samuel Axon reports in
US executive branch agencies will use ChatGPT Enterprise for just $1 per agency:
The workers will have access to ChatGPT Enterprise, a type of account that includes access to frontier models and cutting-edge features with relatively high token limits, alongside a more robust commitment to data privacy than general consumers of ChatGPT get. ChatGPT Enterprise has been trialed over the past several months at several corporations and other types of large organizations.
The workers will also have unlimited access to advanced features like Deep Research and Advanced Voice Mode for a 60-day period. After the one-year trial period, the agencies are under no obligation to renew.
Did I mention the drug-dealer's algorithm?
But that's not one of the two problems Ding is discussing. He is wondering why instead of yacht shopping,
this happened:
but after 18 months, margins are about as negative as they’ve ever been… windsurf’s been sold for parts, and claude code has had to roll back their original unlimited $200/mo tier this week.
companies are still bleeding. the models got cheaper - gpt-3.5 costs 10x less than it used to. but somehow the margins got worse, not better.
What the A16Z graph shows is the rapid reduction in cost per token of
each specific model, but also the rapid pace at which each specific model is supplanted by a better successor. Ding notes that users want the
current best model:
gpt-3.5 is 10x cheaper than it was. it's also as desirable as a flip phone at an iphone launch.
when a new model is released as the SOTA, 99% of the demand immediatley shifts over to it. consumers expect this of their products as well.
Which causes the first of the two problems Ding is describing. His graph shows that the cost per token of the model users actually want is
approximately constant:
the 10x cost reduction is real, but only for models that might as well be running on a commodore 64.
so this is the first faulty pillar of the “costs will drop” strategy: demand exists for "the best language model," period. and the best model always costs about the same, because that's what the edge of inference costs today.
...
when you're spending time with an ai—whether coding, writing, or thinking—you always max out on quality. nobody opens claude and thinks, "you know what? let me use the shitty version to save my boss some money." we're cognitively greedy creatures. we want the best brain we can get, especially if we’re balancing the other side with our time.
So the business model based on the cost of inference dropping 10x per year doesn't work. But that isn't the worst of the two problems. While it is true that the cost in dollars of a set number of tokens is roughly constant, the number of tokens a user needs
is not:
while it's true each generation of frontier model didn't get more expensive per token, something else happened. something worse. the number of tokens they consumed went absolutely nuclear.
chatgpt used to reply to a one sentence question with a one sentence reply. now deep research will spend 3 minutes planning, and 20 minutes reading, and another 5 minutes re-writing a report for you while o3 will just run for 20-minutes to answer “hello there”.
the explosion of rl and test-time compute has resulted in something nobody saw coming: the length of a task that ai can complete has been doubling every six months. what used to return 1,000 tokens is now returning 100,000.
Users started by trying fairly simpple tasks on fairly simple models. The power users, the ones in the 8%, were happy with the results and graduated to trying complex questions on frontier models. So their consumption of tokens
exploded:
today, a 20-minute "deep research" run costs about $1. by 2027, we'll have agents that can run for 24 hours straight without losing the plot… combine that with the static price of the frontier? that’s a ~$72 run. per day. per user. with the ability to run multiple asynchronously.
once we can deploy agents to run workloads for 24 hours asynchronously, we won't be giving them one instruction and waiting for feedback. we'll be scheduling them in batches. entire fleets of ai workers, attacking problems in parallel, burning tokens like it's 1999.
obviously - and i cannot stress this enough - a $20/month subscription cannot even support a user making a single $1 deep research run a day. but that's exactly what we're racing toward. every improvement in model capability is an improvement in how much compute they can meaningfully consume at a time.
The power users were on Anthropic's unlimited plan, so
this happened:
users became api orchestrators running 24/7 code transformation engines on anthropic's dime. the evolution from chat to agent happened overnight. 1000x increase in consumption. phase transition, not gradual change.
so anthropic rolled back unlimited. they could've tried $2000/month, but the lesson isn't that they didn't charge enough, it’s that there’s no way to offer unlimited usage in this new world under any subscription model.
it's that there is no flat subscription price that works in this new world.
Ed Zitron's
AI Is A Money Trap looks at the effect of Anthropic figuring this out on Cursor:
the single-highest earning generative AI company that isn’t called OpenAI or Anthropic, and the highest-earning company built on top of (primarily) Anthopic’s technology.
When Anthropic decided to reduce the rate at which they were losing money, Cursor's
business model collapsed:
In mid-June — a few weeks after Anthropic introduced “priority tiers” that required companies to pay up-front and guarantee a certain throughput of tokens and increased costs on using prompt caching, a big part of AI coding — Cursor massively changed the amount its users could use the product, and introduced a $200-a-month subscription.
Cursor's customers
weren't happy:
Cursor’s product is now worse. People are going to cancel their subscriptions. Its annualized revenue will drop, and its ability to raise capital will suffer as a direct result. It will, regardless of this drop in revenue, have to pay the cloud companies what it owes them, as if it had the business it used to. I have spoken to a few different people, including a company with an enterprise contract, that are either planning to cancel or trying to find a way out of their agreements with Cursor.
So Cursor, which was already losing money, will have less income and higher costs. They are the largest company buit on the AI major's platforms, despite only earning "around $42 million a month", and Anthropic just showed that their business model doesn't work. This isn't a good sign for the generative AI industry and thus, as
Zitron explains in details, for the persistence of the AI bubble.
Ding explains why OpenAi's $1/year/agency deal is all about with similar deals at the
big banks:
this is what devins all in on. they’ve recently announced their citi and goldman sachs parterships, deploying devin to 40,000 software engineers at each company. at $20/mo this is a $10M project, but here’s a question: would you rather have $10M of ARR from goldman sachs or $500m from prosumer devleopers?
the answer is obvious: six-month implementations, compliance reviews, security audits, procurement hell mean that that goldman sachs revenue is hard to win — but once you win it it’s impossible to churn. you only get those contracts if the singular decision maker at the bank is staking their reputation on you — and everyone will do everything they can to make it work.
Once the organization is hooked on the drug, they don't care what it costs because both real and political switching costs are intolerable,
Bigger doses are needed
Anjli Raval reports that
The AI job cuts are accelerating:
Even as business leaders claim AI is “redesigning” jobs rather than cutting them, the headlines tell another story. It is not just Microsoft but Intel and BT that are among a host of major companies announcing thousands of lay-offs explicitly linked to AI. Previously when job cuts were announced, there was a sense that these were regrettable choices. Now executives consider them a sign of progress. Companies are pursuing greater profits with fewer people.
For the tech industry, revenue per employee has become a prized performance metric. Y Combinator start-ups brag about building companies with skeleton teams. A website called the “Tiny Teams Hall of Fame” lists companies bringing in tens or hundreds of millions of dollars in revenue with just a handful of employees.
Brandon Vigliarolo's
IT firing spree: Shrinking job market looks even worse after BLS revisions has the latest data:
The US IT jobs market hasn't exactly been robust thus far in 2025, and downward revisions to May and June's Bureau of Labor Statistics data mean IT jobs lost in July are part of an even deeper sector slowdown than previously believed.
The Bureau of Labor Statistics reported relatively flat job growth last month, but unimpressive payroll growth numbers hid an even deeper reason to be worried: Most of the job growth reported (across all employment sectors) in May and June was incorrect.
According to the BLS, May needed to be revised down by 125,000 jobs to just 19,000 added jobs; June had to be revised down by even more, with 133,000 erroneous new jobs added to company payrolls that month. That meant just 14,000 new jobs were added in June.
...
Against that backdrop, Janco reports that BLS data peg the IT-sector unemployment rate at 5.5 percent in July - well above the national rate of 4.2 percent. Meanwhile, the broader tech occupation unemployment rate was just 2.9 percent, as reported by CompTIA.
Note these points from Janco's table:
- The huge spike of 107,100 IT jobs lost last November.
- The loss of 26,500 IT jobs so far this year.
- That so far this year losses are 327% of the same period last year.
The doses are increasing but their effect in pumping the stock hasn't been; the
NDXT index of tech stocks hasn't been heading moonwards over the last year.
CEOs have been enthusiastically laying off expensive workers and replacing them with much cheaper indentured servnts on H-1B visas, as Dan Gooding reports in
H-1B Visas Under Scrutiny as Big Tech Accelerates Layoffs:
The ongoing reliance on the H-1B comes as some of these same large companies have announced sweeping layoffs, with mid-level and senior roles often hit hardest. Some 80,000 tech jobs have been eliminated so far this year, according to the tracker Layoffs.fyi.
Gooding
notes that:
In 2023, U.S. colleges graduated 134,153 citizens or green card holders with bachelor's or master's degrees in computer science. But the same year, the federal government also issued over 110,000 work visas for those in that same field, according to the Institute for Sound Public Policy (IFSPP).
"The story of the H-1B program is that it's for the best and the brightest," said Jeremy Beck, co-president of NumbersUSA, a think tank calling for immigration reform. "The reality, however, is that most H-1B workers are classified and paid as 'entry level.' Either they are not the best and brightest or they are underpaid, or both."
While it is highly likely that most CEOs have drunk the Kool-Aid and actually believe that AI will replace the workers they fired,
Liz Fong-Jones believes that:
the megacorps use AI as pretext for layoffs, but actually rooted in end of 0% interest, changes to R&D tax credit (S174, h/t @pragmaticengineer.com for their reporting), & herd mentality/labour market fixing. they want investors to believe AI is driving cost efficiency.
AI today is literally not capable of replacing the senior engineers they are laying off. corps are in fact getting less done, but they're banking on making an example of enough people that survivors put their heads down and help them implement AI in exchange for keeping their jobs... for now.
Note that the megacorps are monopolies, so "getting less done" and delivering worse product by using AI isn't a problem for them — they won't lose business. It is just more enshittification.
Presumably, most CEOs think they have been laying off the fat, and replacing it with cheaper workers whose muscle is enhanced by AI, thereby pumping the stock. But they can't keep doing this; they'd end up with C-suite surrounded by short-termers on H-1Bs with no institutional memory of how the company actually functions. This information would have fallen off the end of the AIs' context.
The deleterious effects kick in
The deleterious effects come in three forms.
Within the companies, as the hype about AI's capabilities meets reality. For the workers, and not just those who were laid off. And in the broader economy, as the rush to build AI data centers meets limited resources.
The companies
But Raval sees the
weakening starting:
But are leaner organisations necessarily better ones? I am not convinced these companies are more resilient even if they perform better financially. Faster decision making and lower overheads are great, but does this mean fewer resources for R&D, legal functions or compliance? What about a company’s ability to withstand shocks — from supply chain disruptions to employee turnover and dare I say it, runaway robots?
Some companies such as Klarna have reversed tack, realising that firing hundreds of staff and relying on AI resulted in a poorer customer service experience. Now the payments group wants them back.
Of course, the tech majors have already enshittified their customer experience, so they can impose AI on their customers without fear. But AI is enshittifying the customer experience of smaller companies who have acutal competitors.
The workers
Shannon Pettypiece reports that
'A black hole': New graduates discover a dismal job market:
NBC News asked people who recently finished technical school, college or graduate school how their job application process was going, and in more than 100 responses, the graduates described months spent searching for a job, hundreds of applications and zero responses from employers — even with degrees once thought to be in high demand, like computer science or engineering. Some said they struggled to get an hourly retail position or are making salaries well below what they had been expecting in fields they hadn’t planned to work in.
And Anjli Raval note that
The AI job cuts are accelerating:
Younger workers should be particularly concerned about this trend. Entire rungs on the career ladder are taking a hit, undermining traditional job pathways. This is not only about AI of course. Offshoring, post-Covid budget discipline, and years of underwhelming growth have made entry-level hiring an easy thing to cut. But AI is adding to pressures.
...
The consequences are cultural as well as economic. If jobs aren’t readily available, will a university degree retain its value? Careers already are increasingly “squiggly” and not linear. The rise of freelancing and hiring of contractors has already fragmented the nature of work in many cases. AI will only propel this.
...
The tech bros touting people-light companies underestimate the complexity of business operations and corporate cultures that are built on very human relationships and interactions. In fact, while AI can indeed handle the tedium, there should be a new premium on the human — from creativity and emotional intelligence to complex judgment. But that can only happen if we invest in those who bring those qualities and teach the next generation of workers — and right now, the door is closing on many of them.
In
Rising Young Worker Despair in the United States, David G. Blanchflower & Alex Bryson describe some of the consequences:
Between the early 1990s and 2015 the relationship between mental despair and age was hump-shaped in the United States: it rose to middle-age, then declined later in life. That relationship has now changed: mental despair declines monotonically with age due to a rise in despair among the young. However, the relationship between age and mental despair differs by labor market status. The hump-shape in age still exists for those who are unable to work and the unemployed. The relation between mental despair and age is broadly flat, and has remained so, for homemakers, students and the retired. The change in the age-despair profile over time is due to increasing despair among young workers. Whilst the relationship between mental despair and age has always been downward sloping among workers, this relationship has become more pronounced due to a rise in mental despair among young workers. We find broad-based evidence for this finding in the Behavioral Risk Factor Surveillance System (BRFSS) of 1993-2023, the National Survey on Drug Use and Health (NSDUH), 2008-2023, and in surveys by Pew, the Conference Board and Johns Hopkins University.
History tends to show that large numbers of jobless young people despairing of their prospects for the future is a pre-revolutionary situation.
The economy
Bryce Elder's
What’ll happen if we spend nearly $3tn on data centres no one needs? points out the huge size of the AI bubble:
The entire high-yield bond market is only valued at about $1.4tn, so private credit investors putting in $800bn for data centre construction would be huge. A predicted $150bn of ABS and CMBS issuance backed by data centre cash flows would triple those markets’ current size. Hyperscaler funding of $300bn to $400bn a year compares with annual capex last year for all S&P 500 companies of about $950bn.
It’s also worth breaking down where the money would be spent. Morgan Stanley estimates that $1.3tn of data centre capex will pay for land, buildings and fit-out expenses. The remaining $1.6tn is to buy GPUs from Nvidia and others. Smarter people than us can work out how to securitise an asset that loses 30 per cent of its value every year, and good luck to them.
Brian Merchant argues that this spending is so big it is offsetting the impact of the tariffs in
The AI bubble is so big it's propping up the US economy (for now):
Over the last six months, capital expenditures on AI—counting just information processing equipment and software, by the way—added more to the growth of the US economy than all consumer spending combined. You can just pull any of those quotes out—spending on IT for AI is so big it might be making up for economic losses from the tariffs, serving as a private sector stimulus program.
Noah Smith's
Will data centers crash the economy? focuses on the incredible amounts the big four — Google, Meta, Microsoft, and Amazon — are spending:
For Microsoft and Meta, this capital expenditure is now more than a third of their total sales.
Smith notes that, as a proportion of GDP, this roughly matches the peak of the
telecom boom:
That would have been around 1.2% of U.S. GDP at the time — about where the data center boom is now. But the data center boom is still ramping up, and there’s no obvious reason to think 2025 is the peak,
The fiber optic networks that, a quarter-century later, are bringing you this post were the result of the telecom boom.
Over-investment is back, but might this be a
good thing?
I think it’s important to look at the telecom boom of the 1990s rather than the one in the 2010s, because the former led to a gigantic crash. The railroad boom led to a gigantic crash too, in 1873 ... In both cases, companies built too much infrastructure, outrunning growth in demand for that infrastructure, and suffered a devastating bust as expectations reset and loans couldn’t be paid back.
In both cases, though, the big capex spenders weren’t wrong, they were just early. Eventually, we ended up using all of those railroads and all of those telecom fibers, and much more. This has led a lot of people to speculate that big investment bubbles might actually be beneficial to the economy, since manias leave behind a surplus of cheap infrastructure that can be used to power future technological advances and new business models.
But for anyone who gets caught up in the crash, the future benefits to society are of cold comfort.
How likely is the bubble to burst? Elder notes just
one reason:
Morgan Stanley estimates that more than half of the new data centres will be in the US, where there’s no obvious way yet to switch them on:
America needs to find an extra 45GW for its data farms, says Morgan Stanley. That’s equivalent to about 10 per cent of all current US generation capacity, or “23 Hoover Dams”, it says. Proposed workarounds to meet the shortfall include scrapping crypto mining, putting data centres “behind the meter” in nuclear power plants, and building a new fleet of gas-fired generators.
Good luck with that! It is worth noting that the crash has
already happened in China, as Caiwei Chen reports in
China built hundreds of AI data centers to catch the AI boom. Now many stand unused.:
Just months ago, a boom in data center construction was at its height, fueled by both government and private investors. However, many newly built facilities are now sitting empty. According to people on the ground who spoke to MIT Technology Review—including contractors, an executive at a GPU server company, and project managers—most of the companies running these data centers are struggling to stay afloat. The local Chinese outlets Jiazi Guangnian and 36Kr report that up to 80% of China’s newly built computing resources remain unused.
Elder also uses the analogy with the late 90s
telecom bubble:
In 2000, at the telecoms bubble’s peak, communications equipment spending topped out at $135bn annualised. The internet hasn’t disappeared, but most of the money did. All those 3G licences and fibre-optic city loops provided zero insulation from default:
Peak data centre spend this time around might be 10 times higher, very approximately, with public credit investors sharing the burden more equally with corporates. The broader spread of capital might mean a slower unwind should GenAI’s return on investment fail to meet expectations, as Morgan Stanley says. But it’s still not obvious why creditors would be coveting a server shed full of obsolete GPUs that’s downwind of a proposed power plant.
When the bubble bursts, who will
lose money?
A data center bust would mean that Big Tech shareholders would lose a lot of money, like dot-com shareholders in 2000. It would also slow the economy directly, because Big Tech companies would stop investing. But the scariest possibility is that it would cause a financial crisis.
Financial crises tend to involve bank debt. When a financial bubble and crash is mostly a fall in the value of stocks and bonds, everyone takes losses and then just sort of walks away, a bit poorer — like in 2000. Jorda, Schularick, and Taylor (2015) survey the history of bubbles and crashes, and they find that debt (also called “credit” and “leverage”) is a key predictor of whether a bubble ends up hurting the real economy.
The Jorda
et al paper is
When Credit Bites Back: Leverage, Business Cycles, and Crises, and what they mean by "credit" and "leverage" is
bank loans.
Smith looks at whether
the banks are lending:
So if we believe this basic story of when to be afraid of capex busts, it means that we have to care about who is lending money to these Big Tech companies to build all these data centers. That way, we can figure out whether we’re worried about what happens to those lenders if Big Tech can’t pay the money back.
And so does
The Economist:
During the first half of the year investment-grade borrowing by tech firms was 70% higher than in the first six months of 2024. In April Alphabet issued bonds for the first time since 2020. Microsoft has reduced its cash pile but its finance leases—a type of debt mostly related to data centres—nearly tripled since 2023, to $46bn (a further $93bn of such liabilities are not yet on its balance-sheet). Meta is in talks to borrow around $30bn from private-credit lenders including Apollo, Brookfield and Carlyle. The market for debt securities backed by borrowing related to data centres, where liabilities are pooled and sliced up in a way similar to mortgage bonds, has grown from almost nothing in 2018 to around $50bn today.
The rush to borrow is more furious among big tech’s challengers. CoreWeave, an ai cloud firm, has borrowed liberally from private-credit funds and bond investors to buy chips from Nvidia. Fluidstack, another cloud-computing startup, is also borrowing heavily, using its chips as collateral. SoftBank, a Japanese firm, is financing its share of a giant partnership with Openai, the maker of ChatGPT, with debt. “They don’t actually have the money,” wrote Elon Musk when the partnership was announced in January. After raising $5bn of debt earlier this year xai, Mr Musk’s own startup, is reportedly borrowing $12bn to buy chips.
Smith focuses on
private credit:
These are the potentially scary part. Private credit funds are basically companies that take investment, borrow money, and then lend that money out in private (i.e. opaque) markets. They’re the debt version of private equity, and in recent years they’ve grown rapidly to become one of the U.S.’ economy’s major categories of debt:
Are the banks vulnerable to
private credit?.
Private credit funds take some of their financing as equity, but they also borrow money. Some of this money is borrowed from banks. In 2013, only 1% of U.S. banks’ total loans to non-bank financial institutions was to private equity and private credit firms; today, it’s 14%.
BDCs are “Business Development Companies”, which are a type of private credit fund. If there’s a bust in private credit, that’s an acronym you’ll be hearing a lot.
And I believe the graph above does not include bank purchases of bonds (CLOs) issued by private credit companies. If private credit goes bust, those bank assets will go bust too, making banks’ balance sheets weaker.
The fundamental problem here is that an AI bust would cause losses that would be both very large and very highly correlated, and thus very likely to be a tail risk not adequately accounted for by the banks' risk models, just as the large, highly correlated losses caused the banks to need a bail-out in the
Global Financial Crisis of 2008.
No comments:
Post a Comment