DSHR's Blog: The Back Of The AI Envelope

Thursday, June 12, 2025

The Back Of The AI Envelope

The rise of the technology industry over the last few decades has been powered by its very strong economies of scale. Once you have invested in developing and deploying a technology, the benefit of adding each additional customer greatly exceeds the additional cost of doing so. This led to the concept of "blitzscaling", that it makes sense to delay actually making a profit and devote these benefits to adding more customers. That way you follow the example of Amazon and Uber on the path to a monopoly laid out by Brian Arthur's Increasing Returns and Path Dependence in the Economy. Eventually you can extract monopoly rents and make excess profits, but in the meantime blitzscale believers will pump your stock price.

This is what the VCs behind OpenAI and Anthropic are doing, and what Google, Microsoft and Oracle are trying to emulate. Is it going to work? Below the fold I report on some back-of-the-envelope calculations, which I did without using A1.

David Gerard notes that:

Microsoft is forecast to spend $80 billion on AI in 2025.

Lets try to figure out the return on this investment. We will assume that the $80B is split two ways, $40B to Nvidia for hardware and $40B on building data centers to put it in. Depreciating the $40B of hardware over five years is very optimistic, it is likely to be uneconomic to run after 2-3 years. But that's what we'll do. So that is minus $8B/year on the bottom line over the next five years. Similarly, depreciating the data centers over 20 years is likely optimistic, given the rate at which AI power demand is increasing. But that's what we'll do, giving another minus $2B/year on the bottom line.

Microsoft could argue that some of the $80B is the cost of training the models. But since the models will depreciate even faster than the hardware used to train them, this doesn't make things look better.

Microsoft's gross margin for cloud services is about 70%, so they will be expecting this $10B/year cost to generate $33B/year in revenue, or about 13% of Microsoft's total. Of course, there will be some ramp up in the revenue, but Microsoft is planning to keep investing, so next year's investment will need to generate a return too. We will thus ignore the ramp.

Source

Jukka Niiranen notes that:

Microsoft is today promoting the pay-as-you-go pricing model of Copilot Studio as the preferred sales motion. The list price of one message is $0.01. While enterprise clients may get discounts, there’s also the chance of prepaid message capacity being unused, so things may even out. With this price point, Copilot Studio usage generates $2.5M revenue per month, and $30M per year.

So Microsoft is processing about 3B messages/year. It needs adoption to be so fast that next year's revenue will be around 1,100 times its current rate. They will need next year's customers to generate about 33T messages/year.

How is adoption going? Jukka Niiranen notes that:

160k organizations using Copilot, this translates to around 1.5K messages per org per month. Or 52 messages per day. Now, we have to remember that one action in a Copilot Studio agent often consumes more than one message. ...

If those 52 messages were only about regular GenAI usage without any business process logic, that would mean 26 responses from Copilot Studio agents per day. If they were to include things like agent actions (meaning, AI does something more than just chatting back at you) or AI tools, we’re quickly at a point where the average Copilot Studio customer organization does a couple of agent runs per day.

This is shockingly low. It is plain and obvious that most customers are merely experimenting with trying to build agents. Hardly anyone is running it in production yet. Which wouldn’t be that bad if this was a new 2025 product. But Copilot Studio has been out since November 2023.

The back of my envelope says that Microsoft's AI business needs to grow customers like no business (even OpenAI) has ever grown customers if it is not to be a huge drag on the bottom line.

If this were a traditional technology business with very strong economies of scale growing customers incredibly fast would be good, because the incremental revenue from each new customer vastly outweighs the incremental cost of supporting them. This is where Microsoft's 70% gross margin comes from.

OpenAI lost $5B on $4B in revenue, so each $1 of revenue cost them $2.25. Ed Zitron had a more detailed estimate:

To be abundantly clear, as it stands, OpenAI currently spends $2.35 to make $1.

Lets assume Microsoft is doing better, with each $1 in revenue costing $1.50. But, as James O'Donnell and Casey Crownhart report in We did the math on AI’s energy footprint. Here’s the story you haven’t heard.:

As conversations with experts and AI companies made clear, inference, not training, represents an increasing majority of AI’s energy demands and will continue to do so in the near future. It’s now estimated that 80–90% of computing power for AI is used for inference.

If we assume unrealistically that training is a one-time cost and they don't need to retrain for next year, training cost them say 15% of $45M, or about $6.75M and answering the 30B messages cost them $38.25M. Scaling up by a factor of 1,100 means answering the messages would cost them $42B plus the $10B depreciation, so $52B. But it would only generate $33B in revenue, so each $1 of revenue would cost about $1.58. Scaling up would make the losses worse.

There are only two possibilities. Either inference gets at least an order of magnitude cheaper than training instead of around 6 times more expensive, or the price of using AI goes up by at least an order of magnitude. Now you see why Sam Altman et al are so desperate to run the "drug-dealer's algorithm" (the first one's free) and get the world hooked on this drug so they can supply a world of addicts.

8 comments:

Arthur said...: I don't see why a 10x reduction in per token/message inference costs is in any way unlikely given the reduction in costs seen so far?; June 13, 2025 at 4:50 AM
David. said...: What matters is the cost of inference relative to the cost of training. Technology improvements reduce the absolute cost of both, not necessarily the relative cost.; June 13, 2025 at 7:16 AM
Tardigrade said...: "Now you see why Sam Altman et al are so desperate to run the "drug-dealer's algorithm" (the first one's free) and get the world hooked on this drug so they can supply a world of addicts."

Remind's me of Asimov's "The Feeling of Power" in which humans become so dependent on computers that they forget how to do math. And the various Nestle boycotts of Nestle's marketing of infant formula to new mothers in poor countries, resulting in babies dying because the mother's can't afford to properly use, or buy, the formula, but having used it free in the hospital have now lost the ability to make their own milk.

Education is incredibly important. People don't even realize how much they learn by completing an entire process on their own, and then repeating it. AI has a place for tedium, but not if it replaces the original learning and reinforcing opportunities. And if it's used to cut out entry level jobs, this is exactly what it's replacing for humanity as a whole.

I see this being seen as even worse than the exporting of manufacturing jobs is currently seen.; June 16, 2025 at 9:25 AM
David. said...: Carmen Arroyo and Jill R Shah's Musk’s xAI Burns Through $1 Billion a Month as Costs Pile Up documents the insanity:

"* Elon Musk's AI startup xAI is burning through $1 billion a month, with expected losses of $13 billion in 2025, due to the high costs of building advanced AI models.
* To cover the gap, xAI is trying to raise $9.3 billion in debt and equity, with plans to spend over half of it in the next three months, and expects to be profitable by 2027.
* xAI's revenues are expected to be $500 million this year, rising to over $2 billion next year, but the company is struggling to develop revenue streams at the same rate as its competitors, such as OpenAI."; June 17, 2025 at 1:51 PM
Kevin Christopher Henry said...: Is there a miscalculation here? "The list price of one message is $0.01.... With this price point, Copilot Studio usage generates... $30M per year.
So Microsoft is processing about 30B messages/year."

$30M / $0.01 = 3B, right?; July 2, 2025 at 4:33 PM
David. said...: Thanks, Kevin. I'll update the post.; July 4, 2025 at 10:21 AM
David. said...: Ed Zitron's The Hater's Guide To The AI Bubble is nearly 14,500 words long but absolutely a must-read. This is just a tiny taste:

"Everything I’m discussing is the result of the Rot Economy thesis I wrote back in 2023 — the growth-at-all-costs mindset that has driven every tech company to focus on increasing quarterly revenue numbers, even if the products suck, or are deeply unprofitable, or, in the case of generative AI, both.

Nowhere has there been a more pungent version of the Rot Economy than in Large Language Models, or more specifically GPUs. By making everything about growth, you inevitably reach a point where the only thing you know how to do is spend money, and both LLMs and GPUs allowed big tech to do the thing that worked before — building a bunch of data centers and buying a bunch of chips — without making sure they’d done the crucial work of “making sure this would create products people like.” As a result, we’re now sitting on top of one of the most brittle situations in economic history — our markets held up by whether four or five companies will continue to buy chips that start losing them money the second they’re installed."; July 22, 2025 at 2:30 PM
David. said...: Bryce Elder takes a look at Morgan Stanley's Kool-Aid laced report on data center capex in What’ll happen if we spend nearly $3tn on data centres no one needs? and concludes:

"In 2000, at the telecoms bubble’s peak, communications equipment spending topped out at $135bn annualised. The internet hasn’t disappeared, but most of the money did. All those 3G licences and fibre-optic city loops provided zero insulation from default:

Peak data centre spend this time around might be 10 times higher, very approximately, with public credit investors sharing the burden more equally with corporates. The broader spread of capital might mean a slower unwind should GenAI’s return on investment fail to meet expectations, as Morgan Stanley says. But it’s still not obvious why creditors would be coveting a server shed full of obsolete GPUs that’s downwind of a proposed power plant."

The whole piece is full of gems, such as:

"The entire high-yield bond market is only valued at about $1.4tn, so private credit investors putting in $800bn for data centre construction would be huge. A predicted $150bn of ABS and CMBS issuance backed by data centre cash flows would triple those markets’ current size. Hyperscaler funding of $300bn to $400bn a year compares with annual capex last year for all S&P 500 companies of about $950bn."

You have to read the whole thing.; July 30, 2025 at 3:54 PM