Thursday, June 12, 2025

The Back Of The AI Envelope

Sauce
The rise of the technology industry over the last few decades has been powered by its very strong economies of scale. Once you have invested in developing and deploying a technology, the benefit of adding each additional customer greatly exceeds the additional cost of doing so. This led to the concept of "blitzscaling", that it makes sense to delay actually making a profit and devote these benefits to adding more customers. That way you follow the example of Amazon and Uber on the path to a monopoly laid out by Brian Arthur's Increasing Returns and Path Dependence in the Economy. Eventually you can extract monopoly rents and make excess profits, but in the meantime blitzscale believers will pump your stock price.

This is what the VCs behind OpenAI and Anthropic are doing, and what Google, Microsoft and Oracle are trying to emulate. Is it going to work? Below the fold I report on some back-of-the-envelope calculations, which I did without using A1.

David Gerard notes that:
Microsoft is forecast to spend $80 billion on AI in 2025.
Lets try to figure out the return on this investment. We will assume that the $80B is split two ways, $40B to Nvidia for hardware and $40B on building data centers to put it in. Depreciating the $40B of hardware over five years is very optimistic, it is likely to be uneconomic to run after 2-3 years. But that's what we'll do. So that is minus $8B/year on the bottom line over the next five years. Similarly, depreciating the data centers over 20 years is likely optimistic, given the rate at which AI power demand is increasing. But that's what we'll do, giving another minus $2B/year on the bottom line.

Microsoft could argue that some of the $80B is the cost of training the models. But since the models will depreciate even faster than the hardware used to train them, this doesn't make things look better.

Microsoft's gross margin for cloud services is about 70%, so they will be expecting this $10B/year cost to generate $33B/year in revenue, or about 13% of Microsoft's total. Of course, there will be some ramp up in the revenue, but Microsoft is planning to keep investing, so next year's investment will need to generate a return too. We will thus ignore the ramp.

Source
Jukka Niiranen notes that:
Microsoft is today promoting the pay-as-you-go pricing model of Copilot Studio as the preferred sales motion. The list price of one message is $0.01. While enterprise clients may get discounts, there’s also the chance of prepaid message capacity being unused, so things may even out. With this price point, Copilot Studio usage generates $2.5M revenue per month, and $30M per year.
So Microsoft is processing about 30B messages/year. It needs adoption to be so fast that next year's revenue will be around 1,100 times its current rate. They will need next year's customers to generate about 330T messages/year.

How is adoption going? Jukka Niiranen notes that:
160k organizations using Copilot, this translates to around 1.5K messages per org per month. Or 52 messages per day. Now, we have to remember that one action in a Copilot Studio agent often consumes more than one message. ...

If those 52 messages were only about regular GenAI usage without any business process logic, that would mean 26 responses from Copilot Studio agents per day. If they were to include things like agent actions (meaning, AI does something more than just chatting back at you) or AI tools, we’re quickly at a point where the average Copilot Studio customer organization does a couple of agent runs per day.

This is shockingly low. It is plain and obvious that most customers are merely experimenting with trying to build agents. Hardly anyone is running it in production yet. Which wouldn’t be that bad if this was a new 2025 product. But Copilot Studio has been out since November 2023.
The back of my envelope says that Microsoft's AI business needs to grow customers like no business (even OpenAI) has ever grown customers if it is not to be a huge drag on the bottom line.

If this were a traditional technology business with very strong economies of scale growing customers incredibly fast would be good, because the incremental revenue from each new customer vastly outweighs the incremental cost of supporting them. This is where Microsoft's 70% gross margin comes from.

OpenAI lost $5B on $4B in revenue, so each $1 of revenue cost them $2.25. Ed Zitron had a more detailed estimate:
To be abundantly clear, as it stands, OpenAI currently spends $2.35 to make $1.
Lets assume Microsoft is doing better, with each $1 in revenue costing $1.50. But, as James O'Donnell and Casey Crownhart report in We did the math on AI’s energy footprint. Here’s the story you haven’t heard.:
As conversations with experts and AI companies made clear, inference, not training, represents an increasing majority of AI’s energy demands and will continue to do so in the near future. It’s now estimated that 80–90% of computing power for AI is used for inference.
If we assume unrealistically that training is a one-time cost and they don't need to retrain for next year, training cost them say 15% of $45M, or about $6.75M and answering the 30B messages cost them $38.25M. Scaling up by a factor of 1,100 means answering the messages would cost them $42B plus the $10B depreciation, so $52B. But it would only generate $33B in revenue, so each $1 of revenue would cost about $1.58. Scaling up would make the losses worse.

There are only two possibilities. Either inference gets at least an order of magnitude cheaper than training instead of around 6 times more expensive, or the price of using AI goes up by at least an order of magnitude. Now you see why Sam Altman et al are so desperate to run the "drug-dealer's algorithm" (the first one's free) and get the world hooked on this drug so they can supply a world of addicts.

2 comments:

Arthur said...

I don't see why a 10x reduction in per token/message inference costs is in any way unlikely given the reduction in costs seen so far?

David. said...

What matters is the cost of inference relative to the cost of training. Technology improvements reduce the absolute cost of both, not necessarily the relative cost.