Thursday, July 20, 2017

Patting Myself On The Back

Cost vs. Kryder rate
I started working on economic models of long-term storage six years ago, and quickly discovered the effect shown in this graph. It plots the endowment, the money which, deposited with the data and invested at interest, pays for the data to be stored "forever", as a function of the Kryder rate, the rate at which $/GB drops with time. As the rate slows below about 20%,  the endowment needed rises rapidly. Back in early 2011 it was widely believed that 30-40% Kryder rates were a law of nature, they had been that way for 30 years. Thus, if you could afford to store data for the next few years you could afford to store it forever

2014 cost/byte projection
As it turned out, 2011 was a good time to work on this issue. That October floods in Thailand destroyed 40% of the world's disk manufacturing capacity, and disk prices spiked. Preeti Gupta at UC Santa Cruz reviewed disk pricing in 2014 and we produced this graph. I wrote at the time:
The red lines are projections at the industry roadmap's 20% and a less optimistic 10%. [The graph] shows three things:
  • The slowing started in 2010, before the floods hit Thailand.
  • Disk storage costs in 2014, two and a half years after the floods, were more than 7 times higher than they would have been had Kryder's Law continued at its usual pace from 2010, as shown by the green line.
  • If the industry projections pan out, as shown by the red lines, by 2020 disk costs per byte will be between 130 and 300 times higher than they would have been had Kryder's Law continued.
Backblaze average $/GB
Thanks to Backblaze's admirable transparency, we have 3 years more data. Their blog reports on their view of disk pricing as a bulk purchaser over many years. It is far more detailed than the data Preeti was able to work with. Eyeballing the graph, we see a 2013 price around 5c/GB and a 2017 price around half that. A 10% Kryder rate would have meant a 2017 price of 3.2c/GB, and a 20% rate would have meant 2c/GB, so the out-turn lies between the two red lines on our graph. It is difficult to make predictions, especially about the future. But Preeti and I nailed this one.

This is a big deal. As I've said many times:
Storage will be
Much less free
Than it used to be
The real cost of a commitment to store data for the long term is much greater than most people believe, and there is no realistic prospect of a technological discontinuity that would change this.

1 comment:

blissex said...

As usual many thanks for the interesting data and insights.

Most "hockey stick" curves are actually the initial segment of an S-curve. And indeed for cost-per-TB that is becoming clear.

As to me, I also deal with archives, but also with warmer data, and my pet-peeve is with cost-per-IOPS-per-TB, as my curse is often to have to decatastrophize storage systems built with ridiculously low IOPS-per-TB of total capacity.
Unfortunately cost-per-IOPS-per-TB has not been falling anything like Kryder's Law...