A year ago at CNI I described the early stages of the effort to develop an economic model of long-term storage. We haven't made as much progress in the past year as we hoped, primarily because the more we study the problem the more complex it becomes. However, we have managed to take a good hard look at "affordable cloud storage", which has been a hot topic for that whole time. There have been some major developments in the cloud storage market in the last few months, and our model provides some significant insights into them.
This talk describes joint work with Daniel Rosenthal (no relation), Ethan Miller and Ian Adams of UC Santa Cruz, Erez Zadok of Stony Brook University and Mark Storer of NetApp.
I'll start by reviewing the model we are building, looking at the part storage costs are thought to play in the overall cost of digital preservation. The assumption underlying that thought is that storage costs will continue to drop exponentially. A year ago I expressed skepticism that the drop would be as fast in the future as in the past; since then this skepticism has become mainstream in the industry.
Many people believe that cloud storage, through the magic of economies of scale and competition, will ensure that storage prices continue to drop rapidly. We used our model to compare S3, Amazon's cloud storage service, to the cost of do-it-yourself storage. We studied the pricing history of cloud storage. We ran an experimental LOCKSS box in Amazon's cloud and carefully tracked the costs we incurred. In August Amazon introduced Glacier, a product aimed specifically at long-term storage. Two weeks ago Google cut the price of their storage service, and Amazon immediately responded by cutting their prices. Last week Microsoft matched Amazon's price cut. We used our model to analyze these developments.
One thing that most people actually trying to preserve large amounts of digital content agree on is that the number 1 problem they face is not technical but economic. Unlike paper, bits are very vulnerable to interruptions in the money supply. To survive, or in the current jargon to "be sustainable", a digital collection needs an assured stream of funds for the long term. Very few have it. You can tell people are worried about a topic when they appoint a "blue ribbon task force" to study it. We had such a task force. It reported 2 years ago that, yes, sustainable economics was a big problem. But the panel conspicuously failed to come up with credible solutions. This concern has motivated a good deal of research into the costs of digital preservation, efforts such as CMDP (PDF), LIFE, KRDS, PrestoPrime, ENSURE, and others. Their conclusions differ, but broadly we can say that typically about half the total cost is ingest, about one-third is preservation, mostly storage, and about one-sixth is dissemination.
It is easy to understand why ingesting content is expensive, at least it is easy if you have ever tried to do it on a production scale. There is a lot of stuff to ingest. In the real world it is diverse and messy. People want not just the content, but also metadata. This has to be either manually generated, which is expensive, or extracted automatically, which is a great way of revealing the messy nature of the real world. It is easy to understand why disseminating content is a small part of the total, because preserved content is, on average, very rarely disseminated. Why is storage, an on-going cost that must be paid for the life of the collection, such a small part of the total?
Kryder's Law, which says that the areal density of bits on disk platters has increased 30-40%/year for the last 30 years. The areal density doesn't have a one-to-one relationship with the cost per GB of disk, but they are closely correlated. The effect has been, for the last 30 years, that consumers got roughly double the storage at the same price every two years or so.
If something goes on steadily for 30 years or so it gets built into people's models of the world. For digital preservation, the model of the world into which it gets built is that, if you can afford to store something for a few years, you can afford to store it forever. The price per byte of the storage will have become negligible. Thus, the breakdown that has storage costs being one-third of the total has built into it the idea that storage media costs drop so fast that the one- third has only to pay for a few years of storage.
There are three different business models for long-term storage:
- It can be rented, as for example with Amazon's S3 which charges an amount per GB per month.
- It can be monetized, as with Google's Gmail, which sells ads against your accesses to your e-mail.
- Or it can be endowed, as with Princeton's DataSpace, which requires data to be deposited together with a capital sum thought to be enough to fund its storage "for ever".
If you believe that storage media costs will continue to drop at 30-40%/year you need to project these costs a few years out. The slower the rate of storage media cost drop, the longer ahead you need to project things like storage technology, power costs, staff costs and interest rates.
We are working on building an economic model of long-term storage. First, we built two prototype models, one short-term and one long-term. The first models a unit of storage capacity, as it might be a shelf of a filer, through which media of increasing capacity flow. The second models a chunk of data through time as it migrates from one generation of storage media to its successors. The goal is to compute the endowment, the capital needed to fund the chunk's preservation for, in our case, 100 years.
The price per byte of each media generation is set by a Kryder's Law parameter. Each technology also has running costs, and costs for moving in and moving out. Interest rates are set each year using a model based on the last 20 years of inflation-protected Treasuries. I should add the caveat that this is a still a prototype, so the numbers it generates should not be relied on. But the shapes of the graphs seem highly plausible.
As expected, it is an S-curve. If the endowment is too low, running out of money is certain. If it is large enough, survival is certain. One insight from this graph is that the transition from 0% to 100% survival happens over about 10% of the endowment value. The 25% Kryder rate dominates the much lower interest rates.
The first talk I gave with a detailed description of our research into the economics of long-term storage was at the Library of Congress in September 2011. It raised the possibility that the 30-year history of media price drops might not continue, and this got a lot of push-back. People pointed to the poor history of predictions that Moore's Law would slow down.
About 6 weeks later I spoke at CNI. Thailand was struggling to recover from the 4th worst disaster in world history, the floods that overwhelmed much of the country and in particular took out about 40% of the world's capacity to manufacture disk drives. This natural disaster showed that Kryder's Law was not a law of nature. Cost per byte roughly doubled and, a year later, has still not returned to its pre-flood level. Fortunately, this disaster meant that I no longer had to argue with people saying that the cost of storage was guaranteed to drop 40%/year.
In the good old days when Kryder's Law ruled the land, the disk drive industry was very competitive. There were a large number of manufacturers, competing fiercely on price. This kept their margins low, which meant that over time the weaker competitors could not sustain themselves and were taken over by the stronger. Even before the floods, there were only two and a half competitors left. Seagate and Western Digital had almost half the market each, with a small percentage left for Toshiba.
Despite this, their margins were miserable; 3% and 6% respectively for the quarter before the floods. Two quarters later, their margins were 37% and 16%. Another effect of the floods and the consolidation of the industry was that warranties for the kinds of disks used for long-term storage were drastically reduced, from 3 to 2 years by WD and from 5 to 1 by Seagate. Complex accounting effects mean that reducing warranties can have a big effect on a company's bottom line. In effect, it transfers risk and the associated cost to the customer.
It is easy to dismiss the floods as a temporary glitch in the Kryder's Law exponential curve. It is a bit less easy to argue that a market with only two suppliers will be as competitive, and thus have as low margins, as one with many suppliers. But despite the skeptics in the Library of Congress audience, it was already true by September 2011 that the disk industry had fallen off the Kryder curve. There is a very fundamental reason why it was doomed to fall off the curve.
It is always tempting to think that exponential curves will continue, but in the real world they are always just the steep part of an S-curve.
Note how Dave's graph shows PMR being replaced by Heat Assisted Magnetic Recording (HAMR) starting in 2009. No-one has yet shipped HAMR drives. If we had stayed on the Kryder's Law curve we should have had 4TB 3.5" SATA drives in 2010. Instead, in late 2012 the very first 4TB drives are just hitting the market. What happened?
The transition from PMR to HAMR turned out to be far harder and much more expensive than the industry expected. HAMR works by using a tiny laser to heat up a very small spot on the disk so that a larger magnetic head will write only the heated spot. Manufacturing the lasers and getting the laser and the magnet to work together is very, very difficult. So the industry turned to desperate measures to stretch PMR into a 6th generation. They came up with the idea of shingled writes, in which the tracks on the disk are so close together that writing one track partly overlays the adjacent one. Then fearsomely complex signal processing is used on a read to disentangle the overlapping tracks. The problem with this, and the reason why shingled writes haven't been widely adopted, is that operating systems expect to be able to write a disk randomly. With shingled writes a disk becomes, in effect, an append-only device.
Tom Coughlin, a respected storage industry analyst, had a blog post on Forbes recently entitled Peak Disk commenting on the fact that unit shipments of disk have been decreasing for at least a couple of quarters. This triggered a lot of discussion on a storage mail list I'm on. Summarizing the discussion, there appear to be six segments of the hard disk business, of which four are definitely in decline:
- Enterprise drives: Flash is killing off the storage systems that enterprises used to use for performance-critical data. These were based around buying a lot of very fast, relatively small capacity and expensive disks and using only a small proportion of the tracks on them. This reduced the time spent waiting for the head to arrive at the right track, and thus kept performance high albeit at an extortionate cost. Flash is faster and cheaper.
- Desktop drives: Laptops and tablets are killing off desktop systems, so the market for the 3.5" consumer drives that went into them is going away.
- Consumer embedded: Flash and the shift of video recorder functions to the network have killed this market for hard disks.
- Mobile: Flash has killed this market for hard disks.
- Enterprise SATA: Public and private cloud storage systems are growing and hungry for the highest GB per $ drives available, but the spread of deduplication and the arrival of 4TB drives will likely slow the growth of unit shipments somewhat.
- Branded drives: This market is mostly high-capacity external drives and SOHO NAS boxes. Cloud storage constrains its growth, but the bandwidth gap means that it has a viable niche.
Although about 70% of all bytes of storage produced each year is disk, both tape and solid state are alternatives for preservation:
- Tape's recording technology lags about 8 years behind disk; it is unlikely to run into the problems plaguing disk for some years. We can expect its relative cost advantage over disk to grow in the medium term.
- Flash memory's advantages, including low power, physical robustness and low access latency have overcome its higher cost per byte in many markets, such as tablets and servers. Properly exploited, they could result in enough lower running costs to justify use for long-term storage too. But analysis by MarkKryder and Chang Soo Kim at Carnegie-Mellon is not encouraging about the prospects for flash and the range of alternate solid state technologies beyond the end of the decade.
Actually, there is a much simpler argument than all this stuff about margins and technologies and market segments to show that the current mindset about storing data has been overtaken by events. Bill McKibben's Rolling Stone article Global Warming's Terrifying New Math uses three numbers to illustrate the looming crisis. Here are three numbers that illustrate the looming crisis in long-term storage, its cost:
- According to IDC, the demand for storage each year grows about 60%.
- According to IHS iSuppli, the bit density on the platters of disk drives will grow no more than 20%/year for the next 5 years.
- According to computereconomics.com, IT budgets in recent years have grown between 0%/year and 2%/year.
If you're in the digital preservation business, storage is already way more than 5% of your IT budget. Its going to consume 100% of the budget in much less than 10 years. We need to address the triangle between the blue and green lines through some combination of an increase in the IT budget by orders of magnitude, a radical reduction in the rate at which we store new data, or a radical reduction in the cost of storage.
A number of the sites that run LOCKSS boxes under the auspices of the Library of Congress' NDIIPP program asked whether they could use "affordable cloud storage" instead of local disk. At first I thought this was a technical question, and spent some time analyzing the various possible ways of using cloud storage for a LOCKSS box, and prototyping them.
But pretty soon I realized I was asking the wrong question. The question should have been "does using affordable cloud storage for my LOCKSS box save money?" So Daniel Vargas and I ran a LOCKSS box at Amazon for 14 weeks and carefully logged the costs. Under our contract with the Library we had neither the time nor the money to do so at a fully realistic size or time, so our results had to be scaled up on both axes. But to summarize our experience, we believe that running the median-sized LOCKSS box at Amazon for 3 years would incur charges between $9K and $19K, significantly higher than the purchase cost of the corresponding LOCKSS box (about $1.5K) and the power, cooling and network costs it would incur. Our experience suggests that staff costs would be roughly the same between local and cloud hosted boxes. The details are in a report to the Library of Congress, which funded the work.
This is but one data point, we need the more comprehensive view of our model. We need a baseline, the cost of local storage, to compare with cloud costs. It should lean over backwards to be fair to cloud storage. I don't know of a lot of good data to base this on; I use numbers from Backblaze, a PC backup service which publishes detailed build and ownership costs for their 4U storage pods, which using 3TB drives and RAID6 can store about 117TB. I take their 2011 build cost, and current one-off retail prices for 3TB drives. Based on numbers from San Diego Supercomputer Center and Google, I add running costs so that the hardware cost is only 1/3 of the total 3-year cost of ownership. Note that this is much more expensive than Backblaze's published running cost. I add move-in and move-out costs of 20% of the purchase price in each generation. Then I multiply the total by three to reflect three geographically separate replicas.
In the past, with Kryder rates in to 30-40% range, we were in the flatter part of the graph where the precise Kryder rate wasn't that important in predicting the long-term cost. As Kryder rates decease, we move into the steep part of the graph, which has two effects:
- The cost increases sharply.
- The cost becomes harder to predict, because it depends strongly on the precise Kryder rate.
Here is a table showing the history of the prices for several major cloud storage services since their launch. Their Kryder rates are way below the rate of the underlying disk over the same period, except for Google, who introduced their service at an unsustainable price point way above Amazon's, and had to cut it quickly.
|Dec 2012 (c/GB/mo)
|Annual % Drop
If local storage's Kryder rate matches IHS' projected 20% the 117TB needs an endowment of $128K. If S3's rate is their historic 7% it needs an endowment of $1007K. The endowment needed in S3 is more than 7 times larger than in local storage, and depends much more strongly on the Kryder rate. This raises two obvious questions.
First, why don't S3's prices drop as the cost of the underlying storage drops? The answer is that they don't need to, because their customers are locked in. Bandwidth charges make it very expensive to move data from S3 to a competitor. To get our 117TB out of S3's RRS in a month would cost about six week's storage. Plus three week's storage for the data that hadn't yet been moved, Plus three week's storage at the competitor. In practice, storage charges during the move would be significatly higher, since you wouldn't delete stuff from S3 until you had confirmed that it had arrived safely. The compute charges for the on-arrival integrity checks need to be added in, as do the staff costs for the move and the service disruption it would cause.
And for what? No major competitor charges less than S3. I don't know any competitors that are enough cheaper than S3 to make moving an economic win. Why is this? Amazon has the vast majority of the market; Mark Andreessen estimated some time ago over 90%. Thus it has the industry's lowest cost base.
Earlier this year Rackspace lowered prices so that for small users they were a bit cheaper than Amazon. Amazon didn't budge; they didn't see Rackspace as a significant competitor. For large users Amazon was still cheaper. Two weeks ago Google, who had initially priced their service well above Amazon, decided to get serious about competing and undercut S3 significantly. Two days later Amazon matched Google's prices. This signals that Google is a significant competitor because their pockets are deep. So is Microsoft, which last week cut prices to match Amazon. No-one smaller is significant. There's no reason for Amazon, Google and Microsoft to fight to the death to monopolize storage, so don't expect these initial shots to turn into a full-scale war.
Second, why is S3 so much more expensive than local storage? After all, even using S3's Reduced Redundancy Storage to store 117TB, you would pay in the first month almost enough to buy the hardware for one of Backblaze's storage pods.
The answer is that, for the vast majority of S3's customers, it isn't that expensive. First, they are not in the business of long-term storage. Their data has a shelf-life much shorter than the life of the drives, so they cannot amortize across the full life of the media. Second, their demand for storage has spikes. By using S3, they avoid paying for the unused capacity to cover the spikes.
Long-term storage has neither of these characteristics, and this makes S3's business model inappropriate for long-term storage. Amazon recently admitted as much when they introduced Glacier, a product aimed specifically at long-term storage, with headline pricing between 5 and 10 times cheaper than S3.
To make sure that Glacier doesn't compete with S3, Amazon gave it two distinguishing characteristics. First, there is a unpredictable delay between requesting data and getting it. Amazon says this will average about 4 hours, but they don't commit to either an average or a maximum time. Second, the pricing for access to the data is designed to discourage access:
- There is a significant per-request charge, to motivate access in large chunks.
- Although you are allowed to access 5% of your data each month with no per-byte charge, the details are complex and hard to model, and the cost of going above your allowance is high.
- No accesses to the content other than for integrity checks.
- Accesses to the content for integrity checking are generated at a precisely uniform rate.
- Each request is for 1GB of content.
- One reserved AWS instance used for integrity checks.
But this is not an apples-to-apples comparison. Both local storage and S3 provide adequate access to the data. Glacier's long latency and severe penalties for unplanned access mean that, except for truly dark archives, it isn't feasible to use Glacier as the only repository. Even for dark archives, Glacier's access charges provide a very powerful lock-in. Getting data out of Glacier to move it to a competitor in any reasonable time-frame would be very expensive, easily as much as a year's storage.
Providing adequate access to justify preserving the content, and avoiding getting locked-in to Amazon, requires maintaining at least one copy outside Glacier. If we maintain one copy of our 117TB example in Glacier with 20-month integrity checks experiencing a 7% Kryder rate, and one copy in local storage experiencing a 20% Kryder rate (instead of the three in our earlier local storage examples), the endowment needed would be $223K. The endowment needed for three copies in local storage at a 20% Kryder rate would be $128K.
Replacing two copies in local storage with one copy in Glacier would increase costs somewhat. Its effect on robustness would be mixed, with 4 versus 3 total copies (effectively triplicated in Glacier, plus local storage) and greater system diversity, but at the cost of less frequent integrity checks.
Many people assume that Glacier is tape. I believe that Amazon designed Glacier so that they could implement it using tape, but that they don't need to. Glacier is priced so that they can use the same storage infrastructure as S3. Consider a 3Tb drive that is in service 4 years. It generates $480 in income. Suppose Amazon buys drives 20% less than retail, or $80, adds $45 in server cost per drive, and spends $80/year per drive in power, cooling, space, and other costs. This would be well above what Backblaze reports. Then over 4 years the drive makes $35 in gross profit. Not much, but Amazon is willing, as they did with S3, to price low initially to capture the bulk of the market and wait for Kryder's Law to grow their margins through time.
In conclusion, it is clear that going forward the cost of storage will be a much bigger part of the overall cost of digital preservation than it used to be. This is partly because we are accumulating more and more stuff to store, but also that the comfortable assumption that the cost per byte of storage would drop rapidly is no longer tenable.
It is pretty clear that commercial storage services like S3 are simply too expensive to use for digital preservation. Their prices don't drop as fast as the underlying cost of hardware. The reasons for this are mostly business rather than technical. This may be an argument for private cloud storage services, but only if they can be organized to exploit Kryder's Law fully.
Access and lock-in considerations make it very difficult for a digital preservation system to use Glacier as its only repository. Even with generous assumptions, it isn't clear that using Glacier to replace all but one local store reduces costs or enhances overall reliability. Systems that combine Glacier with local storage, or with other cloud storage systems, will need to manage accesses to the Glacier copy very carefully if they are not to run up large access costs.
I'll leave you with this thought. You, or at least I, often hear people say something similar to what Dr. Fader of the Wharton School Customer Analytics Initiative attributes to Big Data zealots: "Save it all - you never know when it might come in handy for a future data-mining expedition." Clearly, the value that could be extracted from the data in the future is non-zero, but even the Big Data zealot believes it is probably small. The reason the Big Data zealot gets away with saying things like this is because he, and his audience, believe that this small value outweighs the cost of keeping the data indefinitely. They believe that because they believe Kryder's Law will continue.
Lets imagine that everyone thought that way, and decided to keep everything forever. The natural place to put it would be in S3's "affordable cloud storage". According to IDC, in 2011 the world stored 1.8 Zettabytes (billion TB) of data. If we decided to keep it all for the long term in the cloud, we would be effectively endowing it. How big would the endowment be?
Applying our model, starting with S3's current highest-volume price of $0.055/GB/mo and assuming that price continues to drop at the 10%/yr historic rate for S3's largest tier, we need an endowment of about $6.3K/TB. So the net present value of the cost of keeping all the world's 2011 data in S3 would be about $11.4 trillion. The 2011 Gross World Product (GWP) at purchasing price parity is almost $80 trillion. So keeping 2011's data would consume 14% of 2011's GWP. The world would be writing S3 a check each month of the first year for almost $100 billion, unless the world got a volume discount.
IDC estimates that 2011's data was 50% larger than 2010's; I believe their figure for the long-run annual growth of data is 57%/yr. Even if it is only 50%, compare that with even the most optimistic Kryder's Law projections of around 30%. But we're using S3, and a 10% rate of cost decrease. So 2012's endowment will be (50-10)=40% bigger than 2011, and so on into the future. The World Bank estimates that in 2010 GWP grew 5.1%. Assuming this growth continues, endowing 2012's data will consume 19% of GWP.
On these trends, endowing 2018's data will consume more than the entire GWP for the year. So, we're going to have to throw stuff away. Even if we believe keeping stuff is really cheap, its still too expensive. The bad news is that deciding what to keep and what to throw away isn't free either. Ignoring the problem incurs the costs of keeping the data; dealing with the problem incurs the costs of deciding what to throw away. We may be in the bad situation of being unable to afford either to keep or to throw away the data we generate. Perhaps we should think more carefully before generating it in the first place. Of course, thought of that kind isn't free either ...