Friday, November 30, 2012

Updating "Cloud vs. Local Storage Costs"

A number of things have changed since I wrote my "Cloud vs. Local Storage Costs" post in June that impact the results:
  • Amazon reduced S3 prices somewhat. As of December 1st, in response to a cut by Google, there will be a further cut.
  • Disk prices have continued their slow recovery from the Thai floods.
  • 4TB SATA drives are in stores now, albeit at high prices.
  • Michael Factor pointed out that I hadn't correctly accounted for RAID overhead in my original calculation.
Below the fold is some discussion of the Amazon vs. Google price war, and a recalculated graph.
For some time I've been pointing out the lack of competition in the storage service market. S3 has had both dominant market share and price leadership among the major players. This may be changing. Earlier this year, Rackspace cut prices so that, for small customers, they were less than S3's. S3 didn't respond, which indicates that they don't see Rackspace as a serious threat. But when Google cut prices, S3 instantly responded. Comparing the main services in the Eastern US:
Up to (TB) S3 (c/GB/mo) Google (c/GB/mo) Rackspace (c/GB/mo)
1 9.5 9.5 10
10 8 8.5 10
50 8 7.5 10
100 7 7.5 10
400 7 7 10
500 7 ask 10
1000 6.5 ask 10
5000 6 ask 10
For the first time this year Amazon has major competitors undercutting its prices, if only marginally. As we have documented, access costs and bandwidths provide powerful lock-in effects, thus these price differentials are unlikely to cost S3 much market share. Comparing access charges up to 10TB/mo:
  • Google: 11c/GB plus 1c/10,000 requests
  • S3: 12c/GB plus 1c/10,000 requests
  • Rackspace: 18c/GB
Google also introduced Durable Reduced Availability storage (DRA). Comparing this with S3's Reduced Redundancy Storage (S3-RRS):
Up to (TB) S3 (c/GB/mo) Google (c/GB/mo)
1 7.6 7
10 6.4 6
50 6.4 5.5
100 5.6 5.5
400 5.6 5
500 5.6 ask
1000 5.2 ask
5000 4.8 ask
Note that these services are not directly comparable. Google describes DRA as:
DRA storage lowers prices by trading off some data availability while maintaining the same latency performance and durability as standard Google Cloud Storage. ... DRA achieves cost savings by keeping fewer redundant replicas of data. Unlike other reduced redundancy cloud storage offerings, DRA is implemented in a manner that maintains data durability so you don't have to worry about losing your data in the cloud.
It isn't clear what Google means by "a manner that maintains data durability" but which keeps "fewer redundant replicas of data".

Amazon describes RRS as:
Reduced Redundancy Storage (RRS) is a storage option within Amazon S3 that enables customers to reduce their costs by storing non-critical, reproducible data at lower levels of redundancy than Amazon S3’s standard storage. ... The RRS option stores objects on multiple devices across multiple facilities, providing 400 times the durability of a typical disk drive, but does not replicate objects as many times as standard Amazon S3 storage, and thus is even more cost effective. Reduced Redundancy Storage is: ... Designed to provide 99.99% durability and 99.99% availability of objects over a given year. This durability level corresponds to an average annual expected loss of 0.01% of objects.
Since DRA reduces availability where RRS reduces durability, DRA may be appropriate for preservation where RRS is not on its own. Right now DRA is slightly cheaper but as our simulations have shown small differences in starting price are far less important than how closely the prices follow Kryder's Law.

The recalculated graph is based on Amazon's S3 prices for December. It assumes the Backblaze hardware with 45 3TB Seagate drives at today's $99.99 one-off price from CDW. They are arranged in 3 15-drive RAID-6 arrays, for a total of about 117TB of usable capacity. A real order for 45 would probably get a quantity discount on the drives, so this is again leaning over backwards to be fair. Since we are still working on the technology replacement policy issues, the local graph uses a baseline technology replacement policy that keeps the disks in service for their specified life.We show curves for disk service lives between 1 and 5 years.

The graph shows that the storage service and disk price cuts have roughly canceled each other out.

5 comments:

madsquirrel said...

just to point out that it looks like these Backblaze drives are consumer/home HDD's not server class. Also, Backblaze recommends Hitachi over Seagate in the article.
Great article though!

David. said...

Microsoft just matched Amazon's price cut. Basically, I think what is going on is the big players signaling to each other that (a) they are players and (b) they don't want a price war.

The Register's article includes some quotes from Jerome Lecat of Scality making the same point I've been making about the slow price drop in Amazon's (and other suppliers') prices: "That’s a 10 per cent annual decrease. It is serious decrease, but not quite as impressive as they make it sound."

David. said...

Rackspace just cut prices for both bandwidth and storage in their S3 competitor. It looks like they're still more expensive than Amazon.

Steve Greene said...

Sorry to be reviving a "zombie" comment thread, but I was wondering if you could point me to more recent work in this vein? I'm finding I have to combat the glib assumption that digitization is always the "golden ticket" to saving long-term storage costs. I work with a non-profit group (https://archivalresearchers.org) that presents our POV to lawmakers. I'm particularly interested in very large archival document collections, many of which normally have rather low usage. Also concerned with the costs associated with "rich media" digital storage, such as stills, audio and moving images.

David. said...

Steve, since I retired in 2017 I'm a full-time grandparent so I haven't been writing much on storage cost issues. You can find everything I have managed to write with this search.