Thursday, December 6, 2012

Updating "More on Glacier Pricing"

In September I posted More on Glacier Pricing including a comparison with our baseline local storage model. Last week I posted Updating "Cloud vs. Local Storage Costs which among other things updated and corrected the baseline local storage model. Thus I needed to updated the comparison with Glacier too. Below the fold is this updated comparison, together with a back-of-the-envelope calculation to support the claim I've been making that although Glacier may look like tape, it could just be using S3's disk storage infrastructure.

As I said in last week's post, we are still working on the technology replacement policy issues. For the red line in this graph I chose the local storage curve from last week that is based on the baseline technology replacement policy with a 3-year service life for the media. The green line is the S3 curve from last week. The blue and purple lines are Glacier with integrity checks every 4 and 20 months.

The reduction in disk prices from the earlier simulations has made local storage more competitive with Glacier at the same Kryder rates. Two important caveats:
  • As I explained here, Glacier alone is not a viable preservation storage system. The apples-to-apples comparison between 3 local copies versus 1 local copy plus a copy in Glacier is now more favorable to the 3 local copies at 20% Kryder rate, with an endowment of $128K as against the local at 20% plus Glacier at Amazon's now 7% endowment of $223K.
  • The Kryder rate for Glacier is likely to be zero for a long time, 1c/GB/mo is a price point that Amazon will be reluctant to move away from. To the extent that they are reluctant, it makes local storage look even better.
Note that the earlier simulations were for a larger collection, 135TB as against 117TB.

I've been saying that my guess is that although Amazon designed Glacier so that they could implement it using tape, they priced it so that they won't lose money if it runs atop the same storage infrastructure as S3. Here is a back-of-the-envelope calculation to support my guess. Consider a 3TB drive which, after Amazon's 3-fold replication effectively (ignoring RAID overhead) holds 1TB of Glacier data for a 4-year service life during which we assume Amazon sticks at 1c/GB/mo:
  • Income: $480
  • Purchase cost: Assuming Amazon gets a 20% discount from retail, so $80 for the drive and, using Backblaze's numbers, about $45 per drive in server cost. Total $125.
  • Running cost: Backblaze's numbers suggest that running costs are about 1/9 of the purchase cost per year, so over 4 years the total is $56.
Thus over 4 years the drive costs Amazon $181 and earns $480, or a gross profit of $299. That's a long way from losing money. Incidentally, this also gives some idea how large the margins for S3 are likely to be. Running the same calculation assuming the drive holds S3 data at 5.5c/GB/mo for the largest tier:
  • Income: $2640
  • Purchase cost: $125
  • Running cost: $56
Thus over 4 years the drive costs Amazon $181 and earns $2640, or a gross profit of $2459. There's gold in them there drives!

No comments: