Glacier's pricing model is complex, so I had to make some fairly heroic assumptions:
- No accesses to the content other than for integrity checks.
- Accesses to the content for integrity checking are generated at a precisely uniform rate. This is important because Glacier's data access charges for each month are based on the peak hourly rate during that month.
- Each request is for 1GB of content. This is important because Glacier charges for each request in addition to the charge for the amount of data requested.
- In each month no more than 5% of the content may be accessed without an access charge, but the requests to do so are charged the normal request fee.
- Glacier with no integrity checks.
- Glacier with an integrity check of each object every 20 months.
- Glacier with an integrity check of each object every 4 months.
this assumes that they both experience the same Kryder rate. If Glacier experiences Amazon's historic 3% rate and local storage the industry's projection of 20%, Glacier is nearly 2.5 times more expensive.
If my surmise that Glacier's pricing will follow S3's is correct, then the only way to make Glacier competitive with local storage is to extend the interval between integrity checks enough that all accesses to data are covered by the 5% monthly free allowance.
The shortest interval that can possibly achieve this is 20 months, although in practice some margin for error would be needed and thus a more practical interval would be 24 months. The 20 months line in the graph suggests that this makes Glacier at a 3% Kryder rate somewhat cheaper than local storage at a 20% rate, but even if the assumptions above were to be true, this is not an apples-to-apples comparison.
The Blue Ribbon Task Force and other investigations of the sustainability of digital preservation emphasize that preservation cannot be justified as an end in itself, only as a way to provide access to the preserved content. The local disk case provides practical access; the Glacier case does not. The long latency between requesting access and obtaining it, and the severe economic penalties for unpredictable or high-rate accesses mean that Glacier cannot alone be a practical digital preservation system. At least one copy of the content must be in a system that is capable of:
- Providing low-latency access for users of the content. Otherwise the preservation of the content cannot be justified.
- Being a source for bulk transfer of the content, for example to a Glacier competitor. Getting bulk data out of Glacier quickly is expensive, equivalent to between 5 months and a year of storage, which provides a powerful lock-in effect.
Because the cost penalties for peak access to storage and for small requests are so large (see the difference between the 4-month and 20-month lines), if Glacier is not to be significantly more expensive than local storage in the long term preservation systems that use it will need to be carefully designed to rate-limit accesses and to request data in large chunks.
No comments:
Post a Comment