Facebook's cold storage does would save more power but increase the spin-up time to 20s. The system provides only (actually somewhat less than) 10% of the bandwidth per unit content, which sets the touch rate limit.
The Steve looked at the fine print of the drive specifications. He found two significant restrictions:
- The drives have a life-time limit of 50K start/stop cycles.
- For reasons that are totally opaque, the drives are limited to a total transfer of 180TB/yr.
The reason that Facebook's disk-based cold storage doesn't suffer from the same limits as traditional MAID is that it isn't doing random I/O. Facebook's system schedules I/Os so that it uses the full bandwidth of the disk array, raising the touch rate limit to that of the drives, and reducing the number of start-stop cycles. Admittedly, the response time for a random data object is now a worst-case 7 times the time for which a group of drives is active, but this is not a critical parameter for Facebook's application.
Steve's metric seems to be a major contribution to the analysis of storage systems.