Tuesday, October 8, 2013

Hybrid Disk Drives

Dave Anderson gave an interesting talk at the Library of Congress' Designing Storage Architectures meeting on Seagate's enterprise hybrid disk drives. Details below the fold.

Most discussion of hybrid drives, those with both spinning platters and solid state memory in the same package, has assumed that they would be consumer drives for laptops. The solid state memory would add only a little cost but give two significant advantages:
  • The system would boot or wake up as fast as if all its mass storage were solid state, because the files needed would be in the solid state part of the drive.
  • The disk would use less power, because the solid state part of the drive would behave as a cache, so many reads would be satisfied from it rather than needing the drive to be spun-up.
But Dave was talking about enterprise drives, for use in data centers. These drives have 128MB of DRAM and 32GB of flash sitting between the interface and a normal enterprise hard disk:
  • Data read from the hard disk goes to the DRAM to be buffered outbound on its way to the bus, just as with the cache on a non-hybrid drive. Frequently read data is copied to the flash, boosting read performance.
  • Data written to the drive is cached in DRAM, just as with a non-hybrid drive. A normal drive should not acknowledge the write at this point (although many do) because a power failure could result in the DRAM contents being lost. But the hybrid drive uses the small amount of residual power after the failure to write the data waiting to be written to disk into the flash. When power is restored the pending writes are written from flash to the hard disk. Thus it is safe for the drive to acknowledge the write as soon as it is in DRAM, improving performance.
But there are more subtle effects as well:
  • First, enterprise drives are often short-stroked, using only a small part of the drive to get high performance. The hybrid cache gets almost all the performance without the loss in capacity for much less cost than the same capacity in an all-flash drive. This is because real applications have higher data locality than synthetic benchmarks.
  • Second, it is likely that data written to the drive will flow through the DRAM to the hard drive without ever being written to flash. Reducing the write traffic to the flash increases its service life, and there is another effect in the same direction. The endurance of flash memory is a function of temperature, write activity and retention time. If the data sits in the flash a long time, it needs to be re-written. But in this architecture data doesn't sit in the flash for a long time, so not merely isn't it written frequently, it isn't re-written frequently.
You gotta love synergistic effects like these.

No comments: