Thursday, April 17, 2014

Henry Newman on HD vs. SSD Economics

Henry Newman has an excellent post entitled SSD vs. HDD Pricing: Seven Myths That Need Correcting. His seven myths are:
  • First, some assume that the price of MLC NAND flash will continue to decrease at a rapid and predictable rate that will make it competitive with HDDs for bandwidth, and nearly for capacity, by 2014 or 2015. This downward trend, it is assumed, will make flash a viable alternative for large storage and to act as a memory or “buffer” to improve performance.
  • Second, there is a general assumption that prices for bandwidth ($/GB/s) for SSDs is much lower than for HDDs, and that enterprises will measure costs in these terms instead of capacity.
  • Third, there is no distinction made between flash in general, such as consumer SSDs, and enterprise storage SSDs. It is assumed that MLC NAND will not only reduce in price ($/GB) but also that it will increase in density and larger capacity drives will be developed.
  • Fourth, it is assumed that the quality of MLC NAND will either remain constant or increase as prices decrease and densities increase, allowing it to improve not only performance, but also reliability and power consumption of the systems it is used in.
  • Fifth, it is assumed that power consumption for SSDs is, or will shortly be, significantly lower than that of HDDs overall, on a per GB basis and on a per GB/s basis.
  • Sixth, they assume disk performance will grow at a constant rate of about 20 percent per generation and not improve.
  • Seventh, they assume file system data layout will not improve to allow better disk utilization.
Henry is looking at the market for performance storage, not for long-term storage, but given that limitation I agree with nearly everything he writes. However, I think there is a simpler argument that ends up at the same place that Henry did:
  • Flash can do everything that hard disk can, but there are many markets where hard disk cannot do what flash can do.
  • The supply of both flash and hard disk is constrained. Flash is constrained because investing in new flash fabs would not be profitable, especially given the obviously limited scope for shrinking flash cells. Hard disk is constrained because the market is effectively a duopoly, and both players are struggling to transition from the current PMR technology to HAMR.
  • Thus flash will command a premium over hard disk prices so that the market directs the limited supply of flash to those applications, such as tablets, smartphones, and high-performance servers, where its added value is highest.

Monday, April 7, 2014

What Could Possibly Go Wrong?

I gave a talk at UC Berkeley's Swarm Lab entitled "What Could Possibly Go Wrong?" It was an initial attempt to summarize for non-preservationistas what we have learnt so far about the problem of preserving digital information for the long term in the more than 15 years of the LOCKSS Program. Follow me below the fold for an edited text with links to the sources.

Wednesday, April 2, 2014

EverCloud workshop

I was invited to a workshop sponsored by ISAT/DARPA entitled The EverCloud: Anticipating and Countering Cloud-Rot that arose from Yale's EverCloud project. I gave a brief statement on an initial panel; an edited text with links to the sources is below the fold.

Monday, March 31, 2014

The Half-Empty Archive

Cliff Lynch invited me to give one of UC Berkeley iSchool's "Information Access Seminars" entitled The Half-Empty Archive. It was based on my brief introductory talk at ANADP II last November, an expanded version given as a staff talk at the British Library last January, and the discussions following both. An edited text with links to the sources is below the fold.

Friday, March 28, 2014


We were asked if the CLOCKSS Archive uses PREMIS metadata. The answer is no, and a detailed explanation is below the fold.

Tuesday, March 18, 2014

Krste Asanović Keynote at FAST14

The standout presentation at Usenix's FAST conference this year was Krste Asanović's keynote on UC Berkeley's ASPIRE Project. His introduction was:
The first generation of Warehouse-Scale Computers (WSC) built everything from commercial off-the-shelf (COTS) components: computers, switches, and racks. The second generation, which is being deployed today, uses custom computers, custom switches, and even custom racks, albeit all built using COTS chips. We believe the third generation of WSC in 2020 will be built from custom chips. If WSC architects are free to design custom chips, what should they do differently?
There is much to think about in the talk, which stands out because it treats the entire stack, from hardware to applications, in a holistic way. It is well worth your time to read the slides and watch the video. Below the fold, I have some comments.

Monday, March 17, 2014

Seagate's Kinetic hard drives

I was impressed by Seagate's announcement last October of their Kinetic Open Storage Platform and blogged about it at the time. I should have paid more attention. My ex-colleague at Sun, Geoff Arnold, who knows far more than I do about scale-out systems (he worked at Amazon) also blogged about the announcement, and his post is so worth reading that it got over 40K hits in the first two weeks! You should join the crowd.

And this week Seagate and the German storage firm Rausch announced at CeBit the first storage system product I've seen based on Kinetic drives, the BigFoot Object Storage Solution. A rack of these 4U units would hold 2.8PB.