Monday, March 31, 2014

The Half-Empty Archive

Cliff Lynch invited me to give one of UC Berkeley iSchool's "Information Access Seminars" entitled The Half-Empty Archive. It was based on my brief introductory talk at ANADP II last November, an expanded version given as a staff talk at the British Library last January, and the discussions following both. An edited text with links to the sources is below the fold.

Friday, March 28, 2014


We were asked if the CLOCKSS Archive uses PREMIS metadata. The answer is no, and a detailed explanation is below the fold.

Tuesday, March 18, 2014

Krste Asanović Keynote at FAST14

The standout presentation at Usenix's FAST conference this year was Krste Asanović's keynote on UC Berkeley's ASPIRE Project. His introduction was:
The first generation of Warehouse-Scale Computers (WSC) built everything from commercial off-the-shelf (COTS) components: computers, switches, and racks. The second generation, which is being deployed today, uses custom computers, custom switches, and even custom racks, albeit all built using COTS chips. We believe the third generation of WSC in 2020 will be built from custom chips. If WSC architects are free to design custom chips, what should they do differently?
There is much to think about in the talk, which stands out because it treats the entire stack, from hardware to applications, in a holistic way. It is well worth your time to read the slides and watch the video. Below the fold, I have some comments.

Monday, March 17, 2014

Seagate's Kinetic hard drives

I was impressed by Seagate's announcement last October of their Kinetic Open Storage Platform and blogged about it at the time. I should have paid more attention. My ex-colleague at Sun, Geoff Arnold, who knows far more than I do about scale-out systems (he worked at Amazon) also blogged about the announcement, and his post is so worth reading that it got over 40K hits in the first two weeks! You should join the crowd.

And this week Seagate and the German storage firm Rausch announced at CeBit the first storage system product I've seen based on Kinetic drives, the BigFoot Object Storage Solution. A rack of these 4U units would hold 2.8PB.

Wednesday, March 12, 2014

Dan Geer at RSA

Dan Geer gave a must-read talk at the recent RSA conference. Dan is especially strong on the fragility of the systems upon which society is coming to depend, a theme that also ran through my friend Dewayne Hendricks' recent talk at Stanford's EE380 seminar series. Below the fold, I quibble with one part of Dan's talk to show how persistent projections based on exponential growth are in the face of facts.

Wednesday, March 5, 2014

Windows XP

The idea that format migration is integral to digital preservation was for a long time reinforced by people's experience of format incompatibility in Microsoft's Office suite. Microsoft's business model used to depend on driving the upgrade cycle by introducing gratuitous forward incompatibility, new versions of the software being set up to write formats that older versions could not render. But what matters for digital preservation is backwards incompatibility; newer versions of the software being unable to render content written by older versions. Six years ago the limits of Microsoft's ability to introduce backwards incompatibility were dramatically illustrated when they tried to remove support for some really old formats.

The reason for this fiasco was that Microsoft greatly over-estimated its ability to impose the costs of migrating old content on their customers, and the customer's ability to resist. Old habits die hard. Microsoft is trying to end support of Windows XP and Office 2003 on April 8 but it isn't providing cost-effective upgrade paths for what is now Microsoft's fastest-growing installed base. Joel Hruska writes:
Microsoft has come under serious fire for some significant missteps in this process, including a total lack of actual upgrade options. What Microsoft calls an upgrade involves completely wiping the PC and reinstalling a fresh OS copy on it — or ideally, buying a new device. Microsoft has misjudged how strong its relationship is with consumers and failed to acknowledge its own shortcomings. Not providing an upgrade utility is one example — but so is the general lack of attractive upgrade prices or even the most basic understanding of why users haven't upgraded.
This resistance to change has obvious implications for digital preservation.