Thursday, October 23, 2014

Facebook's Warm Storage

Last month I was finally able to post about Facebook's cold storage technology. Now, Subramanian Muralidhar and a team from Facebook, USC and Princeton have a paper at OSDI that describes the warm layer between the two cold storage layers and Haystack, the hot storage layer. f4: Facebook's Warm BLOB Storage System is perhaps less directly aimed at long-term preservation, but the paper is full of interesting information. You should read it, but below the fold I relate some details.

Monday, October 20, 2014

Journal "quality"

Anurag Acharya and co-authors from Google Scholar have a pre-print at arxiv.org entitled Rise of the Rest: The Growing Impact of Non-Elite Journals in which they use article-level metrics to track the decreasing importance of the top-ranked journals in their respective fields from 1995 to 2013. I've long argued that the value that even the globally top-ranked journals add is barely measurable and may even be negative; this research shows that the message is gradually getting out. Authors of papers subsequently found to be "good" (in the sense of attracting citations) are slowly but steadily choosing to publish away from the top-ranked journals in their field. You should read the paper, but below the fold I have some details.

Wednesday, October 15, 2014

The Internet of Things

In 1996, my friend Steven McGeady gave a fascinating and rather prophetic keynote address to the Harvard Conference on the Internet and Society. In his introduction, Steven said:
I was worried about speaking here, but I'm even more worried about some of the pronouncements that I have heard over the last few days, ... about the future of the Internet. I am worried about pronouncements of the sort: "In the future, we will do electronic banking at virtual ATMs!," "In the future, my car will have an IP address!," "In the future, I'll be able to get all the old I Love Lucy reruns - over the Internet!" or "In the future, everyone will be a Java programmer!"

This is bunk. I'm worried that our imagination about the way that the 'Net changes our lives, our work and our society is limited to taking current institutions and dialling them forward - the "more, better" school of vision for the future.
I have the same worries that Steven did about discussions of the Internet of Things that looms so large in our future. They focus on the incidental effects, not on the fundamental changes. Barry Ritholtz points me to a post by Jon Evans at TechCrunch entitled The Internet of Someone Else's Things that is an exception. Jon points out that the idea that you own the Smart Things you buy is obsolete:
They say “possession is nine-tenths of the law,” but even if you physically and legally own a Smart Thing, you won’t actually control it. Ownership will become a three-legged stool: who physically owns a thing; who legally owns it; …and who has the ultimate power to command it. Who, in short, has root.
What does this have to do with digital preservation? Follow me below the fold.

Tuesday, October 7, 2014

Economies of Scale in Peer-to-Peer Networks

In a recent IEEE Spectrum article entitled Escape From the Data Center: The Promise of Peer-to-Peer Cloud Computing, Ozalp Babaoglu and Moreno Marzolla (BM) wax enthusiastic about the potential for Peer-to-Peer (P2P) technology to eliminate the need for massive data centers. Even more exuberance can be found in Natasha Lomas' Techcrunch piece The Server Needs To Die To Save The Internet (LM) about the MaidSafe P2P storage network. I've been working on P2P technology for more than 16 years, and although I believe it can be very useful in some specific cases, I'm far less enthusiastic about its potential to take over the Internet.

Below the fold I look at some of the fundamental problems standing in the way of a P2P revolution, and in particular at the issue of economies of scale. After all, I've just written a post about the huge economies that Facebook's cold storage technology achieves by operating at data center scale.

Tuesday, September 30, 2014

More on Facebook's "Cold Storage"

So far this year I've attended two talks that were really revelatory; Krste Asanović's keynote at FAST 13, which I blogged about earlier, and Kestutis Patiejunas' talk about Facebook's cold storage systems. Unfortunately, Kestutis' talk was off-the-record, so I couldn't blog about it at the time. But he just gave a shorter version at the Library of Congress' Designing Storage Architectures workshop, so now I can blog about this fascinating and important system. Below the fold, the details.

Thursday, September 25, 2014

Plenary Talk at 3rd EUDAT Conference

I gave a plenary talk at the 3rd EUDAT Conference's session on sustainability entitled Economic Sustainability of Digital Preservation. Below the fold is an edited text with links to the sources.

Tuesday, September 23, 2014

A Challenge to the Storage Industry

I gave a brief talk at the Library of Congress Storage Architecture meeting, pulling together themes from a number of recent blog posts. My goal was twofold:
  • to outline the way in which current storage architectures fail to meet the needs of long-term archives,
  • and to set out what an architecture that would meet those needs would look like.
Below the fold is an edited text with links to the earlier posts here that I was condensing.