Friday, December 26, 2014

Crypto-currency as a basis for preservation

Although I have great respect for the technology underlying crypto-currencies such as Bitcoin, I've been skeptical for some time as to its viability as a product in the market both as a currency and as the basis for peer-to-peer storage proposals such as Permacoin and MaidSafe. The attraction of crypto-currencies is their decentralized nature, but if they become successful enough to be generally useful, economies of scale lead to their centralization. It was easy to get caught up in the enthusiasm as Bitcoin grew rapidly, but:
Bitcoin was the worst investment of 2014, as its value halved.
Bitcoin's hash rate had been growing exponentially since the start of 2013 but has been approximately flat for the last quarter, indicating that investment in new mining hardware has dried up.
The reason for investment drying up is likely that the revenue from mining is less than a third of what it was.
The Bitcoin market capitalization dropped from $11B to $4.4B.
Even if you don't accept my economies of scale arguments, these numbers should temper your enthusiasm for basing peer-to-peer storage on a crypto-currency.

Thursday, December 18, 2014

Economic Failures of HTTPS

Bruce Schneier points me to Assessing legal and technical solutions to secure HTTPS, a fascinating, must-read analysis of the (lack of) security on the Web from an economic rather than a technical perspective by Axel Arnbak and co-authors from Amsterdam and Delft universities. Do read the whole paper, but below the fold I provide some choice snippets.

Tuesday, December 16, 2014

Hardware I/O Virtualization

At, Timothy Prickett Morgan has an interesting post entitled A Rare Peek Into The Massive Scale Of AWS. It is based on a talk by Amazon's James Hamilton at the re:Invent conference. Morgan's post provides a hierarchical, network-centric view of the AWS infrastructure:
  • Regions, 11 of them around the world, contain Availability Zones (AZ).
  • The 28 AZs are arranged so that each Region contains at least 2 and up to 6 datacenters.
  • Morgan estimates that there are close to 90 datacenters in total, each with 2000 racks, burning 25-30MW.
  • Each rack holds 25 to 40 servers.
AZs are no more than 2ms apart measured in network latency, allowing for synchronous replication. This means the AZs in a region are only a couple of kilometres apart, which is less geographic diversity than one might want, but a disaster still has to have a pretty big radius to take out more than one AZ. The datacenters in an AZ are not more than 250us apart in latency terms, close enough that a disaster might take all the datacenters in one AZ out.

Below the fold, some details and the connection between what Amazon is doing now, and what we did in the early days of NVIDIA.

Thursday, December 11, 2014

"Official" Senate CIA Torture Report

Please go and read James Jacobs' post The Official Senate CIA Torture Report to understand the challenges government documents librarians face. You would think that a document generating such worldwide interest would be easy to find and preserve. In your dreams, as it turns out.

Tuesday, December 9, 2014

Talk at Fall CNI

I gave a talk at the Fall CNI meeting entitled Improving the Odds of Preservation, with the following abstract:
Attempts have been made, for various types of digital content, to measure the probability of preservation. The consensus is about 50%. Thus the rate of loss to future readers from "never preserved" vastly exceeds that from all other causes, such as bit rot and format obsolescence. Will persisting with current preservation technologies improve the odds of preservation? If not, what changes are needed to improve them?
It covered much of the same material as Costs: Why Do We Care, with some differences in emphasis. Below the fold, the text with links to the sources.

Thursday, December 4, 2014

A Note of Thanks

I have a top-of-the-line MacBook Air, which is truly a work of art, but I discovered fairly quickly that subjecting a machine that cost almost $2000 to the vicissitudes of today's travel is worrying. So for years now the machine I've travelled with is a netbook, an Asus Seashell 1005PE. It is small, light, has almost all-day battery life and runs Ubuntu just fine. It cost me about $250, and with both full-disk encryption and an encrypted home directory, I just don't care if it gets lost, broken or seized.

But at last the signs of the hard life of a travelling laptop are showing. I looked around for a replacement and settled on the Acer C720 Chromebook. This cost me $387 including tax and same-day delivery from Amazon. Actually, same-day isn't accurate. It took less than 9 hours from order to arrival! If I'd waited until Black Friday to order it would have been more than $40 cheaper.

For that price, the specification is amazing:
  • 1.7GHz 4-core Intel Core i3
  • 4GB RAM
  • 32GB SSD
  • 11.6" 1366x768 screen
Thanks to these basic instructions from Jack Wallen and the fine work of HugeGreenBug in assembling a version of Ubuntu for the C720, 24 hours after ordering I had a light, thin, powerful laptop with a great display running a full 64-bit installation of Ubuntu 14.0.4. I'm really grateful to everyone who contributed to getting Linux running on Chromebooks in general and on the C720 in particular. Open source is wonderful.

Of course, there are some negatives. The bigger screen is great, but it makes the machine about an inch bigger in width and depth. Like the Seashell and unlike full-size laptops, it will be usable in economy seats on the plane even if the passenger in front reclines their seat. But it'll be harder than it was with the Seashell to claim that the computer and the drink can co-exist on the economy seat-back table.

Below the fold, some details for anyone who wants to follow in my footsteps.

Tuesday, December 2, 2014

Henry Newman's Farewell Column

Henry Newman has been writing a monthly column on storage technology for Enterprise Storage Forum for 12 years, and he's decided to call it a day. His farewell column is entitled Follow the Money: Picking Technology Winners and Losers and it starts:
I want to leave you with a single thought about our industry and how to consistently pick technology winners and losers. This is one of the biggest lessons I’ve learned in my 34 years in the IT industry: follow the money.
Its an interesting read. Although Henry has been a consistent advocate for tape for "almost three decades", he uses tape as an example of the money drying up. He has a table showing that the LTO media market is less than half the size it was in 2008. He estimates that the total tape technology market is currently about $1.85 billion, whereas the disk technology market it around $35 billion.
Following the money also requires looking at the flip side and following the de-investment in a technology. If customers are reducing their purchases of a technology, how can companies justify increasing their spending on R&D? Companies do not throw good money after bad forever, and at some point they just stop investing.
Go read the whole thing and understand why Henry's regular column will be missed, and how perceptive the late Jim Gray was when in 2006 he stated that Tape is Dead, Disk is Tape, Flash is Disk.