I'm naturally happy when someone cites my blog and uses my data, as Alex Teu did in his post Cloud Storage Is Eating The World Alive on TechCrunch. I'm less happy with the some of the conclusions Alex drew. Below the fold, I argue with him.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Friday, August 22, 2014
Thursday, August 21, 2014
Is This The Dawn of DAWN?
More than three years ago, Ian Adams, Ethan Miller and I were inspired by a 2009 paper FAWN: A Fast Array of Wimpy Nodes from David Andersen et al at C-MU. They showed how a fabric of nodes, each with a small amount of flash memory and a very low-power processor, could process key-value queries as fast as a network of beefy servers using two orders of magnitude less power.
We put forward a storage architecture called DAWN: Durable Array of Wimpy Nodes, similar hardware but optimized for long-term storage. Its advantages were small form factor, durability, and very low running costs. We argued that these would outweigh the price premium for flash over disk. Recent developments are starting to make us look prophetic - details below the fold.
We put forward a storage architecture called DAWN: Durable Array of Wimpy Nodes, similar hardware but optimized for long-term storage. Its advantages were small form factor, durability, and very low running costs. We argued that these would outweigh the price premium for flash over disk. Recent developments are starting to make us look prophetic - details below the fold.
Tuesday, August 19, 2014
TRAC Audit: Do-It-Yourself Demos
In my post TRAC Audit: Process I explained how we demonstrated the LOCKSS Polling and Repair Protocol to the auditors, and linked to the annotated logs we showed them. These demos have been included in the latest release of the LOCKSS software. Below the fold, and now in the documentation, are step-by-step instructions allowing you to replicate this demo.
Thursday, August 14, 2014
"National Hosting" of archives
The LOCKSS team are working with some countries to build in-country Private LOCKSS Networks (PLNs) to preserve the content such as e-journals and e-books that they pay for. Other countries are considering outsourcing their national archive of this content to foreign providers. One of the questions that countries ask about these efforts is "where is the data stored?" Recent developments in the US and the UK mean that this is no longer the right question to ask. Follow me below the fold to find out what the right question has become.
Tuesday, August 12, 2014
TRAC Audit: Lessons
This is the third in a series of posts about CRL's TRAC audit of the
CLOCKSS Archive. Previous posts announced the release of the certification report, and recounted the audit process. Below the fold I look at the lessons we and others can learn from our experiences during the audit.
Tuesday, August 5, 2014
TRAC Audit: Process
This is the second in a series of posts about CRL's audit of the CLOCKSS Archive. In the first, I announced the release of the certification report. In this one I recount the process of being audited and what we did during it. Follow me below the fold for a long story, but not as long as the audit process.
Update: the third post discussing the lessons to be drawn is here.
Update: the third post discussing the lessons to be drawn is here.
Monday, August 4, 2014
Post-Flash Solid State Storage Gets Real-er
HGST announced today that they are demonstrating an SSD that is based on Phase-Change Memory (PCM), one of the technologies competing to take over as flash runs out of steam. The selling point of the SSD is that it is extremely fast:
The demonstration shows unprecedented SSD performance levels that are achieved by utilizing a combination of HGST's new, latency-optimized interface protocols with next-generation non-volatile memory components.The SSD is based on 1Gb PCM chips. The new protocols that are needed to squeeze this performance out of PCIe were described by Dejan Vučinić et al in their paper DC Express: Shortest Latency Protocol for Reading Phase Change Memory over PCI Express at this year's FAST conference.
The SSD demonstration utilizes a PCIe interface and delivers three million random read IOs per second of 512 bytes each when operating in a queued environment and a random read access latency of 1.5 microseconds (us) in non-queued settings, delivering results that cannot be achieved with existing SSD architectures and NAND Flash memories. This performance is orders of magnitude faster than existing Flash based SSDs, resulting in a new class of block storage devices.