Thursday, March 28, 2019

The 47 Links Mystery

Nearly a year ago, in All Your Tweets Are Belong To Kannada, I blogged about Cookies Are Why Your Archived Twitter Page Is Not in English. It describes some fascinating research by Sawood Alam and Plinio Vargas into the effect of cookies on the archiving of multi-lingual web-sites.

Sawood Alam just followed up with Cookie Violations Cause Archived Twitter Pages to Simultaneously Replay In Multiple Languages, another fascinating exploration of these effects. Follow me below the fold for some commentary.

Tuesday, March 26, 2019

FAST 2019

I wasn't able to attend this year's FAST conference in Boston, and reading through the papers I didn't miss much relevant to long-term storage. Below the fold a couple of quick notes and a look at the one really relevant paper.

Thursday, March 21, 2019

Cost-Reducing Writing DNA Data

In DNA's Niche in the Storage Market, I addressed a hypothetical DNA storage company's engineers and posed this challenge:
increase the speed of synthesis by a factor of a quarter of a trillion, while reducing the cost by a factor of fifty trillion, in less than 10 years while spending no more than $24M/yr.
Now, a company called Catalog plans to demo a significant step in the right direction:
The goal of the demonstration, says Park, is to store 125 gigabytes, ... in 24 hours, on less than 1 cubic centimeter of DNA. And to do it for $7,000.
That would be 1E11 bits for $7E3. At the theoretical maximum 2 bits/base, it would be $3.5E-8 per base, versus last year's estimate of 1E-4, or around 30,000 times better.

If the demo succeeds, it marks a major achievement. But below the fold I continue to throw cold water on the medium-term prospects for DNA storage.

Tuesday, March 19, 2019

Compression vs. Preservation

An archive is in a hardware refresh cycle and they have asked me to comment on concerns arising because their favored storage hardware uses data compression, which may not be possible to disable even if doing so were a good idea. This is an issue I wrote about two years ago in Threats to stored data.

Because similar concerns keep re-appearing in discussions of digital preservation, I decided this time to discuss it in the same way as Cloud for Preservation, writing a post with a general discussion of the issues without referring to a specific institution. Below the fold, the details.

Thursday, March 14, 2019

It's The Enforcement, Stupid!

Kim Stanley Robinson is a remarkable author. In 1990 he concluded his Wild Shore triptych of novels describing alternate futures for California with Pacific Edge:
Pacific Edge (1990) can be compared to Ernest Callenbach's Ecotopia, and also to Ursula K. Le Guin's The Dispossessed. This book's Californian future is set in the El Modena neighborhood of Orange in 2065. It depicts a realistic utopia as it describes a possible transformation process from our present status, to a more ecologically-focused future.
Why am I writing about this now, nearly three decades later? Follow me below the fold for an explanation.

Thursday, March 7, 2019

It Isn't Just Cryptocurrency Mining

Izabella Kaminska's Just because it's digital doesn't mean it's green reports on:
A new report by the carbon emission think-tank The Shift Project out this week highlights that not much has changed since [2014]. ICT still contributes to about 4 per cent of global greenhouse gas emissions, which is still twice that of civil aviation. What is worse, its contribution is growing more quickly than that of civil aviation.
Cryptocurrency mining is definitely a problem, but how big a part of the problem isn't clear. It could be quite big. Follow me below the fold for some surprising details.

Tuesday, March 5, 2019

Demand Is Far From Insatiable

Based on numbers that IDC conjures from thin air, pundits believe that demand for storage is insatiable because everyone says Lets Just Keep Everything Forever In The Cloud. That idea assumes storage is free, but Storage Will Be Much Less Free Than It Used To Be. (Both links are from 2012). Below the fold I look at some real-world numbers showing how much storage actual customers are buying.