I was one of the crowd of people who reacted to Wednesday's news that Argonne National Labs would shut down the NEWTON Ask A Scientist service, on-line since 1991, this Sunday by alerting Jason Scott's ArchiveTeam. Jason did what I should have done before flashing the bat-signal. He fed the URL into the Internet Archive's Save Page Now, to be told "relax, we're all over it". The site has been captured since 1996 and the most recent capture before the announcement was Feb 7th. Jason arranged for captures Thursday and today.
As you can see by these examples, the Wayback Machine has a pretty good copy of the final state of the service and, as the use of Memento spreads, it will even remain accessible via its original URL.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Saturday, February 28, 2015
Tuesday, February 24, 2015
Using the official Linux overlayfs
I realize that it may not be obvious exactly how to use the newly-official Linux overlayfs implementation. Below the fold, some example shell scripts that may help clarify things.
Friday, February 20, 2015
Report from FAST15
I spent most of last week at Usenix's File and Storage Technologies conference. Below the fold, notes on the most interesting talks from my perspective.
Tuesday, February 17, 2015
Vint Cerf's talk at AAAS
Vint Cerf gave a talk entitled Digital Vellum at the AAAS meeting last Friday that has received a lot of attention in the media, including follow-up pieces by other writers, and even drew the attention of Dave Farber's famed IP list. I have some doubts about how accurately the press
has reported his talk, which isn't available via the
AAAS meeting website. I am commenting on the reports, not
the talk. But, as The Register points out, Cerf has been making similar points for some time. I did find a TEDx talk he titled Bit Rot on YouTube, uploaded a year ago. Below the fold is my take.
Tuesday, February 10, 2015
The Evanescent Web
Papers drawing attention to the decay of links in academic papers have quite a history, i blogged about three relatively early ones six years ago. Now Martin Klein and a team from the Hiberlink project have taken the genre to a whole new level with a paper in PLoS One entitled Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot. Their dataset is 2-3 orders of magnitude bigger than previous studies, their methods are far more sophisticated, and they study both link rot (links that no longer resolve) and content drift (links that now point to different content). There's a summary on the LSE's blog.
Below the fold, some thoughts on the Klein et al paper.
Below the fold, some thoughts on the Klein et al paper.
Saturday, February 7, 2015
It takes longer than it takes
I hope it is permissible to blow my own horn on my own blog. Two concepts recently received official blessing after a good long while, for one of which I'm responsible, and for the other of which I'm partly responsible. The mysteries are revealed below the fold.
Thursday, February 5, 2015
Disk reliability
Two recent publications about disk reliability are of considerable interest. Continuing their exemplary tradition of transparency, Backblaze updated their 2013 report on their experience of disk failures with a report on 2014, and the raw data and a set of FAQs. And J-F Paris et al published Self-Repairing Disk Arrays. Below the fold, thoughts on the relationship between these two.