I'd like you all to read a piece by danah boyd entitled There was a bomb on my block. Go read it, then follow me below the fold for an explanation of why I think you should have done so.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Wednesday, September 28, 2016
Monday, September 26, 2016
The Things Are Winning
More than three years ago my friend Jim Gettys, who worked on One Laptop Per Child, and on the OpenWrt router software, started warning that the Internet of Things was a looming security disaster. Bruce Schneier's January 2014 article The Internet of Things Is Wildly Insecure — And Often Unpatchable and Dan Geer's April 2014 Heartbleed as Metaphor were inspired by Jim's warnings. That June Jim gave a talk at Harvard's Berkman Center entitled (In)Security in Home Embedded Devices. That September Vint Cerf published Bufferbloat and Other Internet Challenges, and Jim blogged about it. That Christmas a botnet running on home routers took down the gaming networks of Microsoft's Xbox and Sony's Playstation. That wasn't enough to motivate action to fix the problem.
As I write this on 9/24/16 the preceding link doesn't work, although the Wayback Machine has copies. To find out why the link isn't working and what it has to do with the IoT, follow me below the fold.
As I write this on 9/24/16 the preceding link doesn't work, although the Wayback Machine has copies. To find out why the link isn't working and what it has to do with the IoT, follow me below the fold.
Thursday, September 22, 2016
Where Did All Those Bits Go?
Lay people reading the press about storage, and even some "experts" writing in the press about storage, believe two things:
- per byte, storage media are getting cheaper very rapidly (Kryder's Law), and
- the demand for storage greatly exceeds the supply.
Tuesday, September 20, 2016
Brief Talk at the Storage Architecture Meeting
I was asked to give a brief summary of the discussions at the "Future of Storage" workshop to the Library of Congress' Storage Architecture meeting. Below the fold, the text of the talk with links to the sources.
Thursday, September 15, 2016
Nature's DNA storage clickbait
Andy Extance at Nature has a news article that illustrates rather nicely the downside of Marcia McNutt's (editor-in-chief of Science) claim that one reason to pay the subscription to top journals is that:
Our news reporters are constantly searching the globe for issues and events of interest to the research and nonscience communities.Follow me below the fold for an analysis of why no-one should be paying Nature to publish this kind of stuff.
Tuesday, September 13, 2016
Scary Monsters Under The Bed
So don't look there!
I sometimes hear about archives which scan for and remove malware from the content they ingest. It is true that archives contain malware, but this isn't a good idea:
See, for example, the Internet Archive's Malware Museum, which contains access surrogates of malware which has been defanged.
I sometimes hear about archives which scan for and remove malware from the content they ingest. It is true that archives contain malware, but this isn't a good idea:
- Most content in archives is never accessed by a reader who might be a target for malware, so most of the malware scan effort is wasted. It is true that increasingly these days data mining accesses much of an archive's content, but it does so in ways that are unlikely to activate malware.
- At ingest time, the archive doesn't know what it is about the content future scholars will be interested in. In particular, they don't know that the scholars aren't studying the history of malware. By modifying the content during ingest they may be destroying its usefulness to future scholars.
- Scanning and removing malware during ingest doesn't guarantee that the archive contains no malware, just that it doesn't contain any malware known at the time of ingest. If an archive wants to protect readers from malware, it should scan and remove it as the preserved content is being disseminated, creating a safe surrogate for the reader. This will guarantee that the reader sees no malware known at access time, likely to be a much more comprehensive set.
See, for example, the Internet Archive's Malware Museum, which contains access surrogates of malware which has been defanged.
Tuesday, September 6, 2016
Memento at W3C
Herbert van de Sompel's post at the W3C's blog Memento and the W3C announces that both the W3C's specifications and their Wiki now support Memento (RFC7089):
The Memento protocol is a straightforward extension of HTTP that adds a time dimension to the Web. It supports integrating live web resources, resources in versioning systems, and archived resources in web archives into an interoperable, distributed, machine-accessible versioning system for the entire web. The protocol is broadly supported by web archives. Recently, its use was recommended in the W3C Data on the Web Best Practices, when data versioning is concerned. But resource versioning systems have been slow to adopt. Hopefully, the investment made by the W3C will convince others to follow suit.This is a very significant step towards broad adoption of Memento. Below the fold, some details.
Thursday, September 1, 2016
CrossRef on "fake DOIs"
I should have pointed to Geoff Bilder's post to the CrossRef blog, DOI-like strings and fake DOIs when it appeared at the end of June. It responds to the phenomena described in Eric Hellman's Wiley's Fake Journal of Constructive Metaphysics and the War on Automated Downloading, to which I linked in the comments on Improving e-Journal Ingest (among other things). Details below the fold.