We interrupt our regularly scheduled blogging for this special announcement. Go read Michael Nelson's post Why we need multiple web archives: the case of blog.reidreport.com right now! Its a detailed account in several updates of the forensic analysis of Joy-Ann Reid claim that either her blog or the Internet Archive was hacked. Michael's work landed him a spot on CNN at 0930 April 29th. He did an excellent job of explanation. Half an hour later Reid walked back her claims.
Michael is right about the importance of multiple independent Web archives; once again the Lots Of Copies Keep Stuff Safe principle. But the economics of this multiplicity are problematic.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Monday, April 30, 2018
Thursday, April 26, 2018
Cryptographers On Blockchains
David Gerard's April 21st blog post is a real linkfest. Below the fold, commentary on four of the links.
Tuesday, April 24, 2018
All Your Tweets Are Belong To Kannada
Gerd Badur CC BY-SA 3.0, Source |
47% of mementos of Barack Obama's Twitter page were in non-English languages, almost half of which were in Kannada alone. While language diversity in web archives is generally a good thing, in this case though, it is disconcerting and counter-intuitive.Kannada is an Indian language spoken by only about 38 million people. Below the fold, some commentary.
Thursday, April 12, 2018
Your Tax Dollars At Work
When I was writing Pre-publication Peer Review Subtracts Value, Springer wanted to charge me $39.95 for access to Comparing Published Scientific Journal Articles to Their Pre-print Versions by Martin Klein et al. This despite the fact that the copyright notice said:
This is a U.S. government work and its text is not subject to copyright protection in the United StatesFortunately, you can now follow the link to the final version at arXiv.org. I'm not the only one annoyed by the publishers charging for access to papers not subject to copyright. Below the fold, some more on this scam.
Tuesday, April 10, 2018
Natural Redundancy
Most uncompressed files contain significant redundancy, which is why they can be made smaller by a compression algorithm; they work by reducing redundancy. The better the algorithm, the less redundancy left in the output. If the files are then stored for the long term, they need to be protected, for example by erasure coding, which adds some redundancy back. In Exploiting Source Redundancy to Improve the Rate of Polar Codes, Ying Wang, Krishna R. Narayanan and Anxiao (Andrew) Jiang of Texas A&M explore using the original redundancy to reduce the amount of protection redundancy needed for a given level of reliability. Below the fold, some commentary.
Monday, April 9, 2018
John Perry Barlow RIP
By Mohamed Nanabhay from Qatar CC BY 2.0 |
The Economist, The Guardian and the New York Times had good obituaries, but they mentioned only his Declaration of the Independence of Cyberspace among his writings. It was undoubtedly an important rallying-cry at the time, but it should not be allowed to overshadow his other cyberspace-related writings, thankfully collected by the EFF in the John Perry Barlow Library. Below the fold, the one I would have chosen.
Thursday, April 5, 2018
Emulating Stephen Hawking's Voice
Jason Fagone at the San Francisco Chronicle has a fascinating story of heroic, successful (and timely) emulation in The Silicon Valley quest to preserve Stephen Hawking’s voice. It's the story of a small team which started work in 2009 trying to replace Hawking's voice synthesizer with more modern technology. Below the fold, some details to get you to read the whole article
Tuesday, April 3, 2018
Falling Research Productivity
Are Ideas Getting Harder to Find? by Nicholas Bloom et al looks at the history of investment in R&D and its effect on the product across several industries. Their main example is Moore's Law, and they show that [page 19]:
research effort has risen by a factor of 18 since 1971. This increase occurs while the growth rate of chip density is more or less stable: the constant exponential growth implied by Moore’s Law has been achieved only by a massive increase in the amount of resources devoted to pushing the frontier forward.Below the fold, some commentary on this and other relevant research.
Assuming a constant growth rate for Moore’s Law, the implication is that research productivity has fallen by this same factor of 18, an average rate of 6.8 percent per year.
If the null hypothesis of constant research productivity were correct, the growth rate underlying Moore’s Law should have increased by a factor of 18 as well. Instead, it was remarkably stable. Put differently, because of declining research productivity, it is around 18 times harder today to generate the exponential growth behind Moore’s Law than it was in 1971.