Tuesday, March 31, 2020

Archival Cloud Storage Pricing

Although there are significant technological risks to data stored for the long term, its most important vulnerability is to interruptions in the money supply. The current pandemic is likely to cause archives to suffer significant interruptions in the money supply.

In Cloud For Preservation I described how much of the motivation for using cloud services was their month-by-month pay-for-what-you-use billing, which transforms capital expenditures (CapEx) into operational expenditures (OpEx). Organizations typically find OpEx much easier to justify than CapEx because:
  • The numbers they look at are smaller, even if what they add up to over time is greater.
  • OpEx is less of a commitment, since it can be decreased if circumstances change.
Unfortunately, the lower the commitment the higher the risk to long-term preservation. Since it doesn't deliver immediate returns, it is likely to be first on the chopping block. Thus both reducing storage cost and increasing its predictability are important for sustainable digital preservation. Below the fold I revisit this issue.

Tuesday, March 24, 2020

More On Failures From FAST 2020

A Study of SSD Reliability in Large Scale Enterprise Storage Deployments by Stathis Maneas et al, which I discussed in Enterprise SSD Reliability, wasn't the only paper at this year's Usenix FAST conference about storage failures. Below the fold I comment on one specifically about hard drives rather than SSDs, making it more relevant to archival storage.

Tuesday, March 17, 2020

Proof-of-Stake In Practice

At the most abstract level, the work of Eric Budish, Raphael Auer, Joshua Gans and Neil Gandal is obvious. A blockchain is secure only if the value to be gained by an attack is less than the cost of mounting it. These papers all assume that actors are "economically rational", driven by the immediate monetary bottom line, but this isn't always true in the real world. As I wrote when commenting on Gans and Gandal:
As we see with Bitcoin's Lightning Network, true members of the cryptocurrency cult are not concerned that the foregone interest on capital they devote to making the system work is vastly greater than the fees they receive for doing so. The reason is that, as David Gerard writes, they believe that "number go up". In other words, they are convinced that the finite supply of their favorite coin guarantees that its value will in the future "go to the moon", providing capital gains that vastly outweigh the foregone interest.
Follow me below the fold for a discussion of a recent attack on a Proof-of-Stake blockchain that wasn't motivated by the immediate monetary bottom line.

Tuesday, March 10, 2020

Enterprise SSD Reliability

I couldn't attend this year's USENIX FAST conference. Because of the COVID-19 outbreak the normally high level of participation from Asia was greatly reduced, with many registrants and even some presenters unable to make it. But I've been reading the papers, and below the fold I have commentary on an extremely interesting one about the reliability of SSD media in enterprise applications.

Saturday, March 7, 2020

Guest Post: Michael Nelson's Response

Back last June I posted a three part series on Michael Nelson's CNI keynote Web Archives at the Nexus of Good Fakes and Flawed Originals and offered him a guest post to respond. Now, I owe Nelson a profound apology. He e-mailed me in January, but I completely misunderstood his e-mail and missed the attachment containing the HTML of the guest post. It is no real excuse that I was on painkillers and extremely short of sleep at the time.

So, below the fold, greatly delayed through my failure, is Michael Nelson's response, which is also available here.

Tuesday, March 3, 2020

Falling Research Productivity Revisited

Last year, in Falling Research Productivity, I commented on Are Ideas Getting Harder to Find? by Nicholas Bloom et al. Now, The Economist's current issue has a Free Exchange column entitled How to get more innovation bang for the research buck that takes off from the same paper:
In a paper by Nicholas Bloom, Charles Jones and Michael Webb of Stanford University, and John Van Reenen of the Massachusetts Institute of Technology (MIT), the authors note that even as discovery has disappointed, real investment in new ideas has grown by more than 4% per year since the 1930s. Digging into particular targets of research—to increase computer processing power, crop yields and life expectancy—they find that in each case maintaining the pace of innovation takes ever more money and people.
Follow me below the fold for some commentary on a number of the other papers they cite.

Thursday, February 27, 2020

Ludwig Siegele On Data

Ludwig Siegele's latest Special report for The Economist is entitled A deluge of data is giving rise to a new economy. He provides an excellent overview of the impact the availability of vast amounts of data is having on business. But follow me below the fold for my two quibbles.

Tuesday, February 18, 2020

The Scholarly Record At The Internet Archive

The Internet Archive has been working on a Mellon-funded grant aimed at collecting, preserving and providing persistent access to as much of the open-access academic literature as possible. The motivation is that much of the "long tail" of academic literature comes from smaller publishers whose business model is fragile, and who are at risk of financial failure or takeover by the legacy oligopoly publishers. This is particularly true if their content is open access, since they don't have subscription income. This "long tail" content is thus at risk of loss or vanishing behind a paywall.

The project takes two opposite but synergistic approaches:
  • Top-Down: Using the bibliographic metadata from sources like CrossRef to ask whether that article is in the Wayback Machine and, if it isn't trying to get it from the live Web. Then, if a copy exists, adding the metadata to an index.
  • Bottom-up: Asking whether each of the PDFs in the Wayback Machine is an academic article, and if so extracting the bibliographic metadata and adding it to an index.
Below the fold, a discussion of the progress that has been made so far.

Thursday, February 13, 2020

Economic Limits Of Proof-of-Stake Blockchains

In 2018's Cryptocurrencies Have Limits I discussed Eric Budish's The Economic Limits Of Bitcoin And The Blockchain, an important analysis of the economics of two kinds of "51% attack" on Bitcoin and other cryptocurrencies based on "Proof-of-Work" (PoW) blockchains. Among other things, Budish shows that, for safety, the value of transactions in a block must be low relative to the fees in the block plus the reward for mining the block. In last year's The Economics Of Bitcoin Transactions I discussed Raphael Auer's Beyond the doomsday economics of “proof-of-work” in cryptocurrencies, in which Auer shows that:
proof-of-work can only achieve payment security if mining income is high, but the transaction market cannot generate an adequate level of income. ... the economic design of the transaction market fails to generate high enough fees.
Follow me below the fold for a discussion of a fascinating recent paper that extends Budish's analysis.

Tuesday, February 11, 2020

More On The Ad Bubble

Google UI Timeline
Two weeks ago a firestorm erupted over a seemingly insignificant change to the UI of Google's search engine. It was enough to get Google to backtrack. A week later Daisuke Wakabayashi and Tiffany Hsu had the details in Why Google Backtracked on Its New Search Results Look, including this informative timeline graphic of the history of such changes since 2007. Their explanation for why Google made the change was:
Users complained that Google was trying to trick people into clicking on more paid results, while marketing executives said it was yet another step in blurring the line between ads and unpaid search results, forcing them to spend more money with the internet company.
Well, yes, but follow me below the fold for the bigger picture.

Thursday, February 6, 2020

Meta: Slow Blogging

Blogging is slow right now because my physical therapist wants me standing up and moving around at least every 15 minutes. Long-form blogging in 15-minute increments is hard.

Thursday, January 30, 2020

Regulating Social Media: Part 1

It has become obvious that self-regulated social media are a threat to pretty much every country's national security. This is intended to be the start of a series looking at the range of suggestions as to how, at least in the United States, it might be done, including (I hope) at least these:
Below the fold, I start with the first of them.

Tuesday, January 14, 2020

Advertising Is A Bubble

The surveillance economy, and thus the stratospheric valuations of:
Facebook and Alphabet (Google’s parent), which rely on advertising for, respectively, 97% and 88% of their sales.
depend on the idea that targeted advertising, exploiting as much personal information about users as possible, results in enough increased sales to justify its cost.This is despite the fact the both experimental research and the experience of major publishers and advertisers show the opposite. Now, The new dot com bubble is here: it’s called online advertising by Jesse Frederik and Maurits Martijn provides an explanation for this disconnect. Follow me below the fold to find out about it and enjoy some wonderful quotes from them.

Thursday, January 9, 2020

Library of Congress Storage Architecture Meeting

.The Library of Congress has finally posted the presentations from the 2019 Designing Storage Architectures for Digital Collections workshop that took place in early September, I've greatly enjoyed the earlier editions of this meeting, so I was sorry I couldn't make it this time. Below the fold, I look at some of the presentations.

Tuesday, January 7, 2020

Bitcoin's Lightning Network (updated)

Discussions of cryptocurrencies and other blockchain technologies are bedeviled by a nearly universal assumption that attributes that are possible to achieve in theory are guaranteed to be realized in practice. Examples include decentralization and anonymity.

Back in June David Gerard asked:
How good a business is running a Lightning Network node? LNBig provides 49.6% ($3.7 million in bitcoins) of the Lightning Network’s total channel liquidity funding — that just sits there, locked in the channels until they’re closed. They see 300 transactions a day, for total earnings on that $3.7 million of … $20 a month. They also spent $1000 in channel-opening fees.
Even if the Lightning Network worked (which it doesn't), and were decentralized (which it isn't), Gerard's point was that the transaction fees were woefully inadequate to cover the costs of running a node. Now, A Cryptoeconomic Traffic Analysis of Bitcoin’s Lightning Network by the Hungarian team of Ferenc Béres, István A. Seres, and András A. Benczúr supports Gerard's conclusion with a detailed analysis.

Below the fold, some commentary.

Thursday, January 2, 2020

Bunnie Huang's Betrusted Project

The awesome Bunnie Huang asks Can We Build Trustable Hardware? It is a fascinating approach to the problem I discussed in Securing The Hardware Supply Chain:
how we can know that the hardware the software we secured is running on is doing what we expect it to?
Bunnie's experience has made him very skeptical of the integrity of the hardware supply chain:
In the process of making chips, I’ve also edited masks for chips; chips are surprisingly malleable, even post tape-out. I’ve also spent a decade wrangling supply chains, dealing with fakes, shoddy workmanship, undisclosed part substitutions – there are so many opportunities and motivations to swap out “good” chips for “bad” ones. Even if a factory could push out a perfectly vetted computer, you’ve got couriers, customs officials, and warehouse workers who can tamper the machine before it reaches the user.
Below the fold, some discussion of Bunnie's current project.