Thursday, July 27, 2017

Decentralized Long-Term Preservation

Lambert Heller is correct to point out that:
name allocation using IPFS or a blockchain is not necessarily linked to the guarantee of permanent availability, the latter must be offered as a separate service.
Storage isn't free, and thus the "separate services" need to have a viable business model. I have demonstrated that increasing returns to scale mean that the "separate service" market will end up being dominated by a few large providers just as, for example, the Bitcoin mining market is. People who don't like this conclusion often argue that, at least for long-term preservation of scholarly resources, the service will be provided by a consortium of libraries, museums and archives. Below the fold I look into how this might work.

These institutions would act in the public interest rather than for profit, and thus somehow be exempt from the effects of increasing returns to scale. Given the budget pressures these institutions are under, I'm skeptical. But lets assume that they are magically exempt.

The whole point of truly decentralized peer-to-peer systems is that they cannot be centrally managed; for example by a consortium of libraries. A system of this kind needs management that arises spontaneously by the effect of its built-in incentives on each individual participant. Among the functions that this spontaneous management needs to perform for a long-term storage service is to ensure that:
  • the storage resources needed to meet the demand are provided,
  • they are replaced as they fail or become obsolete,
  • each object is adequately replicated to ensure its long-term viability,
  • the replicas maintain suitable geographic and organizational diversity,
  • the software is maintained to fix the inevitable vulnerabilities,
and that the software is upgraded as the computing infrastructure evolves through time. Note that these are mostly requirements on the network as a whole rather than on individual peers. The SEC's report on Initial Coin Offerings recognizes similar needs:
Investors in The DAO reasonably expected Slock.it and its co-founders, and The DAO’s Curators, to provide significant managerial efforts after The DAO’s launch. The expertise of The DAO’s creators and Curators was critical in monitoring the operation of The DAO, safeguarding investor funds, and determining whether proposed contracts should be put for a vote. Investors had little choice but to rely on their expertise.

By contract and in reality, DAO Token holders relied on the significant managerial efforts provided by Slock.it and its co-founders, and The DAO’s Curators, as described above.
Even in the profit-driven world of crypto-currencies, the incentive from profit doesn't always lead to concensus (see the issue of increasing the Bitcoin block size, and the DAO heist), or to the provision of resources to meet the demand (see Bitcoin's backlog of unconfirmed transactions). Since we have assumed away the profit motive, and all we have left is a vague sense of the public interest, the built-in incentives powering the necessary functions will be weak.

This lack of effective governance is a problem in the short-term world of crypto-currency speculation (see the surplus GPUs flooding the market as Ethereum miners drop out). It is a disaster in digital preservation, where the requirement is to perform continuously and correctly over a time-scale of many technology generations. Human organizations can survive much longer time-scales; 8 years ago my University celebrated its 800-th birthday. Does anybody believe we'll be using Bitcoin or Ethereum 80 years from now as it celebrates its 888-th?

We have experience in these matters. Seventeen years ago we published the first paper describing the LOCKSS peer-to-peer digital preservation system. At the software level it was, and has remained through its subsequent evolution, a truly decentralized system. All peers are equal, no peer trusts any other, peers discover others through gossip-style communication. At the management and organizational level, however, formal structures arose such as the LOCKSS Alliance, the MetaArchive and the CLOCKSS Archive to meet real-world demand for the functions above to be performed in a reliable and timely fashion.

Trying by technical means to remove the need to have viable economics and governance is doomed to fail in the medium- let alone the long-term. What is needed is a solution to the economic and governance problems. Then a technology can be designed to work in that framework. Blockchain is a technology in search of a problem to solve, being pushed by ideology into areas where the unsolved problems are not technological.

1 comment:

Chris Aldrich said...

That last sentence defining the blockchain is fantastic.

If you hadn't heard of it yet, I attended a conference last year at UCLA entitled Dodging the Memory Hole, which I suspect is right up your alley. I know they're gearing up for another installment later this year at the Internet Archive in San Francisco. I suspect you'll find lots of friends there, and they're still accepting talks. https://www.rjionline.org/events/dodging-the-memory-hole-2017