name allocation using IPFS or a blockchain is not necessarily linked to the guarantee of permanent availability, the latter must be offered as a separate service.Storage isn't free, and thus the "separate services" need to have a viable business model. I have demonstrated that increasing returns to scale mean that the "separate service" market will end up being dominated by a few large providers just as, for example, the Bitcoin mining market is. People who don't like this conclusion often argue that, at least for long-term preservation of scholarly resources, the service will be provided by a consortium of libraries, museums and archives. Below the fold I look into how this might work.
I'm David Rosenthal, and this is a place to discuss the work I'm doing in Digital Preservation.
Thursday, July 27, 2017
Decentralized Long-Term Preservation
Lambert Heller is correct to point out that:
Tuesday, July 25, 2017
Initial Coin Offerings
The FT's Alphaville blog has started a new series, called ICOmedy looking at the insanity surrounding Initial Coin Offerings (ICOs). The blockchain hype has created an even bigger opportunity to separate the fools from their money than the dot-com era did. To motivate you to follow the series, below the fold there are some extracts and related links.
Thursday, July 20, 2017
Patting Myself On The Back
Cost vs. Kryder rate |
2014 cost/byte projection |
The red lines are projections at the industry roadmap's 20% and a less optimistic 10%. [The graph] shows three things:
- The slowing started in 2010, before the floods hit Thailand.
- Disk storage costs in 2014, two and a half years after the floods, were more than 7 times higher than they would have been had Kryder's Law continued at its usual pace from 2010, as shown by the green line.
- If the industry projections pan out, as shown by the red lines, by 2020 disk costs per byte will be between 130 and 300 times higher than they would have been had Kryder's Law continued.
Backblaze average $/GB |
This is a big deal. As I've said many times:
Storage will beThe real cost of a commitment to store data for the long term is much greater than most people believe, and there is no realistic prospect of a technological discontinuity that would change this.
Much less free
Than it used to be
Tuesday, July 11, 2017
Is Decentralized Storage Sustainable?
There are many reasons to dislike centralized storage services. They include business risk, as we see in le petit musée des projets Google abandonnés, monoculture vulnerability and rent extraction. There is thus naturally a lot of enthusiasm for decentralized storage systems, such as MaidSafe, DAT and IPFS. In 2013 I wrote about one of their advantages in Moving vs. Copying. Among the enthusiasts is Lambert Heller. Since I posted Blockchain as the Infrastructure for Science, Heller and I have been talking past each other. Heller is talking technology; I have some problems with the technology but they aren't that important. My main problem is an economic one that applies to decentralized storage irrespective of the details of the technology.
Below the fold is an attempt to clarify my argument. It is a re-statement of part of the argument in my 2014 post Economies of Scale in Peer-to-Peer Networks, specifically in the context of decentralized storage networks.
Below the fold is an attempt to clarify my argument. It is a re-statement of part of the argument in my 2014 post Economies of Scale in Peer-to-Peer Networks, specifically in the context of decentralized storage networks.
Thursday, July 6, 2017
Archive vs. Ransomware
Archives perennially ask the question "how few copies can we get away with?"
This is a question I've blogged about in 2016 and 2011 and 2010, when I concluded:I've also written before about the immensely profitable business of ransomware. Recent events, such as WannaCrypt, NotPetya and the details of NSA's ability to infect air-gapped computers should convince anyone that ransomware is a threat to which archives are exposed. Below the fold I look into how archives should be designed to resist this credible threat.
- The number of copies needed cannot be discussed except in the context of a specific threat model.
- The important threats are not amenable to quantitative modeling.
- Defense against the important threats requires many more copies than against the simple threats, to allow for the "anonymity of crowds".
Labels:
digital preservation,
fault tolerance,
human error,
malware
Subscribe to:
Posts (Atom)