Tuesday, September 25, 2018

Web Archives As Evidence

In Blockchain Solves Preservation! I critiqued John Collomosse et al's ARCHANGEL: Trusted Archives of Digital Public Documents. They argue that
integrity validation via hashes is needed because:
Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation — trust at face value in a centralised authority, like a national government archive or University.
But they also write that:
acceptance of content evidence might eventually become similar to acceptance of DNA evidence in court, but that establishing that level of confidence would require strong public engaged to explain Blockchain in an accessible manner particularly explaining why one could trust the cryptographic assurances inherent in a DLT solution.
At least as far as courts are concerned, they're wrong about both "face value" and how trust is established. Below the fold, an explanation.

Tuesday, September 18, 2018

Vint Cerf on Traceability

Vint Cerf's Traceability addresses a significant problem:
how to preserve the freedom and openness of the Internet while protecting against the harmful behaviors that have emerged in this global medium. That this is a significant challenge cannot be overstated. The bad behaviors range from social network bullying and misinformation to email spam, distributed denial of service attacks, direct cyberattacks against infrastructure, malware propagation, identity theft, and a host of other ills
Cerf's proposed solution is:
differential traceability. The ability to trace bad actors to bring them to justice seems to me an important goal in a civilized society. The tension with privacy protection leads to the idea that only under appropriate conditions can privacy be violated. By way of example, consider license plates on cars. They are usually arbitrary identifiers and special authority is needed to match them with the car owners ... This is an example of differential traceability; the police department has the authority to demand ownership information from the Department of Motor Vehicles that issues the license plates. Ordinary citizens do not have this authority.
Below the fold I examine this proposal and one of the responses.

Thursday, September 13, 2018

Blockchain Solves Preservation!

We're in a period when blockchain or "Distributed Ledger Technology" is the Solution to Everything™, so it is inevitable that it will be proposed as the solution to the problems of digital preservation. John Collomosse et al's abstract for ARCHANGEL: Trusted Archives of Digital Public Documents states:
We present ARCHANGEL; a de-centralised platform for ensuring the long-term integrity of digital documents stored within public archives. Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation --- trust at face value in a centralised authority, like a national government archive or University. ARCHANGEL proposes a shift to a technological underscoring of that trust, using distributed ledger technology (DLT) to cryptographically guarantee the provenance, immutability and so the integrity of archived documents. We describe the ARCHANGEL architecture, and report on a prototype of that architecture build over the Ethereum infrastructure. We report early evaluation and feedback of ARCHANGEL from stakeholders in the research data archives space.
This is a wonderful example of the way people blithely assume that the claimed properties of blockchain systems are actually delivered in the real world. Below the fold I ask whether Collomosse et al have applied appropriate skepticism to blockchain's claims, and whether they have considered the sustainability of their proposal.

Tuesday, September 11, 2018

What Does Data "Durability" Mean

In What Does 11 Nines of Durability Really Mean? David Friend writes:
No amount of nines can prevent data loss.

There is one very important and inconvenient truth about reliability: Two-thirds of all data loss has nothing to do with hardware failure.

The real culprits are a combination of human error, viruses, bugs in application software, and malicious employees or intruders. Almost everyone has accidentally erased or overwritten a file. Even if your cloud storage had one million nines of durability, it can’t protect you from human error.
Friend may be right that these are the top 5 causes of data loss, but over the timescale of preservation as opposed to storage they are far from the only ones. In Requirements for Digital Preservation Systems: A Bottom-Up Approach we listed 13 of them. Below the fold, some discussion of the meaning and usefulness of durability claims.

Tuesday, September 4, 2018

Chia Network

Back in March I wrote Proofs of Space, analyzing Bram Cohen's fascinating EE380 talk. I've now learned more about Chia Network, the company that is implementing a network using his methods. Below the fold I look into their prospects.