Tuesday, March 27, 2018

Bad Blockchain Content

A Quantitative Analysis of the Impact of Arbitrary Blockchain Content on Bitcoin by Roman Matzutt et al examines the stuff in the Bitcoin blockchain that isn't a monetary transaction. They:
provide the first systematic analysis of the benefits and threats of arbitrary blockchain content. Our analysis shows that certain content, e.g., illegal pornography, can render the mere possession of a blockchain illegal. Based on these insights, we conduct a thorough quantitative and qualitative analysis of unintended content on Bitcoin's blockchain. Although most data originates from benign extensions to Bitcoin's protocol, our analysis reveals more than 1600 files on the blockchain, over 99% of which are texts or images.
Below the fold, some details.

The authors found a small but significant amount of non-financial data in the Bitcoin blockchain:
In total, our detectors found 3,535,855 transactions carrying a total payload of 118.53 MiB, i.e., only 1.4 % of Bitcoin transactions contain non-financial data.
They list some useful forms of non-financial content on the Bitcoin blockchain:
digital notary services, secure releases of cryptographic commitments, or non-equivocation schemes
A range of services provide the ability to inject arbitrary non-financial content into the Bitcoin blockchain. The most popular of the services:
are infrequently but steadily used; since 2016 we recognize on average 23.65 data items being added per month using these services.
The authors identified the traces the services leave, and extracted the arbitrary content:
content insertion services account for 16.12 MiB of non-financial data.
They provide a taxonomy of the techniques used to inject non-financial data, and analyze their effectiveness. They then point out the unavoidable risk that arbitrary content carries:
since all Bitcoin participants maintain a complete local copy of the blockchain (e.g., to ensure correctness of blockchain updates and to bootstrap new users), these desired and vital features put all users at risk when objectionable content is irrevocably stored on the blockchain.
The risk arises because under the law in Germany and most other countries:
a person is culpable for possession of illegal content if she knowingly possesses an accessible document holding said content. It is critical here that German law perceives the hard disk holding the blockchain as an document and that users can easily reassemble any illegal content within the blockchain. Furthermore, users can be assumed to knowingly maintain control over such illegal content w.r.t. German law if sufficient media coverage causes the content's existence to become public knowledge among Bitcoin users
Their examination of the Bitcoin blockchain found among the 16.12 MiB of arbitrary content instances of most of the types of objectional content that they listed:
  • Copyright violations:
    We found seven files that publish (intellectual) property and showcase Bitcoin's potential to aid copyright violations. Engraved are the text of a book, a copy of the original Bitcoin paper, and two short textual white papers. Furthermore, we found two leaked cryptographic keys: one RSA private key and a firmware secret key. Finally, the blockchain contains a so-called illegal prime, encoding software to break the copy protection of DVDs.
  • Malware:
    We could not find actual malware in Bitcoin's blockchain. However, an individual non-standard transaction contains a non-malicious cross-site scripting detector. A security researcher inserted this small piece of code which, if interpreted by an online blockchain parser, notifies the author about the vulnerability. Such malicious code could become a threat for users as most websites offering an online blockchain parser also offer online Bitcoin accounts.
  • Privacy violations:
    Users store memorable private moments on the blockchain. We extracted six wedding-related images and one image showing a group of people, labeled with their online pseudonyms. Furthermore, 609 transactions contain online public chat logs, emails, and forum posts discussing Bitcoin, including topics such as money laundering. Storing private chat logs on the blockchain can, e.g., leak single user's private information irrevocably. Moreover, third parties can release information without knowledge nor consent of affected users. Most notably, we found at least two instances of doxing , i.e., the complete disclosure of another individual's personal information. This data includes phone numbers, addresses, bank accounts, passwords, and multiple online identities. Recently, jurisdictions such as the European Union began to punish such serious privacy violations, including the distribution of doxing data. Again, carrying out such assaults via blockchains fortifies the problem due to their immutability.
  • Politically sensitive content:
    The blockchain has been used by whistleblowers as a censorship-resistant permanent storage for leaked information. We found backups of the WikiLeaks Cablegate data as well as an online news article concerning pro-democracy demonstrations in Hong Kong in 2014.
  • Illegal and condemned content:
    Bitcoin's blockchain contains at least eight files with sexual content. While five files only show, describe, or link to mildly pornographic content, we consider the remaining three instances objectionable for almost all jurisdictions: Two of them are backups of link lists to child pornography, containing 274 links to websites, 142 of which refer to Tor hidden services. The remaining instance is an image depicting mild nudity of a young woman. In an online forum this image is claimed to show child pornography, albeit this claim cannot be verified (due to ethical concerns we refrain from providing a citation). Notably, two of the explicit images were only detected by our suspicious-transaction detector, i.e., they were not inserted via known services.
By analogy with authorities' seemingly convenient ability to find child pornography in the computers of those they wish to prosecute for other offences, there is a theory that some government seeded the Bitcoin blockchain with this content to provide a convenient basis for suppressing the currency. Interpol's warning can be interpreted in this context. These ideas fit neatly into the libertarian, anti-government philosophy underlying much cryptocurrency advocacy.

Non-blockchain content sharing services have a wider range of techniques available to handle objectionable content:
The trade-off between enabling open systems for data distribution and risking that unwanted or even illegal content is being shared is already known from peer-to-peer networks. Peer-to-peer-based file-sharing protocols typically limit the spreading of objectionable public content by tracking the reputation of users offering files or assigning a reputation to files themselves. This way, users can reject objectionable content or content from untrustworthy sources. Contrarily, distributed content stores usually resort to encrypt private files before outsourcing them to other peers. By storing only encrypted files, users can plausibly deny possessing any content of others and can thus obliviously store it on their hard disk. Unfortunately, these protection mechanisms are not applicable to blockchains, as content cannot be deleted once it has been added to the blockchain and the utilization of encryption cannot be enforced reliably.
The whole point of a public blockchain such as Bitcoin's is that it eliminates the need to trust a central authority, replacing it with trust in the consensus of the mining power. But if there is any way for miners to confirm blocks containing non-financial data, everyone storing a copy of the blockchain trusts the sources of the non-financial data not to place them at risk. Since all methods of inserting non-financial data into the Bitcoin blockchain require payment in Bitcoin, the source is visible in the blockchain, and can likely be de-anonymized after the fact. But even de-anonymizing and punishing the miscreant would not remove the objectionable content or mitigate the risk to others.

This problem seems endemic to any immutable public ledger that allows the insertion of arbitrary data. Last week it became much more pressing:
In the wake of this week’s passage of the Allow States and Victims to Fight Online Sex Trafficking Act (FOSTA) bill in both houses of Congress on Wednesday, Craigslist has removed its "Personals" section entirely, and Reddit has removed some related subreddits, likely out of fear of future lawsuits.
Even more pressing is the effect of the EU's General Data Protection Regulation. David Meyer writes in Blockchain is on a collision course with EU privacy law:
The bloc’s General Data Protection law, which will come into effect in a few months’ time, says people must be able to demand that their personal data is rectified or deleted under many circumstances.
...
And with sanctions for flouting the GDPR including fines of up to €20 million or 4 percent of global revenues, many businesses may find the ultra-buzzy blockchain trend a lot less palatable than they first thought.
...
Altering data “just doesn’t work on a blockchain,” said John Mathews, the chief finance officer for Bitnation a project that aims to provide blockchain-based identity and governance services, as well as document storage.
But not to worry. Because blockchain, the EU will just have to tear up the GDPR and start again:
“From a blockchain point of view, the GDPR is already out of date,” Mathews said. “Regulation plays catch-up with technology. The GDPR was written on the assumption that you have centralized services controlling access rights to the user’s data, which is the opposite of what a permissionless blockchain does.”

Jutta Steiner is the founder of Parity.io, a startup that develops decentralized technologies, and the former security chief for the Ethereum Foundation. She agrees with Mathews that “the GDPR needs a proper review.”

“From a practitioner’s perspective, it sounds to me that it was drafted by trying to implement a certain perspective of how the world should be without taking into account how technology actually works,” Steiner said.
...
“I can’t see the regulators being so stubborn as to not adjust the regulation. … They’ll just see the other countries will use the technology and Europe is at a disadvantage.”
So that's all right then. The EU's citizens will learn to love Garbage In, Garbage Out because otherwise they can't blockchain!

6 comments:

David. said...

"In An Empirical Analysis of Traceability in the Monero Blockchain, a group of eminent computer scientists analyze a longstanding privacy defect in the Monero cryptocurrency, and reveal a new, subtle flaw, both of which can be used to potentially reveal the details of transactions and identify their parties. ... But the Monero problems exemplify a special problem of blockchain anonymity. By design, every transaction in the blockchain is irrevocably, universally, permanently public. That means that when new defects are discovered in a blockchain-based anonymity tool, attackers can download all the transactions that ever took place under the flawed anonymity protocol and go to work de-anonymizing them. ... But one of the defenses against future disclosures of defects in encryption techniques is to throw away the old messages once they're done with, to reduce the availability of decryptable ciphertexts. And that's not possible on the blockchain, because the blockchain only works if you can't delete things from it." reports Cory Doctorow at Boing Boing. Yet another reason why immutability is more of a bug than a feature.

David. said...

"Another day, ... initial coin offering (ICO).

This time it's the turn of - wait for it - the “ GDPR Cash token”, which promises access to “a community of GDPR experts” who can help businesses find their way around all 11 chapters and 99 articles of the EU's General Data Protection Regulation, due to come into force on May 25." from Immutable ledgers meet European data protection by Jemima Kelly in Alphaville's Someone is wrong on the Internet series.

David. said...

David Gerard reports, not for the first time, that the Ethereum blockchain has another kind of bad content, bad code in so-called "smart contracts". To be precise, two integer overflow bugs in code that has been widely copy-and-pasted into immutable smart contracts. Gerard's book has a whole chapter on smart contract bugs. He writes:

"Remember that not even Gavin Wood — the Ph.D computer scientist who wrote the Ethereum protocol specification — could write a smart contract safely enough not to lose hundreds of millions of dollars of his startup’s ICO funds in the Parity wallet disaster last November. What makes you sufficiently sure that you can?"

Emin Gun Sirer points out that he called out the integer overflow problem last July.

David. said...

"security researcher Omer Zohar demonstrated proof-of-concept code for a fully functional [botnet] command-and-control infrastructure built on top of the Ethereum network. Zohar was exploring the scope for potential misuse of blockchain in a bid to keep one step ahead of hackers and develop potential mitigation strategies." from John Leyden at The Register.

David. said...

A year after this post, the BBC reported in Child abuse images hidden in crypto-currency blockchain that the increased blocksize of Bitcoin Satoshi's Vision (BSV) had provided much greater scope for immutably publishing illegal content.

David. said...

Ernesto Van der Sar's Cloudflare Blocks Abusive Content on its Ethereum Gateway illustrates a problem with immutable blockchain storage that I discussed in this post 5 years ago:

"One of Cloudflare’s main aims is to make the Internet more secure while respecting the privacy of its users. This laudable goal is broadly respected but in common with other internet services, abuse of Cloudflare’s services can lead to conflicting situations.
...
In its most recent transparency report, Cloudflare further notes that it has implemented access restrictions on its public Ethereum gateway. The company doesn’t store any content on the Ethereum network, nor can it remove any. However, it can block access through its service.

If Cloudflare receives valid abuse reports or copyright infringement complaints, it will take appropriate action. The same applies to the gateway for the decentralized IPFS network."