provide the first systematic analysis of the benefits and threats of arbitrary blockchain content. Our analysis shows that certain content, e.g., illegal pornography, can render the mere possession of a blockchain illegal. Based on these insights, we conduct a thorough quantitative and qualitative analysis of unintended content on Bitcoin's blockchain. Although most data originates from benign extensions to Bitcoin's protocol, our analysis reveals more than 1600 files on the blockchain, over 99% of which are texts or images.Below the fold, some details.
In total, our detectors found 3,535,855 transactions carrying a total payload of 118.53 MiB, i.e., only 1.4 % of Bitcoin transactions contain non-financial data.They list some useful forms of non-financial content on the Bitcoin blockchain:
are infrequently but steadily used; since 2016 we recognize on average 23.65 data items being added per month using these services.The authors identified the traces the services leave, and extracted the arbitrary content:
content insertion services account for 16.12 MiB of non-financial data.
since all Bitcoin participants maintain a complete local copy of the blockchain (e.g., to ensure correctness of blockchain updates and to bootstrap new users), these desired and vital features put all users at risk when objectionable content is irrevocably stored on the blockchain.The risk arises because under the law in Germany and most other countries:
a person is culpable for possession of illegal content if she knowingly possesses an accessible document holding said content. It is critical here that German law perceives the hard disk holding the blockchain as an document and that users can easily reassemble any illegal content within the blockchain. Furthermore, users can be assumed to knowingly maintain control over such illegal content w.r.t. German law if sufficient media coverage causes the content's existence to become public knowledge among Bitcoin usersTheir examination of the Bitcoin blockchain found among the 16.12 MiB of arbitrary content instances of most of the types of objectional content that they listed:
- Copyright violations:
We found seven files that publish (intellectual) property and showcase Bitcoin's potential to aid copyright violations. Engraved are the text of a book, a copy of the original Bitcoin paper, and two short textual white papers. Furthermore, we found two leaked cryptographic keys: one RSA private key and a firmware secret key. Finally, the blockchain contains a so-called illegal prime, encoding software to break the copy protection of DVDs.
We could not find actual malware in Bitcoin's blockchain. However, an individual non-standard transaction contains a non-malicious cross-site scripting detector. A security researcher inserted this small piece of code which, if interpreted by an online blockchain parser, notifies the author about the vulnerability. Such malicious code could become a threat for users as most websites offering an online blockchain parser also offer online Bitcoin accounts.
- Privacy violations:
Users store memorable private moments on the blockchain. We extracted six wedding-related images and one image showing a group of people, labeled with their online pseudonyms. Furthermore, 609 transactions contain online public chat logs, emails, and forum posts discussing Bitcoin, including topics such as money laundering. Storing private chat logs on the blockchain can, e.g., leak single user's private information irrevocably. Moreover, third parties can release information without knowledge nor consent of affected users. Most notably, we found at least two instances of doxing , i.e., the complete disclosure of another individual's personal information. This data includes phone numbers, addresses, bank accounts, passwords, and multiple online identities. Recently, jurisdictions such as the European Union began to punish such serious privacy violations, including the distribution of doxing data. Again, carrying out such assaults via blockchains fortifies the problem due to their immutability.
- Politically sensitive content:
The blockchain has been used by whistleblowers as a censorship-resistant permanent storage for leaked information. We found backups of the WikiLeaks Cablegate data as well as an online news article concerning pro-democracy demonstrations in Hong Kong in 2014.
- Illegal and condemned content:
Bitcoin's blockchain contains at least eight files with sexual content. While five files only show, describe, or link to mildly pornographic content, we consider the remaining three instances objectionable for almost all jurisdictions: Two of them are backups of link lists to child pornography, containing 274 links to websites, 142 of which refer to Tor hidden services. The remaining instance is an image depicting mild nudity of a young woman. In an online forum this image is claimed to show child pornography, albeit this claim cannot be verified (due to ethical concerns we refrain from providing a citation). Notably, two of the explicit images were only detected by our suspicious-transaction detector, i.e., they were not inserted via known services.
Non-blockchain content sharing services have a wider range of techniques available to handle objectionable content:
The trade-off between enabling open systems for data distribution and risking that unwanted or even illegal content is being shared is already known from peer-to-peer networks. Peer-to-peer-based file-sharing protocols typically limit the spreading of objectionable public content by tracking the reputation of users offering files or assigning a reputation to files themselves. This way, users can reject objectionable content or content from untrustworthy sources. Contrarily, distributed content stores usually resort to encrypt private files before outsourcing them to other peers. By storing only encrypted files, users can plausibly deny possessing any content of others and can thus obliviously store it on their hard disk. Unfortunately, these protection mechanisms are not applicable to blockchains, as content cannot be deleted once it has been added to the blockchain and the utilization of encryption cannot be enforced reliably.The whole point of a public blockchain such as Bitcoin's is that it eliminates the need to trust a central authority, replacing it with trust in the consensus of the mining power. But if there is any way for miners to confirm blocks containing non-financial data, everyone storing a copy of the blockchain trusts the sources of the non-financial data not to place them at risk. Since all methods of inserting non-financial data into the Bitcoin blockchain require payment in Bitcoin, the source is visible in the blockchain, and can likely be de-anonymized after the fact. But even de-anonymizing and punishing the miscreant would not remove the objectionable content or mitigate the risk to others.
This problem seems endemic to any immutable public ledger that allows the insertion of arbitrary data. Last week it became much more pressing:
In the wake of this week’s passage of the Allow States and Victims to Fight Online Sex Trafficking Act (FOSTA) bill in both houses of Congress on Wednesday, Craigslist has removed its "Personals" section entirely, and Reddit has removed some related subreddits, likely out of fear of future lawsuits.Even more pressing is the effect of the EU's General Data Protection Regulation. David Meyer writes in Blockchain is on a collision course with EU privacy law:
The bloc’s General Data Protection law, which will come into effect in a few months’ time, says people must be able to demand that their personal data is rectified or deleted under many circumstances.But not to worry. Because blockchain, the EU will just have to tear up the GDPR and start again:
And with sanctions for flouting the GDPR including fines of up to €20 million or 4 percent of global revenues, many businesses may find the ultra-buzzy blockchain trend a lot less palatable than they first thought.
Altering data “just doesn’t work on a blockchain,” said John Mathews, the chief finance officer for Bitnation a project that aims to provide blockchain-based identity and governance services, as well as document storage.
“From a blockchain point of view, the GDPR is already out of date,” Mathews said. “Regulation plays catch-up with technology. The GDPR was written on the assumption that you have centralized services controlling access rights to the user’s data, which is the opposite of what a permissionless blockchain does.”So that's all right then. The EU's citizens will learn to love Garbage In, Garbage Out because otherwise they can't blockchain!
Jutta Steiner is the founder of Parity.io, a startup that develops decentralized technologies, and the former security chief for the Ethereum Foundation. She agrees with Mathews that “the GDPR needs a proper review.”
“From a practitioner’s perspective, it sounds to me that it was drafted by trying to implement a certain perspective of how the world should be without taking into account how technology actually works,” Steiner said.
“I can’t see the regulators being so stubborn as to not adjust the regulation. … They’ll just see the other countries will use the technology and Europe is at a disadvantage.”