Friday, October 8, 2021

Talk At "Blockchain for Business" Conference

I was invited to be on a panel at the University of Arkansas' "Blockchain for Business" conference together with John Ryan and Dan Geer. Below the fold are my introductory remarks.

I'd like to thank Dan Conway for inviting me to talk about the security of blockchains. You don't need to take notes; the text of my remarks with links to the sources is at blog.dshr.org.

"Blockchain" is unfortunately a term used to describe two completely different technologies, which have in common only that they both use a data structure called a Merkle Tree, commonly in the form patented by Stuart Haber and Scott Stornetta in 1991. This is a linear chain of blocks each including the hash of the previous block. Even more unfortunately, the more secure way to implement trustworthy public databases using Merkle Trees isn't called a blockchain, so doesn't benefit from the tsunami of hype that surrounds the term.

Permissioned blockchains have a central authority controlling which network nodes can add blocks to the chain, whereas permissionless blockchains such as Bitcoin's do not; this difference is fundamental:
  • Permissioned blockchains can use well-established and relatively efficient techniques such as Byzantine Fault Tolerance to ensure that each node in the network has performed the same computation on the same data to arrive at the same state for the next block in the chain. This is a consensus mechanism.
  • In principle each node in a permissionless blockchain's network can perform a different computation on different data to arrive at a different state for the next block in the chain. Which of these blocks ends up in the chain is determined by a randomized, biased election mechanism. For example, in Proof-of-Work blockchains such as Bitcoin's a node wins election by being the first to solve a puzzle. The length of time it takes to solve the puzzle is random, but the probability of being first is biased, it is proportional to the compute power the node uses.
This fundamental difference means that the problems of securing the two blockchains are quite different:
  • A permissioned blockchain is a way to implement a distributed database. Securing it is a conventional problem. You need to ensure the central authority doesn't admit bad actors. You need to ensure each node is under separate administr ation with no shared credentials, to guard against compromise. Ideally, each node should run different software to guard against supply chain attacks, and so on.
  • Securing a permissionless blockchain is an unconventional problem. Because anyone, even bad guys, can take part its security depends primarily on ensuring that the cost of a successful attack greatly exceeds the rewards to be obtained from it.
To succeed, the attacker of a permissionless blockchain needs a high probability of being elected, which typically means they need to control a majority of the electorate. Making this control more expensive than the potential reward for an attack requires that being a voter be expensive. This has a number of consequences:
  • There is no central authority to collect funds to pay the voters, so they need to be reimbursed by the system itself via either inflation of a cryptocurrency, or transaction fees, or both. Currently, Bitcoin miners income is around 90% rewards (i.e. inflation). Research shows a fee-only system, as Bitcoin is intended to become, is insecure.
  • Imposing costs with Proof-of-Work, as most cryptocurrencies do, leads to catastrophic carbon footprints. Alternatives to Proof-of-Work are vastly more complex, and extremely difficult to get right. Ethereum has been trying to implement Proof-of-Stake for seven years.
  • Information technologies have strong economies of scale. The more resource a voter has, the better its margins. Thus successful permissionless blockchains are centralized. 3-4 mining pools have controlled the majority of Bitcoin mining power for at least 7 years.
The advantage of permissionless over permissioned blockchains is claimed to be decentralization. But in practice this is an illusion, the enormous costs of attempting to avoid centralization are wasted:
a Byzantine quorum system of size 20 could achieve better decentralization than proof-of-work mining at a much lower resource cost.
Ethereum has been even more centralized than Bitcoin, and because it is a programming environment its attack surface is exponentially greater. In particular, as we see in the recent $600M attack on Poly Network, it is much more vulnerable to supply chain attacks of the kind that Munin defends against in other environments.

Actually, centralization is often a good thing. Mistakes are inevitable, as we see with the recent $90M oopsie at Compound and the subsequent $67M oopsie, or the $23M fee Bitfinex paid for a $100K transaction. Centralization of Ethereum allowed Poly Network to to convince miners to make most transfers of the $600M loot very difficult, and persuade the thief to return most of it. Immutability sounds like a great idea until you're the victim of a theft.

The less vulnerable way to implement a trustworthy decentralized database is shown by the Certificate Transparency system described in RFC6962. It is a trust-but-verify system, typically a much more appropriate model for business than immutability. It allows for real-time verification that the certificates that secure HTTPS were issued by the appropriate Certificate Authority (CA) and are current. In essence it is a network with three types of node:
  • Logs, to which CAs report their current certificates, and from which they obtain attestations, called Signed Certificate Timestamps (SCTs) that owners can attach to their certificates. Clients can verify the signature on the SCT, then verify that the hash it contains matches the certificate. If it does, the certificate was the one that the CA reported to the log, and the owner validated. Each log maintains a Merkle tree data structure of the certificates for which it has issued SCTs.
  • Monitors, which periodically download all newly added entries from the logs that they monitor, verify that they have in fact been added to the log, and perform a series of validity checks on them. They also thus act as backups for the logs they monitor.
  • Auditors, which use the Merkle tree of the logs they audit to verify that certificates have been correctly appended to the log, and that no retroactive insertions, deletions or modifications of the certificates in the log have taken place. Clients can use auditors to determine whether a certificate appears in a log. If it doesn't, they can use the SCT to prove that the log misbehaved.
As with permissioned blockchains, a few tens of nodes provides adequate decentralization. A key point is that clients verify certificates against a random subset of the tens of nodes they trust, which for each node is a different subset of the whole set of nodes. Thus an attacker has to compromise the vast majority of the nodes to avoid detection. This aids efficiency, by optimizing for the common case when no attack is taking place, while still providing a very high probability of unambiguous detection while an attack is underway.

Note that unlike a blockchain, this is not a consensus or an election mechanism. It is a mechanism for ensuring that none of the actors in the network can escape responsibility for their actions, which in many cases is what is needed. For example, Hof and Carle show how the same mechanism can be applied to securing the software supply chain.

No comments: