Thursday, December 28, 2017

Why Decentralize?

In Blockchain: Hype or Hope? (paywalled until June '18) Radia Perlman asks what exactly you get in return for the decentralization provided by the enormous resource cost of blockchain technologies? Her answer is:
a ledger agreed upon by consensus of thousands of anonymous entities, none of which can be held responsible or be shut down by some malevolent government ... [but] most applications would not require or even want this property.
Two important essays published last February by pioneers in the field provide different answers to Perlman's question:
Below the fold I try to apply our experience with the decentralized LOCKSS technology to ask whether their arguments hold up. I'm working on a follow-up post based on Chelsea Barabas, Neha Narula and Ethan Zuckerman's Defending Internet Freedom through Decentralization from last August, which asks the question specifically about the decentralized Web and thus the idea of decentralized storage.

Buterin

The Meaning of Decentralization is the shorter and more accessible of the two essays. Vitalik Buterin is a co-founder of Ethereum, as one can tell from the links in his essay, which starts by discussing what decentralization means:
When people talk about software decentralization, there are actually three separate axes of centralization/decentralization that they may be talking about. While in some cases it is difficult to see how you can have one without the other, in general they are quite independent of each other. The axes are as follows:
  • Architectural (de)centralization — how many physical computers is a system made up of? How many of those computers can it tolerate breaking down at any single time?
  • Political (de)centralization — how many individuals or organizations ultimately control the computers that the system is made up of?
  • Logical (de)centralization— does the interface and data structures that the system presents and maintains look more like a single monolithic object, or an amorphous swarm? One simple heuristic is: if you cut the system in half, including both providers and users, will both halves continue to fully operate as independent units?
He notes that:
Blockchains are politically decentralized (no one controls them) and architecturally decentralized (no infrastructural central point of failure) but they are logically centralized (there is one commonly agreed state and the system behaves like a single computer)
The Global LOCKSS network (GLN) is decentralized on all three axes. Individual libraries control their own network node; nodes cooperate but do not trust each other; no network operation involves more than a small proportion of the nodes. The CLOCKSS network, built from the same technology, is decentralized on the architectural and logical axes, but is centralized on the political axis since all the nodes are owned by the CLOCKSS archive.

Buterin then asks:
why is decentralization useful in the first place? There are generally several arguments raised:
  • Fault tolerance— decentralized systems are less likely to fail accidentally because they rely on many separate components that are not likely.
  • Attack resistance— decentralized systems are more expensive to attack and destroy or manipulate because they lack sensitive central points that can be attacked at much lower cost than the economic size of the surrounding system.
  • Collusion resistance — it is much harder for participants in decentralized systems to collude to act in ways that benefit them at the expense of other participants, whereas the leaderships of corporations and governments collude in ways that benefit themselves but harm less well-coordinated citizens, customers, employees and the general public all the time.
As regards fault tolerance, I think what Buterin meant by "that are not likely" is "that are not likely to suffer common-mode failures", because he goes on to ask:
Do blockchains as they are today manage to protect against common mode failure? Not necessarily. Consider the following scenarios:
  • All nodes in a blockchain run the same client software, and this client software turns out to have a bug.
  • All nodes in a blockchain run the same client software, and the development team of this software turns out to be socially corrupted.
  • The research team that is proposing protocol upgrades turns out to be socially corrupted.
  • In a proof of work blockchain, 70% of miners are in the same country, and the government of this country decides to seize all mining farms for national security purposes.
  • The majority of mining hardware is built by the same company, and this company gets bribed or coerced into implementing a backdoor that allows this hardware to be shut down at will.
  • In a proof of stake blockchain, 70% of the coins at stake are held at one exchange.
His recommendations for improving fault tolerance:
are fairly obvious:
They may be "fairly obvious" but some of them are very hard to achieve in the real world. For example:
  • What matters isn't that there are multiple competing implementations, but rather what fraction of the network's resources use the most common implementation. Pointing, as Buterin does, to a list of implementations of the Ethereum protocol (some of which are apparently abandoned) is interesting but if the majority of mining power runs one of them the network is vulnerable. Since one of them is likely to be more efficient than the others, a monoculture is likely to arise.
  • Similarly, the employment or volunteer status of the "core developers and researchers" isn't very important if they are vulnerable to the kind of group-think that we see in the Bitcoin community.
  • While it is true that the Ethereum mining algorithm is designed to enable smaller miners to function, it doesn't address the centralizing force I described in Economies of Scale in Peer-to-Peer Networks. If smaller mining nodes are more cost-effective, economies of scale will drive the network to be dominated by large collections of smaller mining nodes under unified control. If they aren't more cost-effective, the network will be dominated by collections of larger mining nodes under unified control. Either way, you get political centralization and lose attack resistance.
The result is, as Buterin points out, that the vaunted attack resistance of proof-of-work blockchains like Bitcoin's is less than credible:
In the case of blockchain protocols, the mathematical and economic reasoning behind the safety of the consensus often relies crucially on the uncoordinated choice model, or the assumption that the game consists of many small actors that make decisions independently. If any one actor gets more than 1/3 of the mining power in a proof of work system, they can gain outsized profits by selfish-mining. However, can we really say that the uncoordinated choice model is realistic when 90% of the Bitcoin network’s mining power is well-coordinated enough to show up together at the same conference?
But it turns out that coordination is a double-edged sword:
Many communities, including Ethereum’s, are often praised for having a strong community spirit and being able to coordinate quickly on implementing, releasing and activating a hard fork to fix denial-of-service issues in the protocol within six days. But how can we foster and improve this good kind of coordination, but at the same time prevent “bad coordination” that consists of miners trying to screw everyone else over by repeatedly coordinating 51% attacks?
And this makes resisting collusion hard:
Collusion is difficult to define; perhaps the only truly valid way to put it is to simply say that collusion is “coordination that we don’t like”. There are many situations in real life where even though having perfect coordination between everyone would be ideal, one sub-group being able to coordinate while the others cannot is dangerous.
As Tonto said to the Lone Ranger, "What do you mean we, white man?"

Our SOSP paper showed how, given a large surplus of replicas, the LOCKSS polling protocol made it hard for even a very large collusion among the peers to modify the consensus of the non-colluding peers without detection. The large surplus of replicas allowed each peer to involve a random sample of other peers in each operation. Absent an attacker, the result of each operation would be landslide agreement or landslide disagreement. The random sample of peers made it hard for an attacker to ensure that all operations resulted in landslides.

Alas, this technique has proven difficult to apply in other contexts, which in any case (except for cryotcurrencies) find it difficult to provide a sufficient surplus of replicas.

Szabo

Nick Szabo was a pioneer of digital currency, sometimes even suspected of being Satoshi Nakamoto. His essay Money, blockchains, and social scalability starts by agreeing with Perlman that blockchains are extremely wasteful of resources:
Blockchains are all the rage. The oldest and biggest blockchain of them all is Bitcoin, ... Running non-stop for eight years, with almost no financial loss on the chain itself, it is now in important ways the most reliable and secure financial network in the world.

The secret to Bitcoin’s success is certainly not its computational efficiency or its scalability in the consumption of resources. ... Bitcoin’s puzzle-solving hardware probably consumes in total over 500 megawatts of electricity. ... Rather than reduce its protocol messages to be as few as possible, each Bitcoin-running computer sprays the Internet with a redundantly large number of “inventory vector” packets to make very sure that all messages get accurately through to as many other Bitcoin computers as possible. As a result, the Bitcoin blockchain cannot process as many transactions per second as a traditional payment network such as PayPal or Visa.
Szabo then provides a different answer than Perlman's to the question "what does Bitcoin get in return for this profligate expenditure of resources?"
the secret to Bitcoin’s success is that its prolific resource consumption and poor computational scalability is buying something even more valuable: social scalability. ... Social scalability is about the ways and extents to which participants can think about and respond to institutions and fellow participants as the variety and numbers of participants in those institutions or relationships grow. It's about human limitations, not about technological limitations or physical resource constraints.
He measures social scalability thus:
One way to estimate the social scalability of an institutional technology is by the number of people who can beneficially participate in the institution. ... blockchains, and in particular public blockchains that implement cryptocurrencies, increase social scalability, even at a dreadful reduction in computational efficiency and scalability.
People participate in cryptocurrencies in three ways, by mining, transacting, and HODL-ing. In practice most miners simply passively contribute resources to a few large mining pools in return for a consistent cash flow. Chinese day-traders generate the vast majority of Bitcoin transactions. 40% of Bitcoin are HODL-ed by a small number of early adopters. None of these are really great social scalability. Bitcoin is a scheme to transfer money from many later to a few earlier adopters:
Bitcoin was substantially mined early on - early adopters have most of the coins. The design was such that early users would get vastly better rewards than later users for the same effort.

Cashing in these early coins involves pumping up the price, then selling to later adopters, particularly in the bubbles. Thus Bitcoin was not a Ponzi or pyramid scheme, but a pump-and-dump. Anyone who bought in after the earliest days is functionally the sucker in the relationship.
Szabo goes on to discuss the desirability of trust minimization and the impossibility of eliminating the need for trust:
Trust minimization is reducing the vulnerability of participants to each other’s and to outsiders’ and intermediaries’ potential for harmful behavior. ... In most cases an often trusted and sufficiently trustworthy institution (such as a market) depends on its participants trusting, usually implicitly, another sufficiently trustworthy institution (such as contract law). ... An innovation can only partially take away some kinds of vulnerability, i.e. reduce the need for or risk of trust in other people. There is no such thing as a fully trustless institution or technology. ... The historically recent breakthroughs of computer science can reduce vulnerabilities, often dramatically so, but they are far from eliminating all kinds of vulnerabilities to the harmful behavior of any potential attacker.
Szabo plausibly argues that the difference between conventional Internet services and blockchains is that between matchmaking and trust-minimization:
Matchmaking is facilitating the mutual discovery of mutually beneficial participants. Matchmaking is probably the kind of social scalability at which the Internet has most excelled. ... Whereas the main social scalability benefit of the Internet has been matchmaking, the predominant direct social scalability benefit of blockchains is trust minimization. ... Trust in the secret and arbitrarily mutable activities of a private computation can be replaced by verifiable confidence in the behavior of a generally immutable public computation. This essay will focus on such vulnerability reduction and its benefit in facilitating a standard performance beneficial to a wide variety of potential counterparties, namely trust-minimized money.
Szabo then describes his vision of "trust-minimized money" and its advantages thus:
A new centralized financial entity, a trusted third party without a “human blockchain” of the kind employed by traditional finance, is at high risk of becoming the next Mt. Gox; it is not going to become a trustworthy financial intermediary without that bureaucracy.

Computers and networks are cheap. Scaling computational resources requires cheap additional resources. Scaling human traditional institutions in a reliable and secure manner requires increasing amounts accountants, lawyers, regulators, and police, along with the increase in bureaucracy, risk, and stress that such institutions entail. Lawyers are costly. Regulation is to the moon. Computer science secures money far better than accountants, police, and lawyers.
Given the routine heists from exchanges it is clear that the current Bitcoin ecosystem is much less secure than traditional financial institutions. And imagine if the huge resources devoted to running the Bitcoin blockchain were instead devoted to additional security for the financial institutions!

Szabo is correct that:
In computer science there are fundamental security versus performance tradeoffs. Bitcoin's automated integrity comes at high costs in its performance and resource usage. Nobody has discovered any way to greatly increase the computational scalability of the Bitcoin blockchain, for example its transaction throughput, and demonstrated that this improvement does not compromise Bitcoin’s security.
The LOCKSS technology's automated security also comes from using a lot of computational resources, although by doing so it avoids expensive and time-consuming copyright negotiations. But then Szabo argues that because of the resource cost and the limited transaction throughput, the best that can be delivered is a reduced level of security for most transactions:
Instead, a less trust-minimized peripheral payment network (possibly Lightning ) will be needed to bear a larger number of lower-value bitcoin-denominated transactions than Bitcoin blockchain is capable of, using the Bitcoin blockchain to periodically settle with one high-value transaction batches of peripheral network transactions.
Despite the need for peripheral payment networks, Szabo argues:
Anybody with a decent Internet connection and a smart phone who can pay $0.20-$2 transaction fees – substantially lower than current remitance fees -- can access Bitcoin any where on the globe.
Transaction Fees
That was then. Current transaction fees are in the region of $50, with a median transaction size of about $4K, so the social scalability of Bitcoin transactions no longer extends to "Anybody with a decent Internet connection and a smart phone". As I wrote:
To oversimplify, the argument for Bitcoin and its analogs is the argument for gold, that because the supply is limited the price will go up. The history of the block size increase shows that the supply of Bitcoin transactions is limited to something around 4 per second. So by the same argument that leads to HODL-ing, the cost of getting out when you decide you can't HODL any more will always go up. And, in particular, it will go up the most when you need it the most, when the bubble bursts.
Szabo's outdated optimism continues:
When it comes to small-b bitcoin, the currency, there is nothing impossible about paying retail with bitcoin the way you’d pay with a fiat currency. ... Gold can have value anywhere in the world and is immune from hyperinflation because its value doesn’t depend on a central authority. Bitcoin excels at both these factors and runs online, enabling somebody in Albania to use Bitcoin to pay somebody in Zimbabwe with minimal trust in or and no payment of quasi-monopoly profits to intermediaries, and with minimum vulnerability to third parties.
Mining Pools 12/25/17
They'd better be paying many tens of thousands of dollars to make the transaction fees to the quasi-monopoly mining pools (top 6 pools = 79.8% of the mining power) worth the candle. Bitcoin just lost 25% of its "value" in a day, which would count as hyperinflation if it hadn't recently gained 25% in a day. In practice, they need to trust exchanges. And, as David Gerard recounts in Chapter 7, retailers who have tried accepting Bitcoin have found the volatility, the uncertainty of transactions succeeding and the delays impossible to live with.

Szabo's discussion of blockchains has worn better than his discussion of cryptocurrencies. It starts with a useful definition:
It is a blockchain if it has blocks and it has chains. The “chains” should be Merkle trees or other cryptographic structures with ... post-unforgeable integrity. Also the transactions and any other data whose integrity is protected by a blockchain should be replicated in a way objectively tolerant to worst-case malicious problems and actors to as high a degree as possible (typically the system can behave as previously specified up to a fraction of 1/3 to 1/2 of the servers maliciously trying to subvert it to behave differently).
and defines the benefit blockchains provide thus:
To say that data is post-unforgeable or immutable means that it can’t be undetectably altered after being committed to the blockchain. Contrary to some hype this doesn’t guarantee anything about a datum’s provenance, or its truth or falsity, before it was committed to the blockchain.
but this doesn't eliminate the need for (trusted) governance because 51% or less attacks are possible:
and because of the (hopefully very rare) need to update software in a manner that renders prior blocks invalid – an even riskier situation called a hard fork -- blockchains also need a human governance layer that is vulnerable to fork politics.
The possibility of 51% attacks means that it is important to identify who is behind the powerful miners. Szabo's earlier "bit gold" was based on his "secure property titles":
Also like today’s private blockchains, secure property titles assumed and required securely distinguishable and countable nodes.
Given the objective 51% hashrate attack limit to some important security goals of public blockchains like Bitcoin and Ethereum, we actually do care about the distinguishable identity of the most powerful miners to answer the question “can somebody convince and coordinate the 51%?
Or the 49% of the top three pools. Identification of the nodes is the basic difference between public and private blockchains:
So I think some of the “private blockchains” qualify as bona fide blockchains; others should go under the broader rubric of “distributed ledger” or “shared database” or similar. They are all very different from and not nearly as socially scalable as public and permissionless blockchains like Bitcoin and Ethereum.

All of the following are very similar in requiring an securely identified (distinguishable and countable) group of servers rather than the arbitrary anonymous membership of miners in public blockchains. In other words, they require some other, usually far less socially scalable, solution to the Sybil (sockpuppet) attack problem:
  • Private blockchains
  • The “federated” model of sidechains (Alas, nobody has figured out how to do sidechains with any lesser degree of required trust, despite previous hopes or claims). Sidechains can also be private chains, and it’s a nice fit because their architectures and external dependencies (e.g. on a PKI) are similar.
  • Multisig-based schemes, even when done with blockchain-based smart contracts
  • Threshold-based “oracle” architectures for moving off-blockchain data onto blockchains
Like blockchains, the LOCKSS technology can be used in public (as in the Global LOCKSS Network) or private (as in the CLOCKSS Archive) networks. The CLOCKSS network identifies its nodes using a Public Key Infrastructure (PKI):
The dominant, but usually not very socially scalable, way to identify a group of servers is with a PKI based on trusted certificate authorities (CAs). To avoid the problem that merely trusted third parties are security holes, reliable CAs themselves must be expensive, labor-intensive bureaucracies that often do extensive background checks themselves or rely on others (e.g. Dun and Bradstreet for businesses) to do so.
Public certificate authorities have proven not trustworthy but private CAs are within the trust border of the sponsoring organization.

Szabo is right that:
We need more socially scalable ways to securely count nodes, or to put it another way to with as much robustness against corruption as possible, assess contributions to securing the integrity of a blockchain.
But in practice, the ideal picture of blockchains hasn't worked out for Bitcoin:
That is what proof-of-work and broadcast-replication are about: greatly sacrificing computational scalability in order to improve social scalability. That is Satoshi’s brilliant tradeoff. It is brilliant because humans are far more expensive than computers and that gap widens further each year. And it is brilliant because it allows one to seamlessly and securely work across human trust boundaries (e.g. national borders), in contrast to “call-the-cop” architectures like PayPal and Visa that continually depend on expensive, error-prone, and sometimes corruptible bureaucracies to function with a reasonable amount of integrity.
Total Daily Transaction Fees
With the overhead cost of transactions currently running at well over $10M/day its not clear that "humans are far more expensive than computers". With almost daily reports of thefts over $10M Bitcoin lacks "a reasonable amount of integrity" at the level most people interact with it.

It is possible that other public blockchain applications might not suffer these problems. But mining blocks needs to be costly for the chain to deter Sybil attacks, and these costs need to be covered. So, as I argued in Economies of Scale in Peer-to-Peer Networks, there has to be an exchange rate between the chains "coins" and the fiat currencies that equipment and power vendors accept. Economies of scale will apply, and drive centralization of the network. If the "coins" become, as Bitcoins did, channels for flight capital and speculation the network will also become a target for crime. Private blockchains escape these problems, but they lack social scalability and have single points of failure; their advantages over more conventional and efficient systems are unclear.

3 comments:

Anonymous said...

«Szabo's discussion of blockchains has worn better than his discussion of cryptocurrencies. It starts with a useful definition:»

I dunno why techies pay so much attention to "blockchain" coins, the issues within etc. have been thoroughly discussed for decades. The only big deal is that a lot of "greater fools" have rushed into pump-and-dump schemes.

As to "blockchains" techies are routinely familiar with the Linux kernel 'git' crypto blockchain ledger, which was designed precisely to ensure that source code deposits and withdrawals into contributors' accounts were cryptographically secured in a peer-to-peer way to ensure malicious servers could not subvert the kernel source.

David. said...

One-stop counterfeit certificate shops for all your malware-signing needs by Dan Goodin is an example of why treating Certificate Authorities as "Trusted Third Parties" is problematic:

"A report published by threat intelligence provider Recorded Future ... identified four such sellers of counterfeit certificates since 2011. Two of them remain in business today. The sellers offered a variety of options. In 2014, one provider calling himself C@T advertised certificates that used a Microsoft technology known as Authenticode for signing executable files and programming scripts that can install software. C@T offered code-signing certificates for macOS apps as well. His fee: upwards of $1,000 per certificate."

Note that these certificates are not counterfeit, they are real certificates "registered under legitimate corporations and issued by Comodo, Thawte, and Symantec—the largest and most respected issuers". They are the result of corporate identity theft and failures of the verification processes of the issuers.

David. said...

"Over 23,000 users will have their SSL certificates revoked by tomorrow morning, March 1, in an incident between two companies —Trustico and DigiCert— that is likely to have a huge impact on the CA (Certificate Authority) industry as a whole in the coming months." is the start of a Catalin Cimpanu post.

It is a complicated story of Certificate Authorities behaving badly (who could have imagined?). Cimpanu has a useful timeline. The gist is that Trustico used to resell certificates from DigiCert but was switching to resell certificates from Comodo. During this spat with DigiCert, it became obvious that:

A) Trustico's on-line certificate generation process captured and stored the user's private keys, which is a complete no-no. Dan Goodin writes:

"private keys for TLS certificates should never be archived by resellers, and, even in the rare cases where such storage is permissible, they should be tightly safeguarded. A CEO being able to attach the keys for 23,000 certificates to an email raises troubling concerns that those types of best practices weren't followed. (There's no indication the email was encrypted, either, although neither Trustico nor DigiCert provided that detail when responding to questions.)"

B) Trustico's approach to website security was inadequate. They had to take their website down:

"shortly after a website security expert disclosed a critical vulnerability on Twitter that appeared to make it possible for outsiders to run malicious code on Trustico servers. The vulnerability, in a trustico.com website feature that allowed customers to confirm certificates were properly installed on their sites, appeared to run as root. By inserting commands into the validation form, attackers could call code of their choice and get it to run on Trustico servers with unfettered "root" privileges, the tweet indicated."