|Bitcoin pools 9/1/18|
Gradually, the economies of scale you need to make money mining Bitcoin are concentrating mining power in fewer and fewer hands. I believe this centralizing tendency is a fundamental problem for all incentive-compatible P2P networks. ... After all, the decentralized, distributed nature of Bitcoin was supposed to be its most attractive feature.That October I expanded the comment into Economies of Scale in Peer-to-Peer Networks, in which I wrote:
The simplistic version of the problem is this:
- The income to a participant in a P2P network of this kind should be linear in their contribution of resources to the network.
- The costs a participant incurs by contributing resources to the network will be less than linear in their resource contribution, because of the economies of scale.
- Thus the proportional profit margin a participant obtains will increase with increasing resource contribution.
- Thus the effects described in Brian Arthur's Increasing Returns and Path Dependence in the Economy will apply, and the network will be dominated by a few, perhaps just one, large participant.
|Ethereum miners 11/07/21|
Now, a DARPA-sponsored report entitled Are Blockchains Decentralized? by a large team from the Trail of Bits security company conforms to Betteridge's Law. They examine this and many other centralizing forces affecting a wide range of blockchain implementations and conclude that the answer to their question is "No". Below the fold I comment on each of their findings (in italic), then discuss Professor Angela Walch's analysis of the problems of using "decentralized" in a legal context.
The challenge with using a blockchain is that one has to either (a) accept its immutability and trust that its programmers did not introduce a bug, or (b) permit upgradeable contracts or off-chain code that share the same trust issues as a centralized approach.Although the report is focused on Bitcoin's blockchain, it includes some of Ethereum's problems:
For example, Alice can submit a transaction to a contract and, before the transaction is mined, the contract could be upgraded to have completely different semantics. The transaction would be executed against the new contract. Upgradeable contract patterns have become incredibly popular in Ethereum as they allow developers to circumvent immutability to patch bugs after deployment. But they also allow developers to patch in backdoors that would allow them to abscond with a contract’s assets. The challenge with using a blockchain is that one has to either (a) accept its immutability and trust that the programmers did not introduce a bug, or (b) permit upgradeable contracts or off-chain code that share the same trust issues as a centralized approach.Given that it is impossible to predict how long any given transaction will be delayed, the risk Alice runs can be significant. I first discussed the problems that "upgradeable" contracts both cure and create in 2018's DINO and IINO, linking to Udi Wertheimer's 2017 Bancor Unchained: All Your Token Are Belong To Us:
Bancor’s contracts are “upgradeable”, meaning they can replace them with new functionality, giving them more power, or removing power from themselves. They promise on some communications they will gradually remove their control over the system.Bancor's contracts had administrative backdoors that allowed, for example, taking anyone's tokens. An attacker exploited them to take complete control of the contract and steal $23M.
Every widely used blockchain has a privileged set of entities that can modify the semantics of the blockchain to potentially change past transactions.By which the authors mean the developers and maintainers of the software. They note that:
In some cases, the developers or maintainers of a blockchain intentionally modify its software to mutate the blockchain’s state to revert or mitigate an attack—this was Ethereum’s response to the 2016 DAO hack. But in most other cases, changes to a blockchain are an unintentional or unexpected consequence of another change. For example, Ethereum’s Constantinople hard fork reduced the gas costs of certain operations. However, some immutable contracts that were deployed before the hard fork relied on the old costs to prevent a certain class of attack called “reentrancy.” Constantinople’s semantic changes caused these once secure contracts to become vulnerable.
|Report page 23|
We generated software bills of materials (SBOMs) and dependency graphs for the major clients for Bitcoin, Bitcoin Cash, Bitcoin Gold, Ethereum, Zcash, Iota, Dash, Dogecoin, Monero, and Litecoin. We then compared two dependency graphs based on the clients’ normalized edit distance.The table shows that almost all the clients are at least 90% identical. This commonality makes various blockchains vulnerable to supply chain attacks on other blockchains:
While software bugs can lead to consensus errors, we demonstrated that overt software changes can also modify the state of the blockchain. Therefore, the core developers and maintainers of blockchain software are a centralized point of trust in the system, susceptible to targeted attack. There are currently four active contributors with access to modify the Bitcoin Core codebase, the compromise of any of whom would allow for arbitrary modification of the codebase. Recently, the lead developer of the $8 billion Polygon network, Jordi Baylina, was recently targeted in an attack with the Pegasus malware, which could have been used to steal his wallet or deployment credentials.Many recent software supply chain attacks have used compromised developer credentials. Thus the security of the blockchain depends upon the operational security of the developers.
The number of entities sufficient to disrupt a blockchain is relatively low: four for Bitcoin, two for Ethereum, and less than a dozen for most PoS networks.This is the so-called "Nakamoto coefficient". The authors make the same point that I have been making for the last eight years:
It is well known that Bitcoin is economically centralized: in 2020, 4.5% of Bitcoin holders controlled 85% of the currency. But what about Bitcoin’s systemic or authoritative centralization? As we saw in the last section, Bitcoin’s Nakamoto coefficient is four, because taking control of the four largest mining pools would provide a hashrate sufficient to execute a 51% attack. In January of 2021, the Nakamoto coefficient for Ethereum was only two. As of April 2022, it is three.The authors explain this lack of decentralization:
Each mining pool operates its own, proprietary, centralized protocol and interacts with the public Bitcoin network only through a gateway node. In other words, there are really only a handful of nodes that participate in the consensus network on behalf of the majority of the network’s hashrate. Controlling those nodes provides the means to, at a minimum, deny service to their constituent hashrate.They perform the same analysis for a set of Proof-of-Stake blockchains, which are naturally centralized because of the extreme Gini coefficients of cryptocurrencies:
Most PoS blockchain’s consensus protocols (Avalanche’s Snowflake, Solana’s Tower BFT, etc.) break down if the validators associated with at least one-third of the staked assets are malicious, effectively pausing the network. Therefore, the Nakamoto coefficient of most PoS blockchains is equal to the smallest number of validators that have collectively staked at least a third of all of the staked assets.And they point out that for Ethereum2, the most consequential PoS blockchain, the Nakamoto coefficient is currently 12 because:
According to Nansen, the four biggest depositors have more than a third of the stake, and those depositors have 12 nodes.There are many problems with Proof-of-Stake, but obvious ones include that the depositors are pseudonymous, and could thus actually be one person, that much of the currency is held by exchanges on behalf of their customers (see Justin Son's takeover of the Steem blockchain), and that it is possible to borrow large amounts of cryptocurrencies. These issues were aptly illstrated by Molly White's report Solend DAO passes proposal to take over the account of a large holder with a position that poses systemic risk:
The proposal succeeded hours after it was proposed, with one whale providing 1 million votes out of the 1.15 million votes in favor.
The standard protocol for coordination within blockchain mining pools, Stratum, is unencrypted and, effectively, unauthenticated.The Stratum protocol is how the mining pool operator divides up the work on its proposed block and assigns each fragment to pool members. The authors point out that:
We have discovered that, today, all of the mining pools we tested either assign a hard-coded password for all accounts or simply do not validate the password provided during authentication. For example, all ViaBTC accounts appear to be assigned the password “123.” Poolin seems not to validate authentication credentials at all. Slushpool explicitly instructs its users to ignore the password field as, “It is a legacy Stratum protocol parameter that has no use nowadays.” We discovered this by registering multiple accounts with the mining pools, and examining their server code, when available. These three mining pools alone account for roughly 25% of the Bitcoin hashrate.The Stratum protocol was enhanced with passwords in order to mitigate a denial-of-service attack, but clearly the pools no longer care about it.
For a blockchain to be optimally distributed, there must be a so-called Sybil cost. There is currently no known way to implement Sybil costs in a permissionless blockchain like Bitcoin or Ethereum without employing a centralized trusted third party (TTP). Until a mechanism for enforcing Sybil costs without a TTP is discovered, it will be almost impossible for permissionless blockchains to achieve satisfactory decentralization.As regard the economic forces driving centralization, in the section on Sybil and Eclipse Attacks: The “Other” 51%, the authors note that (Pg. 15):
A recent impossibility result for the decentralization of permissionless blockchains like Bitcoin and Ethereum was discovered by Kwon et al. It indicates that for a blockchain to be optimally distributed, there must be a so-called Sybil cost. That is, the cost of a single participant operating multiple nodes must be greater than the cost of operating one node.In 2019's Impossibility of Full Decentralization in Permissionless Blockchains Kwon et al formalize and extend the mechanism I described in 2014. My argument for centralization was that economies of scale meant that the cost per Sybil would decrease with N; their argument is that unless the cost per Sybil increases with N the system will not be decentralized. And, as they point out, no-one has any idea how to push back against economies of scale, much less make the cost per Sybil go up with N:
Because there is no central membership register, permissionless blockchains have to defend against Sybil attacks. But they also face three other related problems:
- Maintaining the connectivity of the network as nodes join and leave without a central register of node addresses.
- Maintaining the "mempool" of pending transactions as they are created, chosen by miners to include in blocks, and eventually finalized with no central database.
- Maintaining the state of the blockchain as miners propose newly mined blocks to be appended to it with no central database.
- The identities of its other neighbors. Thus a node wishing to join the network need only communicate with one member node, which will propagate its identity to the other nodes.
- Any transactions it has received. Thus, similarly, a node wishing to transact need only communicate with its neighbors to update the "mempool".
- Its idea of the head of the chain. Thus network-wide consensus on the longest chain is accelerated.
|Report page 18|
By crawling the Bitcoin network and querying nodes for known peers, we can estimate the number of public Bitcoin nodes (i.e., nodes actively accepting incoming connections). From crawling the Bitcoin network throughout 2021, we estimate that the public Bitcoin nodes constitute only 6–11% of the total number of nodes. Therefore, the vast majority of Bitcoin nodes do not meaningfully contribute to the health of the Bitcoin network. We have extended the Barabási–Albert random graph model to capture the behavior of Bitcoin peering. This model suggests that at the current size of the Bitcoin network, at least 10% of nodes must be public to ensure that new nodes are able to maximize their number of peers (and, therefore, maximize the health and connectivity of the network). As the total number of nodes increases, this bound approaches 40%.The authors observe that only 6-11% of Bitcoin nodes accept incoming connections, and assume that the others are behind network address translators and can only listen to the gossip protocol. For the purpose of maintaining connectivity the Bitcoin network is much smaller than it appears. The authors conclude:
A dense, possibly non-scale-free, subnetwork of Bitcoin nodes appears to be largely responsible for reaching consensus and communicating with miners—the vast majority of nodes do not meaningfully contribute to the health of the network.Thus the target for attacks on the Bitcoin network is not the whole network, but only this subnetwork.
When nodes have an out-of-date or incorrect view of the network, this lowers the percentage of the hashrate necessary to execute a standard 51% attack. Moreover, only the nodes operated by mining pools need to be degraded to carry out such an attack. For example, during the first half of 2021 the actual cost of a 51% attack on Bitcoin was closer to 49% of the hashrate.
|Report page 14|
The vast majority of Bitcoin nodes appear to not participate in mining and node operators face no explicit penalty for dishonesty.Since the vast majority of blocks are mined by the large pools, which each appear as a single node, this is inevitable. Miners in a pool that submit invalid blocks will be excluded and penalized by the pool.
Bitcoin traffic is unencrypted—any third party on the network route between nodes (e.g., ISPs, Wi-Fi access point operators, or governments) can observe and choose to drop any messages they wish.And:
Of all Bitcoin traffic, 60% traverses just three ISPs.As we see, for example with bufferbloat, ISPs need not explicitly drop entire messages, they can introduce delays that cause packets to be dropped.
As of July 2021, about half of all public Bitcoin nodes were operating from IP addresses in German, French, and US ASes, the top four of which are hosting providers (Hetzner, OVH, Digital Ocean, and Amazon AWS). The country hosting the most nodes is the United States (roughly one-third), followed by Germany (one-quarter), France (10%), The Netherlands (5%), and China (3%). ... This is yet another potential surface on which to execute an eclipse attack, since the ISPs and hosting providers have the ability to arbitrarily degrade or deny service to any node. Traditional Border Gateway Protocol (BGP) routing attacks have also been identified as threats.The effect of dropping messages and introducing delays is to reduce the threshold for a 51% attack.
The underlying network infrastructure is particularly important for Bitcoin and its derivatives, since all Bitcoin protocol traffic is unencrypted. Unencrypted traffic is fine for transactional and block data, since they are cryptographically signed and, therefore, impervious to tampering. However, any third party on the network route between nodes (e.g., ISPs, Wi-Fi access point operators, or governments) can observe and choose to drop any messages they wish.
Tor is now the largest network provider in Bitcoin, routing traffic for about half of Bitcoin’s nodes. Half of these nodes are routed through the Tor network, and the other half are reachable through .onion addresses. The next largest autonomous system (AS)—or network provider—is AS24940 from Germany, constituting only 10% of nodes. A malicious Tor exit node can modify or drop traffic similarly to an ISP.Malicious Tor exit nodes are a long-running problem.
Of Bitcoin’s nodes, 21% were running an old version of the Bitcoin Core client that is known to be vulnerable in June of 2021.The security of a blockchain depends on the software, which will inevitably have vulnerabilities, which will need timely patches.
The Ethereum ecosystem has a significant amount of code reuse: 90% of recently deployed Ethereum smart contracts are at least 56% similar to each other.The potential for "smart contracts" acquiring vulnerabilities through their software supply chain is very significant, because once again it is highly centralized. The authors sampled:
1,586 smart contracts deployed to the Ethereum blockchain in October 2021, and compared their bytecode similarity, using Levenshtein distance as a metric. One would expect such a metric to underestimate the similarity between contracts, since it compares low-level bytecode that has already been transformed, organized, and optimized by the compiler, rather than the original high-level source code. This metric was chosen both to act as a lower bound on similarity and to enable comparison between contracts for which we do not have the original source code. We discovered that 90% of the Ethereum smart contracts were at least 56% similar to each other. About 7% were completely identical.However, the authors don't discuss the major cause of Ethereum's "decentralized apps" not being decentralized. Ethereum nodes need far more resource than a mobile device or a desktop browser can supply. But on a mobile device or in a desktop browser is where a "decentralized app" needs to run if it is going to interact with a human. So, as Moxie Marlinspike discovered:
companies have emerged that sell API access to an ethereum node they run as a service, along with providing analytics, enhanced APIs they’ve built on top of the default ethereum APIs, and access to historical transactions. Which sounds… familiar. At this point, there are basically two companies. Almost all dApps use either Infura or Alchemy in order to interact with the blockchain. In fact, even when you connect a wallet like MetaMask to a dApp, and the dApp interacts with the blockchain via your wallet, MetaMask is just making calls to Infura!Law professor Angela Walch's Deconstructing ‘Decentralization’: Exploring the Core Claim of Crypto Systems examines the legal issues created by the false assertion that permissionless blockchains are decentralized. She takes off from here:
On June 14, 2018, William Hinman, Director of the SEC’s Division of Corporation Finance, seized the crypto world’s attention when he stated that “current offers and sales of Ether are not securities transactions” and linked this conclusion to the “sufficiently decentralized” structure of the Ethereum network.She concludes:
Like many other descriptors of blockchain technology (e.g., immutable, trustless, reflects truth), the adjective ‘decentralized’ as an inevitable characteristic of blockchain technology proves to be an overstatement, and we know that making decisions based on overstatements rather than reality can lead to bad consequences.Her argument is in four parts:
- Walch analyzes how people use the word "decentralized":
For example, in Arizona’s statute that uses the term ‘decentralized’ to define ‘blockchain technology,’ there is no definition of ‘decentralized’ to be found. Most mainstream descriptions of blockchain technologies or cryptoassets state simply that blockchains are decentralized. End of story. Decentralized is just something that blockchains are. An inherent characteristic. An essential and identifying feature.She identifies two senses in which it is used:
First, it is used to describe the network of computers (often referred to as ‘nodes’) that comprise a permissionless blockchain, as these systems operate through peer‐to‐peer connections between computers, rather than on a central server.And dissects Hinman's speech using these senses, showing how it conflates the two meanings, and how Hinman asserts that the Bitcoin and Ethereum blockchains are "sufficiently decentralized" without having a clear definition of the term or detailed evidence to support the assertion.
The second way ‘decentralized’ is commonly used is to describe how power or agency works within permissionless blockchain systems. 20 If there is not a single, central party keeping the record, that means that no single party has responsibility for it, and thus no single party is accountable for it.
- Walch observes that:
It turns out I am far from alone in critiquing the use of ‘decentralized’ to describe blockchain systems. In fact, in the past few years, exploring the concept of “decentralization” has become a trend for thought leaders and academics in the crypto space. Venture capitalists, Ethereum creator Vitalik Buterin, and others have attempted to articulate what “decentralization” means.Despite these efforts, we are no nearer to an agreed definition. Among the difficulties is that these systems are layered, and the way they are centralized is different at each layer. Walch makes seven points:
- No One Knows What “Decentralization” Means because it cannot be measured. Although it is possible to estimate a "Nakamoto coefficient" assuming that the entities involved are independent, there is no way to know if this assumption is true. And in most cases this computation ignores the concentration of power in the software development team, who cannot be considered independent actors.
- Satoshi Didn’t Invent Decentralization. Walch means here that it has a long history in politics but, as Arvind Narayanan and Jeremy Clark show in Bitcoin's Academic Pedigree, it also has a long history in software engineering.
- Decentralized Does Not Equal Distributed. Walch notes that these terms are often misused in the discourse. Decentralized refers to multiple independent loci of power, where distributed refers to multiple coordinated loci of execution.
- Decentralization Exists on a Spectrum, which since it cannot be measured should be a given.
- Decentralization is Dynamic rather than Static, because both the technology and the way it is used evolve over time. Walch observes:
The critical takeaway here is that any measurement of decentralization is obsolete immediately after it has been calculated. In a permissionless system, anyone can join, and no one has to stay, so the system’s composition is, in theory, always in flux.
- Decentralization is Aspirational, Not Actual, or as I have argued, the use of the term is essentially gaslighting.
- Decentralization Can Be Used to Hide Power or Enable Rule‐Breaking. The whole purpose of cryptocurrencies is to evade regulation, so Walch is too kind when she writes:
the term ‘decentralized’ is being used to hide actions by participants in the system in a fog of supposedly “freely floating authority,” and we must be vigilant not to overlook pockets of authority and power within these systems.Walch fails to appreciate that, as Dave Troy explains, cryptocurrencies are rooted in the idea that government and thus regulation are evil.
- Calls to Action. Walch writes:
The status quo usage of the terms “decentralized” and “decentralization” is deemed untenable by many commentators, and there are a variety of calls to action in the literature. ... The rationale behind these calls to action is that current usage of the term is creating misunderstandings about the capabilities of the technology. Further, it is clearly creating misunderstandings about how power works in these systems, with the potential for error in how law or regulation treats these systems and the people who act within them.Well, yes, but "creating misunderstandings about how power works" is the goal the crypto-bros are trying to achieve.
- Walch then provides:
examples of actions within the Bitcoin and Ethereum blockchain systems that undermine claims that either system is particularly decentralized.These include two accounts of secretive actions by Bitcoin developers. Critical Bug Discovery and Fix in Bitcoin Software in Fall 2018 and Bitcoin’s March 2013 Hard Fork, and two accounts of similar secretive actions by Ethereum developers, Secret Meetings of Ethereum Core Developers in Fall 2018 and Ethereum’s July 2016 Hard Fork. These descriptions of the response of the systems to crises dovetail nicely with the DARPA report's more analytical view of developer power.
In Hashing Power Concentration and 51% Attacks Walch briefly covers the concentration of mining power, but she omits many of the other aspects of concentration detailed in the DARPA report.
- Finally, Walch discusses the problems the issues she identifies (not to mention those in the DARPA report) pose for the use of "decentralized" as a legal concept in four areas:
- Decentralization’s Uncertain Meaning Makes It Ill‐Suited for A Legal Standard, in which Walch writes:
it is relatively easy to count nodes in a network, but much harder to identify and understand how miners, nodes, and software developers interact in governing a blockchain. As Sarah Jamie Lewis, a privacy advocate and crypto systems expert, has explained, “We need to move beyond naïve conceptions of decentralization (like the % of nodes owned by an entity), and instead, holistically, understand how trust and power are given, distributed and interact...Hidden centralization is the curse of protocol design of our age. Many people have become very good at obfuscating and rationalizing away power concentration.”The system has more layers than just the miners and the developers, and each of them is centralized in a different, and hard-to-measure way. This obviously makes "decentralization" useless as a basis for regulation.
- Decentralization’s Dynamic Nature Complicates Its Use as a Legal Standard, in which Walch writes:
if the measurement and determination of a decentralization level is done periodically to mark the moment when a particular legal status is achieved, then participants in blockchain systems (nodes, miners, developers) may game the standard by taking actions to move along the decentralization spectrum. If the prize is large (as non‐security status would be), then anything gameable (including a level of decentralization) will be gamed.Any technique for measuring the decentralization of a layer in the system will give answers that change over time, and that will be subject to being gamed. Again, this isn't a viable basis for regulation.
- If Actual Decentralization is Now Just a Dream, Wait Till It Comes True, in which Walch writes:
In Part III, I provided examples of events in Bitcoin and Ethereum that belie claims that they are decentralized, while in Part II, I noted the largely aspirational nature of ‘decentralization’ in permissionless blockchains. If this is the case, it is premature to use ‘decentralization’ as a way to make legal decisions. However noble the goals are for a given blockchain system to reach decentralization nirvana, the law must deal with present‐day realities rather than hopes or dreams.It has been more than 13 years since Bitcoin launched and in all that time it has never been effectively decentralized at the layers of hash rate or developers. Kwon et al, in which show that it will never be decentralized. So dealing with "present‐day realities" involves rejecting the idea that these blockchains are decentralized and focusing on applying existing laws to the loci of power in the system, such as the organizers of mining pools, the developers, and the exchanges.
- Decentralization Veils and Malleable Tokens, in which Walch finally gets to the most important point, laying out the function the claim of "decentralization" is intended to perform:
the common meaning of ‘decentralized’ as applied to blockchain systems functions as a veil that covers over and prevents many from seeing the actions of key actors within the system. Hence, Hinman’s (and others’) inability to see the small groups of people who wield concentrated power in operating the blockchain protocol. In essence, if it’s decentralized, well, no particular people are doing things of consequence.And:
Going further, if one believes that no particular people are doing things of consequence, and power is diffuse, then there is effectively no human agency within the system to hold accountable for anything.
The consequence of casting a veil over the people’s actions is that they may not be held accountable for those actions – in effect, that a Veil of Decentralization, in which functions as a liability shield akin to the famed corporate veil.
Moreover, being protected by a Veil of Decentralization may even be better than what blockchain participants could get if they actually formed a limited liability entity together. In entities, people making significant decisions that affect others (like directors, officers, or managers) generally owe fiduciary duties, but, despite my urging, no one has yet decided to treat the core developers or significant miners of blockchain protocols as fiduciaries. What’s more, the Veil of Decentralization is helpful to participants in the blockchain because it provides a liability shield without making the blockchain system a legal person that could be sued. With a limited liability entity, the corporation or LLC provides the site of legal personhood, but with a decentralized blockchain system, there is no such site. Thus, if we misapply the term “decentralized,” people within “decentralized” blockchain systems get the benefit of limited liability without the cost of certain duties and responsibilities.
- Decentralization’s Uncertain Meaning Makes It Ill‐Suited for A Legal Standard, in which Walch writes:
Update: 5th September 2022
Hetzner, a private centralized cloud provider, stepped in on a discussion around running blockchain nodes, highlighting its terms of services that prohibit customers from using the services for crypto activities. However, the Ethereum community perceived the revelation as a threat to the ecosystem as Hetzner’s cloud services host nearly 16% of the Ethereum nodesNote that 54% of the hash rate is hosted on Amazon Web Services.