DSHR's Blog: Pseudonymity And Cooperation

Ever since I explained the reasons why in 2014's Economies of Scale in Peer-to-Peer Networks, I have been pointing out that Bitcoin isn't decentralized, it is centralized around five or fewer large mining pools. Ethereum is even more centralized; last November two pools controlled the majority of Ethereum mining. On 13^th June 2014 GHash controlled 51% of the Bitcoin mining power. The miners understood that this looked bad, so they split into a few large pools. But there is nothing to stop these pools coordinating their activities. As Vitalik Buterin wrote:

can we really say that the uncoordinated choice model is realistic when 90% of the Bitcoin network’s mining power is well-coordinated enough to show up together at the same conference?

and Makarov and Schoar wrote:

Six out of the largest mining pools are registered in China and have strong ties to Bitmain Techonologies, which is the largest producer of Bitcoin mining hardware

Source

Although as I write it is still true that 5 pools control the majority of Bitcoin mining (and 3 pools control the majority of Ethereum mining), over the last 18 months there has been a significant change in the traceability of Bitcoin mining pools. The graph shows that the proportion of pools actively obfuscating their identities has increased, so that "unknown" has been close to and occasionally above 50% of the Bitcoin mining power. It was bad enough that "trustless" meant trusting 4-5 pools, mostly in cahoots with Bitmain. But now "trustless" means trusting a group of miners who are actively hiding their identities and, for all you know, could be one large confederation. Alternatively, they could fear attacks from other miners!

Earlier this month How ‘Trustless’ Is Bitcoin, Really? by Siobhan Roberts drew attention to Cooperation among an anonymous group protected Bitcoin during failures of decentralization by Alyssa Blackburn et al that pushed Bitcoin's centralization problem back to its earliest days. Below the fold I discuss the details.

Roberts writes:

Ms. Blackburn said her project was agnostic to Bitcoin’s pros and cons. Her goal was to pierce the scrim of anonymity, track the transaction flow from Day 1 and study how the world’s largest cryptoeconomy emerged.

Blackburn and her co-authors examined the Bitcoin blockchain using newly developed address-linking techniques:

to achieve both high specificity (>99%) and high sensitivity (>99%), enabling us to study the bitcoin community in detail between launch (January 3^rd2009) and parity with the US Dollar (February 9^th2011). The end of this period was also punctuated by the launch of the Silk Road, an online, bitcoin-based, black market. This interval captures bitcoin’s transition from a digital object with no value to a functional monetary system.

Using these address-linking techniques, we show that the bitcoin blockchain makes it possible to explore the socioeconomic behavior of bitcoin’s participants. We find that, in line with the findings of Vilfredo Pareto (20) in 1896 (and subsequent studies of many national economies), wealth, income, and resources in the bitcoin community were highly centralized. This threatened bitcoin’s security, which relies on decentralization, routinely enabling agents to perform a 51% attack that would allow double-spending of the same bitcoins. The result was a social dilemma for bitcoin’s participants: whether to benefit unilaterally from attacks on the currency, or to act in the interest of the collective. Strikingly, participants declined to perform a 51% attack in every case, instead choosing to cooperate.

They showed that early mining was concentrated:

Between launch and dollar parity, most of the bitcoin was mined by only 64 agents, collectively accounting for ₿2,676,800 (PV: $84 billion). This is 1000-fold smaller than prior estimates of the size of the early Bitcoin community (75,000) (13). In total, our list included 210 agents with a substantial economic interest in bitcoin during this period (defined as agents that mined bitcoin worth >$2,000 at the time.) It is striking that the early bitcoin community created a functional medium of exchange despite having very few core participants.

And that, as a result, inequality was rampant:

We plotted the distribution of bitcoin mining income on log-log axes during six time intervals between the launch of bitcoin (January, 3, 2009) and when it achieved parity with the US dollar (February, 9, 2011). A power law was visible in each interval. The presence of this distribution in Interval 1 implies the emergence of Pareto distributions within four months of bitcoin’s launch. (See Fig 2.) This underlines the degree of centralization of the bitcoin blockchain, and highlights that Pareto distributions can emerge extremely rapidly.

Blackburn and her co-authors have shown that two current aspects of the cryptocurrency ecosystem, the lack of decentralization and the extreme Gini coefficients^*, are not later emergent phenomena, but were present right from the start. These results are not surprising. Consider that, before the source code was published, Nakamoto was the sole miner and owner of Bitcoin. The Gini coefficient was 1 and one miner controlled 100% of the mining power. The interesting question is not why early mining and ownership was concentrated, but rather what forces drove them to become somewhat less concentrated and how effective were these forces?.

As regard mining, the observation that there were few initial miners and many mined only intermittently is easy to understand. The community of cipherpunks that was Nakamoto's audience was small and tight-knit. No-one could make an immediate profit minng, so at first only the true believers mined. Mining consumed 100% of a computer that was needed for other uses, so most were not mining 24/7. Note also that there may have been many who tried mining, but discovered that their computer wasn't fast enough to compete, leaving no trace in the blockchain for later researchers to see.

As regards inequality, at first there were few transactions. The only way to obtain significant Bitcoin holdings was to mine them, and there was nothing on which to spend them, so HODL-ing was the order of the day. Newly mined bitcoins accrued to the few true enthusiasts willing to devote a powerful computer to mining.

The concentration of mining power led to many opportunities for 51% attacks. Figure 5 shows that from January:

Until December 2009, the agent with the most computational power (at the time this was Agent #1, Satoshi Nakamoto) had sufficient resources to perform a 51% attack.

Figure 5b

In particular:

During a weeklong period from September 29, 2010 and October 4, 2010, the agent with the most computational power (no longer Satoshi Nakamoto, but instead Agent #2) has enough resources to perform a 51% attack during several 6+ hour long windows. Agent #2 declines to perform a 51% attack, and instead continues to exhibit cooperative behavior.

Figure 5c

This allowed the authors to:

estimate the effective population size of the decentralized bitcoin network by counting the frequency of streaks in which all blocks are mined by one agent (bottom-left) or two agents (bottom-right). These are compared to the expected values for idealized networks comprising P agents with identical resources. The comparisons suggest an effective population size of roughly 5, a tiny fraction of the total number of participants. The grayed out region corresponds to an expected value of less than one streak; for instance, given an effective population of 25 agents, a unilateral streak of length 6 should never be observed in the two-year interval we studied. In fact, we observe 21 such streaks.

Again, this is not a surprise. The fact that consistently productive mining required a dedicated, powerful computer explains why the total productive mining population was about 64 and the effective mining population was about 5. The same fact explains why mining was restricted to Bitcoin enthusiasts, dedicated to making the system work and thus unwilling to subvert it for what were, at the time, paltry financial rewards compared to the prospect of owning a significant fraction of the total future Internet currency. During the period to 9^th February 2011 the block reward was 50 BTC, the first "halvening" was on 28^th November 2012, by which time about half of all possible Bitcoin had been mined. Thus the theoretical cost of a 6-block 51% attack was 300 BTC, or less than $300. But the potential benefit was limited by the difficulty of spending Bitcoin. Even if one of the 64 agents did not care about the progress of the coin "to the moon", the reward for malfeasance was inadequate.

It is important to note that most analyses of mining behavior assume a relatively homogeneous set of miners. This is not true, and Blackburn et al provide a striking example:

The most protracted vulnerability was in early October 2010, when Agent #2 could have performed a 51% attack during five six-hour periods ... Agent #2 was among the first users to accelerate the bitcoin-transaction-validation process (mining) by employing general-purpose graphical processing units

The most interesting of Blackburn et al's results is:

We show that our top 64 agents are extremely central to the contemporary bitcoin transaction network, such that nearly all addresses (>99%) can be linked to a top agent via a chain of less than 6 transactions (31-33). These network properties have unintended privacy consequences, because they make the network much more vulnerable to deanonymization using a “follow-the-money” approach. In this approach, the identity of a target bitcoin address can be ascertained by identifying a short transaction path linking it to an address whose identity is known, and then using off-chain data sources (ranging from public data to subpoenas) to walk along the path, determining who-paid-whom to de-identify addresses until the target address is identified.

A key limitation of the follow-the-money approach is the need to identify a known agent who is connected to the target address via a short path. Our results imply that, were the identities of the 64 top agents to become known, it would become easy to identify short transaction paths linking any target address to an already de-identified top agent address. This could adversely affect the privacy of bitcoin transactions. Similar vulnerabilities were identified in a recent preprint studying the Ethereum transaction network (2 million addresses), suggesting that many cryptocurrencies may be susceptible to follow-the-money attacks

The "recent preprint" is Percolation framework reveals limits of privacy in Conspiracy, Dark Web, and Blockchain networks by Louis M. Shekhtman et al who:

apply our framework to three real-world networks: (1) a blockchain transaction network, (2) a network of interactions on the dark web, and (3) a political conspiracy network. We find that in all three networks, beginning from one compromised individual, it is possible to deanonymize a significant fraction of the network (> 50%) within less than 5 steps. Overall these results provide guidelines for investigators seeking to identify actors in anonymous networks, as well as for users seeking to maintain their privacy.

Their framework shows:

that the question of anonymity between network actors, and the corresponding ability of a party seeking to deanonymize the individuals based on information from their neighbors, can be solved using tools and methods of percolation from statistical physics ... Furthermore, we demonstrate that classical quantities from statistical physics have important meanings and provide crucial information on the scope to which anonymity can be maintained among individuals in real hidden networks.

Shekhtman et al write:

While some approaches have considered deanonymizing the individuals behind nodes in a network ..., these have typically related to specific encrypted protocols and users have found ways to overcome these issues. In contrast, our approach is fundamental to the nature of privacy when interacting in a network i.e., when interacting with another party, a user often must reveal some identifying information. Using our general framework demonstrated in Fig 1 and based on the topologies of three real hidden networks, we are able to quantify the extent to which this information can be exploited and thus reveal how individuals can be unwittingly identified by their neighbors who failed to remain anonymous.

As applied to cryptocurrencies, the problem for the user is that their pseudonymity depends not just on their own operational security, over which they have some control, but also on the operational security of every other wallet with which their wallets transact, over which they have no control. In other words, pseudonymity requires cooperation. In the same way that informal cooperation was needed to maintain the security of the blockchain, informal cooperation in maintaining operational security is necesary. Unfortunately, it is too hard for ordinary mortals.

Cory Doctoriow made the same point in the context of cryptography generally in The Best Defense Against Rubber-Hose Cryptanalysis:

First, there is the attacker’s advantage. For you to perfectly defend your cryptographic privacy, you must make no mistakes. You must have perfect math, implemented in perfect code, on perfect hardware. You must choose a robust passphrase and never expose it to a third party (say, by keying it in within sight of a hidden camera, or where a sneaky keylogger can capture it).

Not only that, but everyone you communicate with has to be perfect, too — security is a team sport, and if your fellow dissident has a weak passphrase that reveals the contents of your group chat, it doesn’t matter if everyone else in your cell practiced better secrecy hygiene.

The defender has to be perfect, but the attacker need only find a single imperfection. For a spy agency to attack you successfully, they need only wait, and wait, and wait, until you slip up. You will be tired, hunted, demoralized. They will have well-paid operatives who rotate off shift every eight hours and can rewind and review their intercepts when their attention wavers.

Doctorow continues:

Can cryptocurrency resist tyranny? Sure. Of course it can. It’s not hugely practical for this purpose, but cryptocurrency has some utility in defeating financial censorship.
...
But any accounting of the peripheral role cryptocurrency plays in fighting despotism has to also include the central role that financial secrecy plays in promoting despotism.

Cryptocurrency — and other unregulated financial products —have decentralized many bank-like functions, but they have only increased the centralization of wealth.

When it comes to the rule of law, that is the only centralization that matters. For governments to be accountable to the public, they need to be reliant on the public for their legitimacy.

The best defense against rubber-hose cryptanalysis is a political process that answers to voters, not donors. Every billionaire isn’t merely a policy failure: every billionaire is an engine for producing policy failures.

* Note that the Pareto distribution describes the inequality of income whereas the Gini coefficient describes the inequality of wealth. But the difficulty of spending Bitcoin in this early period means that unequal income causes unequal wealth.

DSHR's Blog

Thursday, June 23, 2022

Pseudonymity And Cooperation

No comments: