Tuesday, June 7, 2016

The Need For Black Hats

I was asked to provided some background for a panel on "Security" at the Decentralized Web Summit held at the Internet Archive. Below the fold is a somewhat expanded version.

Nearly 13 years ago my co-authors and I won Best Paper at SOSP for the peer-to-peer anti-entropy protocol that nodes in a LOCKSS network use to detect and repair damage to their contents. The award was for showing a P2P network that failed gradually and gracefully under attack from a very powerful adversary. Its use of proof-of-work under time constraints is related to ideas underlying blockchains.

The paper was based on a series of simulations of 1000-node networks, so we had to implement both sides, defence and attack. In our design discussions we explicitly switched between wearing white and black hats; we probably spent more time on the dark side. This meant that we ended up with a very explicit and very pessimistic threat model, which was very helpful in driving the design

The decentralized Web will be attacked, in non-obvious ways. Who would have thought that IP's strength, the end-to-end model, would also bring one of its biggest problems, pervasive surveillance? Or that advertising would be the death of Tim Berners-Lee's Web

I'd like to challenge the panelists to follow our example, and to role-play wearing black hats in two scenarios:
  • Scenario 1. We are the NSA. We have an enormous budget, no effective oversight, taps into all the major fiber links, and a good supply of zero-days. How do we collect everyone's history of browsing the decentralized Web? (I guarantee there is a team at NSA/GCHQ asking this question).
  • Scenario 2. We are the Chinese government. We have an enormous budget, an enormous workforce, a good supply of zero-days, total control over our country's servers and its connections to the outside world. How do we upgrade the Great Firewall of China to handle the decentralized Web, and how do we censor our citizens use of it? (I guarantee there is a team in China asking these questions).
I'll kick things off by pointing out one common factor between the two scenarios, that the adversaries have massive resources. Massive resources are an inescapable problem for decentralized systems, and the cause is increasing returns to scale or network effects. Increasing returns are the reason why the initially decentralized Web is now dominated by a few huge companies like Google and Facebook. They are the reason that Bitcoin's initially decentralized blockchain recently caused Mike Hearn to write this:
the block chain is controlled by Chinese miners, just two of whom control more than 50% of the hash power. At a recent conference over 95% of hashing power was controlled by a handful of guys sitting on a single stage.
One necessary design goal for networks such as Bitcoin is that the protocol be incentive-compatible, or as Ittay Eyal and Emin Gun Sirer express it:
the best strategy of a rational minority pool is to be honest, and a minority of colluding miners cannot earn disproportionate benefits by deviating from the protocol
They show that the Bitcoin protocol was, and still is, not incentive-compatible. More recently, Sirer and others have shown that the Distributed Autonomous Organization based on Ethereum isn't incentive-compatible either. Even if these protocols were, increasing returns to scale would drive centralization and thus ensure attacks with massive resources, whether from governments, large corporations. And lets not forget that attacks can be mounted using botnets.

Massive resources enable Sybil attacks. The $1M attack CMU mounted in 2014 against the Tor network used both traffic confirmation and Sybil attacks:
The particular confirmation attack they used was an active attack where the relay on one end injects a signal into the Tor protocol headers, and then the relay on the other end reads the signal. These attacking relays were stable enough to get the HSDir ("suitable for hidden service directory") and Guard ("suitable for being an entry guard") consensus flags. Then they injected the signal whenever they were used as a hidden service directory, and looked for an injected signal whenever they were used as an entry guard.
Traffic confirmation attacks don't need to inject signals, they can be based on statistical correlation. Correlations in the time domain are particularly hard for interactive services, such as Tor and the decentralized Web, to disguise.
Then the second class of attack they used, in conjunction with their traffic confirmation attack, was a standard Sybil attack — they signed up around 115 fast non-exit relays, all running on or Together these relays summed to about 6.4% of the Guard capacity in the network. Then, in part because of our current guard rotation parameters, these relays became entry guards for a significant chunk of users over their five months of operation.
Sybil attacks are very hard for truly decentralized networks to defend against, since no-one is in a position to do what the Tor project did to CMU's Sybils:
1) Removed the attacking relays from the network.
Richard Chirgwin at The Register reports on Philip Winter et al's Identifying and characterizing Sybils in the Tor network. Their sybilhunter program found the following kinds of Sybils:
  • Rewrite Sybils – these hijacked Bitcoin transactions by rewriting their Bitcoin addresses;
  • Redirect Sybils – these also attacked Bitcoin users, by redirecting them to an impersonation site;
  • FDCservers Sybils – associated with the CMU deanonymisation research later subpoenaed by the FBI;
  • Botnets of Sybils – possibly misguided attempts to help drive up usage;
  • Academic Sybils – they observed the Amazon EC2-hosted nodes operated by Biryukov, Pustogarov, and Weinmann for this 2013 paper; and
  • The LizardNSA attack on Tor.
The Yale/UT-Austin Dissent project is an attempt to use cryptographic techniques to provide anonymity while defending against both Sybil and traffic analysis attacks, but they believe there are costs in doing so:
We believe the vulnerabilities and measurability limitations of onion routing may stem from an attempt to achieve an impossible set of goals and to defend an ultimately indefensible position. Current tools offer a general-purpose, unconstrained, and individualistic form of anonymous Internet access. However, there are many ways for unconstrained, individualistic uses of the Internet to be fingerprinted and tied to individual users. We suspect that the only way to achieve measurable and provable levels of anonymity, and to stake out a position defensible in the long term, is to develop more collective anonymity protocols and tools. It may be necessary to constrain the normally individualistic behaviors of participating nodes, the expectations of users, and possibly the set of applications and usage models to which these protocols and tools apply.
They note:
Because anonymity protocols alone cannot address risks such as software exploits or accidental self-identification, the Dissent project also includes Nymix, a prototype operating system that hardens the user’s computing platform against such attacks.
Getting to a shared view of the threats the decentralized Web is intended to combat before implementations are widely deployed is vital. The lack of such a view in the design of TCP/IP and the Web is the reason we're in the mess we're in. Unless the decentralized Web does a significantly better job handling the threats than the current one, there's no point in doing it. Without a "black hat" view during the design, there's no chance that it will do a better job.


David. said...

I've written before about the difference between the decentralized Web, and the decentralized Internet implemented by the Named Data Networking (NDN) project. During the Decentralized Web Summit I've been thinking about privacy in NDN.

In the TCP/IP end-to-end world, traffic between the endpoints is very likely to transit the "backbone", so it is very likely to be observed by the NSA and others who have taps into the backbone fiber networks, and everyone legitimately along the path between the endpoints. And routing can be manipulated, for example to ensure that the path between US endpoints includes segments overseas so that it is legal for the NSA to intercept it.

In the NDN, in effect each "router" or cache satisfies the requests from its downstream clients (hosts or caches) from its cache, and only forwards cache misses upstream. Thus in order to be sure you see all the traffic for a particular endpoint, you have to be between the end-point and its closest "router". This is mostly the case examined in Privacy Risks in Named Data Networking. The more "routers" there are between the end-point and your observation point, the less you will know about what is being asked for and which endpoint asked for it. And in order to manipulate the "routing" you need to disable all the caches between the endpoint and your observation point.

Thus it appears that NDN doesn't do much to inhibit targeted surveillance of individual endpoints, but it does much to reduce the value of taps into the backbone for enabling dragnet surveillance.

David. said...

As was predictable, the DAO is under attack and the fix:

"The development community is proposing a soft fork, (with NO ROLLBACK; no transactions or blocks will be “reversed”) which will make any transactions that make any calls/callcodes/delegatecalls that reduce the balance of an account with code hash 0x7278d050619a624f84f51987149ddb439cdaadfba5966f7cfaea7ad44340a4ba (ie. the DAO and children) lead to the transaction (not just the call, the transaction) being invalid, starting from block 1760000 (precise block number subject to change up until the point the code is released), preventing the ether from being withdrawn by the attacker past the 27-day window. This will later be followed up by a hard fork which will give token holders the ability to recover their ether."

illustrates that the DAO is neither completely decentralized nor completely autonomous.

David. said...

With a second similar attack under way, Vitalik Buterin has posted Thinking About Smart Contract Security, with a crowd-sourced list of known security issues in Ethereum contracts. He writes:

"A final note is that while all of the concerns so far have been about accidental bugs, malicious bugs are an additional concern. How confident can we really be that the MakerDAO decentralized exchange does not have a loophole that lets them take out all of the funds? Some of us in the community may know the MakerDAO team and consider them to be nice people, but the entire purpose of the smart contract security model is to provide guarantees that are strong enough to survive even if that is not the case, so that entities that are not well-connected and established enough for people to trust them automatically and do not have the resources to establish their trustworthiness via a multimillion-dollar licensing process are free to innovate, and have consumers use their services feeling confident about their safety. Hence, any checks or highlights should not just exist at the level of the development environment, they should also exist at the level of block explorers and other tools where independent observers can verify the source code."

The comments are interesting.

A side note. The value of Ether plunged as soon as the attack was known. Bitcoin advocates often say "there is a bounty of [insert current market cap of BTC] on finding a bug in Bitcoin's code". The result of a bug would be to transfer BTC to the attacker, as with the DAO attack. An attack that transferred all BTC to the attacker would be hard to hide, and would result in the value of BTC plunging. So wile it is true that finding a bug in the Bitcoin code might make the finder a lot of money, it is a wild exaggeration to suggest that the lot of money would equal the BTC market cap. The losses to everyone else would be far greater than the gain to the attacker, who would have killed the goose that laid the golden egg.

David. said...

Kevin Marks explains the last point better than I can:

"This was my point at the summit - when people say 'Bitcoin is secure because if you could hack it you'd steal $7bn' it's more like 'you could destroy $7bn' Digital currencies are like stock in a conglomerate of all the companies using the currency."

David. said...

Zach Graves at Techdirt's Lessons From The Downfall Of A $150M Crowdfunded Experiment In Decentralized Governance is well worth a read about the implications of the DAO attacks.

David. said...

Nathaniel Popper's How China Took Center Stage in Bitcoin’s Civil War is a good read, especially the diagram showing the top 3 Chinese pools controlling 59% of the mining power in the last month, and the top 4 over 70%:

"Big pool operators have become the kingmakers of the Bitcoin world: Running the pools confers the right to vote on changes to Bitcoin’s software, and the bigger the pool, the more voting power. If members of a pool disagree, they can switch to another pool. But most miners choose a pool based on its payout structure, not its Bitcoin politics."