Tuesday, September 22, 2020

Moxie Marlinspike On Decentralization

The Ecosystem Is Moving: Challenges For Distributed And Decentralized Technology is a talk by Moxie Marlinspike that anyone interested in the movement to re-decentralize the Internet should watch and think about. Marlinspike concludes "I'm not entirely optimistic about the future of decentralized systems, but I'd also love to be proven wrong".

I spent nearly two decades building and operating in production the LOCKSS system, a small-ish system that was intended, but never quite managed, to be completely decentralized. I agree with Marlinspike's conclusion, and have been writing with this attitude at least 2014's Economies Of Scale In Peer-to-Peer Networks. It is always comforting to find someone coming to the same conclusion via a completely different route, as with scalability expert Todd Hoff in 2018 and now Moxie Marlinspike based on his experience building the Signal encrypted messaging system. Below the fold I contrast his reasons for skepticism with mine.

Marlinspike's talk is in two parts. The theme of the first is that [4:33] "user expectations of software are evolving rapidly, and evolving rapidly is in conflict with decentralization". He uses a raft of examples of centralized systems that have out-evolved their decentralized competitors, including Slack vs. IRC, Facebook vs. e-mail, WhatsApp vs. XMPP. The [6:04] decentralized protocols are stuck in time, whereas the centralized protocols are constantly iterating.

What he doesn't say, but that reinforces his point, is that many of the techniques routinely used by centralized systems to improve the user experience, such as A/B testing, are difficult if not impossible to apply to decentralized systems. Further, decentralization imposes significant overheads compared to a centralized version of the same system. The idea that, for example, "Twitter but decentralized" would take market share away from Twitter is implausible. It would lack Twitter's ways of finding out what its users want, it would be much slower than Twitter at implementing and deploying those features, and once deployed they would be slower.

One major reason that it would be slower was pointed out by Paul Vixie in his 2014 article Rate-limiting State. Unless decentralized systems implement rate limits, as Bitcoin does, they are vulnerable to what are, in effect, DDoS attacks of various kinds. I discussed this problem in Rate-Limits. The result is that decentralized systems will be slowed not merely by the overheads of decentralization, but by the need to artificially slow the system as a defense measure.

Marlinspike's second theme starts when he asks [7:22] "why do we want decentralization anyway?". He lays out [7:32] four goals touted by advocates of decentralization:
  • Privacy
  • Censorship resistance
  • Availability
  • Control
He examines each in turn, making the case that centralized systems deliver a better user experience than decentralized systems along each of these axes. I can only summarize his arguments; you should watch each segment carefully.

Privacy [8:01]

Marlinspike starts by pointing out that "Most decentralized systems in the world are not encrypted by default". This is because the key and certificate management problems are significantly harder in a P2P system, which needs to implement something like PGP's Web of Trust, than a centralized system.

But what advocates of decentralization mean by privacy is both data privacy implemented by encryption, and metadata privacy implemented by "data ownership". This implies that each user owns and operates her own service which contains her data. Marlinspike comments that this seems antiquated, "left over from a time when computers were for computer people". The vast majority of users lack the skills necessary to do this. Although Marlinspike does have the necessary skills, and [9:57] does run his own e-mail server, this doesn't provide meaningful data or metadata privacy because "every e-mail I send or receive has gmail on the other end of it". Thus Google has a copy of (almost) every one of his e-mails.

As he says, real data protection requires end-to-end encryption, but metadata protection requires innovation. Both will happen faster in centralized systems because they can change faster. Signal provides metadata protection in the form of private groups, private contact discovery and sealed sender, so the centralized service has no visibility into group state or membership, or who is talking to whom. Marlinspike provides [10:52] a fascinating description of the cryptography behind private groups.

He says [16:01] "P2P is not necessarily privacy-preserving". Originally, Signal's voice and video calls operated on a P2P basis, with direct contact between the parties. But users said "do you mean someone can just call me and get my IP address?" So now they are routed via the service but, since they are end-to-end encrypted, it cannot see the content. It does know the parties' IP addresses, but an attacker would have to compromise the server to identify their IP addresses unambiguously.

Censorship resistance [17:09]

Marlinspike's model of censorship is that the censor, for example the Great Firewall of China, blocks access to services of which it disapproves. The problem is that if a service can be discovered by the user, it can be discovered by the censor. And, given automation, even if there are multiple providers of the service, it is likely that the censor can discover them at least as fast as the users, leading to a game of whack-a-mole. But if users are identified by each provider, whenever they are forced by the censor to switch they have to reconstruct their social network. This is an asymmetric game, where the cost to the censor is much less than the cost to the users.

Centralized services such as WhatsApp and Signal use techniques such as proxy sharding (each user can discover only a small subset of the access points) to make it hard for the censor to discover all the service access points quickly, and domain fronting to make it costly to block the access points the censor does discover. But the basic requirement for defending against this kind of censorship is [21:07] rapid response, which is difficult in a decentralized system.

Marlinspike doesn't discuss the other type of censorship resistance, resistance to data being deleted from decentralized systems, such as blockchains.

Availability [21:31]

In his brief discussion, Marlinspike uses the example of sharding a database between two data centers, which halves the mean time between failures for the system as a whole.

This is somewhat misleading. In his example, the system has gone from a binary failure model, it is either up or down, to a model where failures degrade the system rather than cause complete failure. In many cases this is preferable, especially if there are large numbers of shards so a failure degrades the system only slightly. Fault-tolerance can be an important feature of decentralized systems (e.g. LOCKSS). But the fault-tolerance comes at two kinds of cost, the cost of replication, and the cost of coordinating between the replicas. Done right, decentralization can improve both fault-tolerance and resilience against attack, but only at significant cost in resources and performance.

Control [22:26]

Marlinspike starts "People feel the Internet is this terrible place, in ways I don't think people used to feel, ... and a lot of this comes down to a feeling that we have a lack of control." He continues by discussing two ways decentralization advocates suggest users can exert control, switching among federated services, and extensibility.

In a federated environment, different services can behave differently, so when one no longer satisfies a user's need, she can switch to another. Marlinspike assumes that the user's identity is per-service, as it is for example with e-mail (user@example.com). This does make switching difficult as doing so requires the user to rebuild their social graph. He observes that many people still use Yahoo mail!

His assumption does cover many cases, but it is possible for decentralized systems to share a single user-generated identity (Self-sovereign identity). An example is the use of a public key as an identity.

His example of a [25:24] "protocol that's designed to be extended, so that people can modify the technology in ways that meet their needs" is XMPP, which as he says ended up as a morass of XEPs. The result was a lot of uncertainty in the user experience - "you want to send a video, there's a XEP for that, does the recipient support that?". And despite its extensibility, it couldn't adapt to major changes like mobile environments. The result wasn't control, since XEP extensions provided little value unless they were adopted everywhere. Similarly, he points to Bitcoin, where extensibility takes the form of forks, leading to fragmentation. This has more to do with open source than decentralization, which the cryptocurrency world has failed at.


Marlinspike concludes that the problem is that developing and deploying technology involves "buildings full or rooms full of people sitting in front of computers 8 hours/day every day forever". To change technology so it serves our needs better, what is needed is to make developing and deploying technology easier, which isn't what decentralization does.

Marlinspike vs. Me

My skepticism was laid out in, among others, It Isn't About The Technology, Decentralized Web Summit2018: Quick Takes and Special Report on Decentralizing the Internet. Then I was asked to summarize what would be needed for success apart from working technology (which we pretty much have)? My answer, in What Does The Decentralized Web Need? was four things:
  • A sustainable business model. A decentralized system in which all nodes run the same software isn't decentralized. A truly decentralized system needs to be supported by an ecosystem with multiple suppliers, each with a viable business model, and none big enough to dominate the market. As W. Brian Arthur demonstrated in 1994, increasing returns to scale make this hard to achieve in technology markets. And almost the only semi-viable business model for small Web companies is advertising, with really strong increasing returns to scale.
  • Source
    Anti-trust enforcement. As Steve Faktor wrote:
    It turns out that startups are Trojan horses. We think of them as revolutionaries when in fact, they’re the farm team for the establishment.
    These days startups get bought by the incumbent giants before they can become big and very profitable, and thus pose a threat to the incumbents. Without a return to effective anti-trust enforcement, this is what would happen if, despite the odds, a decentralized system succeeded.
  • The killer app. I wrote:
    The killer app will not be "[centralized app] but decentralized", because it won't be as good as [centralized app]. Even if it were, these days who needs Second Life, let alone "Second Life, but on the blockchain"? It has to be something that users need, but that cannot be implemented by a centralized system.
    It is really hard to find an application that can't be implemented on a centralized system, and even harder to find one of them that users would actually want.
  • A way to remove content. I wrote:
    Unfortunately, politicians love to pose as defending their constituents from bad people by passings laws censoring content on the Web, preferably by forcing the incumbent platforms to do it for them. Laws against child porn and "terrorism", and for the "right to be forgotten", "protection" of personal information, and "protection" of intellectual property all require Web publishing systems to implement some means for removing content.
    In the absence of mechanisms that enable censorship, it won't just be the incumbent platforms trying to kill our new, small companies, it will be governments.
    Removing content from a well-designed decentralized system is hard to implement, which is why the advocates believe they are censorship-resistant. But succeeding in the face of both the incumbent platforms and governments is unlikely.
Ether miners 07/09/19
Finally, if the decentralized system is implemented, deployed and becomes successful, it needs to stay decentralized. As we see with by far the most prominent decentralized technology, blockchains, this never happens. As I described in 2014's Economies of Scale in Peer-to-Peer Networks, very powerful economic forces drive centralization of a successful decentralized system.

As you can see, Marlinspike's arguments are based largely on technical issues, whereas mine are based largely on economic issues. But we agree that the fundamental problem is that decentralized systems inherently provide users a worse experience than centralized systems along the axes that the vast majority of users care about. We each place stress on a different set of factors causing this. Marlinspike makes a strong case that they provide a worse experience even along the axes that the decentralized advocates claim to care about. I make the case that even if they defeat the odds and succeed, like blockchains they will not remain actually decentralized.


Dragan Espenschied said...

Perhaps a less "puristic" view on decentralization can be helpful. For instance, Mastodon and Nextcloud are pretty successful in providing great options to lots of users. Neither of them is built on p2p protocols, nor have they replaced their centralized competition. They are based on standardized protocols and can happily co-exist with centralized services, even integrate with them, but also are different or "decentralized" exactly in areas that their users want. Economically, their user bases are tiny compared to for instance Dropbox or Twitter, but enough to keep enough developers on working on improvements, or even forks and slightly different implementations.

Of course these projects are based on very established patterns: an online office and a micro blogging network. They didn't have to do the ground design work of researching how users should work together on an office file, share a calendar, exchange messages, or technically manage communities. Basically, the "AB testing" or whatever other expensive design process, has already been done a decade ago, and most users are familiar enough with the established patterns to dream up their own, smaller innovative features that actually make sense for them. This can actually be pleasant since fully centralized systems oftentimes economically require "constant innovation" leading to stuff nobody has asked for and that only looks good in a powerpoint deck—see for instance the latest Dropbox update.

Clearly Nextcloud has benefited from (pretty reasonable) EU regulations, but also both Nextcloud and Mastodon mainly seem to thrive on reasonable community processes and management, and integration with like-minded projects—which is cheaper than acquiring other companies, or a general mission that includes some portion of world-domination.

David. said...

Thanks, Dragan.

Mastodon has been pretty successful, claiming about 4.4M users. So has Nextcloud, claiming about 0.25M servers. Neither Marlinspike nor I are arguing that decentralized systems "left over from a time when computers were for computer people" can't be successful for a geeky audience.

But the goal of the movement is to "re-decentralize the Internet". At that scale 4.4M users and 0.25M servers are insignificant. Marlinspike's Signal app has around 20M downloads, and is a little less insignificant (and centralized).

Marlinspike and I are arguing that if your goal is to re-decentralize the Internet, you are starting with some very significant hurdles to overcome.

Dragan Espenschied said...

Yes I guess what I was trying to get at was that "the whole internet" seems to be the wrong goal for that movement, which seems to borrow its narrative from "classic" silicon valley entrepreneurship. Decentralization projects that actually deliver, as in, provide value to their users, just operate in a different space alltogether. That's not neccessarily "geeky" (I guess the government employees that have to work with Nextcloud are not any more geeky than their colleagues who have to use Office365), but a focus on different goals.

David. said...

Anna Wiener's Taking back Our Privacy is subtitled:

"Moxie Marlinspike, the founder of the end-to-end encrypted messaging service Signal, is “trying to bring normality to the Internet.”"

It is a fascinating look at Marlinspike and encryption:

"Enforcing laws, Marlinspike believes, should be difficult. He likes to say that “we should all have something to hide,” a statement that he intends not as a blanket endorsement of criminal activity but as an acknowledgment that the legal system can be manipulated, and that even the most banal activities or text messages can be incriminating. In his view, frequent lawbreaking points to systemic rot. He often cites the legalization of same-sex marriage and, in some states, marijuana as evidence that people sometimes need to challenge laws or engage in nominally criminal activity for years before progress can be made. “Before, it was inconceivable,” he said. “After, it was inconceivable that it was ever inconceivable.” Privacy, he says, is a necessary condition for experimentation, and for social change. He compares the need for a secure digital space to the need for a private domestic one—where, for instance, a child might safely experiment with gender identity or expression."

Cobey Williamson said...

I believe Dragan is precisely correct. P2P and decentralization has zero in common with the “whole internet”; in fact it is antithetical to it. The “whole internet” refers to a virtual experience that exists entirely on centralized servers, the Matrix of digital life. P2P and decentralization are not network endeavors, but rather the utilization of network infrastructure and standardized protocols to achieve individual exchanges.

Unknown said...

I think the whole internet is probably the wrong goal. The positives of decentralisation for me, acknowleding the costs - which I think everyone recognises are:

Creating enduring structures - so sometimes you have a lot of data/creative effort in and internet place that you really like, then the Company that owns it is sold and the place you really liked may change or even be shut down. Decentralised communities are maintained by those who are interested - they may die if no one is interested but if people want to keep them going that is their choice, there is no authority (and you can talk about the clustering of power in decentralised projects but it can be mitigated)which can change things suddenly, without consent or warning.

Secondly decentralised blockchain projects particularly are great for getting people to collaborate to build something and fractionally reward them for doing so and allowing them a stake in the project. For instance you could have a platform that rewarded both creators and users of information internationally. Its not really feasible to have 100,000 shareholders on a Company Register that a central authority has to mediate for a small project but they can hold their crypto and have a stake in a project no problem. Like cooperatives but on a larger, more international scale with trust among those who don't know each other.

Lack of regulation - people are raising funds, building things, messing aorund with far less oversight than traditional structures and that allows them the possibility of creating things that the current structures would abhor. So it allows us to think in ways that are heretical to current hierarchies and raise funding far more easily than finding that one VC that shares your vision. This is a negative too of course with scams everywhere, but maybe this is the opportunity to reintroduce Caveat Emptor - certainly crypto is helping to increase my antifragility in respect of scams (the hard way)!

So I would say there are positives around building communities and different types of organisations. Also, functionality and solutions have increasing capability. There are some use-cases I think are pretty good - for instance real-time financial information for public companies or land registry or company records - I think all benefit from an immutable blockchain where all amendments are public.

So not the whole internet but maybe a subset?