Monday, November 10, 2014

Gossip protocols: a clarification

a subtype of “gossip” protocols" and cites LOCKSS as an example, saying:
Not coincidentally, LOCKSS “consists of a large number of independent, low-cost, persistent Web caches that cooperate to detect and repair damage to their content by voting in “opinion polls” (PDF). In other words, gossip and anti-entropy.
The main use for gossip protocols is to disseminate information in a robust, randomized way, by having each peer forward information it receives from other peers to a random selection of other peers. As the function of LOCKSS boxes is to act as custodians of copyright information, this would be a very bad thing for them to do.

It is true that LOCKSS peers communicate via an anti-entropy protocol, and it is even true that the first such protocol they used, the one I implemented for the LOCKSS prototype, was a gossip protocol in the sense that peers forwarded hashes of content to each other. Alas, that protocol was very insecure. Some of the ways in which it was insecure related directly to its being a gossip protocol.

An intensive multi-year research effort in cooperation with Stanford's CS department to create a more secure anti-entropy protocol led to the current  protocol, which won "Best Paper" at the 2003 Symposium on Operating System Principles. It is not a gossip protocol in any meaningful sense (see below the fold for details). Peers never forward information they receive from other peers, all interactions are strictly pair-wise and private.

For the TRAC audit of the CLOCKSS Archive we provided an overview of the operation of the LOCKSS anti-entropy protocol; if you are interested in the details of the protocol this, rather than the long and very detailed paper in ACM Transactions on Computer Systems (PDF), is the place to start.

According to Wikipedia:
a gossip protocol is one that satisfies the following conditions:
  • The core of the protocol involves periodic, pairwise, inter-process interactions.
  • The information exchanged during these interactions is of bounded size.
  • When agents interact, the state of at least one agent changes to reflect the state of the other.
  • Reliable communication is not assumed.
  • The frequency of the interactions is low compared to typical message latencies so that the protocol costs are negligible.
  • There is some form of randomness in the peer selection. Peers might be selected from the full set of nodes or from a smaller set of neighbors.
  • Due to the replication there is an implicit redundancy of the delivered information.
The current LOCKSS anti-entropy protocol does not meet this definition. Peer communications are periodic and pairwise, but each pairwise communication forms part of a poll (anti-entropy operation) not the whole of one. When peers communicate, their state may change but the new state may not be a reflection of the state of the other. There is no implicit redundancy of the delivered information, the information delivered between two peers is specific to that pair of peers and is never shared with any other peer.

The redundancy of preserved content in a LOCKSS network is a higher-level concept than the details of individual peer communication. The current protocol is a peer-to-peer consensus protocol.

No comments: