Gossip Protocol - AiCassindra

How Gossip Works

Imagine a cocktail party. You tell a rumor to 3 random friends. In the next round, they tell 3 random friends. Exponentially, the entire party knows the rumor very quickly. Distributed systems use this mechanism to propagate state updates.

Every second (or configurable interval), a node selects a few random peer nodes and exchanges information. They compare their knowledge of the cluster state (using version numbers or vector clocks) and merge the diffs.

Advertisement

Key Attributes

Peer-to-Peer: No central bottleneck.
Robust: Can tolerate node failures; the message finds a way.
Eventually Consistent: It takes (\log N)$ rounds for information to propagate to all $ nodes.

Advertisement

Phi Accrual Failure Detector

Cassandra uses Gossip not just for state but for failure detection. Instead of a binary "up/down", it calculates a probability ($\Phi$) that a node is down based on the history of heartbeat intervals. This adapts to network latency automatically.

Visualization

Below is a simulation of Gossip infection. Click "Start Gossip" to see how a single piece of information spreads through the cluster.

← Back to Distributed Systems