Review on “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications”

Tharmakulasingham Inthirakumaaran
3 min readDec 23, 2021

It was introduced in 2001 by Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan, and was developed at MIT.

This paper is about a new scalable distributed protocol for lookup in a dynamic peer-to-peer system with frequent node arrivals and departures. According to the authors, the main operation of the chord protocol is “given a key, it maps the key onto a node”. The applications that use the chord are responsible for storing the data in a node. The paper mentioned that the 3 features that distinguish Chord from other lookup protocols are its simplicity, provable correctness, and provable performance. They have achieved this goal by having a routing table with the information of fewer other nodes and consistent hashing to create identifiers for keys & nodes. Chord’s design has overcome the following issues in peer-to-peer systems: load balancing, decentralization, scalability, availability, and flexible naming conventions.

As per the paper, they have used the consistent hash function to assign each node and key an m-bit identifier by using a base hash function such as SHA-1. According to the authors, this has made the keys & nodes uniformly distributed which contributed to minimal load balancing in nodes as well as made the Chord protocol deterministic. According to protocol, nodes & keys are arranged in an identifier circle that has at most 2m nodes, ranging from 0 to 2m — 1 in a m-bit identifier space. Key k is assigned to the first node whose identifier is equal to or follows (the identifier of) k in the identifier space which is denoted as successor. The lookup table used in the Chord protocol is called the “finger table” which contains the information about m out of N nodes(m < N, At most m will be log N). Thus each node maintains information only about O(log N) other nodes, and resolves all lookups via O(logN) messages to other nodes.

According to this paper, in a dynamic network, nodes can join and leave at any time. Whenever a node joins or leaves, Chord needs to preserve two invariants to ensure correctness. First, each node’s successor should be correctly maintained and the second one is for every key k, node successor(k) is responsible for k. In order for lookups to be fast, it is also desirable for the finger tables to be correct. The following three tasks should be done for a newly joined node n:

  1. Initialize node n,
  2. Notify other nodes to update their predecessors and finger tables
  3. The new node takes over its responsible keys from its successor.

In practice, Chord needs to handle nodes joining the system concurrently and nodes that fail or leave voluntarily. To handle these situations, a stabilization protocol should be running periodically in the background which updates finger tables and successor pointers. When node n leaves the network, all of its assigned key-value pairs are reassigned to n’s successor.

According to this paper, the experiments were evaluated by simulations. Load balancing, path length, simultaneous node failures, and lookups during stabilization were experimented. There are some research gaps identified and recommended for future works by authors such as no specific mechanism to heal partitioned rings, the possibility for an incorrect view of the Chord ring , and incorrect identifiers due to a malicious or buggy set of Chord participants, and improving lookup latency.

In this paper, the authors claim that Chord will be a valuable component for peer-to-peer, large-scale distributed applications such as cooperative file sharing, time-shared available storage systems, distributed indices for document and service discovery, and large-scale distributed computing platforms.

Some facts about the Chord

  • Chord offers a powerful primitive: given a key, it determines the node responsible for storing the key’s value, and does so efficiently.
  • In the steady-state, in an -node network, each node maintains routing information for only about O(log N) other nodes, and resolves all lookups via O(log N) messages to other nodes.
  • Updates to the routing information for nodes leaving and joining require only O(log2 N) messages.
  • Chord scales well with the number of nodes, recover from large numbers of simultaneous node failures and joins, and answers most lookups correctly even during recovery.

--

--