Consistent Hashing

How distributed systems minimize data movement when nodes change — consistent hashing maps both keys and nodes to a ring.

Read this as How much key movement happens when capacity changes?
Failure Trap
The ring reduces remapping, but hot keys and uneven virtual nodes can still overload shards.
Decision Rule
Use virtual nodes, replicas, and load signals so minimal movement does not become uneven movement.
Consistent hashing on a hash ring A six-step explainer. First, modulo hashing forces most keys to move when a server is added. Then keys and nodes are mapped onto a ring; each key is owned by the first node found walking clockwise. Adding or removing a node moves only the keys in one arc, not all of them. Modulo hashing scatters % 3 S1 S2 S3 add S4 % 4 S1 S2 S3 S4 6 of 9 keys move Changing N rehashes almost every key The hash ring 90° 180° 270° Hash space bent into a ring Nodes sit on the ring N1 45° N2 150° N3 280° Node owns its arc clockwise A key walks clockwise N1 N2 N3 100° 100° → first node clockwise: N2 Add N4: one arc moves N1 N2 N3 N4 NEW Only ~K/(N+1) keys move Remove N2: keys hand off N1 N3 N2 N3 inherits only N2's keys
1 / ?

Traditional Hashing: The Problem

Simple hashing uses hash(key) % num_servers. With 3 servers, adding a 4th means hash(key) % 4 instead of % 3 — almost every key maps to a different server!

This causes a redistribution storm: massive data movement on any topology change.

  • Modulo hashing: key_server = hash(key) % N
  • Change N → almost all keys move
  • Expensive in distributed systems
  • Caches invalidated, data copied

The Consistent Hashing Ring

Consistent hashing imagines the hash space as a ring. Instead of modulo, we map both keys AND nodes onto this ring, then assign each key to the nearest node clockwise.

The ring is just a conceptual view of the hash range (e.g., 0 to 2³²-1) wrapped into a circle.

  • Hash space forms a circular continuum
  • Both nodes and keys are hashed
  • Position determines ownership

Nodes Positioned on the Ring

Each node is placed on the ring based on its hash (e.g., hash of IP address). A node "owns" all positions from the previous node up to itself.

N2 at 150° owns everything from 46° to 150°.

  • Node position = hash(node_id)
  • Ownership range = (prev_node, this_node]
  • Even distribution is ideal
  • Keys go to first node clockwise

Keys Find Their Node

To find a key's owner: hash the key, then walk clockwise until you hit a node. That node stores the key.

Key "user-123" hashes to 100°. Walking clockwise, the first node is N2 at 150°. N2 owns this key.

  • Key position = hash(key)
  • Owner = first node clockwise
  • Lookup is O(log N) with sorted node list

Add Node: Minimal Movement

When N4 joins at 200°, it takes ownership of keys between 150° and 200° — keys that WERE going to N3. Only those keys move.

On average, roughly 1/(N+1) of keys move when adding a node (about K/N for large N). Much better than modulo hashing!

  • Only keys in new range move
  • Average: ~K/(N+1) keys affected
  • Other nodes untouched

Remove Node: Graceful Handoff

When N2 leaves, its keys simply extend their clockwise journey to the next node (N3). Only N2's keys move.

This is why consistent hashing is used in caching, distributed databases, and load balancers — it enables graceful scaling.

  • Removed node's keys go to successor
  • Only 1/N keys affected
  • Enables graceful scaling up/down