Consistent hashing maps both keys and nodes to a circular hash space (ring). Each key is assigned to the first node clockwise from its hash position. When a node joins or leaves, only keys near that node need to move—not the entire dataset. This makes it ideal for distributed caches and databases where nodes change frequently.
Visual Overview
Consistent Hashing Ring
Consistent Hashing Ring
THE HASH RING
┌────────────────────────────────────────────────────┐│││ 0° │││││ N1 ──● ││ / \ K1● ││ / \ ││ 270° ● ● 90° ││ N3 N2 ││ \ / ││ \ ●K2 / ││ \ / ││ ● ││ 180° ││││Hash(K1) = 80° → walks clockwise → lands on N2 ││Hash(K2) = 200° → walks clockwise → lands on N3 ││││ Rule: Key belongs to first node clockwise │└────────────────────────────────────────────────────┘
WHY NOT MODULO HASHING?
┌────────────────────────────────────────────────────┐│ Modulo: node = hash(key) % N ││││ With N=3 nodes: ││Key "user_1" (hash=10): 10 % 3 = 1 →Node 1 ││Key "user_2" (hash=15): 15 % 3 = 0 →Node 0 ││Key "user_3" (hash=23): 23 % 3 = 2 →Node 2 ││││ Add 4th node (N=4): ││Key "user_1": 10 % 4 = 2 → MOVED to Node 2! ││Key "user_2": 15 % 4 = 3 → MOVED to Node 3! ││Key "user_3": 23 % 4 = 3 → MOVED to Node 3! ││││→ ALMOST ALL keys remapped! Cache miss storm! │└────────────────────────────────────────────────────┘
Core Explanation
What is Consistent Hashing?
Real-World Analogy: Imagine a circular running track with mile markers (0-100). Runners (nodes) stand at certain positions on the track. When a new item arrives, you calculate which mile marker it corresponds to, then walk clockwise until you hit the first runner—that runner is responsible for the item.
If a runner leaves, only their items need to be picked up by the next runner clockwise. If a new runner joins, they only take items between them and the previous runner.
How It Works
Consistent Hashing Algorithm
Consistent Hashing Algorithm
ALGORITHM STEPS
┌────────────────────────────────────────────────────┐│ 1. CREATE THE RING ││└─Ring positions: 0 to 2^32 (or 360°) ││││ 2. PLACE NODES ON RING ││└─ For each node: position = hash(node_id) ││└─ Node A: hash("A") = 100 ││└─ Node B: hash("B") = 200 ││└─ Node C: hash("C") = 300 ││││ 3. ASSIGN KEY TO NODE ││└─ key_pos = hash(key) ││└─ Walk clockwise from key_pos ││└─ First node you hit owns this key ││││ EXAMPLE: ││hash("user_123") = 150 ││ 150 →walk clockwise → first node is B (200) ││ Result: "user_123" lives on Node B │└────────────────────────────────────────────────────┘
Adding a Node
Adding a Node
Adding a Node
BEFORE: 3 Nodes (A, B, C)
┌────────────────────────────────────────────────────┐│ 0° │││││ A ───●── 100° ││ / \ ││ / \ ●●● Keys 101-200 ││ 300° ● ● 200° ││ C B ││ \ / ││ \ / ││ ●●● Keys 201-300 ││││ A owns: 301-100 (wraps around) ││ B owns: 101-200 ││ C owns: 201-300 │└────────────────────────────────────────────────────┘
AFTER: Add Node D at 150°
┌────────────────────────────────────────────────────┐│ 0° │││││ A ───●── 100° ││ / \ ││ / D ──●── 150° ││ 300° ● ● 200° ││ C B ││││ A owns: 301-100 (unchanged) ││ D owns: 101-150 (NEW - taken from B) ││ B owns: 151-200 (reduced range) ││ C owns: 201-300 (unchanged) ││││→ Only keys 101-150 moved! (~1/4 of B's keys) ││→ A and C completely unaffected │└────────────────────────────────────────────────────┘
Removing a Node
Removing a Node
Removing a Node
BEFORE: 4 Nodes (A, B, C, D)
┌────────────────────────────────────────────────────┐│ D at 150° owns keys 101-150 ││ B at 200° owns keys 151-200 │└────────────────────────────────────────────────────┘
AFTER: Remove Node D
┌────────────────────────────────────────────────────┐│ D removed at 150° ││ B (next clockwise) absorbs D's keys ││ B now owns keys 101-200 ││││→ Only D's keys moved ││→ A and C completely unaffected │└────────────────────────────────────────────────────┘
KEY INSIGHT
┌────────────────────────────────────────────────────┐│ With N nodes and K total keys: ││││ Node added: ~K/N keys move (only new node's) ││ Node removed: ~K/N keys move (to successor) ││││ Compare to modulo: ~K keys move (almost all!) │└────────────────────────────────────────────────────┘
The Problem: Uneven Distribution
Uneven Distribution Problem
Uneven Distribution Problem
PROBLEM: NODES DON'T HASH EVENLY
┌────────────────────────────────────────────────────┐│ 0° │││││ A ────────●── 50° (owns 0-50 = 50°) ││ \ ││ \ ││ ● 250° B (owns 51-250 = 200°)││ / ││ / ││ C ────────●── 300° (owns 251-360 = 110°) ││││ Node A: 14% of ring ││ Node B: 55% of ring ← HOTSPOT! ││ Node C: 31% of ring ││││ This is just bad luck with hash positions │└────────────────────────────────────────────────────┘
SOLUTION: VIRTUAL NODES
┌────────────────────────────────────────────────────┐│ Give each physical node MULTIPLE ring positions ││││ Physical Node A → Virtual: A1, A2, A3, A4 ││ Physical Node B → Virtual: B1, B2, B3, B4 ││ Physical Node C → Virtual: C1, C2, C3, C4 ││││ 12 virtual nodes spread more evenly on ring ││ Each physical node gets ~33% of keys ││││ More vnodes = better balance (typically 100-256) │└────────────────────────────────────────────────────┘
Real Systems Using Consistent Hashing
System
Use Case
Virtual Nodes
Notes
Amazon DynamoDB
Partitioning
Yes (256 default)
Described in famous Dynamo paper
Apache Cassandra
Data distribution
Yes (configurable)
Default 256 vnodes per node
Memcached (ketama)
Cache routing
Yes
Client-side hashing
Redis Cluster
Slot assignment
Fixed 16384 slots
Slots, not pure consistent hashing
Riak
Key-value storage
Yes
Pioneered virtual nodes
Discord
Guild/server routing
Yes
Routes users to servers
Case Study: Distributed Cache
Distributed Cache with Consistent Hashing
Distributed Cache with Consistent Hashing
DISTRIBUTED CACHE ARCHITECTURE
┌────────────────────────────────────────────────────┐│││ Application │││││↓││CacheClient (consistent hashing) │││││├──►hash("user:123") →Cache Server A ││├──►hash("user:456") →Cache Server B ││└──►hash("user:789") →Cache Server C ││││ SCENARIO: Server B dies ││┌──────────────────────────────────────────────┐│││ Without consistent hashing: ││││ All keys rehash→ 66% cache miss rate! ││││││││ With consistent hashing: ││││ Only B's keys (33%) miss → 33% miss rate ││││ A and C keys still hit! │││└──────────────────────────────────────────────┘││││ SCENARIO: Add Server D ││┌──────────────────────────────────────────────┐│││ Only ~25% of keys move to D (cold) ││││ Other 75% still hit existing caches │││└──────────────────────────────────────────────┘│└────────────────────────────────────────────────────┘
When to Use Consistent Hashing
✓ Perfect Use Cases
Consistent Hashing Use Cases
Consistent Hashing Use Cases
DISTRIBUTED CACHING
Scenario: Memcached cluster with frequent node changes
Requirement: Minimize cache miss storms during scaling
Configuration: 100+ vnodes per physical node
Trade-off: Client needs consistent hashing library
DATABASE SHARDING
Scenario: Horizontal scaling of user data
Requirement: Predictable data location, minimal resharding
Configuration: Vnodes based on disk capacity
Trade-off: Range queries span multiple nodes
LOAD BALANCING (Session Affinity)
Scenario: Route users to same backend server
Requirement: Same user always hits same server
Configuration: Hash on user ID or session token
Trade-off: Uneven load if user activity varies
CDN EDGE ROUTING
Scenario: Route requests to nearest/best edge server
Requirement: Consistent routing, graceful failover
Configuration: Geographic + hash-based assignment
Trade-off: May need to consider latency too
✕ When NOT to Use
When Consistent Hashing May Not Fit
When Consistent Hashing May Not Fit
SMALL FIXED CLUSTERS
Problem: 3-5 nodes that rarely change
Alternative: Simple modulo or explicit assignment
Why: Complexity not worth it for rare rebalancing
RANGE QUERIES REQUIRED
Problem: Need to query ranges of keys
Alternative: B-tree partitioning, range-based sharding
Why: Consistent hashing scatters adjacent keys
CENTRALIZED COORDINATOR EXISTS
Problem: Already have a master that tracks assignments
Alternative: Let coordinator manage key-node mapping
Why: Consistent hashing is for decentralized routing
Interview Application
Common Interview Question
Q: “Design a distributed cache like Memcached. How would you route requests to cache servers?”
Strong Answer:
“I’d use consistent hashing for routing. Here’s my design:
Why Consistent Hashing:
Cache servers will fail and scale frequently
Modulo hashing causes massive cache invalidation on any change
Consistent hashing limits key movement to ~K/N keys
Implementation:
Hash ring: Map 0 to 2^32 onto a circle
Place servers: Hash each server hostname to ring position
Route keys: Hash key, walk clockwise to first server
Virtual nodes: 150 vnodes per physical server for balance
Client-side routing:
server = ring.get_node(hash(key))response = server.get(key)
Handling server failures:
Client detects failure (timeout/error)
Mark server dead, route to next clockwise
Failed server’s keys become cache misses (expected)
When server recovers, gradually warm up
Replication for availability:
Store key on N servers (walk clockwise, pick N)
Read from any replica
Write to all N (or quorum)
Real-world example:
Amazon’s Dynamo paper (2007) established this pattern. Most modern distributed caches and databases use it.”
Follow-up: How do virtual nodes improve the system?
“Virtual nodes solve the uneven distribution problem:
Without vnodes:
3 servers might hash to positions that give one server 60% of keys
That server becomes a hotspot
With vnodes (e.g., 150 per server):
Each physical server has 150 ring positions
450 total positions spread more evenly
Statistical balancing: each server gets ~33% ± small variance
Additional benefits:
Heterogeneous hardware: More powerful servers get more vnodes
Faster rebalancing: When server dies, load spreads across many servers (not just one successor)
Gradual migration: Can move vnodes one at a time during maintenance”
Code Example
Consistent Hash Ring (Python)
import hashlibfrom bisect import bisect_left, insortfrom typing import Optional, Listfrom dataclasses import dataclass@dataclassclass VirtualNode: """A virtual node on the hash ring.""" position: int physical_node: str vnode_id: intclass ConsistentHashRing: """ Consistent hash ring with virtual nodes. Usage: ring = ConsistentHashRing(vnodes_per_node=150) ring.add_node("cache-server-1") ring.add_node("cache-server-2") server = ring.get_node("user:12345") # Returns assigned server Note: This implementation maintains a sorted positions list for O(log N) lookups. Production implementations may use more sophisticated data structures (e.g., skip lists, balanced trees). """ def __init__(self, vnodes_per_node: int = 150): self.vnodes_per_node = vnodes_per_node self.ring: List[VirtualNode] = [] # Sorted by position self.positions: List[int] = [] # Parallel list for O(log N) bisect self.nodes: set = set() def _hash(self, key: str) -> int: """Hash a key to a position on the ring (0 to 2^32).""" digest = hashlib.md5(key.encode()).hexdigest() return int(digest[:8], 16) def add_node(self, node: str) -> List[str]: """ Add a physical node with multiple virtual nodes. Returns list of keys that would need to move to this node. """ if node in self.nodes: return [] self.nodes.add(node) # Add virtual nodes at different ring positions for i in range(self.vnodes_per_node): vnode_key = f"{node}:vnode:{i}" position = self._hash(vnode_key) vnode = VirtualNode(position, node, i) # Insert at correct position to maintain sorted order idx = bisect_left(self.positions, position) self.ring.insert(idx, vnode) self.positions.insert(idx, position) return [] # In production, return keys that need migration def remove_node(self, node: str) -> List[str]: """ Remove a physical node and all its virtual nodes. Returns list of keys that need to move to successor nodes. """ if node not in self.nodes: return [] self.nodes.remove(node) # Rebuild ring and positions without removed node new_ring = [] new_positions = [] for vn, pos in zip(self.ring, self.positions): if vn.physical_node != node: new_ring.append(vn) new_positions.append(pos) self.ring = new_ring self.positions = new_positions return [] # In production, return keys that need migration def get_node(self, key: str) -> Optional[str]: """ Find the node responsible for a key. O(log N) lookup. Returns the first node clockwise from the key's position. """ if not self.ring: return None position = self._hash(key) # O(log N) bisect on pre-computed positions list idx = bisect_left(self.positions, position) # Wrap around if past the end if idx >= len(self.ring): idx = 0 return self.ring[idx].physical_node def get_nodes(self, key: str, n: int = 3) -> List[str]: """ Get N unique nodes for a key (for replication). Walks clockwise and collects unique physical nodes. """ if not self.ring: return [] position = self._hash(key) idx = bisect_left([vn.position for vn in self.ring], position) nodes = [] seen = set() for i in range(len(self.ring)): vnode = self.ring[(idx + i) % len(self.ring)] if vnode.physical_node not in seen: nodes.append(vnode.physical_node) seen.add(vnode.physical_node) if len(nodes) >= n: break return nodes def get_distribution(self) -> dict: """Get the key distribution across nodes (for monitoring).""" if not self.ring: return {} # Count ring space owned by each node distribution = {node: 0 for node in self.nodes} prev_pos = self.ring[-1].position for vnode in self.ring: # Space from previous position to this position if vnode.position > prev_pos: space = vnode.position - prev_pos else: # Wrapped around space = (2**32 - prev_pos) + vnode.position # This space belongs to this vnode's physical node distribution[vnode.physical_node] += space prev_pos = vnode.position # Convert to percentages total = sum(distribution.values()) return {node: (space / total) * 100 for node, space in distribution.items()}# Usage exampleif __name__ == "__main__": ring = ConsistentHashRing(vnodes_per_node=150) # Add cache servers for i in range(3): ring.add_node(f"cache-{i}.example.com") print("Distribution with 3 nodes:") for node, pct in ring.get_distribution().items(): print(f" {node}: {pct:.1f}%") # Route some keys print("\nKey routing:") for key in ["user:123", "user:456", "session:abc", "product:789"]: node = ring.get_node(key) replicas = ring.get_nodes(key, n=2) print(f" {key} → primary: {node}, replicas: {replicas}") # Add a fourth server print("\nAdding cache-3.example.com...") ring.add_node("cache-3.example.com") print("Distribution with 4 nodes:") for node, pct in ring.get_distribution().items(): print(f" {node}: {pct:.1f}%") # Show same keys (most should stay on same servers) print("\nKey routing after adding node:") for key in ["user:123", "user:456", "session:abc", "product:789"]: node = ring.get_node(key) print(f" {key} → {node}")