The fundamental trade-off in distributed systems: during a network partition, you must choose between consistency and availability
TL;DR
The CAP theorem states that during a network partition, a distributed system must choose between consistency (all nodes see the same data) and availability (every request gets a response). You cannot have both simultaneously when the network splits. This isn’t a limitation to overcome—it’s a fundamental property of distributed systems proven mathematically.
Visual Overview
Core Explanation
What is the CAP Theorem?
Real-World Analogy: Imagine you’re running a chain of coffee shops with a shared inventory system. You sell the last bag of a special blend at Shop A.
- With strong consistency: Shop B can’t sell that blend until they confirm Shop A’s sale—but if the network is down between shops, Shop B is stuck (unavailable).
- With high availability: Shop B continues taking orders, potentially selling a bag they don’t have—the system stays up but might be wrong (inconsistent).
You can’t have both during a network problem. That’s CAP.
The Key Insight: Partitions Happen
The theorem only constrains behavior during network partitions. When the network is healthy, CAP doesn’t force any trade-offs—systems can provide strong consistency and high availability simultaneously. The critical insight:
Partition tolerance is not optional. Networks will fail. The real question is: when they do, do you sacrifice C or A?
The Impossible Choice
Making the Decision
The right choice depends on your domain. Ask: “What’s worse for my users—seeing old data or seeing nothing?”
| Domain | Worse Outcome | CAP Choice |
|---|---|---|
| Bank balance | Showing wrong balance → overdraft | CP |
| Inventory count | Overselling → angry customers | CP |
| Social media feed | Slight delay in seeing posts | AP |
| Shopping cart | Cart contents briefly out of sync | AP |
| Distributed lock | Two processes think they hold lock | CP |
| DNS lookup | Slight delay in DNS propagation | AP |
Real Systems and Their CAP Behaviors
These classifications describe default or typical configurations. Most systems are tunable—the same system can behave as CP or AP depending on consistency level settings.
| System | Typical Behavior | Configuration Notes | Use Case |
|---|---|---|---|
| ZooKeeper | CP-leaning | Requires majority for writes; minority partitions read-only | Distributed coordination |
| etcd | CP-leaning | Raft consensus requires majority | Kubernetes config store |
| Consul | Tunable | Default CP; can configure for AP with stale reads | Service discovery |
| Cassandra | Tunable | CL=ONE is AP; CL=QUORUM is CP-leaning | Multi-datacenter database |
| DynamoDB | Tunable | Eventually consistent reads (AP); strongly consistent reads (CP) | High-availability storage |
| CockroachDB | CP-leaning | Serializable by default; unavailable without quorum | Distributed SQL |
| MongoDB | Tunable | Depends on write/read concern settings | Document database |
| Riak | AP-leaning | Sibling values, application-level conflict resolution | Key-value store |
Interview note: Avoid saying “X is a CP system” without qualification. Real systems offer configuration knobs. Focus on describing behavior under specific configurations.
Case Study: Partition Scenario
When to Use Each
✓ Choose CP (Consistency) When
✓ Choose AP (Availability) When
Interview Application
Common Interview Question
Q: “Explain the CAP theorem. How would you design a distributed database that needs to handle network partitions?”
Strong Answer:
“The CAP theorem, proven by Gilbert and Lynch in 2002, states that a distributed system cannot simultaneously provide all three guarantees during a network partition:
- Consistency: Every read receives the most recent write
- Availability: Every request receives a response
- Partition Tolerance: System operates despite network failures
Key insight: Partition tolerance isn’t optional in distributed systems—networks will fail. So the real choice is between C and A when a partition occurs.
For my database design, I’d first ask: ‘What’s the cost of inconsistency vs unavailability?’
For a banking system: I’d choose CP. Wrong balances can cause overdrafts, legal issues. During a partition, the minority side becomes unavailable for writes. I’d implement this with consensus protocols like Raft, requiring majority quorum for commits.
For a social media feed: I’d choose AP. Users expect the app to work. Seeing a post 30 seconds late is acceptable. During partition, all nodes accept writes. After healing, I’d use CRDTs or last-write-wins to merge conflicts.
Important nuance: CAP is about the partition moment. Most of the time there’s no partition, and you can have all three. Also, different parts of a system can make different choices—payments might be CP while user profiles are AP.”
Follow-up: What is PACELC and how does it extend CAP?
“PACELC extends CAP to consider behavior when there’s no partition. It says:
- PAC: During Partition, choose Availability or Consistency
- ELC: Else (no partition), choose Latency or Consistency
Even without partitions, strong consistency requires coordination (more latency). So systems like DynamoDB offer eventual consistency for lower latency (EL) or strong consistency with higher latency (EC).
Example: Cassandra is PA/EL—it chooses availability during partitions and low latency normally. Spanner is PC/EC—it always chooses consistency, accepting latency for synchronous replication.”
Code Example
Demonstrating CP vs AP Behavior
"""
Simulating CAP theorem trade-offs in a distributed key-value store.
"""
from enum import Enum
from dataclasses import dataclass
from typing import Optional
import time
class CAPMode(Enum):
CP = "consistency" # Prioritize consistency over availability
AP = "availability" # Prioritize availability over consistency
@dataclass
class Node:
name: str
data: dict
is_connected: bool = True # Simulates network partition
class DistributedStore:
"""
A simplified distributed store demonstrating CAP trade-offs.
"""
def __init__(self, mode: CAPMode, nodes: list[Node]):
self.mode = mode
self.nodes = {n.name: n for n in nodes}
self.primary = nodes[0].name
def write(self, key: str, value: str) -> dict:
"""
Write a key-value pair.
Behavior differs based on CAP mode during partition.
"""
primary = self.nodes[self.primary]
# Write to primary
primary.data[key] = {"value": value, "timestamp": time.time()}
# Try to replicate to other nodes
replication_results = []
for name, node in self.nodes.items():
if name == self.primary:
continue
if node.is_connected:
# Node reachable - replicate synchronously
node.data[key] = primary.data[key].copy()
replication_results.append((name, "synced"))
else:
# Node unreachable - partition!
replication_results.append((name, "unreachable"))
# Check if we achieved quorum (majority)
synced_count = sum(1 for _, status in replication_results if status == "synced")
total_nodes = len(self.nodes)
has_quorum = (synced_count + 1) > total_nodes / 2 # +1 for primary
if self.mode == CAPMode.CP:
# CP: Require quorum for write to succeed
if not has_quorum:
# Rollback primary write
del primary.data[key]
return {
"success": False,
"error": "Quorum not reached - write rejected for consistency",
"mode": "CP"
}
return {
"success": True,
"message": f"Write committed with quorum ({synced_count + 1}/{total_nodes})",
"mode": "CP"
}
else: # AP mode
# AP: Accept write even without quorum
return {
"success": True,
"message": f"Write accepted (synced to {synced_count + 1}/{total_nodes})",
"warning": "Some nodes may have stale data" if not has_quorum else None,
"mode": "AP"
}
def read(self, key: str, from_node: str) -> dict:
"""
Read a key from a specific node.
Behavior differs based on CAP mode during partition.
"""
node = self.nodes.get(from_node)
if not node:
return {"success": False, "error": "Node not found"}
if self.mode == CAPMode.CP:
# CP: Only return value if we can verify it's current
if not node.is_connected and from_node != self.primary:
return {
"success": False,
"error": "Node partitioned - cannot guarantee consistency",
"mode": "CP"
}
# AP mode or connected node: return whatever we have
if key in node.data:
return {
"success": True,
"value": node.data[key]["value"],
"timestamp": node.data[key]["timestamp"],
"node": from_node,
"mode": self.mode.value,
"warning": "Value may be stale" if not node.is_connected else None
}
return {"success": False, "error": "Key not found"}
# Demonstration
if __name__ == "__main__":
# Create 3 nodes
nodes = [
Node("node-a", {}),
Node("node-b", {}),
Node("node-c", {}),
]
print("=" * 60)
print("SCENARIO: Network partition isolates node-c")
print("=" * 60)
# Simulate partition - node-c is unreachable
nodes[2].is_connected = False
# Test CP mode
print("\n--- CP MODE (Consistency Priority) ---")
cp_store = DistributedStore(CAPMode.CP, nodes)
result = cp_store.write("balance", "1000")
print(f"Write result: {result}")
# CP requires 2/3 nodes (quorum) - this succeeds
# Read from partitioned node
result = cp_store.read("balance", "node-c")
print(f"Read from partitioned node-c: {result}")
# CP refuses - can't guarantee consistency
# Reset for AP test
for n in nodes:
n.data = {}
nodes[2].is_connected = False
# Test AP mode
print("\n--- AP MODE (Availability Priority) ---")
ap_store = DistributedStore(CAPMode.AP, nodes)
result = ap_store.write("balance", "1000")
print(f"Write result: {result}")
# AP accepts - warns about partial sync
# Now partition node-a and node-b, connect node-c
nodes[0].is_connected = False
nodes[1].is_connected = False
nodes[2].is_connected = True
# Read from node-c (which missed the write)
result = ap_store.read("balance", "node-c")
print(f"Read from node-c (missed write): {result}")
# AP returns stale/missing data - but it responds!
Common Misconceptions
Related Content
See It In Action:
- CAP Theorem Explainer - Visual walkthrough of partition scenarios
Related Concepts:
- Eventual Consistency - The AP consistency model
- Quorum - Tuning the C-A trade-off
- Consensus - Strong consistency protocols
Quick Self-Check
- Can explain CAP theorem in 60 seconds?
- Understand why partition tolerance is non-optional?
- Can give examples of CP vs AP system choices?
- Know what happens to each side of a partition in CP vs AP?
- Understand that CAP only applies during partitions?
- Can explain why the same system might choose CP for some operations and AP for others?
Interview Notes
85% of distributed systems interviews
Powers systems at Every distributed database
Architecture decisions query improvement
Partition handling