The fundamental concepts of distributed computing: how multiple machines coordinate to appear as a single coherent system, navigating network partitions, failures, and the CAP theorem
TL;DR
A distributed system is a collection of independent computers that appear to users as a single coherent system. The fundamental challenge: coordinating multiple machines over an unreliable network while handling partial failures. Key constraints are captured by the CAP theorem—during network partitions, you must choose between consistency and availability.
Visual Overview
Why Distributed Systems?
The Single Machine Limit
Every system eventually hits the limits of a single machine:
| Constraint | Single Machine Limit | Distributed Solution |
|---|---|---|
| CPU | ~128 cores | 1000s of machines |
| Memory | ~12 TB | Petabytes across cluster |
| Storage | ~100 TB | Exabytes across cluster |
| Availability | Machine fails = down | Redundancy, failover |
| Latency | Geography bound | Edge locations worldwide |
Core Motivations
- Scalability: Handle more load than one machine can
- Availability: Continue operating despite failures
- Latency: Serve users from nearby locations
- Cost: Commodity hardware vs expensive mainframes
The CAP Theorem
Proven by Seth Gilbert and Nancy Lynch (2002), building on Eric Brewer’s conjecture:
During a network partition, a distributed system must choose between Consistency and Availability.
The Three Properties
Consistency (C): Every read returns the most recent write or an error. All nodes see the same data at the same time.
Availability (A): Every request to a non-failing node receives a response (no timeouts, no errors).
Partition Tolerance (P): The system continues operating despite arbitrary network partitions.
Why You Can’t Have All Three
Consider two nodes during a network partition:
Node A Node B
│ │
│ Network Partition │
│ ═══════X═════ │
│ │
Client writes X=1 to A │
│ │
│ Cannot replicate │
│ to B (partition) │
│ │
│ Client reads X from B
│ │
│ CHOICE: │
│ Return X=0 (stale) │ → AP: Available but inconsistent
│ Return error │ → CP: Consistent but unavailable
Real-World CAP Choices
| System | Choice | Behavior During Partition |
|---|---|---|
| ZooKeeper | CP | Minority partition stops serving |
| etcd | CP | Requires quorum for reads/writes |
| Cassandra | AP | Continues serving, eventual consistency |
| DynamoDB | AP (default) | Eventually consistent reads |
| Spanner | CP | Sacrifices availability for consistency |
Fallacies of Distributed Computing
Classic misconceptions that lead to system failures:
- The network is reliable — Packets drop, connections break
- Latency is zero — Cross-continent: 100-200ms RTT
- Bandwidth is infinite — Congestion, throttling
- The network is secure — Man-in-middle, spoofing
- Topology doesn’t change — Nodes join, leave, fail
- There is one administrator — Multi-team, multi-org
- Transport cost is zero — Serialization, bandwidth costs
- The network is homogeneous — Different protocols, versions
Key Concepts
Network Partitions
A partition occurs when network failure divides the cluster:
Before Partition:
[N1] ←→ [N2] ←→ [N3] ←→ [N4] ←→ [N5]
After Partition:
[N1] ←→ [N2] ═══X═══ [N3] ←→ [N4] ←→ [N5]
└─ Minority ─┘ └─── Majority ───┘
Split-brain: Both sides think they’re the leader → data divergence.
Failure Detection
How do you know if a node is dead or just slow?
| Method | Mechanism | Trade-off |
|---|---|---|
| Heartbeats | Periodic “I’m alive” messages | Timeout = false positives |
| Phi Accrual | Probability of failure | More accurate, complex |
| Gossip | Nodes share failure info | Eventually consistent |
Ordering and Time
In distributed systems, “happened before” is ambiguous:
| Clock Type | Mechanism | Use Case |
|---|---|---|
| Lamport Clocks | Logical counters | Causal ordering |
| Vector Clocks | Per-node counters | Detect conflicts |
| TrueTime | Atomic clocks + GPS | Google Spanner |
| HLC | Hybrid logical/physical | CockroachDB |
Patterns for Handling Failures
1. Replication
Copy data to multiple nodes:
- Leader-Follower: One writes, others follow
- Multi-Leader: Multiple writers, conflict resolution
- Leaderless: All nodes equal, quorum writes
2. Quorum Systems
Require majority agreement: W + R > N
N = 5 replicas
W = 3 (write to 3 nodes)
R = 3 (read from 3 nodes)
At least one node in R saw the write in W
→ Guarantees consistency
3. Consensus Algorithms
Agree on a single value despite failures:
- Paxos: Theoretical foundation, complex
- Raft: Understandable consensus
- Zab: ZooKeeper’s algorithm
Related Content
Related Concepts:
- Consensus - How nodes agree on values
- Eventual Consistency - Relaxed consistency model
- Quorum - Majority-based decisions
- Leader-Follower Replication - Primary-replica pattern
Used In Systems:
- Every cloud-native application
- Databases (PostgreSQL, MySQL with replication)
- Message queues (Kafka, RabbitMQ)
- Coordination services (ZooKeeper, etcd, Consul)
Next Recommended: Consensus - Learn how distributed systems agree on values despite failures
Interview Notes
90% of system design interviews
Powers systems at Every cloud-native system
Network latency, coordination query improvement
Horizontal scaling foundations