Write-Ahead Log (WAL)
How databases and streaming systems ensure durability and crash recovery using write-ahead logging.
The Durability Challenge
Databases keep data in memory for fast access. But memory is volatile — a power failure, crash, or restart erases everything.
How do we guarantee that committed data survives failures?
- Memory is fast but volatile
- Disk is slow but durable
- We need both speed AND safety
Write-Ahead: Log Before Memory
The solution: before updating memory, write the operation to a Write-Ahead Log on disk. This is an append-only file that records every change.
The key word is "ahead" — the log write happens BEFORE the memory write.
- Append-only log on durable storage
fsyncensures bits hit disk- Sequential writes are fast
Then Update Memory
Only after the log entry is durably on disk do we update the in-memory data structures. If we crash after logging but before memory update, we can replay the log.
The log is the source of truth.
- Memory update comes second
- If crash after log: replay recovers
- If crash before log: operation never happened
Crash! But Data is Safe
A crash wipes all memory. But the WAL on disk survives. Every committed operation is recorded there, waiting to be replayed.
This is why "committed" means "logged" — not "in memory."
- Memory loss is expected
- Log survives crashes
- Durability = on the log
Replay WAL to Recover
On restart, the system reads the WAL from the last checkpoint and replays each entry. The memory state is reconstructed exactly as it was.
Periodic checkpoints reduce replay time by flushing memory to data files.
- Replay is deterministic
- Checkpoints reduce recovery time
- Same log = same state