Read this as What must be durable before a change is acknowledged?
- Failure Trap
- Acknowledging state that exists only in memory and cannot be replayed after a crash.
- Decision Rule
- Append and flush the log before mutating durable state; recover by replaying from the last checkpoint.
The Durability Challenge
Databases keep data in memory for fast access. But memory is volatile — a power failure, crash, or restart erases everything.
How do we guarantee that committed data survives failures?
- Memory is fast but volatile
- Disk is slow but durable
- We need both speed AND safety
Write-Ahead: Log Before Memory
The solution: before updating memory, write the operation to a Write-Ahead Log on disk. This is an append-only file that records every change.
The key word is "ahead" — the log write happens BEFORE the memory write.
- Append-only log on durable storage
fsyncensures bits hit disk- Sequential writes are fast
Then Update Memory
Only after the log entry is durably on disk do we update the in-memory data structures. If we crash after logging but before memory update, we can replay the log.
The log is the source of truth.
- Memory update comes second
- If crash after log: replay recovers
- If crash before log: operation never happened
Crash! But Data is Safe
A crash wipes all memory. But the WAL on disk survives. Every committed operation is recorded there, waiting to be replayed.
This is why "committed" means "logged" — not "in memory."
- Memory loss is expected
- Log survives crashes
- Durability = on the log
Replay WAL to Recover
On restart, the system reads the WAL from the last checkpoint and replays each entry in order. Replaying [1] SET x=5 and
[2] SET y=10 restores those keys; replaying [3] DEL z then removes z — so a deleted key is gone,
not resurrected.
Periodic checkpoints reduce replay time by flushing memory to data files.
- Replay is deterministic
- Deletes replay too —
zstays gone - Same log = same state