Write-Ahead Log (WAL)

How databases and streaming systems ensure durability and crash recovery using write-ahead logging.

Read this as What must be durable before a change is acknowledged?
Failure Trap
Acknowledging state that exists only in memory and cannot be replayed after a crash.
Decision Rule
Append and flush the log before mutating durable state; recover by replaying from the last checkpoint.
Write-Ahead Log durability and crash recovery Five steps: in-memory data is volatile; each change is appended to a Write-Ahead Log on disk before memory is updated; on a crash, memory is lost but the log survives; replaying the log rebuilds the exact state, including honoring a deleted key so it stays gone. Memory is fast but volatile IN-MEMORY x = 5 y = 10 z = 3 lost on power-off A crash erases everything. How does committed data survive? Log to disk BEFORE memory CLIENT SET x=5 WAL · DISK [1] SET x=5 append fsync ✓ durable The log entry hits disk first. Then update memory IN-MEMORY x = 5 WAL · DISK [1] SET x=5 source of truth Memory follows the durable log. Crash now? Replay rebuilds it. Crash — but the log survives IN-MEMORY WIPED WAL · SAFE [1] SET x=5 [2] SET y=10 [3] DEL z "Committed" means "in the log." Replay the log to recover IN-MEMORY x = 5 y = 10 z removed WAL · DISK [1] SET x=5 [2] SET y=10 [3] DEL z Same log → same state.
1 / ?

The Durability Challenge

Databases keep data in memory for fast access. But memory is volatile — a power failure, crash, or restart erases everything.

How do we guarantee that committed data survives failures?

  • Memory is fast but volatile
  • Disk is slow but durable
  • We need both speed AND safety

Write-Ahead: Log Before Memory

The solution: before updating memory, write the operation to a Write-Ahead Log on disk. This is an append-only file that records every change.

The key word is "ahead" — the log write happens BEFORE the memory write.

  • Append-only log on durable storage
  • fsync ensures bits hit disk
  • Sequential writes are fast

Then Update Memory

Only after the log entry is durably on disk do we update the in-memory data structures. If we crash after logging but before memory update, we can replay the log.

The log is the source of truth.

  • Memory update comes second
  • If crash after log: replay recovers
  • If crash before log: operation never happened

Crash! But Data is Safe

A crash wipes all memory. But the WAL on disk survives. Every committed operation is recorded there, waiting to be replayed.

This is why "committed" means "logged" — not "in memory."

  • Memory loss is expected
  • Log survives crashes
  • Durability = on the log

Replay WAL to Recover

On restart, the system reads the WAL from the last checkpoint and replays each entry in order. Replaying [1] SET x=5 and [2] SET y=10 restores those keys; replaying [3] DEL z then removes z — so a deleted key is gone, not resurrected.

Periodic checkpoints reduce replay time by flushing memory to data files.

  • Replay is deterministic
  • Deletes replay tooz stays gone
  • Same log = same state