Write-Ahead Log (WAL)

How databases and streaming systems ensure durability and crash recovery using write-ahead logging.

Based on: Write-Ahead Log Checkpointing Log-Based Storage

1 / ?

The Durability Challenge

Databases keep data in memory for fast access. But memory is volatile — a power failure, crash, or restart erases everything.

How do we guarantee that committed data survives failures?

Memory is fast but volatile
Disk is slow but durable
We need both speed AND safety

Write-Ahead: Log Before Memory

The solution: before updating memory, write the operation to a Write-Ahead Log on disk. This is an append-only file that records every change.

The key word is "ahead" — the log write happens BEFORE the memory write.

Append-only log on durable storage
fsync ensures bits hit disk
Sequential writes are fast

Then Update Memory

Only after the log entry is durably on disk do we update the in-memory data structures. If we crash after logging but before memory update, we can replay the log.

The log is the source of truth.

Memory update comes second
If crash after log: replay recovers
If crash before log: operation never happened

Crash! But Data is Safe

A crash wipes all memory. But the WAL on disk survives. Every committed operation is recorded there, waiting to be replayed.

This is why "committed" means "logged" — not "in memory."

Memory loss is expected
Log survives crashes
Durability = on the log

Replay WAL to Recover

On restart, the system reads the WAL from the last checkpoint and replays each entry. The memory state is reconstructed exactly as it was.

Periodic checkpoints reduce replay time by flushing memory to data files.

Replay is deterministic
Checkpoints reduce recovery time
Same log = same state

What's Next?

WAL is the foundation of durability in databases like PostgreSQL, MySQL, and streaming systems like Kafka. Understanding WAL helps you grasp concepts like checkpointing, log compaction, and point-in-time recovery.