Skip to content

Immutability

Design principle where data structures cannot be modified after creation, simplifying distributed systems by eliminating update conflicts and race conditions

TL;DR

Immutability means data cannot be changed after creation. In distributed systems, immutable data structures eliminate entire classes of concurrency bugs, enable caching without invalidation, simplify replication, and power systems like Kafka, Git, and event sourcing architectures.

Visual Overview

Mutable vs Immutable Data

Core Explanation

What is Immutability?

Immutability is a design principle where data structures cannot be modified after creation. Instead of updating existing data, you create new versions.

Programming Example:

// MUTABLE (traditional)
let user = { name: "Alice", age: 30 };
user.age = 31; // Original data modified ✕

// IMMUTABLE (functional)
const user = { name: "Alice", age: 30 };
const updatedUser = { ...user, age: 31 }; // New object created ✓
// Original 'user' unchanged

In distributed systems, immutability typically means:

  1. Append-only writes: New records added, existing records never modified
  2. Versioned data: Each change creates a new version
  3. Event logs: Store changes as immutable events

Why Immutability Matters in Distributed Systems

1. Eliminates Concurrency Bugs

Concurrency: Mutable vs Immutable

2. Enables Aggressive Caching

Caching: Mutable vs Immutable

3. Simplifies Replication

Replication: Mutable vs Immutable

4. Time-Travel Debugging

Time-Travel Debugging

Append-Only Logs

The most common form of immutability in distributed systems:

Kafka Topic (Append-Only Log)

Versioned Data

Alternative approach: Keep multiple versions of data

Database with Versioning

Real Systems Using Immutability

SystemImmutability ModelUse CaseBenefits
KafkaAppend-only logMessage streamingReplay, fault tolerance, high throughput
GitImmutable commitsVersion controlComplete history, branching, rollback
BlockchainImmutable ledgerCryptocurrencyTamper-proof, audit trail
Event SourcingEvent logCQRS systemsAudit trail, time-travel, replay
S3Write-once objectsObject storageCache forever, versioning
DatomicImmutable factsDatabaseQuery past states, time-travel

Case Study: Kafka Log Immutability

Kafka Design Decisions

Case Study: Git Commits

Git Commit Immutability

When to Use Immutability

✓ Perfect Use Cases

Event Sourcing Architectures

Event Sourcing Use Case

Message Streaming

Message Streaming Use Case

Caching & CDN

Caching & CDN Use Case

Version Control

Version Control Use Case

✕ When NOT to Use (or Use Carefully)

Storage-Constrained Systems

Storage-Constrained Systems

GDPR Right to Delete

GDPR Right to Delete

Real-Time Updates with Small Changes

Real-Time Updates with Small Changes

Interview Application

Common Interview Question

Q: “Why does Kafka use immutable logs instead of a traditional database?”

Strong Answer:

“Kafka uses immutable append-only logs for several key reasons:

1. Performance:

  • Sequential disk writes are 6x faster than random writes (600 MB/s vs 100 MB/s)
  • Append-only allows optimizing for sequential I/O
  • Result: Kafka achieves millions of messages/second throughput

2. Replayability:

  • Immutable messages can be read multiple times
  • Consumers can reset offset and replay historical data
  • Use cases: Recovery from consumer failures, backfilling data for new analytics

3. Simplifies Replication:

  • Replicas just copy log segments
  • No complex merge logic (events never change)
  • Idempotent replication (copying same event twice is safe)

4. Multiple Consumers:

  • Same log can be consumed by multiple independent consumers
  • Each consumer tracks own offset
  • Example: Real-time analytics + batch processing on same stream

5. Durability:

  • Once written to log, message is never lost
  • Replicas have identical copies (deterministic)
  • Contrast with message queues that delete on consumption

Trade-offs:

  • Storage cost: Must retain logs (mitigated by log compaction + retention)
  • Cannot update: If message has error, must append correction event
  • But benefits far outweigh costs for streaming use cases”

Code Example

Immutable Event Sourcing Pattern

// MUTABLE APPROACH (traditional)
class BankAccount {
  constructor() {
    this.balance = 0; // Mutable state
  }

  deposit(amount) {
    this.balance += amount; // In-place update ✕
    // History lost!
  }

  withdraw(amount) {
    this.balance -= amount; // In-place update ✕
  }
}

// Problem: No audit trail, race conditions on concurrent updates

// IMMUTABLE APPROACH (event sourcing)
class BankAccountEventSourced {
  constructor() {
    this.events = []; // Immutable event log
  }

  // Commands: Append events (never modify existing)
  deposit(amount) {
    const event = {
      type: "DEPOSIT",
      amount: amount,
      timestamp: Date.now(),
      id: generateId(),
    };
    this.events.push(event); // Append-only ✓
    return event;
  }

  withdraw(amount) {
    const event = {
      type: "WITHDRAW",
      amount: amount,
      timestamp: Date.now(),
      id: generateId(),
    };
    this.events.push(event); // Append-only ✓
    return event;
  }

  // Query: Compute current state from events
  getBalance() {
    return this.events.reduce((balance, event) => {
      if (event.type === "DEPOSIT") return balance + event.amount;
      if (event.type === "WITHDRAW") return balance - event.amount;
      return balance;
    }, 0);
  }

  // Time-travel: Get balance at any point in history
  getBalanceAt(timestamp) {
    return this.events
      .filter(e => e.timestamp <= timestamp)
      .reduce((balance, event) => {
        if (event.type === "DEPOSIT") return balance + event.amount;
        if (event.type === "WITHDRAW") return balance - event.amount;
        return balance;
      }, 0);
  }

  // Audit: Get complete transaction history
  getAuditLog() {
    return this.events.map(e => ({
      type: e.type,
      amount: e.amount,
      timestamp: new Date(e.timestamp).toISOString(),
    }));
  }
}

// Usage
const account = new BankAccountEventSourced();
account.deposit(100);
account.withdraw(20);
account.deposit(50);

console.log(account.getBalance()); // 130
console.log(account.getBalanceAt(Date.now() - 1000)); // Balance 1 second ago
console.log(account.getAuditLog()); // Complete history

Immutable Cache Keys (Versioned Assets)

// MUTABLE (cache invalidation problem)
<script src="/bundle.js"></script>
// Updated bundle.js → Must invalidate CDN cache (complex!)

// IMMUTABLE (cache forever)
<script src="/bundle.abc123.js"></script>
// Updated bundle → New hash → New URL → Old cache unaffected ✓

// Implementation
const crypto = require('crypto');
const fs = require('fs');

function generateImmutableAssetURL(filePath) {
  const content = fs.readFileSync(filePath);
  const hash = crypto.createHash('sha256')
    .update(content)
    .digest('hex')
    .substring(0, 8);

  const extension = filePath.split('.').pop();
  const basename = filePath.replace(`.${extension}`, '');

  // Immutable URL: content hash in filename
  const immutableURL = `${basename}.${hash}.${extension}`;

  // HTTP headers for immutable cache
  // Cache-Control: public, max-age=31536000, immutable
  // Result: Browser never revalidates (cache forever)

  return immutableURL;
}

// Example
generateImmutableAssetURL('bundle.js');  // bundle.abc12345.js
// Change one byte → Different hash → Different URL → New cache entry

Prerequisites: None - foundational concept

Related Concepts:

Used In Systems:

  • Kafka: Message streaming with immutable logs
  • Git: Version control with immutable commits
  • Blockchain: Immutable distributed ledger

Explained In Detail:

  • Kafka Deep Dive - Immutable log architecture in depth

Quick Self-Check

  • Can explain immutability in 60 seconds?
  • Understand difference between mutable and immutable data?
  • Know 3 benefits of immutability in distributed systems?
  • Can explain how Kafka uses immutability for performance?
  • Understand trade-offs (storage cost, GDPR)?
  • Can implement simple event sourcing pattern?
Interview Notes
💼55% of system design interviews
Interview Relevance
55% of system design interviews
🏭Kafka, Git, blockchain
Production Impact
Powers systems at Kafka, Git, blockchain
No race conditions
Performance
No race conditions query improvement
📈Simplified replication
Scalability
Simplified replication