Skip to content

Logical Clocks

Mechanisms for ordering events in distributed systems based on causality rather than physical time

TL;DR

Logical clocks order events in distributed systems based on causality, not wall-clock time. They answer: “Could event A have caused event B?” without requiring synchronized physical clocks. The two main types are Lamport clocks (single counter, partial ordering) and vector clocks (one counter per process, detects concurrency). For production systems, Hybrid Logical Clocks (HLC) combine the best of both worlds.

Visual Overview

Logical vs Physical Time

Core Explanation

What are Logical Clocks?

Real-World Analogy: Imagine detectives reconstructing a crime timeline. They don’t have synchronized watches, but they know:

  • “The window was broken before the alarm went off” (same location, sequence)
  • “The alarm triggered the police dispatch” (cause and effect)
  • “The neighbor’s dog barked” (concurrent—maybe caused by crime, maybe not)

They’re building a causal graph, not a timeline. Logical clocks do the same for distributed events.

The Happens-Before Relation

Leslie Lamport defined the fundamental “happens-before” relation (→) in 1978:

Happens-Before Rules

The Family of Logical Clocks

Logical Clock Comparison

Vector Clocks Deep Dive

Vector Clock Execution

Comparison Table

AspectLamport ClockVector ClockHLC
Space per processO(1)O(N)O(1)
Message overhead1 integerN integers2 integers
Causality: A→B implies clock(A) < clock(B)
Detects concurrency
Bounded from physical time
Time-range queries

Real Systems Using Logical Clocks

SystemClock TypeUse CaseNotes
CockroachDBHLCTransaction orderingBounded clock skew
Google SpannerTrueTime (+ logical)Global transactionsGPS + atomic clocks
RiakDotted Version VectorsSibling detectionVector clock lineage
TiDBHLC variantMVCC timestampsSimilar to CockroachDB

Note: Some systems historically associated with vector clocks (e.g., Dynamo) have evolved. DynamoDB’s current implementation differs from the 2007 paper. Cassandra uses physical timestamps with last-write-wins, not Lamport clocks. Always verify current behavior in documentation.

Hybrid Logical Clocks (HLC)

Hybrid Logical Clock

When to Use What

✓ Decision Guide

Choosing a Logical Clock

✕ Common Mistakes

Logical Clock Anti-patterns

Interview Application

Common Interview Question

Q: “Compare Lamport clocks and vector clocks. When would you use each?”

Strong Answer:

“Both are logical clocks for ordering events without synchronized physical clocks, but they have different capabilities:

Lamport Clocks:

  • Single integer per process
  • Guarantee: If A→B (A caused B), then L(A) < L(B)
  • Limitation: L(A) < L(B) does NOT mean A→B (might be concurrent)
  • Space: O(1) per process

Vector Clocks:

  • Array of N integers (one per process)
  • Guarantee: If A→B then V(A) < V(B), AND vice versa
  • Capability: Can detect concurrent events (A ∥ B)
  • Space: O(N) per process

When to use Lamport:

  • Total order broadcast (just need consistent ordering)
  • Log correlation across services
  • Large-scale systems where O(N) is prohibitive

When to use Vector:

  • Conflict detection (shopping carts, collaborative editing)
  • Optimistic replication (Dynamo-style)
  • CRDTs (need to know what’s concurrent for merging)

Real-world examples:

  • Amazon DynamoDB uses vector clocks for conflict detection
  • CockroachDB uses Hybrid Logical Clocks (combines physical time + logical counter)”

Follow-up: How do you compare two vector clocks?

“Vector clock comparison determines causality or concurrency:

V1 < V2 (V1 happened before V2): All elements of V1 ≤ corresponding elements of V2, AND at least one is strictly less.

V1 = V2: All elements are equal (same event or same causal history).

V1 ∥ V2 (concurrent): Neither V1 < V2 nor V2 < V1. Some elements are larger in V1, some in V2.

Example:

[2, 3, 1] vs [2, 4, 2]
2≤2, 3≤4, 1≤2, and at least one <  [2,3,1] < [2,4,2] (causal)

[3, 1, 2] vs [2, 3, 1]
3>2 but 1<3  concurrent (neither is "before" the other)

This comparison is what enables conflict detection—if two writes are concurrent, you need application-level merge logic.”

Follow-up: What are Hybrid Logical Clocks and why do databases use them?

“HLC combines physical timestamps with logical counters:

Structure: (physical_time, logical_counter)

Why they exist: Lamport clocks can drift arbitrarily far from physical time. If you’re at Lamport time 1,000,000, you have no idea if that was 1 second or 1 hour ago. This makes time-range queries impossible.

HLC properties:

  1. Preserves causality like Lamport
  2. Physical component stays bounded from wall clock
  3. Logical counter handles ties at same physical time

Use case: ‘Give me all transactions from the last 5 minutes’ — only works if timestamps relate to real time.

Used by: CockroachDB, TiDB, YugabyteDB.

Trade-off: HLC doesn’t detect concurrency like vector clocks. For conflict detection, you still need vector clocks or explicit conflict markers.”

Code Example

Vector Clock Implementation (Python)

from typing import Dict, Optional
from copy import deepcopy

class VectorClock:
    """
    Vector clock for detecting causality and concurrency.

    Each process maintains a vector of counters, one per process.
    """

    def __init__(self, process_id: str, known_processes: list[str] = None):
        self.process_id = process_id
        self.clock: Dict[str, int] = {}

        if known_processes:
            for p in known_processes:
                self.clock[p] = 0
        self.clock[process_id] = 0

    def increment(self) -> 'VectorClock':
        """Increment own counter (local event or send)."""
        self.clock[self.process_id] = self.clock.get(self.process_id, 0) + 1
        return self

    def update(self, other: 'VectorClock') -> 'VectorClock':
        """
        Update clock on receive.

        Takes element-wise max, then increments own counter.
        """
        for process_id, count in other.clock.items():
            self.clock[process_id] = max(
                self.clock.get(process_id, 0),
                count
            )
        self.increment()
        return self

    def copy(self) -> 'VectorClock':
        """Create a copy of this clock."""
        new_clock = VectorClock(self.process_id)
        new_clock.clock = deepcopy(self.clock)
        return new_clock

    def __lt__(self, other: 'VectorClock') -> bool:
        """
        Check if self happened-before other.

        Returns True if all elements ≤ and at least one <.
        """
        all_processes = set(self.clock.keys()) | set(other.clock.keys())

        all_leq = True
        any_lt = False

        for p in all_processes:
            self_val = self.clock.get(p, 0)
            other_val = other.clock.get(p, 0)

            if self_val > other_val:
                all_leq = False
            if self_val < other_val:
                any_lt = True

        return all_leq and any_lt

    def __eq__(self, other: 'VectorClock') -> bool:
        """Check if clocks are equal."""
        all_processes = set(self.clock.keys()) | set(other.clock.keys())
        return all(
            self.clock.get(p, 0) == other.clock.get(p, 0)
            for p in all_processes
        )

    def is_concurrent(self, other: 'VectorClock') -> bool:
        """Check if self and other are concurrent (neither caused the other)."""
        return not (self < other) and not (other < self) and not (self == other)

    def compare(self, other: 'VectorClock') -> str:
        """Compare two clocks and describe relationship."""
        if self == other:
            return "equal"
        elif self < other:
            return "self happened-before other"
        elif other < self:
            return "other happened-before self"
        else:
            return "concurrent"

    def __repr__(self) -> str:
        return f"VC({dict(self.clock)})"


# Usage example
if __name__ == "__main__":
    print("=== Vector Clock Demo ===\n")

    # Three processes
    processes = ["A", "B", "C"]
    a = VectorClock("A", processes)
    b = VectorClock("B", processes)
    c = VectorClock("C", processes)

    # A does local event
    a.increment()
    print(f"A local event: {a}")

    # A sends to B
    msg_clock = a.copy()
    a.increment()
    print(f"A sends to B: {a}")

    # B receives from A
    b.update(msg_clock)
    print(f"B receives: {b}")

    # B does local event
    b.increment()
    print(f"B local event: {b}")

    # Meanwhile, C does local event (concurrent with A and B)
    c.increment()
    print(f"C local event: {c}")

    # Compare clocks
    print(f"\nComparing A and B: {a.compare(b)}")
    print(f"Comparing A and C: {a.compare(c)}")
    print(f"Comparing B and C: {b.compare(c)}")

    print("\n=== Conflict Detection Example ===\n")

    # Simulate two concurrent writes
    write1 = VectorClock("client1", ["client1", "client2"])
    write2 = VectorClock("client2", ["client1", "client2"])

    write1.increment()  # [1, 0]
    write2.increment()  # [0, 1]

    print(f"Write 1: {write1}")
    print(f"Write 2: {write2}")
    print(f"Relationship: {write1.compare(write2)}")

    if write1.is_concurrent(write2):
        print("CONFLICT DETECTED: Need application-level resolution")

Hybrid Logical Clock (Python)

import time
from dataclasses import dataclass

@dataclass
class HLC:
    """
    Hybrid Logical Clock.

    Combines physical time with logical counter.
    """
    pt: int  # Physical time (milliseconds)
    lc: int  # Logical counter

    def __lt__(self, other: 'HLC') -> bool:
        if self.pt != other.pt:
            return self.pt < other.pt
        return self.lc < other.lc

    def __eq__(self, other: 'HLC') -> bool:
        return self.pt == other.pt and self.lc == other.lc

    def __repr__(self) -> str:
        return f"HLC(pt={self.pt}, lc={self.lc})"


class HLCClock:
    """Hybrid Logical Clock implementation."""

    def __init__(self):
        self.hlc = HLC(pt=0, lc=0)

    def _now_ms(self) -> int:
        """Current wall clock time in milliseconds."""
        return int(time.time() * 1000)

    def send_or_local(self) -> HLC:
        """Generate timestamp for send or local event."""
        now = self._now_ms()
        pt = max(self.hlc.pt, now)

        if pt == self.hlc.pt:
            lc = self.hlc.lc + 1
        else:
            lc = 0

        self.hlc = HLC(pt=pt, lc=lc)
        return HLC(pt=pt, lc=lc)

    def receive(self, msg_hlc: HLC) -> HLC:
        """Update clock on receive."""
        now = self._now_ms()
        pt = max(self.hlc.pt, msg_hlc.pt, now)

        if pt == self.hlc.pt == msg_hlc.pt:
            lc = max(self.hlc.lc, msg_hlc.lc) + 1
        elif pt == self.hlc.pt:
            lc = self.hlc.lc + 1
        elif pt == msg_hlc.pt:
            lc = msg_hlc.lc + 1
        else:
            lc = 0

        self.hlc = HLC(pt=pt, lc=lc)
        return HLC(pt=pt, lc=lc)


# Usage
if __name__ == "__main__":
    print("\n=== HLC Demo ===\n")

    node1 = HLCClock()
    node2 = HLCClock()

    # Node 1 sends
    ts1 = node1.send_or_local()
    print(f"Node 1 sends: {ts1}")

    # Node 2 receives
    ts2 = node2.receive(ts1)
    print(f"Node 2 receives: {ts2}")

    # Both are close to physical time!
    print(f"\nActual time (ms): {int(time.time() * 1000)}")

See It In Action:

Related Concepts:

Quick Self-Check

  • Can explain the happens-before relation in 60 seconds?
  • Know the difference between Lamport and vector clocks?
  • Understand when vector clocks detect concurrency?
  • Can compare two vector clocks to determine causality?
  • Know what HLC is and why databases use it?
  • Understand the trade-off: space (O(N) for vector) vs capability (concurrency detection)?
Interview Notes
💼40% of distributed time discussions
Interview Relevance
40% of distributed time discussions
🏭Event ordering, conflict resolution
Production Impact
Powers systems at Event ordering, conflict resolution
Varies by algorithm
Performance
Varies by algorithm query improvement
📈Depends on implementation
Scalability
Depends on implementation