Skip to content

Topic Partitioning

How distributed systems divide data into partitions for parallel processing, ordering guarantees, and horizontal scalability

TL;DR

Topic partitioning divides a message topic into multiple independent partitions, enabling parallel processing, ordering guarantees per partition, and horizontal scaling. Each partition is an ordered, immutable sequence of messages that can be processed independently.

Visual Overview

Topic Partitioning Overview

Core Explanation

What is Topic Partitioning?

A topic is a logical category of messages (e.g., “user-events”, “payment-transactions”). Partitioning splits this topic into multiple ordered logs called partitions, where:

  • Each partition is an independent, ordered sequence
  • Messages in a partition maintain strict ordering
  • Partitions can be processed in parallel
  • Partitions are distributed across multiple servers (brokers)

How Partition Assignment Works

When a producer sends a message, it must decide which partition to send to:

// Method 1: Explicit partition (rare)
producer.send(new ProducerRecord<>("user-events",
    partition, // specify partition 0, 1, 2, etc.
    key,
    value
));

// Method 2: Key-based hashing (most common)
producer.send(new ProducerRecord<>("user-events",
    userId,  // key determines partition
    event
));
// partition = hash(userId) % num_partitions
// Same userId always goes to same partition

// Method 3: Round-robin (no key)
producer.send(new ProducerRecord<>("user-events",
    event  // no key, cycles through partitions
));

Key-based partitioning is most common because it provides:

  • Ordering per key: All events for user_123 arrive in order
  • Load distribution: Hash function spreads keys evenly
  • Stateful processing: Consumer can maintain per-user state

Why Partitioning Enables Scale

Single Partition Limits:

Single Partition Limits

Multiple Partitions Scale:

Multiple Partitions Scale

Scaling Pattern:

  • 1 partition = 1 max consumer
  • 4 partitions = 4 parallel consumers
  • 100 partitions = 100 parallel consumers
  • Throughput scales linearly with partitions

Ordering Guarantees

Within Partition (Strong Ordering):

Within Partition Ordering

Across Partitions (No Ordering):

Across Partitions - No Ordering

Tradeoffs

Advantages:

  • ✓ Horizontal scalability (add more partitions/consumers)
  • ✓ High throughput (parallel processing)
  • ✓ Fault isolation (partition failure doesn’t affect others)
  • ✓ Ordered processing per partition

Disadvantages:

  • ✕ No global ordering across topic
  • ✕ Partition count hard to change later
  • ✕ Poor key distribution creates hot partitions
  • ✕ Increases operational complexity

Real Systems Using This

Kafka (Apache)

  • Implementation: Topic → Partitions → Segments on disk
  • Scale: LinkedIn processes 7+ trillion messages/day
  • Partition Strategy: Key-based hashing for ordering
  • Typical Setup: 10-100 partitions per topic

Amazon Kinesis

  • Implementation: Streams → Shards (similar to partitions)
  • Scale: Handles millions of events/second
  • Partition Strategy: Explicit shard keys
  • Typical Setup: Start with 1 shard, scale to thousands

Apache Pulsar

  • Implementation: Topics → Partitions → Segments
  • Scale: Handles petabytes of data
  • Partition Strategy: Key-based + custom routing
  • Typical Setup: 100-1000+ partitions for high-scale topics

Comparison Table

SystemTermMax Throughput/PartitionPartition LimitRebalancing
KafkaPartition~100 MB/secThousandsConsumer group
KinesisShard1 MB/sec writeThousandsManual split/merge
PulsarPartition~200 MB/secThousandsAutomatic

When to Use Topic Partitioning

✓ Perfect Use Cases

High-Volume Event Streams

High-Volume Event Streams

Parallel Data Processing

Parallel Data Processing

Ordered Processing by Key

Ordered Processing by Key

✕ When NOT to Use

Need Total Ordering

When NOT to Use - Total Ordering

Highly Skewed Keys

When NOT to Use - Skewed Keys

Small Message Volume

When NOT to Use - Small Volume

Interview Application

Common Interview Question 1

Q: “Design a chat system that handles 1 billion messages/day. How would you partition messages?”

Strong Answer:

“I’d partition by conversation_id to maintain message ordering within each conversation. This allows parallel processing across conversations while guaranteeing order within each chat. With 10 million active conversations, I’d start with 100 partitions (100K conversations per partition), giving us ~420K messages/sec aggregate throughput. As we scale, we can add more partitions and rebalance.”

Why this is good:

  • Identifies the key (conversation_id)
  • Explains ordering requirement
  • Does capacity math
  • Plans for scaling

Common Interview Question 2

Q: “What happens when a partition gets hot (e.g., celebrity user gets 100x traffic)?”

Strong Answer:

“Hot partitions are a real problem. Solutions:

  1. Add random suffix to key: celebrity_123_<random> spreads across partitions
  2. Dedicated partition: Give celebrity their own partition with dedicated consumer
  3. Application-level caching: Reduce duplicate messages
  4. Re-partition: If possible, change partition key strategy

Trade-off: Losing per-key ordering if we split the key. For celebrities, eventual consistency might be acceptable since their feed already lags.”

Why this is good:

  • Shows awareness of real problem
  • Multiple solutions with tradeoffs
  • Production thinking (celebrity use case)

Red Flags to Avoid

  • ✕ Claiming you can have total ordering with partitions
  • ✕ Not considering hot partition problem
  • ✕ Choosing partition count without capacity reasoning
  • ✕ Forgetting that partition count is hard to change later

Quick Self-Check

Before moving on, can you:

  • Explain topic partitioning in 60 seconds?
  • Draw a diagram showing partitions and consumers?
  • Explain how key-based partitioning works?
  • Identify when to use vs NOT use partitioning?
  • Understand the ordering guarantees?
  • Calculate partition count for a given throughput?

See It In Action

Prerequisites

None - this is a foundational concept

Used In Systems

  • Real-Time Chat Systems - Message distribution per conversation
  • Analytics Pipelines - Event processing at scale

Explained In Detail

  • Kafka Architecture - Topics, Partitions & Segments section (7 minutes)
  • Deep dive into Kafka’s implementation of partitioning, segment storage, and replication across brokers

Next Recommended: Consumer Groups - Learn how consumers coordinate to process partitions in parallel

Interview Notes
⭐ Must-Know
💼85% of messaging interviews
Interview Relevance
85% of messaging interviews
🏭LinkedIn, Uber, Netflix
Production Impact
Powers systems at LinkedIn, Uber, Netflix
Millions of msgs/sec
Performance
Millions of msgs/sec query improvement
📈1+ trillion msgs/day
Scalability
1+ trillion msgs/day