Kafka Topic Partitioning

How Kafka distributes messages across partitions for parallelism and ordering guarantees.

Read this as What partition key buys ordering without creating skew?
Failure Trap
Adding partitions cannot fix a bad key and can change future routing assumptions.
Decision Rule
Key by the entity that needs ordering, then size partitions for throughput and consumer parallelism.
Kafka topic partitioning A topic named "orders" is split into three partitions P0, P1 and P2. A producer hashes each message key to pick a partition; messages with the same key always land in the same partition and stay strictly ordered by offset. A consumer group then assigns one partition to each consumer so the three partitions are read in parallel. Topic: orders P0 offset 0 P1 offset 0 P2 offset 0 One topic, three partitions Topic: orders P0 P1 P2 Producer key: A key: B key: C Each message carries a key Topic: orders P0 P1 P2 hash(key) % 3 Producer key A key B key C Same key → same partition Topic: orders P0 A-1 0 A-2 1 A-3 2 P1 B-1 P2 C-1 Key A stays in P0, ordered by offset Topic: orders P0 P1 P2 Consumer Group C1 C2 C3 One partition per consumer Topic: orders P0 P1 P2 Consumer Group C1 C2 C3 3 partitions read in parallel
1 / ?

A Topic with Three Partitions

In Kafka, a topic is divided into partitions — ordered, immutable sequences of records. Each partition is an independent log that can be hosted on different brokers.

Think of partitions as parallel lanes on a highway. More lanes = more throughput.

  • Topics are logical groupings of related messages
  • Partitions enable horizontal scaling
  • Each partition maintains strict ordering

Messages Written by Key

When a producer sends a message, it includes an optional partition key. This key determines which partition receives the message.

Common keys include user IDs, order IDs, or session IDs — anything that groups related events together.

  • Keys are optional but recommended for ordering
  • Messages without keys use round-robin distribution
  • Key choice affects both ordering and load distribution

Hash Function Determines Partition

Kafka computes hash(key) % num_partitions to determine the target partition. This is deterministic — the same key always maps to the same partition.

This is why adding partitions later requires careful planning. The hash mapping changes, and existing keys may route differently.

  • Default partitioner uses murmur2 hash
  • Custom partitioners possible for special routing
  • Partition count changes affect key distribution

Order Preserved Within Partitions

Messages within a single partition are strictly ordered by offset. If message A was written before message B, consumers will always see A before B.

However, there's no ordering guarantee across partitions. If you need total ordering, you need a single partition (sacrificing parallelism).

  • Offsets are sequential integers per partition
  • Same-key messages always land in same partition
  • Cross-partition ordering requires application logic

Consumers Claim Partitions

In a consumer group, each partition is assigned to exactly one consumer. This ensures messages aren't processed twice within the group.

If you have 3 partitions and 3 consumers, each gets one partition. Add a 4th consumer? It sits idle until a partition becomes available.

  • One partition → one consumer (within a group)
  • More consumers than partitions = wasted consumers
  • Rebalancing redistributes partitions on changes

Maximum Parallelism

The maximum parallelism equals the number of partitions. Three partitions means at most three consumers can work simultaneously.

This is why partition count is a critical design decision. Too few partitions limit throughput. Too many create overhead and complicate ordering.

  • Parallelism upper bound = partition count
  • Plan partition count based on expected throughput
  • Typical: 3-12 partitions for moderate topics