TL;DR
Producer batching groups multiple messages together before sending them to the server, amortizing network overhead and maximizing throughput. Instead of sending each message immediately (1 message = 1 network request), batching collects messages for a short time window or until reaching a size threshold, then sends them together in a single request. This technique can improve throughput by 10-100x.
Visual Overview
Batching Overview WITHOUT BATCHING (Naive Approach):
T=0ms: [Message A] ─ ─ ▶ Network Request 1
T=5ms: [Message B] ─ ─ ▶ Network Request 2
T=8ms: [Message C] ─ ─ ▶ Network Request 3
T=12ms: [Message D] ─ ─ ▶ Network Request 4
Result : 4 network requests, ~50ms total latency
Overhead : 4x network round-trips , 4x TCP overhead
WITH BATCHING (Optimized):
T=0ms: [Message A] ─ ─ ┐
T=5ms: [Message B] ─ ─ ┤
T=8ms: [Message C] ─ ─ ┼ ─ ─ ─ Batch Accumulation
T=12ms: [Message D] ─ ─ ┘
T=20ms: [Batch: A,B,C,D] ─ ─ ▶ Single Network Request
Result : 1 network request, ~30ms total latency
Overhead : 1x network round-trip , 4x compression efficiency
BATCH TRIGGERS :
├ ─ ─ Size Threshold: batch.size = 32 KB (default)
├ ─ ─ Time Threshold: linger.ms = 20 ms (configurable)
├ ─ ─ Memory Pressure: Buffer full, send immediately
└ ─ ─ Explicit Flush: Application calls flush () Click to expand × Batching Overview
WITHOUT BATCHING (Naive Approach):
T=0ms: [Message A] ─ ─ ▶ Network Request 1
T=5ms: [Message B] ─ ─ ▶ Network Request 2
T=8ms: [Message C] ─ ─ ▶ Network Request 3
T=12ms: [Message D] ─ ─ ▶ Network Request 4
Result : 4 network requests, ~50ms total latency
Overhead : 4x network round-trips , 4x TCP overhead
WITH BATCHING (Optimized):
T=0ms: [Message A] ─ ─ ┐
T=5ms: [Message B] ─ ─ ┤
T=8ms: [Message C] ─ ─ ┼ ─ ─ ─ Batch Accumulation
T=12ms: [Message D] ─ ─ ┘
T=20ms: [Batch: A,B,C,D] ─ ─ ▶ Single Network Request
Result : 1 network request, ~30ms total latency
Overhead : 1x network round-trip , 4x compression efficiency
BATCH TRIGGERS :
├ ─ ─ Size Threshold: batch.size = 32 KB (default)
├ ─ ─ Time Threshold: linger.ms = 20 ms (configurable)
├ ─ ─ Memory Pressure: Buffer full, send immediately
└ ─ ─ Explicit Flush: Application calls flush ()
Core Explanation
What is Producer Batching?
Producer batching is a performance optimization where a message producer accumulates multiple messages in memory before sending them to the server in a single network request.
Batching Architecture BATCHING ARCHITECTURE :
Application Thread :
producer.send (message_1) ─ ─ ┐
producer.send (message_2) ─ ─ ┤
producer.send (message_3) ─ ─ ┼ ─ ─ ▶ Batch Buffer (per partition)
producer.send (message_4) ─ ─ ┤ │
producer.send (message_5) ─ ─ ┘ │
▼
Background Sender Thread :
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ Wait for trigger : │
│ - Size >= 32 KB │
│ - Time >= linger.ms │
│ - Buffer full │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
▼
[Send Batch ] ─ ─ ▶ Server Click to expand × Batching Architecture
BATCHING ARCHITECTURE :
Application Thread :
producer.send (message_1) ─ ─ ┐
producer.send (message_2) ─ ─ ┤
producer.send (message_3) ─ ─ ┼ ─ ─ ▶ Batch Buffer (per partition)
producer.send (message_4) ─ ─ ┤ │
producer.send (message_5) ─ ─ ┘ │
▼
Background Sender Thread :
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ Wait for trigger : │
│ - Size >= 32 KB │
│ - Time >= linger.ms │
│ - Buffer full │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
▼
[Send Batch ] ─ ─ ▶ Server
Key Batching Parameters:
// Batch size threshold (bytes)
batch.size = 32768 // 32 KB default
// Time to wait for batch to fill (milliseconds)
linger.ms = 0 // Send immediately (default)
linger.ms = 20 // Wait up to 20ms for more messages
// Total memory for all batches
buffer.memory = 67108864 // 64 MB default
Network Overhead Analysis:
Network Overhead Analysis SINGLE MESSAGE SEND :
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ TCP/IP Header : 40 bytes │
│ Kafka Protocol Header: 100 bytes │
│ Message Overhead: 50 bytes │
│ Actual Message Payload: 200 bytes │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ Total: 390 bytes │
│ Efficiency : 200/390 = 51% │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
BATCHED SEND (100 messages):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ TCP/IP Header : 40 bytes (1x) │
│ Kafka Protocol Header: 100 bytes (1x) │
│ Message Overhead: 50 bytes × 100 = 5000 │
│ Actual Message Payload: 200 × 100 = 20000 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ Total: 25,140 bytes │
│ Efficiency : 20000/25140 = 80% │
│ Network Savings: 64x fewer requests │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Result : 64x reduction in network overhead! Click to expand × Network Overhead Analysis
SINGLE MESSAGE SEND :
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ TCP/IP Header : 40 bytes │
│ Kafka Protocol Header: 100 bytes │
│ Message Overhead: 50 bytes │
│ Actual Message Payload: 200 bytes │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ Total: 390 bytes │
│ Efficiency : 200/390 = 51% │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
BATCHED SEND (100 messages):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ TCP/IP Header : 40 bytes (1x) │
│ Kafka Protocol Header: 100 bytes (1x) │
│ Message Overhead: 50 bytes × 100 = 5000 │
│ Actual Message Payload: 200 × 100 = 20000 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ Total: 25,140 bytes │
│ Efficiency : 20000/25140 = 80% │
│ Network Savings: 64x fewer requests │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Result : 64x reduction in network overhead!
Throughput Impact:
Throughput Impact Scenario : Send 100,000 messages (200 bytes each)
NO BATCHING :
├ ─ ─ Network RTT: 1ms per request
├ ─ ─ Total time: 100,000 × 1ms = 100 seconds
└ ─ ─ Throughput: 1,000 messages/sec
WITH BATCHING (100 msg/batch):
├ ─ ─ Network RTT: 1ms per batch
├ ─ ─ Total time: 1,000 batches × 1ms = 1 second
└ ─ ─ Throughput: 100,000 messages/sec
100x improvement ! Click to expand × Throughput Impact
Scenario : Send 100,000 messages (200 bytes each)
NO BATCHING :
├ ─ ─ Network RTT: 1ms per request
├ ─ ─ Total time: 100,000 × 1ms = 100 seconds
└ ─ ─ Throughput: 1,000 messages/sec
WITH BATCHING (100 msg/batch):
├ ─ ─ Network RTT: 1ms per batch
├ ─ ─ Total time: 1,000 batches × 1ms = 1 second
└ ─ ─ Throughput: 100,000 messages/sec
100x improvement !
Batching Triggers and Tradeoffs
Batch Completion Triggers:
Batch Completion Triggers TRIGGER 1 : SIZE THRESHOLD REACHED
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Current batch: 31 KB
New message: 2 KB
Total: 33 KB > batch.size (32 KB)
Action: Send batch immediately
TRIGGER 2 : TIME THRESHOLD REACHED
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Batch started: T=0ms
Current time: T=20ms >= linger.ms (20ms)
Action: Send batch (even if not full)
TRIGGER 3 : MEMORY PRESSURE
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Buffer memory: 64 MB
Used: 62 MB (97% full)
Action: Send oldest batches to free memory
TRIGGER 4 : EXPLICIT FLUSH
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Application calls: producer.flush ()
Action: Send all pending batches immediately Click to expand × Batch Completion Triggers
TRIGGER 1 : SIZE THRESHOLD REACHED
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Current batch: 31 KB
New message: 2 KB
Total: 33 KB > batch.size (32 KB)
Action: Send batch immediately
TRIGGER 2 : TIME THRESHOLD REACHED
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Batch started: T=0ms
Current time: T=20ms >= linger.ms (20ms)
Action: Send batch (even if not full)
TRIGGER 3 : MEMORY PRESSURE
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Buffer memory: 64 MB
Used: 62 MB (97% full)
Action: Send oldest batches to free memory
TRIGGER 4 : EXPLICIT FLUSH
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
Application calls: producer.flush ()
Action: Send all pending batches immediately
The Latency-Throughput Tradeoff:
Configuration Spectrum CONFIGURATION SPECTRUM :
Low Latency (Real-time Systems):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ linger.ms = 0 ← Send immediately │
│ batch.size = 16384 (16 KB) │
│ │
│ Latency: ~1-2ms │
│ Throughput: ~10K msg/sec │
│ Use case : Trading, alerts │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Balanced (Most Applications):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ linger.ms = 10-20 ← Small wait window │
│ batch.size = 32768 (32 KB) │
│ │
│ Latency: ~15-25ms │
│ Throughput: ~50K msg/sec │
│ Use case : Event streaming │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
High Throughput (Analytics):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ linger.ms = 50-100 ← Longer wait │
│ batch.size = 131072 (128 KB) │
│ │
│ Latency: ~60-120ms │
│ Throughput: ~200K msg/sec │
│ Use case : Log aggregation │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ Click to expand × Configuration Spectrum
CONFIGURATION SPECTRUM :
Low Latency (Real-time Systems):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ linger.ms = 0 ← Send immediately │
│ batch.size = 16384 (16 KB) │
│ │
│ Latency: ~1-2ms │
│ Throughput: ~10K msg/sec │
│ Use case : Trading, alerts │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Balanced (Most Applications):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ linger.ms = 10-20 ← Small wait window │
│ batch.size = 32768 (32 KB) │
│ │
│ Latency: ~15-25ms │
│ Throughput: ~50K msg/sec │
│ Use case : Event streaming │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
High Throughput (Analytics):
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ linger.ms = 50-100 ← Longer wait │
│ batch.size = 131072 (128 KB) │
│ │
│ Latency: ~60-120ms │
│ Throughput: ~200K msg/sec │
│ Use case : Log aggregation │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Production Configuration Examples
Example 1: High-Throughput Log Ingestion
Properties props = new Properties ();
// Optimize for throughput
props. put (ProducerConfig.BATCH_SIZE_CONFIG, 131072 ); // 128 KB
props. put (ProducerConfig.LINGER_MS_CONFIG, 50 ); // Wait 50ms
props. put (ProducerConfig.BUFFER_MEMORY_CONFIG, 268435456 ); // 256 MB
// Enable compression for better batching
props. put (ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4" );
// Allow more in-flight requests
props. put (ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5 );
// Result: 10x throughput improvement
// Tradeoff: ~60ms added latency
Example 2: Low-Latency Real-Time Events
Properties props = new Properties ();
// Optimize for latency
props. put (ProducerConfig.BATCH_SIZE_CONFIG, 16384 ); // 16 KB
props. put (ProducerConfig.LINGER_MS_CONFIG, 0 ); // No wait
props. put (ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432 ); // 32 MB
// Minimal compression overhead
props. put (ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4" );
// Limit in-flight for ordering
props. put (ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1 );
// Result: <5ms p99 latency
// Tradeoff: Lower throughput (~20K msg/sec)
Batch Compression and Efficiency
Compression with Batching:
Compression with Batching WHY BATCHING IMPROVES COMPRESSION :
Single Message Compression :
Message 1: {"user_id": 123, "event": "click", "timestamp": 1234567890}
Compressed: 58 bytes → 52 bytes (10% savings )
Batched Messages Compression (100 messages):
Original: 5,800 bytes
Compressed (lz4): 1,200 bytes (80% savings !)
Why better compression?
├ ─ ─ Repeated keys: "user_id", "event", "timestamp" appear 100x
├ ─ ─ Similar values: Timestamps are sequential
├ ─ ─ Pattern recognition: Better with larger data sets
└ ─ ─ Compression dictionary: More effective context
Combined Batching + Compression :
├ ─ ─ Network overhead: 64x reduction (batching)
├ ─ ─ Payload size: 5x reduction (compression)
└ ─ ─ Total efficiency: 320x improvement ! Click to expand × Compression with Batching
WHY BATCHING IMPROVES COMPRESSION :
Single Message Compression :
Message 1: {"user_id": 123, "event": "click", "timestamp": 1234567890}
Compressed: 58 bytes → 52 bytes (10% savings )
Batched Messages Compression (100 messages):
Original: 5,800 bytes
Compressed (lz4): 1,200 bytes (80% savings !)
Why better compression?
├ ─ ─ Repeated keys: "user_id", "event", "timestamp" appear 100x
├ ─ ─ Similar values: Timestamps are sequential
├ ─ ─ Pattern recognition: Better with larger data sets
└ ─ ─ Compression dictionary: More effective context
Combined Batching + Compression :
├ ─ ─ Network overhead: 64x reduction (batching)
├ ─ ─ Payload size: 5x reduction (compression)
└ ─ ─ Total efficiency: 320x improvement !
Production Compression Strategy:
public class CompressionStrategy {
// LZ4: Fast compression, low CPU
// Best for: High-throughput systems with large batches
// Compression: 2:1 ratio, 300 MB/sec
props. put ( ProducerConfig . COMPRESSION_TYPE_CONFIG , "lz4" );
// Snappy: Balanced
// Best for: Moderate throughput, balanced CPU usage
// Compression: 2.3:1 ratio, 250 MB/sec
props. put ( ProducerConfig . COMPRESSION_TYPE_CONFIG , "snappy" );
// GZIP: Best compression
// Best for: Network-limited systems, low volume
// Compression: 3.2:1 ratio, 50 MB/sec (high CPU)
props. put ( ProducerConfig . COMPRESSION_TYPE_CONFIG , "gzip" );
// None: No compression
// Best for: Already-compressed data (images, video)
props. put ( ProducerConfig . COMPRESSION_TYPE_CONFIG , "none" );
}
Memory Management and Buffer Pool
Buffer Pool Architecture:
Producer Memory Layout PRODUCER MEMORY LAYOUT :
Total Buffer : 64 MB (buffer.memory)
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ Partition 0 Batch: 32 KB (ready) ← Full │
│ Partition 1 Batch: 28 KB (building) ← Accum │
│ Partition 2 Batch: 31 KB (ready) ← Full │
│ Partition 3 Batch: 15 KB (building) │
│ ... │
│ Free Memory: 10 MB │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Memory Exhaustion Behavior :
1. Buffer full (free < new batch size)
2. Block send () call for max.block.ms (default 60s)
3. If still full, throw BufferExhaustedException
4. As batches send , memory freed for new batches
Monitoring :
kafka.producer:type=producer-metrics,name=buffer-available-bytes Click to expand × Producer Memory Layout
PRODUCER MEMORY LAYOUT :
Total Buffer : 64 MB (buffer.memory)
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
│ Partition 0 Batch: 32 KB (ready) ← Full │
│ Partition 1 Batch: 28 KB (building) ← Accum │
│ Partition 2 Batch: 31 KB (ready) ← Full │
│ Partition 3 Batch: 15 KB (building) │
│ ... │
│ Free Memory: 10 MB │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Memory Exhaustion Behavior :
1. Buffer full (free < new batch size)
2. Block send () call for max.block.ms (default 60s)
3. If still full, throw BufferExhaustedException
4. As batches send , memory freed for new batches
Monitoring :
kafka.producer:type=producer-metrics,name=buffer-available-bytes
Tradeoffs
Advantages:
✓ Massively improved throughput (10-100x)
✓ Reduced network overhead (90%+ fewer requests)
✓ Better compression efficiency with larger batches
✓ Lower CPU usage per message (amortized overhead)
✓ Reduced server-side processing load
Disadvantages:
✕ Increased latency (messages wait in batch)
✕ Higher memory usage (buffering messages)
✕ Complexity in tuning (batch.size vs linger.ms)
✕ Risk of data loss if producer crashes before send
✕ Larger failure blast radius (entire batch fails together)
Real Systems Using This
Apache Kafka
Implementation: Per-partition batching with configurable size and time thresholds
Scale: 7+ trillion messages/day at LinkedIn with aggressive batching
Default Config: 32 KB batch.size, 0ms linger.ms (conservative)
Production Config: 64-128 KB batch.size, 20-50ms linger.ms (optimized)
AWS Kinesis
Implementation: Automatic batching via PutRecords API (up to 500 records)
Limits: 5 MB/sec per shard, 1 MB per batch
SDK Behavior: KPL (Kinesis Producer Library) batches automatically
Google Cloud Pub/Sub
Implementation: Client library batches messages automatically
Config: Max batch size (1000 messages), max batch bytes (10 MB)
Optimization: Batching + request compression for efficiency
RabbitMQ
Implementation: Optional publisher confirms batching
Config: Manual batching via application-level buffering
Performance: 10x improvement with batching enabled
When to Use Producer Batching
✓ Perfect Use Cases
High-Volume Event Streaming
High-Volume Event Streaming Scenario : Ingesting millions of events per second
Why batching : Maximizes network and disk efficiency
Example : Clickstream analytics, IoT sensor data
Config : Large batches (128 KB), medium linger (20-50ms) Click to expand × High-Volume Event Streaming
Scenario : Ingesting millions of events per second
Why batching : Maximizes network and disk efficiency
Example : Clickstream analytics, IoT sensor data
Config : Large batches (128 KB), medium linger (20-50ms)
Log Aggregation
Log Aggregation Scenario : Centralized logging from 1000s of services
Why batching : Reduces load on logging infrastructure
Example : ELK stack ingestion, Splunk forwarding
Config : Large batches (128 KB), high linger (50-100ms) Click to expand × Log Aggregation
Scenario : Centralized logging from 1000s of services
Why batching : Reduces load on logging infrastructure
Example : ELK stack ingestion, Splunk forwarding
Config : Large batches (128 KB), high linger (50-100ms)
Bulk Data Migration
Bulk Data Migration Scenario : Moving large datasets between systems
Why batching : Maximum throughput , latency not critical
Example : Database CDC, ETL pipelines
Config : Maximum batches (256 KB), high linger (100ms) Click to expand × Bulk Data Migration
Scenario : Moving large datasets between systems
Why batching : Maximum throughput , latency not critical
Example : Database CDC, ETL pipelines
Config : Maximum batches (256 KB), high linger (100ms)
✕ When NOT to Use (or Use Minimal Batching)
Real-Time Alerting
Real-Time Alerting Problem : Critical alerts delayed by batching
Solution : linger.ms=0 , small batches (16 KB)
Example : Security alerts, system monitoring Click to expand × Real-Time Alerting
Problem : Critical alerts delayed by batching
Solution : linger.ms=0 , small batches (16 KB)
Example : Security alerts, system monitoring
Trading Systems
Trading Systems Problem : Milliseconds matter, batching adds latency
Solution : No batching (linger.ms=0 ) or very small windows
Example : High-frequency trading, order execution Click to expand × Trading Systems
Problem : Milliseconds matter, batching adds latency
Solution : No batching (linger.ms=0 ) or very small windows
Example : High-frequency trading, order execution
Request-Response Patterns
Request-Response Patterns Problem : User waiting for immediate response
Solution : Minimal batching , sync sends
Example : API calls, user-facing operations Click to expand × Request-Response Patterns
Problem : User waiting for immediate response
Solution : Minimal batching , sync sends
Example : API calls, user-facing operations
Interview Application
Common Interview Question 1
Q: “How would you optimize a producer that’s sending 100,000 small messages per second, causing high CPU and network usage?”
Strong Answer:
“The issue is likely excessive network overhead from sending each message individually. I’d implement producer batching:
Diagnosis:
Current: 100K messages × 1 KB = 100K network requests/sec
Network overhead: ~50% of bandwidth wasted on headers
CPU overhead: 100K serialize/send operations
Solution:
// Enable aggressive batching
props. put (ProducerConfig.BATCH_SIZE_CONFIG, 65536 ); // 64 KB
props. put (ProducerConfig.LINGER_MS_CONFIG, 20 ); // 20ms window
props. put (ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4" );
Result:
Batching: 100K messages → ~2K batches (50x reduction)
Compression: 64 KB → ~15 KB per batch (4x savings)
Network: 98% reduction in requests
CPU: 95% reduction in overhead
Added latency: ~20ms (acceptable for most use cases)
Tradeoff: 20ms added latency vs 50x throughput improvement. For log/event streaming, this is optimal.”
Why this is good:
Quantifies the problem
Provides specific configuration
Explains each parameter choice
Analyzes tradeoffs explicitly
Gives measurable results
Common Interview Question 2
Q: “Your Kafka producer is dropping messages under high load. How would you debug and fix this?”
Strong Answer:
“Message drops under load suggest buffer memory exhaustion. Here’s my approach:
Diagnosis Steps:
Check JMX metric: buffer-available-bytes → Likely near 0
Check logs for BufferExhaustedException
Check max.block.ms timeout (default 60s)
Root Cause Analysis:
Batches accumulating faster than sender thread can send
Possible causes:
Network slowness (broker response time)
Too small buffer.memory for traffic volume
Inefficient batching (small batches = more sends)
Solutions (in order):
1. Increase buffer memory:
props. put (ProducerConfig.BUFFER_MEMORY_CONFIG, 268435456 ); // 256 MB
2. Optimize batching for throughput:
props. put (ProducerConfig.BATCH_SIZE_CONFIG, 131072 ); // 128 KB
props. put (ProducerConfig.LINGER_MS_CONFIG, 50 ); // Wait for fuller batches
props. put (ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4" );
3. Application-level backpressure:
producer. send (record, (metadata, exception) -> {
if (exception instanceof BufferExhaustedException) {
// Implement retry with exponential backoff
// Or shed load (return 503 to clients)
}
});
Result: Larger buffer + more efficient batching = 10x capacity improvement”
Why this is good:
Systematic debugging approach
Multiple solution layers
Specific metrics to check
Code examples
Explains root cause clearly
Red Flags to Avoid
✕ Not understanding latency tradeoff of batching
✕ Setting linger.ms without understanding batch.size
✕ Not considering memory implications
✕ Ignoring compression benefits with batching
✕ Not knowing how to measure batching efficiency
Quick Self-Check
Before moving on, can you:
Related Content
Prerequisites
None - this is a foundational performance concept
Used In Systems
Distributed Message Queues - Core performance technique
Event-Driven Architectures - Essential for high throughput
Explained In Detail
Next Recommended: Producer Acknowledgments - Understand reliability guarantees