Skip to content

Producer Batching

How message producers batch records to achieve high throughput by amortizing network overhead and maximizing sequential I/O

TL;DR

Producer batching groups multiple messages together before sending them to the server, amortizing network overhead and maximizing throughput. Instead of sending each message immediately (1 message = 1 network request), batching collects messages for a short time window or until reaching a size threshold, then sends them together in a single request. This technique can improve throughput by 10-100x.

Visual Overview

Batching Overview

Core Explanation

What is Producer Batching?

Producer batching is a performance optimization where a message producer accumulates multiple messages in memory before sending them to the server in a single network request.

Batching Architecture

Key Batching Parameters:

// Batch size threshold (bytes)
batch.size = 32768  // 32 KB default

// Time to wait for batch to fill (milliseconds)
linger.ms = 0       // Send immediately (default)
linger.ms = 20      // Wait up to 20ms for more messages

// Total memory for all batches
buffer.memory = 67108864  // 64 MB default

Why Batching Dramatically Improves Performance

Network Overhead Analysis:

Network Overhead Analysis

Throughput Impact:

Throughput Impact

Batching Triggers and Tradeoffs

Batch Completion Triggers:

Batch Completion Triggers

The Latency-Throughput Tradeoff:

Configuration Spectrum

Production Configuration Examples

Example 1: High-Throughput Log Ingestion

Properties props = new Properties();

// Optimize for throughput
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 131072);    // 128 KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 50);         // Wait 50ms
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 268435456); // 256 MB

// Enable compression for better batching
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

// Allow more in-flight requests
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);

// Result: 10x throughput improvement
// Tradeoff: ~60ms added latency

Example 2: Low-Latency Real-Time Events

Properties props = new Properties();

// Optimize for latency
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);     // 16 KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 0);          // No wait
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432); // 32 MB

// Minimal compression overhead
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

// Limit in-flight for ordering
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1);

// Result: <5ms p99 latency
// Tradeoff: Lower throughput (~20K msg/sec)

Batch Compression and Efficiency

Compression with Batching:

Compression with Batching

Production Compression Strategy:

public class CompressionStrategy {

    // LZ4: Fast compression, low CPU
    // Best for: High-throughput systems with large batches
    // Compression: 2:1 ratio, 300 MB/sec
    props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

    // Snappy: Balanced
    // Best for: Moderate throughput, balanced CPU usage
    // Compression: 2.3:1 ratio, 250 MB/sec
    props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");

    // GZIP: Best compression
    // Best for: Network-limited systems, low volume
    // Compression: 3.2:1 ratio, 50 MB/sec (high CPU)
    props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "gzip");

    // None: No compression
    // Best for: Already-compressed data (images, video)
    props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "none");
}

Memory Management and Buffer Pool

Buffer Pool Architecture:

Producer Memory Layout

Tradeoffs

Advantages:

  • ✓ Massively improved throughput (10-100x)
  • ✓ Reduced network overhead (90%+ fewer requests)
  • ✓ Better compression efficiency with larger batches
  • ✓ Lower CPU usage per message (amortized overhead)
  • ✓ Reduced server-side processing load

Disadvantages:

  • ✕ Increased latency (messages wait in batch)
  • ✕ Higher memory usage (buffering messages)
  • ✕ Complexity in tuning (batch.size vs linger.ms)
  • ✕ Risk of data loss if producer crashes before send
  • ✕ Larger failure blast radius (entire batch fails together)

Real Systems Using This

Apache Kafka

  • Implementation: Per-partition batching with configurable size and time thresholds
  • Scale: 7+ trillion messages/day at LinkedIn with aggressive batching
  • Default Config: 32 KB batch.size, 0ms linger.ms (conservative)
  • Production Config: 64-128 KB batch.size, 20-50ms linger.ms (optimized)

AWS Kinesis

  • Implementation: Automatic batching via PutRecords API (up to 500 records)
  • Limits: 5 MB/sec per shard, 1 MB per batch
  • SDK Behavior: KPL (Kinesis Producer Library) batches automatically

Google Cloud Pub/Sub

  • Implementation: Client library batches messages automatically
  • Config: Max batch size (1000 messages), max batch bytes (10 MB)
  • Optimization: Batching + request compression for efficiency

RabbitMQ

  • Implementation: Optional publisher confirms batching
  • Config: Manual batching via application-level buffering
  • Performance: 10x improvement with batching enabled

When to Use Producer Batching

✓ Perfect Use Cases

High-Volume Event Streaming

High-Volume Event Streaming

Log Aggregation

Log Aggregation

Bulk Data Migration

Bulk Data Migration

✕ When NOT to Use (or Use Minimal Batching)

Real-Time Alerting

Real-Time Alerting

Trading Systems

Trading Systems

Request-Response Patterns

Request-Response Patterns

Interview Application

Common Interview Question 1

Q: “How would you optimize a producer that’s sending 100,000 small messages per second, causing high CPU and network usage?”

Strong Answer:

“The issue is likely excessive network overhead from sending each message individually. I’d implement producer batching:

Diagnosis:

  • Current: 100K messages × 1 KB = 100K network requests/sec
  • Network overhead: ~50% of bandwidth wasted on headers
  • CPU overhead: 100K serialize/send operations

Solution:

// Enable aggressive batching
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 65536);    // 64 KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 20);         // 20ms window
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

Result:

  • Batching: 100K messages → ~2K batches (50x reduction)
  • Compression: 64 KB → ~15 KB per batch (4x savings)
  • Network: 98% reduction in requests
  • CPU: 95% reduction in overhead
  • Added latency: ~20ms (acceptable for most use cases)

Tradeoff: 20ms added latency vs 50x throughput improvement. For log/event streaming, this is optimal.”

Why this is good:

  • Quantifies the problem
  • Provides specific configuration
  • Explains each parameter choice
  • Analyzes tradeoffs explicitly
  • Gives measurable results

Common Interview Question 2

Q: “Your Kafka producer is dropping messages under high load. How would you debug and fix this?”

Strong Answer:

“Message drops under load suggest buffer memory exhaustion. Here’s my approach:

Diagnosis Steps:

  1. Check JMX metric: buffer-available-bytes → Likely near 0
  2. Check logs for BufferExhaustedException
  3. Check max.block.ms timeout (default 60s)

Root Cause Analysis:

  • Batches accumulating faster than sender thread can send
  • Possible causes:
    • Network slowness (broker response time)
    • Too small buffer.memory for traffic volume
    • Inefficient batching (small batches = more sends)

Solutions (in order):

1. Increase buffer memory:

props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 268435456); // 256 MB

2. Optimize batching for throughput:

props.put(ProducerConfig.BATCH_SIZE_CONFIG, 131072);    // 128 KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 50);         // Wait for fuller batches
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

3. Application-level backpressure:

producer.send(record, (metadata, exception) -> {
    if (exception instanceof BufferExhaustedException) {
        // Implement retry with exponential backoff
        // Or shed load (return 503 to clients)
    }
});

Result: Larger buffer + more efficient batching = 10x capacity improvement”

Why this is good:

  • Systematic debugging approach
  • Multiple solution layers
  • Specific metrics to check
  • Code examples
  • Explains root cause clearly

Red Flags to Avoid

  • ✕ Not understanding latency tradeoff of batching
  • ✕ Setting linger.ms without understanding batch.size
  • ✕ Not considering memory implications
  • ✕ Ignoring compression benefits with batching
  • ✕ Not knowing how to measure batching efficiency

Quick Self-Check

Before moving on, can you:

  • Explain producer batching in 60 seconds?
  • Draw the batching flow from send() to network?
  • List all 4 batch trigger conditions?
  • Explain the latency-throughput tradeoff?
  • Calculate network savings from batching?
  • Configure producer for high-throughput vs low-latency?

Prerequisites

None - this is a foundational performance concept

Used In Systems

  • Distributed Message Queues - Core performance technique
  • Event-Driven Architectures - Essential for high throughput

Explained In Detail


Next Recommended: Producer Acknowledgments - Understand reliability guarantees

Interview Notes
💼70% of performance interviews
Interview Relevance
70% of performance interviews
🏭LinkedIn (7+ trillion msgs/day)
Production Impact
Powers systems at LinkedIn (7+ trillion msgs/day)
10-100x throughput
Performance
10-100x throughput query improvement
📈90%+ fewer network requests
Scalability
90%+ fewer network requests