Skip to content

Producer Acknowledgments

Mechanisms by which message producers receive confirmation that their messages were successfully persisted, enabling reliability tradeoffs between latency and durability

TL;DR

Producer acknowledgments (acks) control when Kafka considers a message successfully written. Options include acks=0 (no confirmation), acks=1 (leader confirms), and acks=all (all replicas confirm), trading latency for durability guarantees. Critical for balancing performance vs data safety in message brokers.

Visual Overview

Producer Acknowledgments Overview

Core Explanation

What are Producer Acknowledgments?

Producer acknowledgments (acks) control when a Kafka producer considers a write operation successful. This determines:

  1. When producer receives confirmation that message is safe
  2. How many replicas must persist the message
  3. Trade-off between latency and durability

Three Levels:

Three Acknowledgment Levels

acks=0: No Acknowledgment

Behavior:

acks=0 Behavior

When Message Can Be Lost:

acks=0 Message Loss Scenarios

Configuration:

const producer = kafka.producer({
  acks: 0, // No acknowledgment
  compression: "gzip", // Often used with acks=0 for max throughput
});

Use Cases:

acks=0 Use Cases

acks=1: Leader Acknowledgment

Behavior:

acks=1 Behavior

When Message Can Be Lost:

acks=1 Message Loss Scenario

Configuration:

const producer = kafka.producer({
  acks: 1, // Leader acknowledgment (default)
  timeout: 30000, // 30s timeout
  retry: {
    retries: 3, // Retry on failure
  },
});

Use Cases:

acks=1 Use Cases

acks=all: Full ISR Acknowledgment

Behavior:

acks=all Behavior

In-Sync Replicas (ISR):

In-Sync Replicas (ISR)

min.insync.replicas:

min.insync.replicas Configuration

Configuration:

const producer = kafka.producer({
  acks: -1, // -1 means "all" (acks=all)
  timeout: 30000,
  retry: {
    retries: 5,
  },
});

// Topic configuration
min.insync.replicas = 2; // At least 2 replicas must ack
replication.factor = 3; // Total of 3 replicas

Use Cases:

acks=all Use Cases

Real Systems Using Producer Acks

SystemDefault acksTypical ConfigRationale
Kafka Streamsacks=allacks=all, min.insync.replicas=2State stores require durability
Netflix (Keystone)acks=1acks=1, replication=3High throughput, tolerate rare loss
LinkedInacks=allacks=all, min.insync.replicas=2Business-critical events
Uberacks=1acks=1 (logs), acks=all (trips)Mixed based on data criticality
Confluent Cloudacks=allacks=all, min.insync.replicas=2Default for safety

Case Study: Kafka at LinkedIn

LinkedIn Kafka Acknowledgment Strategy

When to Use Each Ack Level

acks=0: Fire and Forget

Use When:

acks=0 When to Use

acks=1: Leader Only

Use When:

acks=1 When to Use

acks=all: Full Replication

Use When:

acks=all When to Use

Hybrid Approach

Different Topics, Different Acks:

// Critical orders: acks=all
const orderProducer = kafka.producer({
  acks: -1,
  timeout: 30000,
});

// Analytics events: acks=1
const analyticsProducer = kafka.producer({
  acks: 1,
  timeout: 10000,
});

// Metrics: acks=0
const metricsProducer = kafka.producer({
  acks: 0,
  compression: "gzip",
});

Interview Application

Common Interview Question

Q: “How would you ensure zero data loss in a Kafka-based order processing system?”

Strong Answer:

“To ensure zero data loss for orders, I’d configure producers with acks=all and proper ISR settings:

Producer Configuration:

acks=all (or acks=-1)
min.insync.replicas=2
replication.factor=3
retries=MAX_INT (infinite retries)
max.in.flight.requests=1 (for ordering)

How This Prevents Loss:

  1. acks=all: Producer waits for full replication before considering write successful
  2. min.insync.replicas=2: Requires at least 2 replicas (leader + 1 follower) to acknowledge
  3. replication.factor=3: Total of 3 copies across brokers
  4. Result: Message on ≥2 replicas before ACK

Failure Scenarios:

  • Network failure: Producer retries until successful
  • Leader failure: Message already on follower (promoted to new leader)
  • Follower failure: Still have leader + other follower (meets min ISR)
  • Leader + Follower fail: Third replica exists, can rebuild ISR

Only lose data if: All 3 replicas fail simultaneously (extremely rare)

Trade-offs:

  • Latency: 20-30ms vs 5-10ms for acks=1
  • Throughput: Lower (wait for replication)
  • Availability: May reject writes if ISR < 2

Worth It: For orders where data loss = lost revenue + angry customers

Monitoring: Alert if ISR falls below min.insync.replicas”

Code Example

Producer with Different Ack Levels

const { Kafka } = require("kafkajs");

const kafka = new Kafka({
  clientId: "my-producer",
  brokers: ["kafka1:9092", "kafka2:9092", "kafka3:9092"],
});

// Configuration 1: acks=0 (Fire and Forget)
async function sendMetrics() {
  const producer = kafka.producer({
    acks: 0, // No acknowledgment
    compression: "gzip",
  });

  await producer.connect();

  const start = Date.now();
  await producer.send({
    topic: "metrics",
    messages: [{ value: JSON.stringify({ cpu: 80, mem: 60 }) }],
  });
  const latency = Date.now() - start;

  console.log(`Metrics sent (acks=0): ${latency}ms`);
  // Typical output: 1-2ms
  // Risk: Message may be lost
}

// Configuration 2: acks=1 (Leader Acknowledgment)
async function sendUserActivity() {
  const producer = kafka.producer({
    acks: 1, // Leader acknowledgment (default)
    timeout: 30000,
    retry: {
      retries: 3,
      initialRetryTime: 100,
    },
  });

  await producer.connect();

  const start = Date.now();
  await producer.send({
    topic: "user-activity",
    messages: [
      {
        key: "user-123",
        value: JSON.stringify({ action: "click", page: "/products" }),
      },
    ],
  });
  const latency = Date.now() - start;

  console.log(`Activity sent (acks=1): ${latency}ms`);
  // Typical output: 5-10ms
  // Risk: Lost if leader fails before replication
}

// Configuration 3: acks=all (Full ISR Acknowledgment)
async function sendOrder() {
  const producer = kafka.producer({
    acks: -1, // acks=all (wait for full ISR)
    timeout: 30000,
    retry: {
      retries: Number.MAX_VALUE, // Retry forever
      initialRetryTime: 100,
      maxRetryTime: 30000,
    },
    idempotent: true, // Exactly-once semantics
    maxInFlightRequests: 1, // Preserve ordering
  });

  await producer.connect();

  const start = Date.now();
  try {
    await producer.send({
      topic: "orders", // Topic config: min.insync.replicas=2, replication.factor=3
      messages: [
        {
          key: "order-456",
          value: JSON.stringify({
            orderId: "456",
            userId: "123",
            total: 99.99,
            items: [{ id: "product-1", qty: 2 }],
          }),
        },
      ],
    });
    const latency = Date.now() - start;

    console.log(`Order sent (acks=all): ${latency}ms`);
    // Typical output: 15-30ms
    // Guarantee: Message on ≥2 replicas, zero loss
  } catch (error) {
    if (error.type === "NOT_ENOUGH_REPLICAS") {
      // ISR < min.insync.replicas (degraded cluster)
      console.error("Cluster degraded: Not enough in-sync replicas");
      // Alert operations team
      // Queue order for retry
    }
    throw error;
  }
}

// Demonstrating latency differences
async function benchmark() {
  console.log("Benchmarking producer acknowledgments...\n");

  await sendMetrics(); // ~1-2ms
  await sendUserActivity(); // ~5-10ms
  await sendOrder(); // ~15-30ms

  // Trade-off: Latency vs Durability
  // acks=0:   Fastest, least safe
  // acks=1:   Balanced (default)
  // acks=all: Slowest, safest
}

benchmark();

Error Handling with acks=all

async function sendCriticalData(data) {
  const producer = kafka.producer({
    acks: -1,
    retry: {
      retries: 5,
      initialRetryTime: 300,
    },
  });

  await producer.connect();

  try {
    await producer.send({
      topic: "critical-data",
      messages: [{ value: JSON.stringify(data) }],
    });

    console.log("Data persisted successfully (acks=all)");
  } catch (error) {
    // Error types to handle:

    if (error.type === "NOT_ENOUGH_REPLICAS") {
      // ISR < min.insync.replicas
      console.error("Not enough in-sync replicas");
      // Action: Alert operations, queue for retry
    }

    if (error.type === "NOT_ENOUGH_REPLICAS_AFTER_APPEND") {
      // Message written to leader, but ISR shrank before replication
      console.error("Replication failed after append");
      // Action: Retry (may be duplicate, use idempotent producer)
    }

    if (error.type === "REQUEST_TIMED_OUT") {
      // Replication took longer than timeout
      console.error("Acknowledgment timeout");
      // Action: Retry (may be duplicate)
    }

    // Store in dead letter queue for manual review
    await storeInDLQ(data, error);
    throw error;
  }
}

Prerequisites:

Related Concepts:

Used In Systems:

  • Kafka (producer acknowledgments)
  • Pulsar (similar ack levels)
  • RabbitMQ (publisher confirms)

Explained In Detail:

  • Kafka Deep Dive - Producer mechanics and acknowledgments

See It In Action

Quick Self-Check

  • Can explain acks=0/1/all in 60 seconds?
  • Understand latency vs durability trade-offs?
  • Know when messages can be lost for each ack level?
  • Can explain min.insync.replicas and ISR?
  • Understand acks=all + min.insync.replicas=2 pattern?
  • Know which ack level to use for different use cases?
Interview Notes
💼65% of messaging interviews
Interview Relevance
65% of messaging interviews
🏭Durability control
Production Impact
Powers systems at Durability control
Latency vs safety tradeoffs
Performance
Latency vs safety tradeoffs query improvement
📈Data loss prevention
Scalability
Data loss prevention