Skip to content
~100s Visual Explainer

Circuit Breaker

How to prevent cascade failures in microservices — the circuit breaker pattern for resilient service communication.

One Slow Service Kills Everything Service C waiting... Service A threads blocked Service B DOWN Cascade failure: one service takes down everything Fail Fast, Protect Resources Service A Circuit Breaker Service B When failures exceed threshold → breaker trips OPEN Requests fail immediately without calling downstream Protects resources, gives time to recover Closed → Open → Half-Open → ... CLOSED Normal operation Requests pass through OPEN Fail immediately No network calls HALF- OPEN Test request Check if recovered failures > threshold timeout success → close fail → stay open Watching the Breaker Trip CLOSED ✓ ✓ ✓ ✗ ✗ Normal + failures OPEN ✗ ✗ ✗ (instant) Fail fast, no calls HALF-OPEN ✓ (test) 30s timeout CLOSED ✓ ✓ ✓ Recovered! Failure Counter 5/5 → TRIP Automatic Recovery Timeout → test → close No manual intervention Tuning the Breaker Failure Threshold 5-10 failures Too low = false trips Open Timeout 30-60 seconds Too short = hammering Test Count 1 request Half-open test Trade-offs • Low threshold: Fast protection, more false trips • High threshold: Fewer false trips, slower protection Monitor breaker state — open = something is wrong!
1 / ?

One Slow Service Kills Everything

When a downstream service becomes slow or unresponsive, callers keep sending requests. Threads block waiting for timeouts. Connection pools exhaust. Soon the caller itself becomes unresponsive, and the failure cascades upstream.

One broken service can take down your entire system.

  • Slow responses worse than failures (tie up resources)
  • Thread pool exhaustion
  • Connection pool exhaustion
  • Cascade failures through the call graph

Fail Fast, Protect Resources

A circuit breaker wraps calls to external services. When failures exceed a threshold, the breaker "trips open" — subsequent calls fail immediately without attempting the network call. This protects your resources and gives the downstream service time to recover.

Like an electrical circuit breaker that prevents fires.

  • Wrapper around external calls
  • Tracks success/failure metrics
  • Trips open when failures exceed threshold
  • Fails fast: immediate rejection, no network call

Closed → Open → Half-Open → ...

The circuit breaker has three states:

  • CLOSED: Normal operation. Requests pass through. Failures are counted.
  • OPEN: Breaker has tripped. All requests fail immediately without calling downstream.
  • HALF-OPEN: After a timeout, allow one test request. If it succeeds, close the breaker. If it fails, stay open.

Watching the Breaker Trip

Imagine calling a payment service. Five requests fail in a row (threshold=5). The breaker trips open. For the next 30 seconds, all payment calls return an error immediately — no network attempt.

After 30 seconds, one test request goes through. If the payment service is back, the breaker closes and normal operation resumes.

  • Failure counter increments on each failure
  • Threshold reached → trip to OPEN
  • OPEN state: fail immediately (protect resources)
  • Automatic recovery via HALF-OPEN test

Tuning the Breaker

Circuit breaker parameters need tuning:

  • Failure threshold: Too low = false trips on transient errors. Too high = slow to protect.
  • Timeout: Too short = hammering a recovering service. Too long = slow recovery.
  • Window: Rolling window vs consecutive failures.

Monitor your breakers — an open breaker is a signal something is wrong.

What's Next?

Now that you understand circuit breakers, explore related patterns: Rate Limiting for protecting against overload, Heartbeat & Failure Detection for detecting failures, and Failover for handling detected failures.