Skip to content

Sliding Window

A rate limiting algorithm that smoothly enforces request limits by weighting the previous time window, preventing boundary burst problems

TL;DR

Sliding window counter is a rate limiting algorithm that fixes the boundary-burst problem of fixed windows. It weights the previous window’s count based on how far into the current window you are, providing smoother rate enforcement with minimal memory overhead (just two counters per key). Unlike token bucket, it doesn’t allow bursts—it enforces strict rate limits.

Visual Overview

Sliding Window Counter

Core Explanation

What is Sliding Window?

Real-World Analogy: Imagine a highway toll booth that tracks cars per hour. A fixed window system would reset at exactly each hour—so a convoy could race through at 11:59 and again at 12:00. A sliding window instead asks: “In the last 60 minutes from right now, how many cars passed?” It smoothly weights recent and older traffic.

Like looking backward through a sliding 1-hour window that moves with you through time, instead of fixed calendar hours.

How It Works

The algorithm maintains two counters per rate-limited key:

  1. prev_count: Requests in the previous complete window
  2. curr_count: Requests in the current ongoing window
Sliding Window State

The Algorithm

on request:
  window = floor(now / window_size)
  elapsed = (now % window_size) / window_size

  if window != current_window:
    # Rotate to new window
    prev_count = curr_count
    curr_count = 0
    current_window = window

  weighted_count = prev_count * (1 - elapsed) + curr_count

  if weighted_count < limit:
    curr_count += 1
    return ALLOW
  else:
    return REJECT

Key Properties

PropertyValueNotes
Time ComplexityO(1) per requestSimple arithmetic
Space ComplexityO(2) per keyTwo counters + window ID
Burst PreventionYesNo sudden spikes allowed
AccuracyApproximateWeighted estimate, not exact

Why Weighted Counting Works

The weighting formula estimates how many requests occurred in the “last window_size seconds” as if we were looking backward from now:

Weighted Count Visualization

Sliding Window vs Fixed Window vs Token Bucket

Algorithm Comparison

Real Systems Using Sliding Window

SystemImplementationConfigurationUse Case
Stripe APISliding window stylePer-endpoint limitsPayment processing
Cloudflare Rate LimitingConfigurable algorithmsRule-basedEdge protection
Redis Rate Limiterredis-cell moduleConfigurableGeneral rate limiting
EnvoyToken bucket + sliding windowFilter chainService mesh
HAProxystick-tablesConnection rate limitingLoad balancer

Note: Many systems offer configurable algorithms. Verify current behavior in documentation.

Configuration Patterns

Common Sliding Window Configurations

When to Use Sliding Window

✓ Perfect Use Cases

Sliding Window Use Cases

✕ When NOT to Use

When Sliding Window May Not Fit

Interview Application

Common Interview Question

Q: “Explain the boundary burst problem in rate limiting and how sliding window solves it.”

Strong Answer:

“The boundary burst problem occurs with fixed window rate limiting. If your window is 1 minute and resets at :00, a client can make 100 requests at 0:59 and 100 more at 1:00—effectively 200 requests in 2 seconds while technically respecting ‘100/minute’.

Sliding window counter solves this by weighting the previous window:

The Formula:

weighted = prev_window × (1 - elapsed_fraction) + curr_window

If we’re 25% into the current window, we count 75% of the previous window’s requests plus 100% of current. This approximates ‘requests in the last 60 seconds from right now.’

Example: At 1:15 (25% into minute 1):

  • Previous window (minute 0): 80 requests
  • Current window (minute 1): 25 requests so far
  • Weighted: 80 × 0.75 + 25 = 60 + 25 = 85
  • If limit is 100: ALLOWED

Trade-offs:

  • Pro: Prevents boundary gaming, smooth enforcement
  • Con: Approximate (not exact count), no burst tolerance

When I’d use it:

  • Billing/quotas where overage isn’t acceptable
  • SLA enforcement with partner contracts

When I’d use token bucket instead:

  • APIs where legitimate burst traffic is expected (mobile app startup)”

Follow-up: How would you implement sliding window for distributed rate limiting?

“For distributed sliding window, I’d use Redis with atomic operations:

  1. Key structure: ratelimit:{user}:{window} stores count per window
  2. Atomic increment: Use INCR with EXPIRE
  3. Two-key lookup: Fetch current and previous window counts

Lua script approach (atomic):

-- Get both windows atomically
local prev = redis.call('GET', prev_key) or 0
local curr = redis.call('GET', curr_key) or 0
local weighted = prev * (1 - elapsed) + curr
if weighted < limit then
  redis.call('INCR', curr_key)
  redis.call('EXPIRE', curr_key, window_size * 2)
  return 1
end
return 0

Considerations:

  • Redis failures: Fail open (allow) or fail closed (reject) based on use case
  • Multi-region: Local Redis per region with eventual sync, accept slight inaccuracy
  • Clock skew: Use Redis server time, not client time”

Follow-up: What’s the sliding window log variant?

Sliding window log stores the timestamp of each request instead of just a count. To check the limit, you count timestamps within the last window_size seconds.

Pros: Exact counting, no approximation Cons: O(N) space per key, O(log N) per request with cleanup

When to use: Regulatory requirements for exact counts, very low-volume endpoints where precision matters.

For high-throughput APIs, the counter variant’s approximation (typically within 1-5% of actual) is acceptable and much more efficient.”

Code Example

Sliding Window Counter (Python)

import time
from typing import Tuple, Dict
from dataclasses import dataclass

@dataclass
class WindowState:
    """State for sliding window rate limiter."""
    prev_count: int = 0
    curr_count: int = 0
    current_window: int = 0

class SlidingWindowRateLimiter:
    """
    Sliding window counter rate limiter.

    Prevents boundary burst problem by weighting previous window.
    """

    def __init__(self, limit: int, window_seconds: int):
        """
        Args:
            limit: Maximum requests per window
            window_seconds: Window size in seconds
        """
        self.limit = limit
        self.window_seconds = window_seconds
        self.states: Dict[str, WindowState] = {}

    def is_allowed(self, key: str) -> Tuple[bool, dict]:
        """
        Check if request is allowed under sliding window limit.

        Args:
            key: Rate limit key (user_id, ip, api_key)

        Returns:
            (allowed: bool, info: dict)
        """
        now = time.time()
        window = int(now // self.window_seconds)
        elapsed = (now % self.window_seconds) / self.window_seconds

        # Get or create state
        if key not in self.states:
            self.states[key] = WindowState()

        state = self.states[key]

        # Rotate windows if needed
        if window != state.current_window:
            if window == state.current_window + 1:
                # Normal rotation to next window
                state.prev_count = state.curr_count
            else:
                # Skipped windows (no activity), prev is 0
                state.prev_count = 0
            state.curr_count = 0
            state.current_window = window

        # Calculate weighted count
        weighted = state.prev_count * (1 - elapsed) + state.curr_count

        if weighted < self.limit:
            state.curr_count += 1
            allowed = True
            remaining = int(self.limit - weighted - 1)
        else:
            allowed = False
            remaining = 0

        return allowed, {
            "remaining": max(0, remaining),
            "limit": self.limit,
            "window_seconds": self.window_seconds,
            "weighted_count": round(weighted, 2)
        }


# Usage example
if __name__ == "__main__":
    # 5 requests per 10 seconds
    limiter = SlidingWindowRateLimiter(
        limit=5,
        window_seconds=10
    )

    user = "user_123"

    print("Testing sliding window rate limiter (5 req / 10 sec):\n")

    # Simulate requests
    for i in range(8):
        allowed, info = limiter.is_allowed(user)
        status = "✓ ALLOWED" if allowed else "✗ REJECTED"
        print(f"Request {i+1}: {status}")
        print(f"  Weighted count: {info['weighted_count']}")
        print(f"  Remaining: {info['remaining']}\n")

    # Wait for partial window rotation
    print("Waiting 5 seconds (half a window)...\n")
    time.sleep(5)

    # More requests - previous window still partially counted
    for i in range(3):
        allowed, info = limiter.is_allowed(user)
        status = "✓ ALLOWED" if allowed else "✗ REJECTED"
        print(f"Request {i+9}: {status}")
        print(f"  Weighted count: {info['weighted_count']}")
        print(f"  Remaining: {info['remaining']}\n")

Sliding Window with Redis (Production)

import time
import redis

class RedisSlidingWindow:
    """
    Distributed sliding window rate limiter using Redis.

    Uses two keys per client: previous and current window counts.
    """

    LUA_SCRIPT = """
    local curr_key = KEYS[1]
    local prev_key = KEYS[2]
    local limit = tonumber(ARGV[1])
    local window_size = tonumber(ARGV[2])
    local elapsed = tonumber(ARGV[3])

    -- Get counts (default 0)
    local prev_count = tonumber(redis.call('GET', prev_key) or 0)
    local curr_count = tonumber(redis.call('GET', curr_key) or 0)

    -- Calculate weighted count
    local weighted = prev_count * (1 - elapsed) + curr_count

    if weighted < limit then
        -- Increment current window
        redis.call('INCR', curr_key)
        -- Set expiry (2 windows to ensure prev is available)
        redis.call('EXPIRE', curr_key, window_size * 2)
        return {1, math.floor(limit - weighted - 1), weighted}
    else
        return {0, 0, weighted}
    end
    """

    def __init__(self, redis_client: redis.Redis, limit: int, window_seconds: int):
        self.redis = redis_client
        self.limit = limit
        self.window_seconds = window_seconds
        self.script = self.redis.register_script(self.LUA_SCRIPT)

    def is_allowed(self, key: str) -> tuple[bool, dict]:
        """Check and increment atomically in Redis."""
        now = time.time()
        window = int(now // self.window_seconds)
        elapsed = (now % self.window_seconds) / self.window_seconds

        curr_key = f"ratelimit:{key}:{window}"
        prev_key = f"ratelimit:{key}:{window - 1}"

        result = self.script(
            keys=[curr_key, prev_key],
            args=[self.limit, self.window_seconds, elapsed]
        )

        return result[0] == 1, {
            "remaining": result[1],
            "limit": self.limit,
            "weighted": round(result[2], 2)
        }

Express.js Middleware

const Redis = require('ioredis');

const redis = new Redis(process.env.REDIS_URL);

function slidingWindowLimiter({ limit, windowSeconds, keyGenerator }) {
  const luaScript = `
    local curr_key = KEYS[1]
    local prev_key = KEYS[2]
    local limit = tonumber(ARGV[1])
    local window_size = tonumber(ARGV[2])
    local elapsed = tonumber(ARGV[3])

    local prev_count = tonumber(redis.call('GET', prev_key) or 0)
    local curr_count = tonumber(redis.call('GET', curr_key) or 0)

    local weighted = prev_count * (1 - elapsed) + curr_count

    if weighted < limit then
      redis.call('INCR', curr_key)
      redis.call('EXPIRE', curr_key, window_size * 2)
      return {1, math.floor(limit - weighted - 1)}
    else
      return {0, 0}
    end
  `;

  return async (req, res, next) => {
    const key = keyGenerator(req);
    const now = Date.now() / 1000;
    const window = Math.floor(now / windowSeconds);
    const elapsed = (now % windowSeconds) / windowSeconds;

    const currKey = `ratelimit:${key}:${window}`;
    const prevKey = `ratelimit:${key}:${window - 1}`;

    try {
      const [allowed, remaining] = await redis.eval(
        luaScript,
        2,
        currKey,
        prevKey,
        limit,
        windowSeconds,
        elapsed
      );

      res.set({
        'X-RateLimit-Limit': limit,
        'X-RateLimit-Remaining': remaining,
        'X-RateLimit-Window': windowSeconds,
      });

      if (allowed) {
        next();
      } else {
        res.status(429).json({
          error: 'Rate limit exceeded',
          retryAfter: Math.ceil(windowSeconds * (1 - elapsed)),
        });
      }
    } catch (err) {
      console.error('Rate limiter error:', err);
      next(); // Fail open
    }
  };
}

// Usage: 100 requests per minute, smooth enforcement
app.use('/api/', slidingWindowLimiter({
  limit: 100,
  windowSeconds: 60,
  keyGenerator: (req) => req.user?.id || req.ip,
}));

See It In Action:

Related Concepts:

Quick Self-Check

  • Can explain the boundary burst problem in 30 seconds?
  • Understand the weighted count formula and why it works?
  • Know the trade-off between sliding window and token bucket?
  • Can implement sliding window counter with Redis?
  • Understand when approximation is acceptable vs when you need exact counts?
  • Know why sliding window is preferred for billing/quotas?
Interview Notes
💼55% of rate limiting discussions
Interview Relevance
55% of rate limiting discussions
🏭API rate limiters, CDNs
Production Impact
Powers systems at API rate limiters, CDNs
O(1) per request
Performance
O(1) per request query improvement
📈2 counters per key
Scalability
2 counters per key