A rate limiting algorithm that smoothly enforces request limits by weighting the previous time window, preventing boundary burst problems
TL;DR
Sliding window counter is a rate limiting algorithm that fixes the boundary-burst problem of fixed windows. It weights the previous window’s count based on how far into the current window you are, providing smoother rate enforcement with minimal memory overhead (just two counters per key). Unlike token bucket, it doesn’t allow bursts—it enforces strict rate limits.
Visual Overview
Core Explanation
What is Sliding Window?
Real-World Analogy: Imagine a highway toll booth that tracks cars per hour. A fixed window system would reset at exactly each hour—so a convoy could race through at 11:59 and again at 12:00. A sliding window instead asks: “In the last 60 minutes from right now, how many cars passed?” It smoothly weights recent and older traffic.
Like looking backward through a sliding 1-hour window that moves with you through time, instead of fixed calendar hours.
How It Works
The algorithm maintains two counters per rate-limited key:
- prev_count: Requests in the previous complete window
- curr_count: Requests in the current ongoing window
The Algorithm
on request:
window = floor(now / window_size)
elapsed = (now % window_size) / window_size
if window != current_window:
# Rotate to new window
prev_count = curr_count
curr_count = 0
current_window = window
weighted_count = prev_count * (1 - elapsed) + curr_count
if weighted_count < limit:
curr_count += 1
return ALLOW
else:
return REJECT
Key Properties
| Property | Value | Notes |
|---|---|---|
| Time Complexity | O(1) per request | Simple arithmetic |
| Space Complexity | O(2) per key | Two counters + window ID |
| Burst Prevention | Yes | No sudden spikes allowed |
| Accuracy | Approximate | Weighted estimate, not exact |
Why Weighted Counting Works
The weighting formula estimates how many requests occurred in the “last window_size seconds” as if we were looking backward from now:
Sliding Window vs Fixed Window vs Token Bucket
Real Systems Using Sliding Window
| System | Implementation | Configuration | Use Case |
|---|---|---|---|
| Stripe API | Sliding window style | Per-endpoint limits | Payment processing |
| Cloudflare Rate Limiting | Configurable algorithms | Rule-based | Edge protection |
| Redis Rate Limiter | redis-cell module | Configurable | General rate limiting |
| Envoy | Token bucket + sliding window | Filter chain | Service mesh |
| HAProxy | stick-tables | Connection rate limiting | Load balancer |
Note: Many systems offer configurable algorithms. Verify current behavior in documentation.
Configuration Patterns
When to Use Sliding Window
✓ Perfect Use Cases
✕ When NOT to Use
Interview Application
Common Interview Question
Q: “Explain the boundary burst problem in rate limiting and how sliding window solves it.”
Strong Answer:
“The boundary burst problem occurs with fixed window rate limiting. If your window is 1 minute and resets at :00, a client can make 100 requests at 0:59 and 100 more at 1:00—effectively 200 requests in 2 seconds while technically respecting ‘100/minute’.
Sliding window counter solves this by weighting the previous window:
The Formula:
weighted = prev_window × (1 - elapsed_fraction) + curr_windowIf we’re 25% into the current window, we count 75% of the previous window’s requests plus 100% of current. This approximates ‘requests in the last 60 seconds from right now.’
Example: At 1:15 (25% into minute 1):
- Previous window (minute 0): 80 requests
- Current window (minute 1): 25 requests so far
- Weighted: 80 × 0.75 + 25 = 60 + 25 = 85
- If limit is 100: ALLOWED
Trade-offs:
- Pro: Prevents boundary gaming, smooth enforcement
- Con: Approximate (not exact count), no burst tolerance
When I’d use it:
- Billing/quotas where overage isn’t acceptable
- SLA enforcement with partner contracts
When I’d use token bucket instead:
- APIs where legitimate burst traffic is expected (mobile app startup)”
Follow-up: How would you implement sliding window for distributed rate limiting?
“For distributed sliding window, I’d use Redis with atomic operations:
- Key structure:
ratelimit:{user}:{window}stores count per window- Atomic increment: Use INCR with EXPIRE
- Two-key lookup: Fetch current and previous window counts
Lua script approach (atomic):
-- Get both windows atomically local prev = redis.call('GET', prev_key) or 0 local curr = redis.call('GET', curr_key) or 0 local weighted = prev * (1 - elapsed) + curr if weighted < limit then redis.call('INCR', curr_key) redis.call('EXPIRE', curr_key, window_size * 2) return 1 end return 0Considerations:
- Redis failures: Fail open (allow) or fail closed (reject) based on use case
- Multi-region: Local Redis per region with eventual sync, accept slight inaccuracy
- Clock skew: Use Redis server time, not client time”
Follow-up: What’s the sliding window log variant?
“Sliding window log stores the timestamp of each request instead of just a count. To check the limit, you count timestamps within the last window_size seconds.
Pros: Exact counting, no approximation Cons: O(N) space per key, O(log N) per request with cleanup
When to use: Regulatory requirements for exact counts, very low-volume endpoints where precision matters.
For high-throughput APIs, the counter variant’s approximation (typically within 1-5% of actual) is acceptable and much more efficient.”
Code Example
Sliding Window Counter (Python)
import time
from typing import Tuple, Dict
from dataclasses import dataclass
@dataclass
class WindowState:
"""State for sliding window rate limiter."""
prev_count: int = 0
curr_count: int = 0
current_window: int = 0
class SlidingWindowRateLimiter:
"""
Sliding window counter rate limiter.
Prevents boundary burst problem by weighting previous window.
"""
def __init__(self, limit: int, window_seconds: int):
"""
Args:
limit: Maximum requests per window
window_seconds: Window size in seconds
"""
self.limit = limit
self.window_seconds = window_seconds
self.states: Dict[str, WindowState] = {}
def is_allowed(self, key: str) -> Tuple[bool, dict]:
"""
Check if request is allowed under sliding window limit.
Args:
key: Rate limit key (user_id, ip, api_key)
Returns:
(allowed: bool, info: dict)
"""
now = time.time()
window = int(now // self.window_seconds)
elapsed = (now % self.window_seconds) / self.window_seconds
# Get or create state
if key not in self.states:
self.states[key] = WindowState()
state = self.states[key]
# Rotate windows if needed
if window != state.current_window:
if window == state.current_window + 1:
# Normal rotation to next window
state.prev_count = state.curr_count
else:
# Skipped windows (no activity), prev is 0
state.prev_count = 0
state.curr_count = 0
state.current_window = window
# Calculate weighted count
weighted = state.prev_count * (1 - elapsed) + state.curr_count
if weighted < self.limit:
state.curr_count += 1
allowed = True
remaining = int(self.limit - weighted - 1)
else:
allowed = False
remaining = 0
return allowed, {
"remaining": max(0, remaining),
"limit": self.limit,
"window_seconds": self.window_seconds,
"weighted_count": round(weighted, 2)
}
# Usage example
if __name__ == "__main__":
# 5 requests per 10 seconds
limiter = SlidingWindowRateLimiter(
limit=5,
window_seconds=10
)
user = "user_123"
print("Testing sliding window rate limiter (5 req / 10 sec):\n")
# Simulate requests
for i in range(8):
allowed, info = limiter.is_allowed(user)
status = "✓ ALLOWED" if allowed else "✗ REJECTED"
print(f"Request {i+1}: {status}")
print(f" Weighted count: {info['weighted_count']}")
print(f" Remaining: {info['remaining']}\n")
# Wait for partial window rotation
print("Waiting 5 seconds (half a window)...\n")
time.sleep(5)
# More requests - previous window still partially counted
for i in range(3):
allowed, info = limiter.is_allowed(user)
status = "✓ ALLOWED" if allowed else "✗ REJECTED"
print(f"Request {i+9}: {status}")
print(f" Weighted count: {info['weighted_count']}")
print(f" Remaining: {info['remaining']}\n")
Sliding Window with Redis (Production)
import time
import redis
class RedisSlidingWindow:
"""
Distributed sliding window rate limiter using Redis.
Uses two keys per client: previous and current window counts.
"""
LUA_SCRIPT = """
local curr_key = KEYS[1]
local prev_key = KEYS[2]
local limit = tonumber(ARGV[1])
local window_size = tonumber(ARGV[2])
local elapsed = tonumber(ARGV[3])
-- Get counts (default 0)
local prev_count = tonumber(redis.call('GET', prev_key) or 0)
local curr_count = tonumber(redis.call('GET', curr_key) or 0)
-- Calculate weighted count
local weighted = prev_count * (1 - elapsed) + curr_count
if weighted < limit then
-- Increment current window
redis.call('INCR', curr_key)
-- Set expiry (2 windows to ensure prev is available)
redis.call('EXPIRE', curr_key, window_size * 2)
return {1, math.floor(limit - weighted - 1), weighted}
else
return {0, 0, weighted}
end
"""
def __init__(self, redis_client: redis.Redis, limit: int, window_seconds: int):
self.redis = redis_client
self.limit = limit
self.window_seconds = window_seconds
self.script = self.redis.register_script(self.LUA_SCRIPT)
def is_allowed(self, key: str) -> tuple[bool, dict]:
"""Check and increment atomically in Redis."""
now = time.time()
window = int(now // self.window_seconds)
elapsed = (now % self.window_seconds) / self.window_seconds
curr_key = f"ratelimit:{key}:{window}"
prev_key = f"ratelimit:{key}:{window - 1}"
result = self.script(
keys=[curr_key, prev_key],
args=[self.limit, self.window_seconds, elapsed]
)
return result[0] == 1, {
"remaining": result[1],
"limit": self.limit,
"weighted": round(result[2], 2)
}
Express.js Middleware
const Redis = require('ioredis');
const redis = new Redis(process.env.REDIS_URL);
function slidingWindowLimiter({ limit, windowSeconds, keyGenerator }) {
const luaScript = `
local curr_key = KEYS[1]
local prev_key = KEYS[2]
local limit = tonumber(ARGV[1])
local window_size = tonumber(ARGV[2])
local elapsed = tonumber(ARGV[3])
local prev_count = tonumber(redis.call('GET', prev_key) or 0)
local curr_count = tonumber(redis.call('GET', curr_key) or 0)
local weighted = prev_count * (1 - elapsed) + curr_count
if weighted < limit then
redis.call('INCR', curr_key)
redis.call('EXPIRE', curr_key, window_size * 2)
return {1, math.floor(limit - weighted - 1)}
else
return {0, 0}
end
`;
return async (req, res, next) => {
const key = keyGenerator(req);
const now = Date.now() / 1000;
const window = Math.floor(now / windowSeconds);
const elapsed = (now % windowSeconds) / windowSeconds;
const currKey = `ratelimit:${key}:${window}`;
const prevKey = `ratelimit:${key}:${window - 1}`;
try {
const [allowed, remaining] = await redis.eval(
luaScript,
2,
currKey,
prevKey,
limit,
windowSeconds,
elapsed
);
res.set({
'X-RateLimit-Limit': limit,
'X-RateLimit-Remaining': remaining,
'X-RateLimit-Window': windowSeconds,
});
if (allowed) {
next();
} else {
res.status(429).json({
error: 'Rate limit exceeded',
retryAfter: Math.ceil(windowSeconds * (1 - elapsed)),
});
}
} catch (err) {
console.error('Rate limiter error:', err);
next(); // Fail open
}
};
}
// Usage: 100 requests per minute, smooth enforcement
app.use('/api/', slidingWindowLimiter({
limit: 100,
windowSeconds: 60,
keyGenerator: (req) => req.user?.id || req.ip,
}));
Related Content
See It In Action:
- Rate Limiting Explainer - Compares sliding window with token bucket
Related Concepts:
- Rate Limiting - Parent concept
- Token Bucket - Alternative algorithm (allows bursts)
Quick Self-Check
- Can explain the boundary burst problem in 30 seconds?
- Understand the weighted count formula and why it works?
- Know the trade-off between sliding window and token bucket?
- Can implement sliding window counter with Redis?
- Understand when approximation is acceptable vs when you need exact counts?
- Know why sliding window is preferred for billing/quotas?
Interview Notes
55% of rate limiting discussions
Powers systems at API rate limiters, CDNs
O(1) per request query improvement
2 counters per key