Sliding-Window Rate Limiting for FastAPI with Redis
Protect your Python API from hammering and credential-stuffing with a per-IP and per-token rate limiter backed by Redis — before you ship to production.
What You'll Build
A reusable sliding-window rate limiter wired into FastAPI via Depends(), backed by a Redis sorted set and a single atomic Lua script — with stricter limits on auth routes to block credential-stuffing attempts.
Prerequisites
- Python 3.11+
- Docker (for Redis) or a running Redis 7 instance
- Familiarity with FastAPI
Depends()and async/await - A virtual environment activated
Why sliding window vs. fixed window? A fixed-window counter resets on a clock boundary, letting a burst of 2 × limit requests squeeze through the reset seam. A sliding window counts only requests within the last N seconds relative to now, eliminating that spike.
Step 1 — Start Redis
# docker-compose.yml
services:
redis:
image: redis:7-alpine
ports:
- "6379:6379"
command: redis-server --save "" --appendonly no
docker compose up -d
--save "" and --appendonly no disable persistence — appropriate for an ephemeral rate-limit store.
Step 2 — Install Dependencies
python -m venv .venv && source .venv/bin/activate # macOS/Linux
pip install "fastapi>=0.111.0" "uvicorn[standard]>=0.29.0" "redis>=5.0.0"
redis>=5.0.0 is required for the aclose() async teardown API.
Step 3 — Implement the Rate Limiter
Create limiter.py:
import hashlib
import time
import uuid
import redis.asyncio as aioredis
from fastapi import HTTPException, Request, status
# Atomic Lua script: prune stale entries, count, conditionally admit
_SCRIPT = """
local key = KEYS[1]
local now = tonumber(ARGV[1])
local win = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local member = ARGV[4]
redis.call('ZREMRANGEBYSCORE', key, 0, now - win)
local count = redis.call('ZCARD', key)
if count < limit then
redis.call('ZADD', key, now, member)
redis.call('EXPIRE', key, win)
return 1
end
redis.call('EXPIRE', key, win)
return 0
"""
def _identifier(request: Request) -> str:
auth = request.headers.get("Authorization", "")
if auth.startswith("Bearer "):
# Hash the token — raw bearer values never reach Redis
digest = hashlib.sha256(auth[7:].encode()).hexdigest()[:16]
return f"token:{digest}"
forwarded = request.headers.get("X-Forwarded-For", "")
ip = (
forwarded.split(",")[0].strip()
if forwarded
else (request.client.host if request.client else "unknown")
)
return f"ip:{ip}"
class RateLimiter:
"""Dependency-injectable sliding-window rate limiter."""
def __init__(self, requests: int = 60, window: int = 60):
self.requests = requests
self.window = window # seconds
async def __call__(self, request: Request) -> None:
r: aioredis.Redis = request.app.state.redis
key = f"rl:{_identifier(request)}"
now = int(time.time())
member = str(uuid.uuid4())
allowed = await r.eval(
_SCRIPT, 1, key,
str(now), str(self.window), str(self.requests), member,
)
if not allowed:
raise HTTPException(
status_code=status.HTTP_429_TOO_MANY_REQUESTS,
detail="Rate limit exceeded.",
headers={"Retry-After": str(self.window)},
)
Key design decisions:
| Choice | Reason |
|---|---|
| Lua script | Prune + count + insert are atomic; no race conditions |
| Sorted set score = Unix timestamp | O(log N) range removal; no separate TTL per entry |
| Token hashing | Raw bearer tokens never stored in Redis |
Retry-After header |
RFC 6585 compliance; helps well-behaved clients back off |
Step 4 — Wire It Into FastAPI
# main.py
from contextlib import asynccontextmanager
import redis.asyncio as aioredis
from fastapi import Depends, FastAPI
from limiter import RateLimiter
@asynccontextmanager
async def lifespan(app: FastAPI):
app.state.redis = aioredis.from_url(
"redis://localhost:6379", decode_responses=False
)
yield
await app.state.redis.aclose()
app = FastAPI(lifespan=lifespan)
# Tighter limit on auth routes prevents credential-stuffing
auth_limiter = RateLimiter(requests=5, window=60)
api_limiter = RateLimiter(requests=100, window=60)
@app.post("/auth/login", dependencies=[Depends(auth_limiter)])
async def login():
return {"status": "authenticated"}
@app.get("/api/data", dependencies=[Depends(api_limiter)])
async def get_data():
return {"data": "some payload"}
The lifespan context manager creates a connection pool on startup and drains it on shutdown. Two separate RateLimiter instances let you tune each route class independently without sharing state.
Verify It Works
uvicorn main:app --reload
Test the auth endpoint — the 6th and 7th requests should 429:
for i in $(seq 1 7); do
curl -s -o /dev/null -w "%{http_code}\n" \
-X POST http://localhost:8000/auth/login
done
Expected output:
200
200
200
200
200
429
429
Inspect the sorted set directly:
docker exec -it $(docker compose ps -q redis) \
redis-cli ZRANGE "rl:ip:127.0.0.1" 0 -1 WITHSCORES
You should see up to 5 UUID members with Unix-timestamp scores.
Troubleshooting
AttributeError: 'NoneType' object has no attribute 'host'
request.client is None behind some reverse proxies. Add uvicorn's proxy-headers middleware — but only after locking down which upstream IPs you trust, otherwise clients can spoof X-Forwarded-For:
from uvicorn.middleware.proxy_headers import ProxyHeadersMiddleware
app.add_middleware(ProxyHeadersMiddleware, trusted_hosts=["10.0.0.1"])
redis.exceptions.ConnectionError: Error 111 connecting to localhost:6379
Redis isn't reachable. Confirm the container is running (docker compose ps) and port 6379 isn't already occupied by another process.
Rate limits reset unexpectedly under memory pressure
If Redis has a maxmemory-policy that evicts keys (e.g., allkeys-lru), rate-limit keys can disappear early. Use a dedicated Redis instance or set --maxmemory-policy noeviction for the rate-limit store.
All callers hit limits instantly
If request.client is None and X-Forwarded-For is absent, every request falls back to the identifier "ip:unknown" and shares one counter. Add proxy-headers middleware or verify your load balancer injects the real client IP.
Next Steps
- Add
X-RateLimit-Remaining/X-RateLimit-Resetresponse headers by readingZCARDafter admission and returning them via a customResponse. - Switch to
evalsha— load the script once withSCRIPT LOADand call it by SHA to avoid resending the Lua source on every request at high RPS. - Authenticated per-user limits — after JWT validation, swap the identifier to the
subclaim for per-account accounting instead of per-IP. - Explore
slowapi— a production-grade FastAPI rate-limiting library that packages these ideas with decorator syntax and pluggable key functions.
Ji-ho covers the increasingly tangled overlap between cloud architecture and security, drawing on a background as a penetration tester to keep his reporting grounded in real-world attack paths. He never lets a vendor claim go unquestioned and insists that every buzzword come with a proof of concept.
Discussion 0
No comments yet
Be the first to weigh in.