API Rate Limiting: Protecting Self-Hosted APIs from Abuse
Without rate limiting, a single client can overwhelm your self-hosted API. Here's how to implement effective rate limiting.
Why Rate Limit?
Without rate limiting, your API is vulnerable to:
Rate Limiting Strategies
Fixed Window
Allow N requests per time window (e.g., 100 requests per minute). Simple but allows burst at window boundaries.
Sliding Window
Smooths out the fixed window problem. Counts requests over a rolling time period.
Token Bucket
Tokens are added at a fixed rate. Each request consumes a token. If no tokens are available, the request is rejected. Allows short bursts while enforcing long-term limits.
Leaky Bucket
Requests are queued and processed at a fixed rate. Smooths traffic but adds latency.
Implementation Levels
Reverse Proxy (Caddy/Nginx)
Rate limit at the proxy level before requests reach your application. Effective for DDoS and brute force protection.
Application Middleware
Rate limit within your application code. More granular — different limits per endpoint, per user, per API key.
API Gateway
Dedicated rate limiting service. Best for complex APIs with multiple backends.
Recommended Limits
Public APIs
Authentication Endpoints
Webhooks
Response Headers
Always include rate limit headers:
HTTP Status Codes
Never silently drop rate-limited requests. Always inform the client.