Meta Description: Stop API abuse with rate limiting, throttling, and quotas. Learn algorithms, implementation strategies, and how to communicate limits to clients.
Keywords: api abuse prevention, rate limiting, api throttling, api quotas, abuse detection, api protection
Word Count: ~2,300 words
Your API is getting hammered. One client is making 100,000 requests per hour. Your servers are struggling. Other clients are getting slow responses.
You need to prevent API abuse.
Rate limiting, throttling, and quotas protect your API from excessive usage. Here's how to implement them correctly.
Rate Limiting vs Throttling vs Quotas
These terms are often confused. Here's the difference:
Rate Limiting: Maximum requests per time window (100 requests per minute) Throttling: Slowing down requests when limits are approached (delay responses) Quotas: Total usage limits over longer periods (1 million requests per month)
Most APIs use all three.
Rate Limiting Strategies
Strategy 1: Fixed Window
Count requests in fixed time windows.
const rateLimits = new Map(); // userId -> { count, resetTime }
function checkRateLimit(userId, limit = 100, windowSeconds = 60) {
const now = Date.now();
const windowStart = Math.floor(now / (windowSeconds * 1000)) * windowSeconds * 1000;
const userLimit = rateLimits.get(userId);
if (!userLimit || userLimit.resetTime !== windowStart) {
// New window
rateLimits.set(userId, { count: 1, resetTime: windowStart });
return { allowed: true, remaining: limit - 1 };
}
if (userLimit.count >= limit) {
return {
allowed: false,
remaining: 0,
resetAt: windowStart + (windowSeconds * 1000)
};
}
userLimit.count++;
return { allowed: true, remaining: limit - userLimit.count };
}
Pros: Simple, low memory Cons: Burst problem at window boundaries
Strategy 2: Sliding Window
Count requests in a rolling time window.
// Using Redis sorted sets
async function checkRateLimitSlidingWindow(userId, limit = 100, windowSeconds = 60) {
const now = Date.now();
const windowStart = now - (windowSeconds * 1000);
const key = `rate_limit:${userId}`;
// Remove old entries
await redis.zremrangebyscore(key, 0, windowStart);
// Count requests in window
const count = await redis.zcard(key);
if (count >= limit) {
return { allowed: false, remaining: 0 };
}
// Add current request
await redis.zadd(key, now, `${now}-${Math.random()}`);
await redis.expire(key, windowSeconds);
return { allowed: true, remaining: limit - count - 1 };
}
Pros: No burst problem, accurate Cons: Higher memory usage
Strategy 3: Token Bucket
Tokens refill at a constant rate. Each request consumes a token.
class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.refillRate = refillRate; // tokens per second
this.lastRefill = Date.now();
}
refill() {
const now = Date.now();
const timePassed = (now - this.lastRefill) / 1000;
const tokensToAdd = timePassed * this.refillRate;
this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
this.lastRefill = now;
}
consume(tokens = 1) {
this.refill();
if (this.tokens >= tokens) {
this.tokens -= tokens;
return { allowed: true, remaining: Math.floor(this.tokens) };
}
return {
allowed: false,
remaining: 0,
retryAfter: (tokens - this.tokens) / this.refillRate
};
}
}
const buckets = new Map();
function checkRateLimit(userId) {
if (!buckets.has(userId)) {
buckets.set(userId, new TokenBucket(100, 10)); // 100 capacity, 10/sec refill
}
return buckets.get(userId).consume();
}
Pros: Handles bursts gracefully, smooth limiting Cons: More complex
Strategy 4: Leaky Bucket
Requests enter a queue and are processed at a constant rate.
class LeakyBucket {
constructor(capacity, leakRate) {
this.capacity = capacity;
this.queue = [];
this.leakRate = leakRate; // requests per second
this.lastLeak = Date.now();
}
leak() {
const now = Date.now();
const timePassed = (now - this.lastLeak) / 1000;
const requestsToLeak = Math.floor(timePassed * this.leakRate);
if (requestsToLeak > 0) {
this.queue.splice(0, requestsToLeak);
this.lastLeak = now;
}
}
add(request) {
this.leak();
if (this.queue.length >= this.capacity) {
return { allowed: false, queueSize: this.queue.length };
}
this.queue.push(request);
return { allowed: true, queueSize: this.queue.length };
}
}
Pros: Smooth traffic, prevents bursts Cons: Adds latency
Implementing Rate Limiting
Middleware Approach
const rateLimit = require('express-rate-limit');
// Global rate limit
const globalLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 1000, // 1000 requests per window
standardHeaders: true, // Return rate limit info in headers
legacyHeaders: false,
handler: (req, res) => {
res.status(429).json({
error: 'Too many requests',
retryAfter: req.rateLimit.resetTime
});
}
});
app.use('/v1/', globalLimiter);
// Endpoint-specific limits
const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 5,
skipSuccessfulRequests: true
});
app.post('/v1/auth/login', authLimiter, async (req, res) => {
// Handle login
});
Per-User Rate Limiting
async function userRateLimiter(req, res, next) {
const userId = req.auth?.userId || req.ip;
const limit = await checkRateLimit(userId);
// Set headers
res.setHeader('X-RateLimit-Limit', '100');
res.setHeader('X-RateLimit-Remaining', limit.remaining.toString());
res.setHeader('X-RateLimit-Reset', limit.resetAt?.toString() || '');
if (!limit.allowed) {
return res.status(429).json({
error: 'Rate limit exceeded',
retryAfter: Math.ceil((limit.resetAt - Date.now()) / 1000)
});
}
next();
}
app.use(userRateLimiter);
Tiered Rate Limits
Different limits for different user tiers:
const RATE_LIMITS = {
free: { requests: 100, window: 3600 }, // 100/hour
pro: { requests: 1000, window: 3600 }, // 1000/hour
enterprise: { requests: 10000, window: 3600 } // 10000/hour
};
async function tierBasedRateLimiter(req, res, next) {
const user = await getUser(req.auth.userId);
const tier = user.tier || 'free';
const limits = RATE_LIMITS[tier];
const result = await checkRateLimit(
req.auth.userId,
limits.requests,
limits.window
);
res.setHeader('X-RateLimit-Limit', limits.requests.toString());
res.setHeader('X-RateLimit-Remaining', result.remaining.toString());
if (!result.allowed) {
return res.status(429).json({
error: 'Rate limit exceeded',
tier: tier,
limit: limits.requests,
upgradeUrl: 'https://petstoreapi.com/pricing'
});
}
next();
}
Quotas
Quotas limit total usage over longer periods (day, month, year).
Implementing Monthly Quotas
async function checkQuota(userId) {
const month = new Date().toISOString().slice(0, 7); // "2026-03"
const key = `quota:${userId}:${month}`;
const usage = await redis.get(key) || 0;
const user = await getUser(userId);
const quota = user.monthlyQuota || 10000;
if (parseInt(usage) >= quota) {
return {
allowed: false,
used: parseInt(usage),
quota: quota,
resetDate: new Date(month + '-01').setMonth(new Date(month + '-01').getMonth() + 1)
};
}
await redis.incr(key);
await redis.expireat(key, Math.floor(new Date(month + '-01').setMonth(new Date(month + '-01').getMonth() + 1) / 1000));
return {
allowed: true,
used: parseInt(usage) + 1,
quota: quota
};
}
async function quotaMiddleware(req, res, next) {
const quota = await checkQuota(req.auth.userId);
res.setHeader('X-Quota-Used', quota.used.toString());
res.setHeader('X-Quota-Limit', quota.quota.toString());
res.setHeader('X-Quota-Remaining', (quota.quota - quota.used).toString());
if (!quota.allowed) {
return res.status(429).json({
error: 'Monthly quota exceeded',
used: quota.used,
quota: quota.quota,
resetDate: new Date(quota.resetDate).toISOString()
});
}
next();
}
Throttling
Slow down requests instead of blocking them.
Delay-Based Throttling
async function throttleMiddleware(req, res, next) {
const userId = req.auth.userId;
const usage = await getUsageRate(userId);
// If usage is high, add delay
if (usage > 80) { // 80% of limit
const delay = Math.min((usage - 80) * 100, 5000); // Max 5 second delay
await new Promise(resolve => setTimeout(resolve, delay));
}
next();
}
Response Throttling
async function throttleResponse(req, res, next) {
const userId = req.auth.userId;
const usage = await getUsageRate(userId);
if (usage > 90) {
// Return partial data
req.throttled = true;
req.maxResults = 10; // Limit results
}
next();
}
app.get('/v1/pets', throttleResponse, async (req, res) => {
const limit = req.throttled ? req.maxResults : 100;
const pets = await db.pets.findMany({ limit });
if (req.throttled) {
res.setHeader('X-Throttled', 'true');
}
res.json({ data: pets });
});
Abuse Detection
Detect and block abusive patterns.
Pattern Detection
async function detectAbuse(userId) {
const patterns = await analyzeRequestPatterns(userId);
// Detect scraping
if (patterns.sequentialIds > 100) {
return { abuse: true, reason: 'Sequential ID scraping detected' };
}
// Detect brute force
if (patterns.failedLogins > 10) {
return { abuse: true, reason: 'Brute force attack detected' };
}
// Detect bot behavior
if (patterns.requestsPerSecond > 50) {
return { abuse: true, reason: 'Bot-like behavior detected' };
}
return { abuse: false };
}
async function abuseDetectionMiddleware(req, res, next) {
const abuse = await detectAbuse(req.auth.userId);
if (abuse.abuse) {
await blockUser(req.auth.userId, abuse.reason);
return res.status(403).json({
error: 'Account suspended',
reason: abuse.reason,
contact: 'support@petstoreapi.com'
});
}
next();
}
Communicating Limits
Standard Headers
res.setHeader('X-RateLimit-Limit', '100');
res.setHeader('X-RateLimit-Remaining', '95');
res.setHeader('X-RateLimit-Reset', '1678886400');
res.setHeader('Retry-After', '60'); // Seconds until retry
Error Responses
{
"error": "Rate limit exceeded",
"message": "You have exceeded the rate limit of 100 requests per hour",
"limit": 100,
"remaining": 0,
"resetAt": "2026-03-13T11:00:00Z",
"retryAfter": 3600,
"documentation": "https://docs.petstoreapi.com/rate-limits"
}
Documentation
Document your limits clearly:
## Rate Limits
### Free Tier
- 100 requests per hour
- 10,000 requests per month
### Pro Tier
- 1,000 requests per hour
- 100,000 requests per month
### Enterprise Tier
- 10,000 requests per hour
- Unlimited monthly requests
### Headers
- `X-RateLimit-Limit`: Maximum requests per window
- `X-RateLimit-Remaining`: Requests remaining
- `X-RateLimit-Reset`: Unix timestamp when limit resets
### Handling Rate Limits
When you exceed the limit, you'll receive a 429 status code.
Wait for the time specified in `Retry-After` header before retrying.
Best Practices
1. Be Generous
Set limits high enough for legitimate use. Don't frustrate real users.
2. Provide Clear Feedback
Always include rate limit headers. Tell users when they can retry.
3. Tier Your Limits
Offer higher limits for paid tiers. This monetizes heavy usage.
4. Monitor and Adjust
Track limit violations. Adjust limits based on actual usage patterns.
5. Whitelist Internal Services
Don't rate limit your own services:
function shouldRateLimit(req) {
// Skip rate limiting for internal services
if (req.headers['x-internal-service'] === process.env.INTERNAL_SECRET) {
return false;
}
return true;
}
6. Use Multiple Layers
Combine rate limiting, quotas, and abuse detection for comprehensive protection.
Summary
Rate limiting protects your API from abuse while allowing legitimate usage. Implement multiple strategies:
- Rate limiting: Prevent short-term abuse
- Quotas: Control long-term usage
- Throttling: Gracefully degrade service
- Abuse detection: Block malicious patterns
Choose algorithms based on your needs: - Fixed window: Simple, good enough for most cases - Sliding window: More accurate, prevents bursts - Token bucket: Handles bursts gracefully - Leaky bucket: Smooths traffic
Always communicate limits clearly through headers, error messages, and documentation.