Preventing API Abuse with Rate Limiting

Meta Description: Stop API abuse with rate limiting, throttling, and quotas. Learn algorithms, implementation strategies, and how to communicate limits to clients.

Keywords: api abuse prevention, rate limiting, api throttling, api quotas, abuse detection, api protection

Word Count: ~2,300 words

Your API is getting hammered. One client is making 100,000 requests per hour. Your servers are struggling. Other clients are getting slow responses.

You need to prevent API abuse.

Rate limiting, throttling, and quotas protect your API from excessive usage. Here's how to implement them correctly.

Rate Limiting vs Throttling vs Quotas

These terms are often confused. Here's the difference:

Rate Limiting: Maximum requests per time window (100 requests per minute) Throttling: Slowing down requests when limits are approached (delay responses) Quotas: Total usage limits over longer periods (1 million requests per month)

Most APIs use all three.

Rate Limiting Strategies

Strategy 1: Fixed Window

Count requests in fixed time windows.

const rateLimits = new Map(); // userId -> { count, resetTime }

function checkRateLimit(userId, limit = 100, windowSeconds = 60) {
  const now = Date.now();
  const windowStart = Math.floor(now / (windowSeconds * 1000)) * windowSeconds * 1000;

  const userLimit = rateLimits.get(userId);

  if (!userLimit || userLimit.resetTime !== windowStart) {
    // New window
    rateLimits.set(userId, { count: 1, resetTime: windowStart });
    return { allowed: true, remaining: limit - 1 };
  }

  if (userLimit.count >= limit) {
    return {
      allowed: false,
      remaining: 0,
      resetAt: windowStart + (windowSeconds * 1000)
    };
  }

  userLimit.count++;
  return { allowed: true, remaining: limit - userLimit.count };
}

Pros: Simple, low memory Cons: Burst problem at window boundaries

Strategy 2: Sliding Window

Count requests in a rolling time window.

// Using Redis sorted sets
async function checkRateLimitSlidingWindow(userId, limit = 100, windowSeconds = 60) {
  const now = Date.now();
  const windowStart = now - (windowSeconds * 1000);
  const key = `rate_limit:${userId}`;

  // Remove old entries
  await redis.zremrangebyscore(key, 0, windowStart);

  // Count requests in window
  const count = await redis.zcard(key);

  if (count >= limit) {
    return { allowed: false, remaining: 0 };
  }

  // Add current request
  await redis.zadd(key, now, `${now}-${Math.random()}`);
  await redis.expire(key, windowSeconds);

  return { allowed: true, remaining: limit - count - 1 };
}

Pros: No burst problem, accurate Cons: Higher memory usage

Strategy 3: Token Bucket

Tokens refill at a constant rate. Each request consumes a token.

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRate; // tokens per second
    this.lastRefill = Date.now();
  }

  refill() {
    const now = Date.now();
    const timePassed = (now - this.lastRefill) / 1000;
    const tokensToAdd = timePassed * this.refillRate;

    this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
    this.lastRefill = now;
  }

  consume(tokens = 1) {
    this.refill();

    if (this.tokens >= tokens) {
      this.tokens -= tokens;
      return { allowed: true, remaining: Math.floor(this.tokens) };
    }

    return {
      allowed: false,
      remaining: 0,
      retryAfter: (tokens - this.tokens) / this.refillRate
    };
  }
}

const buckets = new Map();

function checkRateLimit(userId) {
  if (!buckets.has(userId)) {
    buckets.set(userId, new TokenBucket(100, 10)); // 100 capacity, 10/sec refill
  }

  return buckets.get(userId).consume();
}

Pros: Handles bursts gracefully, smooth limiting Cons: More complex

Strategy 4: Leaky Bucket

Requests enter a queue and are processed at a constant rate.

class LeakyBucket {
  constructor(capacity, leakRate) {
    this.capacity = capacity;
    this.queue = [];
    this.leakRate = leakRate; // requests per second
    this.lastLeak = Date.now();
  }

  leak() {
    const now = Date.now();
    const timePassed = (now - this.lastLeak) / 1000;
    const requestsToLeak = Math.floor(timePassed * this.leakRate);

    if (requestsToLeak > 0) {
      this.queue.splice(0, requestsToLeak);
      this.lastLeak = now;
    }
  }

  add(request) {
    this.leak();

    if (this.queue.length >= this.capacity) {
      return { allowed: false, queueSize: this.queue.length };
    }

    this.queue.push(request);
    return { allowed: true, queueSize: this.queue.length };
  }
}

Pros: Smooth traffic, prevents bursts Cons: Adds latency

Implementing Rate Limiting

Middleware Approach

const rateLimit = require('express-rate-limit');

// Global rate limit
const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 1000, // 1000 requests per window
  standardHeaders: true, // Return rate limit info in headers
  legacyHeaders: false,
  handler: (req, res) => {
    res.status(429).json({
      error: 'Too many requests',
      retryAfter: req.rateLimit.resetTime
    });
  }
});

app.use('/v1/', globalLimiter);

// Endpoint-specific limits
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 5,
  skipSuccessfulRequests: true
});

app.post('/v1/auth/login', authLimiter, async (req, res) => {
  // Handle login
});

Per-User Rate Limiting

async function userRateLimiter(req, res, next) {
  const userId = req.auth?.userId || req.ip;
  const limit = await checkRateLimit(userId);

  // Set headers
  res.setHeader('X-RateLimit-Limit', '100');
  res.setHeader('X-RateLimit-Remaining', limit.remaining.toString());
  res.setHeader('X-RateLimit-Reset', limit.resetAt?.toString() || '');

  if (!limit.allowed) {
    return res.status(429).json({
      error: 'Rate limit exceeded',
      retryAfter: Math.ceil((limit.resetAt - Date.now()) / 1000)
    });
  }

  next();
}

app.use(userRateLimiter);

Tiered Rate Limits

Different limits for different user tiers:

const RATE_LIMITS = {
  free: { requests: 100, window: 3600 }, // 100/hour
  pro: { requests: 1000, window: 3600 }, // 1000/hour
  enterprise: { requests: 10000, window: 3600 } // 10000/hour
};

async function tierBasedRateLimiter(req, res, next) {
  const user = await getUser(req.auth.userId);
  const tier = user.tier || 'free';
  const limits = RATE_LIMITS[tier];

  const result = await checkRateLimit(
    req.auth.userId,
    limits.requests,
    limits.window
  );

  res.setHeader('X-RateLimit-Limit', limits.requests.toString());
  res.setHeader('X-RateLimit-Remaining', result.remaining.toString());

  if (!result.allowed) {
    return res.status(429).json({
      error: 'Rate limit exceeded',
      tier: tier,
      limit: limits.requests,
      upgradeUrl: 'https://petstoreapi.com/pricing'
    });
  }

  next();
}

Quotas

Quotas limit total usage over longer periods (day, month, year).

Implementing Monthly Quotas

async function checkQuota(userId) {
  const month = new Date().toISOString().slice(0, 7); // "2026-03"
  const key = `quota:${userId}:${month}`;

  const usage = await redis.get(key) || 0;
  const user = await getUser(userId);
  const quota = user.monthlyQuota || 10000;

  if (parseInt(usage) >= quota) {
    return {
      allowed: false,
      used: parseInt(usage),
      quota: quota,
      resetDate: new Date(month + '-01').setMonth(new Date(month + '-01').getMonth() + 1)
    };
  }

  await redis.incr(key);
  await redis.expireat(key, Math.floor(new Date(month + '-01').setMonth(new Date(month + '-01').getMonth() + 1) / 1000));

  return {
    allowed: true,
    used: parseInt(usage) + 1,
    quota: quota
  };
}

async function quotaMiddleware(req, res, next) {
  const quota = await checkQuota(req.auth.userId);

  res.setHeader('X-Quota-Used', quota.used.toString());
  res.setHeader('X-Quota-Limit', quota.quota.toString());
  res.setHeader('X-Quota-Remaining', (quota.quota - quota.used).toString());

  if (!quota.allowed) {
    return res.status(429).json({
      error: 'Monthly quota exceeded',
      used: quota.used,
      quota: quota.quota,
      resetDate: new Date(quota.resetDate).toISOString()
    });
  }

  next();
}

Throttling

Slow down requests instead of blocking them.

Delay-Based Throttling

async function throttleMiddleware(req, res, next) {
  const userId = req.auth.userId;
  const usage = await getUsageRate(userId);

  // If usage is high, add delay
  if (usage > 80) { // 80% of limit
    const delay = Math.min((usage - 80) * 100, 5000); // Max 5 second delay
    await new Promise(resolve => setTimeout(resolve, delay));
  }

  next();
}

Response Throttling

async function throttleResponse(req, res, next) {
  const userId = req.auth.userId;
  const usage = await getUsageRate(userId);

  if (usage > 90) {
    // Return partial data
    req.throttled = true;
    req.maxResults = 10; // Limit results
  }

  next();
}

app.get('/v1/pets', throttleResponse, async (req, res) => {
  const limit = req.throttled ? req.maxResults : 100;
  const pets = await db.pets.findMany({ limit });

  if (req.throttled) {
    res.setHeader('X-Throttled', 'true');
  }

  res.json({ data: pets });
});

Abuse Detection

Detect and block abusive patterns.

Pattern Detection

async function detectAbuse(userId) {
  const patterns = await analyzeRequestPatterns(userId);

  // Detect scraping
  if (patterns.sequentialIds > 100) {
    return { abuse: true, reason: 'Sequential ID scraping detected' };
  }

  // Detect brute force
  if (patterns.failedLogins > 10) {
    return { abuse: true, reason: 'Brute force attack detected' };
  }

  // Detect bot behavior
  if (patterns.requestsPerSecond > 50) {
    return { abuse: true, reason: 'Bot-like behavior detected' };
  }

  return { abuse: false };
}

async function abuseDetectionMiddleware(req, res, next) {
  const abuse = await detectAbuse(req.auth.userId);

  if (abuse.abuse) {
    await blockUser(req.auth.userId, abuse.reason);

    return res.status(403).json({
      error: 'Account suspended',
      reason: abuse.reason,
      contact: 'support@petstoreapi.com'
    });
  }

  next();
}

Communicating Limits

Standard Headers

res.setHeader('X-RateLimit-Limit', '100');
res.setHeader('X-RateLimit-Remaining', '95');
res.setHeader('X-RateLimit-Reset', '1678886400');
res.setHeader('Retry-After', '60'); // Seconds until retry

Error Responses

{
  "error": "Rate limit exceeded",
  "message": "You have exceeded the rate limit of 100 requests per hour",
  "limit": 100,
  "remaining": 0,
  "resetAt": "2026-03-13T11:00:00Z",
  "retryAfter": 3600,
  "documentation": "https://docs.petstoreapi.com/rate-limits"
}

Documentation

Document your limits clearly:

## Rate Limits

### Free Tier
- 100 requests per hour
- 10,000 requests per month

### Pro Tier
- 1,000 requests per hour
- 100,000 requests per month

### Enterprise Tier
- 10,000 requests per hour
- Unlimited monthly requests

### Headers
- `X-RateLimit-Limit`: Maximum requests per window
- `X-RateLimit-Remaining`: Requests remaining
- `X-RateLimit-Reset`: Unix timestamp when limit resets

### Handling Rate Limits
When you exceed the limit, you'll receive a 429 status code.
Wait for the time specified in `Retry-After` header before retrying.

Best Practices

1. Be Generous

Set limits high enough for legitimate use. Don't frustrate real users.

2. Provide Clear Feedback

Always include rate limit headers. Tell users when they can retry.

3. Tier Your Limits

Offer higher limits for paid tiers. This monetizes heavy usage.

4. Monitor and Adjust

Track limit violations. Adjust limits based on actual usage patterns.

5. Whitelist Internal Services

Don't rate limit your own services:

function shouldRateLimit(req) {
  // Skip rate limiting for internal services
  if (req.headers['x-internal-service'] === process.env.INTERNAL_SECRET) {
    return false;
  }
  return true;
}

6. Use Multiple Layers

Combine rate limiting, quotas, and abuse detection for comprehensive protection.

Summary

Rate limiting protects your API from abuse while allowing legitimate usage. Implement multiple strategies:

Rate limiting: Prevent short-term abuse
Quotas: Control long-term usage
Throttling: Gracefully degrade service
Abuse detection: Block malicious patterns

Choose algorithms based on your needs: - Fixed window: Simple, good enough for most cases - Sliding window: More accurate, prevents bursts - Token bucket: Handles bursts gracefully - Leaky bucket: Smooths traffic

Always communicate limits clearly through headers, error messages, and documentation.