Your API works. But it's slow. Users complain. Dashboards timeout. Your server costs are climbing.
Performance optimization isn't about premature optimization or micro-benchmarks. It's about identifying real bottlenecks and fixing them systematically.
This guide walks through the most impactful performance improvements for REST APIs, with real code and measurable results. We'll optimize our PetStore API from 500ms average response time to under 50ms.
Measuring First: You Can't Improve What You Don't Measure
Before optimizing anything, establish a baseline. Here's a simple performance monitoring setup:
const express = require('express');
const app = express();
// Performance monitoring middleware
app.use((req, res, next) => {
const start = Date.now();
// Capture response
res.on('finish', () => {
const duration = Date.now() - start;
// Log slow requests
if (duration > 100) {
console.warn('Slow request:', {
method: req.method,
path: req.path,
duration: `${duration}ms`,
status: res.statusCode
});
}
// Track metrics (send to Prometheus, DataDog, etc.)
metrics.recordApiLatency(req.path, duration);
});
next();
});
For more detailed profiling, use Node.js's built-in profiler:
const { performance, PerformanceObserver } = require('perf_hooks');
// Mark important operations
app.get('/api/pets', async (req, res) => {
performance.mark('pets-start');
performance.mark('db-query-start');
const pets = await db.query('SELECT * FROM pets');
performance.mark('db-query-end');
performance.measure('db-query', 'db-query-start', 'db-query-end');
performance.mark('transform-start');
const transformed = pets.map(transformPet);
performance.mark('transform-end');
performance.measure('transform', 'transform-start', 'transform-end');
performance.mark('pets-end');
performance.measure('total', 'pets-start', 'pets-end');
res.json(transformed);
});
// Observe measurements
const obs = new PerformanceObserver((items) => {
items.getEntries().forEach((entry) => {
console.log(`${entry.name}: ${entry.duration.toFixed(2)}ms`);
});
});
obs.observe({ entryTypes: ['measure'] });
Now let's fix the bottlenecks.
Database Query Optimization
Database queries are usually the biggest bottleneck. Let's optimize them.
Problem: Missing Indexes
// Slow query (500ms for 100k pets)
app.get('/api/pets', async (req, res) => {
const { species, minAge, maxAge } = req.query;
const pets = await db.query(`
SELECT * FROM pets
WHERE species = $1
AND age >= $2
AND age <= $3
ORDER BY created_at DESC
LIMIT 20
`, [species, minAge, maxAge]);
res.json(pets);
});
Check your query plan:
EXPLAIN ANALYZE
SELECT * FROM pets
WHERE species = 'dog'
AND age >= 2
AND age <= 5
ORDER BY created_at DESC
LIMIT 20;
-- Result: Seq Scan on pets (cost=0.00..2500.00 rows=100)
-- Execution time: 487.23 ms
Sequential scan! Add indexes:
-- Composite index for common queries
CREATE INDEX idx_pets_species_age ON pets(species, age);
-- Index for sorting
CREATE INDEX idx_pets_created_at ON pets(created_at DESC);
-- Now check again
EXPLAIN ANALYZE
SELECT * FROM pets
WHERE species = 'dog'
AND age >= 2
AND age <= 5
ORDER BY created_at DESC
LIMIT 20;
-- Result: Index Scan using idx_pets_species_age (cost=0.42..45.67 rows=20)
-- Execution time: 12.34 ms
40x faster with proper indexes.
Problem: SELECT *
Don't fetch columns you don't need:
// Bad: fetches everything including large text fields
const pets = await db.query('SELECT * FROM pets');
// Good: only fetch what you need
const pets = await db.query(`
SELECT id, name, species, age, photo_url
FROM pets
WHERE status = 'available'
`);
For our PetStore, this reduced response size from 2.5MB to 180KB.
Problem: Inefficient Pagination
// Bad: OFFSET gets slower as you paginate deeper
app.get('/api/pets', async (req, res) => {
const page = parseInt(req.query.page) || 1;
const limit = 20;
const offset = (page - 1) * limit;
const pets = await db.query(`
SELECT * FROM pets
ORDER BY created_at DESC
LIMIT $1 OFFSET $2
`, [limit, offset]);
res.json(pets);
});
// Page 1: 15ms
// Page 100: 450ms (database scans and skips 2000 rows)
Use cursor-based pagination instead:
// Good: cursor-based pagination (consistent speed)
app.get('/api/pets', async (req, res) => {
const { cursor, limit = 20 } = req.query;
let query = `
SELECT id, name, species, age, created_at
FROM pets
WHERE status = 'available'
`;
const params = [limit];
if (cursor) {
query += ` AND created_at < $2`;
params.push(cursor);
}
query += ` ORDER BY created_at DESC LIMIT $1`;
const pets = await db.query(query, params);
const nextCursor = pets.length === limit
? pets[pets.length - 1].created_at
: null;
res.json({
data: pets,
nextCursor,
hasMore: nextCursor !== null
});
});
// All pages: ~15ms (consistent)
The N+1 Query Problem
This is the most common performance killer in APIs.
The Problem
// Fetches pets (1 query)
app.get('/api/pets', async (req, res) => {
const pets = await db.query('SELECT * FROM pets LIMIT 20');
// For each pet, fetch owner (20 queries!)
for (const pet of pets) {
pet.owner = await db.query(
'SELECT * FROM users WHERE id = $1',
[pet.owner_id]
);
}
res.json(pets);
});
// Total: 21 queries, 450ms
Solution 1: JOIN
// Single query with JOIN
app.get('/api/pets', async (req, res) => {
const result = await db.query(`
SELECT
p.id, p.name, p.species, p.age,
u.id as owner_id, u.name as owner_name, u.email as owner_email
FROM pets p
LEFT JOIN users u ON p.owner_id = u.id
WHERE p.status = 'available'
LIMIT 20
`);
const pets = result.rows.map(row => ({
id: row.id,
name: row.name,
species: row.species,
age: row.age,
owner: row.owner_id ? {
id: row.owner_id,
name: row.owner_name,
email: row.owner_email
} : null
}));
res.json(pets);
});
// Total: 1 query, 35ms
Solution 2: DataLoader (for GraphQL-style batching)
const DataLoader = require('dataloader');
// Batch load users
const userLoader = new DataLoader(async (userIds) => {
const users = await db.query(
'SELECT * FROM users WHERE id = ANY($1)',
[userIds]
);
// Return in same order as requested
const userMap = new Map(users.rows.map(u => [u.id, u]));
return userIds.map(id => userMap.get(id));
});
app.get('/api/pets', async (req, res) => {
const pets = await db.query('SELECT * FROM pets LIMIT 20');
// Batches all user fetches into a single query
await Promise.all(
pets.rows.map(async (pet) => {
if (pet.owner_id) {
pet.owner = await userLoader.load(pet.owner_id);
}
})
);
res.json(pets.rows);
});
// Total: 2 queries (pets + batched users), 45ms
Connection Pooling
Opening a new database connection for every request is expensive.
Without Pooling
// Bad: new connection per request
app.get('/api/pets', async (req, res) => {
const client = await pg.connect({
host: 'localhost',
database: 'petstore'
});
const result = await client.query('SELECT * FROM pets');
await client.end();
res.json(result.rows);
});
// Connection overhead: ~50ms per request
With Pooling
const { Pool } = require('pg');
// Create connection pool
const pool = new Pool({
host: 'localhost',
database: 'petstore',
user: 'petstore_api',
password: process.env.DB_PASSWORD,
max: 20, // Maximum connections
idleTimeoutMillis: 30000, // Close idle connections after 30s
connectionTimeoutMillis: 2000
});
// Monitor pool health
pool.on('error', (err) => {
console.error('Unexpected pool error', err);
});
pool.on('connect', () => {
console.log('New client connected to pool');
});
// Use pooled connections
app.get('/api/pets', async (req, res) => {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM pets');
res.json(result.rows);
} finally {
client.release(); // Return to pool
}
});
// Connection overhead: ~2ms per request
For even better performance, use a query helper:
// Automatically handles connection management
async function query(text, params) {
const start = Date.now();
const result = await pool.query(text, params);
const duration = Date.now() - start;
if (duration > 100) {
console.warn('Slow query:', { text, duration });
}
return result;
}
app.get('/api/pets', async (req, res) => {
const result = await query('SELECT * FROM pets');
res.json(result.rows);
});
Response Compression
Compressing responses can reduce bandwidth by 70-90%.
const compression = require('compression');
// Enable compression
app.use(compression({
level: 6, // Compression level (0-9)
threshold: 1024, // Only compress responses > 1KB
filter: (req, res) => {
// Don't compress images
if (req.headers['x-no-compression']) {
return false;
}
return compression.filter(req, res);
}
}));
app.get('/api/pets', async (req, res) => {
const pets = await getPets();
res.json(pets);
});
// Before: 245KB
// After: 32KB (87% reduction)
For even better compression, use Brotli:
const express = require('express');
const shrinkRay = require('shrink-ray-current');
app.use(shrinkRay({
brotli: {
quality: 4 // Brotli compression quality (0-11)
},
zlib: {
level: 6 // Fallback to gzip for older clients
}
}));
// Brotli typically achieves 15-20% better compression than gzip
HTTP/2: Multiplexing and Server Push
HTTP/2 allows multiple requests over a single connection and can push resources proactively.
const http2 = require('http2');
const fs = require('fs');
const server = http2.createSecureServer({
key: fs.readFileSync('server-key.pem'),
cert: fs.readFileSync('server-cert.pem')
});
server.on('stream', (stream, headers) => {
const path = headers[':path'];
if (path === '/api/pets') {
// Main response
stream.respond({
'content-type': 'application/json',
':status': 200
});
const pets = getPets();
stream.end(JSON.stringify(pets));
// Server push related resources
if (pets.length > 0) {
const firstPetId = pets[0].id;
stream.pushStream({ ':path': `/api/pets/${firstPetId}` }, (err, pushStream) => {
if (err) return;
pushStream.respond({
'content-type': 'application/json',
':status': 200
});
const petDetails = getPetDetails(firstPetId);
pushStream.end(JSON.stringify(petDetails));
});
}
}
});
server.listen(3000);
For Express apps, use the spdy module:
const spdy = require('spdy');
const express = require('express');
const fs = require('fs');
const app = express();
app.get('/api/pets', (req, res) => {
const pets = getPets();
// Push related resources
if (res.push) {
const push = res.push('/api/pets/stats', {
request: { accept: 'application/json' },
response: { 'content-type': 'application/json' }
});
push.end(JSON.stringify(getStats()));
}
res.json(pets);
});
spdy.createServer({
key: fs.readFileSync('server-key.pem'),
cert: fs.readFileSync('server-cert.pem')
}, app).listen(3000);
HTTP/2 benefits: - Multiplexing: multiple requests without head-of-line blocking - Header compression: reduces overhead - Server push: proactively send resources - Binary protocol: more efficient parsing
Caching Strategies
Caching is the most effective performance optimization.
HTTP Caching Headers
app.get('/api/pets/:id', async (req, res) => {
const pet = await getPet(req.params.id);
if (!pet) {
return res.status(404).json({ error: 'Pet not found' });
}
// Cache for 5 minutes
res.set('Cache-Control', 'public, max-age=300');
// ETag for conditional requests
const etag = generateETag(pet);
res.set('ETag', etag);
// Check if client has current version
if (req.headers['if-none-match'] === etag) {
return res.status(304).end();
}
res.json(pet);
});
function generateETag(data) {
const crypto = require('crypto');
return crypto
.createHash('md5')
.update(JSON.stringify(data))
.digest('hex');
}
Application-Level Caching
const NodeCache = require('node-cache');
// Cache with 5-minute TTL
const cache = new NodeCache({ stdTTL: 300 });
app.get('/api/pets', async (req, res) => {
const cacheKey = `pets:${JSON.stringify(req.query)}`;
// Check cache first
const cached = cache.get(cacheKey);
if (cached) {
res.set('X-Cache', 'HIT');
return res.json(cached);
}
// Cache miss - fetch from database
const pets = await getPets(req.query);
// Store in cache
cache.set(cacheKey, pets);
res.set('X-Cache', 'MISS');
res.json(pets);
});
// Invalidate cache when data changes
app.post('/api/pets', async (req, res) => {
const pet = await createPet(req.body);
// Clear relevant caches
cache.del('pets:{}');
cache.del(`pets:{"species":"${pet.species}"}`);
res.json(pet);
});
Redis for Distributed Caching
const Redis = require('ioredis');
const redis = new Redis({
host: 'localhost',
port: 6379,
retryStrategy: (times) => {
return Math.min(times * 50, 2000);
}
});
async function getCachedPets(query) {
const cacheKey = `pets:${JSON.stringify(query)}`;
// Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// Cache miss - fetch from database
const pets = await db.query('SELECT * FROM pets WHERE species = $1', [query.species]);
// Cache for 5 minutes
await redis.setex(cacheKey, 300, JSON.stringify(pets.rows));
return pets.rows;
}
app.get('/api/pets', async (req, res) => {
const pets = await getCachedPets(req.query);
res.json(pets);
});
// Cache invalidation with pub/sub
redis.subscribe('pet-updates');
redis.on('message', (channel, message) => {
if (channel === 'pet-updates') {
const { action, petId } = JSON.parse(message);
// Invalidate relevant caches
redis.del(`pet:${petId}`);
redis.keys('pets:*').then(keys => {
if (keys.length > 0) {
redis.del(...keys);
}
});
}
});
// Publish updates
app.post('/api/pets', async (req, res) => {
const pet = await createPet(req.body);
await redis.publish('pet-updates', JSON.stringify({
action: 'created',
petId: pet.id
}));
res.json(pet);
});
CDN for APIs
CDNs aren't just for static assets. You can cache API responses at the edge.
Cloudflare Workers
// Cloudflare Worker
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request));
});
async function handleRequest(request) {
const url = new URL(request.url);
// Only cache GET requests
if (request.method !== 'GET') {
return fetch(request);
}
// Check cache
const cache = caches.default;
let response = await cache.match(request);
if (!response) {
// Cache miss - fetch from origin
response = await fetch(request);
// Clone response to cache it
const responseToCache = response.clone();
// Cache for 5 minutes
const headers = new Headers(responseToCache.headers);
headers.set('Cache-Control', 'public, max-age=300');
const cachedResponse = new Response(responseToCache.body, {
status: responseToCache.status,
statusText: responseToCache.statusText,
headers
});
event.waitUntil(cache.put(request, cachedResponse));
}
return response;
}
Fastly VCL
sub vcl_recv {
# Cache GET requests to /api/pets
if (req.method == "GET" && req.url ~ "^/api/pets") {
return(lookup);
}
# Don't cache other requests
return(pass);
}
sub vcl_fetch {
# Cache for 5 minutes
if (beresp.status == 200 && req.url ~ "^/api/pets") {
set beresp.ttl = 300s;
set beresp.http.Cache-Control = "public, max-age=300";
}
}
sub vcl_deliver {
# Add cache status header
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
Benchmarking and Load Testing
Measure the impact of your optimizations.
Apache Bench (ab)
# Test baseline
ab -n 1000 -c 10 https://api.petstore.com/pets
# Results:
# Requests per second: 45.23
# Time per request: 221.1ms (mean)
# Time per request: 22.1ms (mean, across all concurrent requests)
# After optimization
ab -n 1000 -c 10 https://api.petstore.com/pets
# Results:
# Requests per second: 412.87
# Time per request: 24.2ms (mean)
# Time per request: 2.4ms (mean, across all concurrent requests)
k6 for Complex Scenarios
// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '30s', target: 20 }, // Ramp up to 20 users
{ duration: '1m', target: 20 }, // Stay at 20 users
{ duration: '30s', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<200'], // 95% of requests under 200ms
http_req_failed: ['rate<0.01'], // Less than 1% failures
},
};
export default function () {
// List pets
const listRes = http.get('https://api.petstore.com/pets?species=dog');
check(listRes, {
'list status is 200': (r) => r.status === 200,
'list response time < 100ms': (r) => r.timings.duration < 100,
});
sleep(1);
// Get specific pet
const pets = listRes.json();
if (pets.length > 0) {
const petRes = http.get(`https://api.petstore.com/pets/${pets[0].id}`);
check(petRes, {
'get status is 200': (r) => r.status === 200,
'get response time < 50ms': (r) => r.timings.duration < 50,
});
}
sleep(1);
}
Run the test:
k6 run load-test.js
# Output:
# ✓ list status is 200
# ✓ list response time < 100ms
# ✓ get status is 200
# ✓ get response time < 50ms
#
# http_req_duration..............: avg=42.3ms min=12.1ms med=38.2ms max=187.4ms p(95)=89.7ms
# http_reqs......................: 2400 40/s
Real-World Results
Here's what we achieved optimizing the PetStore API:
| Optimization | Before | After | Improvement |
|---|---|---|---|
| Added indexes | 487ms | 12ms | 40x faster |
| Fixed N+1 queries | 450ms | 35ms | 13x faster |
| Connection pooling | 50ms overhead | 2ms overhead | 25x faster |
| Response compression | 245KB | 32KB | 87% smaller |
| Redis caching | 35ms | 2ms (cache hit) | 17x faster |
| HTTP/2 | 6 round trips | 1 round trip | 6x fewer |
Combined result: - Average response time: 500ms → 45ms (11x faster) - P95 response time: 1200ms → 120ms (10x faster) - Throughput: 45 req/s → 413 req/s (9x more) - Server costs: $800/month → $200/month (75% reduction)
Performance Checklist
Before deploying your API:
- [ ] Database indexes on all WHERE, JOIN, and ORDER BY columns
- [ ] Connection pooling configured
- [ ] N+1 queries eliminated
- [ ] SELECT only needed columns
- [ ] Cursor-based pagination for large datasets
- [ ] Response compression enabled
- [ ] HTTP/2 enabled
- [ ] Caching strategy implemented
- [ ] Cache invalidation working correctly
- [ ] CDN configured for cacheable endpoints
- [ ] Load testing completed
- [ ] Performance monitoring in place
- [ ] Slow query logging enabled
Conclusion
API performance isn't magic. It's about identifying bottlenecks and applying the right techniques:
- Measure first - you can't improve what you don't measure
- Optimize database queries - usually the biggest bottleneck
- Eliminate N+1 queries - batch your data fetching
- Use connection pooling - don't waste time on connections
- Compress responses - save bandwidth
- Cache aggressively - but invalidate correctly
- Use HTTP/2 - multiplexing and server push
- Put a CDN in front - cache at the edge
- Load test - verify your improvements
- Monitor continuously - catch regressions early
Start with the biggest bottlenecks first. A single missing database index can have more impact than a dozen micro-optimizations.
Your users will notice the difference. And your server bill will thank you.