Serverless has been hyped to death. "No servers to manage!" "Infinite scale!" "Pay only for what you use!" The marketing is loud, but the reality is more nuanced. Serverless APIs are genuinely great for certain workloads and genuinely painful for others.
This guide cuts through the noise. We'll look at how serverless APIs actually work, how to build them on AWS Lambda, Azure Functions, and Cloudflare Workers, how to deal with cold starts, and when serverless is the wrong choice entirely.
What Serverless Actually Means
"Serverless" doesn't mean no servers. It means you don't manage servers. The cloud provider handles provisioning, scaling, patching, and capacity planning. You write functions, deploy them, and pay per invocation.
For APIs, this means:
- Each HTTP request triggers a function execution
- Functions spin up, handle the request, and shut down
- You're billed for execution time, not idle time
- Scaling is automatic (within limits)
The PetStore API is a good example to think through. A traditional deployment runs a server 24/7 waiting for requests. A serverless deployment runs code only when someone actually calls GET /api/pets. At low traffic, serverless is dramatically cheaper. At high, sustained traffic, the math changes.
AWS Lambda: The Most Mature Option
AWS Lambda is the most widely used serverless platform. It supports Node.js, Python, Java, Go, Ruby, and .NET.
Basic Lambda Function for a REST API
import json
import boto3
from datetime import datetime
# Initialize outside handler for connection reuse
dynamodb = boto3.resource('dynamodb')
pets_table = dynamodb.Table('pets')
def handler(event, context):
"""Main Lambda handler — routes requests to appropriate functions"""
http_method = event['httpMethod']
path = event['path']
path_params = event.get('pathParameters') or {}
try:
if path == '/api/pets' and http_method == 'GET':
return list_pets(event)
elif path.startswith('/api/pets/') and http_method == 'GET':
return get_pet(path_params['id'])
elif path == '/api/pets' and http_method == 'POST':
return create_pet(event)
else:
return response(404, {'error': 'Not found'})
except Exception as e:
print(f'Error: {e}')
return response(500, {'error': 'Internal server error'})
def list_pets(event):
query_params = event.get('queryStringParameters') or {}
limit = int(query_params.get('limit', 20))
result = pets_table.scan(Limit=limit)
return response(200, {'data': result['Items']})
def get_pet(pet_id):
result = pets_table.get_item(Key={'id': pet_id})
pet = result.get('Item')
if not pet:
return response(404, {'error': 'Pet not found'})
return response(200, pet)
def create_pet(event):
body = json.loads(event['body'])
pet = {
'id': generate_id(),
'name': body['name'],
'species': body['species'],
'created_at': datetime.utcnow().isoformat()
}
pets_table.put_item(Item=pet)
return response(201, pet)
def response(status_code, body):
return {
'statusCode': status_code,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps(body)
}
Deploying with API Gateway
Lambda functions need an HTTP trigger. AWS API Gateway connects HTTP requests to Lambda:
# serverless.yml (using Serverless Framework)
service: petstore-api
provider:
name: aws
runtime: python3.11
region: us-east-1
environment:
PETS_TABLE: ${self:service}-pets-${sls:stage}
iam:
role:
statements:
- Effect: Allow
Action:
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:Scan
- dynamodb:UpdateItem
- dynamodb:DeleteItem
Resource: !GetAtt PetsTable.Arn
functions:
api:
handler: src/handler.handler
events:
- http:
path: /api/{proxy+}
method: ANY
cors: true
resources:
Resources:
PetsTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: ${self:provider.environment.PETS_TABLE}
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
Deploy with:
npm install -g serverless
serverless deploy --stage prod
Using AWS SAM Instead
AWS SAM (Serverless Application Model) is AWS's native option:
# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Runtime: python3.11
Timeout: 30
Environment:
Variables:
PETS_TABLE: !Ref PetsTable
Resources:
PetStoreFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: handler.handler
Events:
ApiEvent:
Type: Api
Properties:
Path: /api/{proxy+}
Method: ANY
PetsTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
sam build && sam deploy --guided
Azure Functions
Azure Functions is Microsoft's serverless platform. It integrates tightly with the Azure ecosystem.
import azure.functions as func
import json
import logging
app = func.FunctionApp()
@app.route(route="pets", methods=["GET"])
def list_pets(req: func.HttpRequest) -> func.HttpResponse:
limit = int(req.params.get('limit', 20))
# Fetch from Cosmos DB or Azure SQL
pets = fetch_pets_from_db(limit)
return func.HttpResponse(
json.dumps({'data': pets}),
status_code=200,
mimetype='application/json'
)
@app.route(route="pets/{pet_id}", methods=["GET"])
def get_pet(req: func.HttpRequest, pet_id: str) -> func.HttpResponse:
pet = fetch_pet_by_id(pet_id)
if not pet:
return func.HttpResponse(
json.dumps({'error': 'Pet not found'}),
status_code=404,
mimetype='application/json'
)
return func.HttpResponse(
json.dumps(pet),
status_code=200,
mimetype='application/json'
)
@app.route(route="pets", methods=["POST"])
def create_pet(req: func.HttpRequest) -> func.HttpResponse:
try:
body = req.get_json()
except ValueError:
return func.HttpResponse(
json.dumps({'error': 'Invalid JSON'}),
status_code=400,
mimetype='application/json'
)
pet = save_pet(body)
return func.HttpResponse(
json.dumps(pet),
status_code=201,
mimetype='application/json'
)
Deploy with Azure CLI:
az login
az functionapp create \
--resource-group petstore-rg \
--consumption-plan-location eastus \
--runtime python \
--runtime-version 3.11 \
--functions-version 4 \
--name petstore-api \
--storage-account petstorage
func azure functionapp publish petstore-api
Cloudflare Workers: Edge Serverless
Cloudflare Workers run at the edge — in data centers close to your users. This eliminates most cold start issues and provides very low latency globally.
// worker.js
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
const path = url.pathname;
const method = request.method;
// Route requests
if (path === '/api/pets' && method === 'GET') {
return listPets(request, env);
} else if (path.match(/^\/api\/pets\/\d+$/) && method === 'GET') {
const id = path.split('/').pop();
return getPet(id, env);
} else if (path === '/api/pets' && method === 'POST') {
return createPet(request, env);
}
return new Response(JSON.stringify({ error: 'Not found' }), {
status: 404,
headers: { 'Content-Type': 'application/json' }
});
}
};
async function listPets(request, env) {
const url = new URL(request.url);
const limit = parseInt(url.searchParams.get('limit') || '20');
// Use Cloudflare KV or D1 (SQLite at the edge)
const { results } = await env.DB.prepare(
'SELECT * FROM pets LIMIT ?'
).bind(limit).all();
return jsonResponse(200, { data: results });
}
async function getPet(id, env) {
const pet = await env.DB.prepare(
'SELECT * FROM pets WHERE id = ?'
).bind(id).first();
if (!pet) {
return jsonResponse(404, { error: 'Pet not found' });
}
return jsonResponse(200, pet);
}
async function createPet(request, env) {
const body = await request.json();
const result = await env.DB.prepare(
'INSERT INTO pets (name, species, created_at) VALUES (?, ?, ?) RETURNING *'
).bind(body.name, body.species, new Date().toISOString()).first();
return jsonResponse(201, result);
}
function jsonResponse(status, body) {
return new Response(JSON.stringify(body), {
status,
headers: {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
}
});
}
Wrangler config:
# wrangler.toml
name = "petstore-api"
main = "src/worker.js"
compatibility_date = "2026-01-01"
[[d1_databases]]
binding = "DB"
database_name = "petstore"
database_id = "your-database-id"
Deploy:
npx wrangler deploy
Cold Starts: The Real Problem
Cold starts are the biggest pain point in serverless. When a function hasn't been called recently, the cloud provider needs to spin up a new container, load your runtime, and initialize your code. This adds latency — sometimes hundreds of milliseconds.
Cold Start Times by Platform
| Platform | Typical Cold Start | With Large Dependencies |
|---|---|---|
| Cloudflare Workers | < 5ms | < 5ms |
| AWS Lambda (Node.js) | 100-500ms | 500ms-2s |
| AWS Lambda (Python) | 200-800ms | 1-3s |
| AWS Lambda (Java) | 1-5s | 3-10s |
| Azure Functions | 200ms-2s | 1-4s |
Cloudflare Workers have virtually no cold starts because they use V8 isolates instead of containers.
Strategies to Reduce Cold Starts
1. Keep functions warm with scheduled pings:
# Lambda function that pings your API function
import boto3
def warmer_handler(event, context):
lambda_client = boto3.client('lambda')
# Invoke API function with a warm-up event
lambda_client.invoke(
FunctionName='petstore-api',
InvocationType='Event', # Async
Payload='{"source": "warmer"}'
)
# In your API handler, skip processing for warm-up events
def handler(event, context):
if event.get('source') == 'warmer':
return {'statusCode': 200, 'body': 'warm'}
# Normal request handling...
Schedule with EventBridge:
# serverless.yml
functions:
warmer:
handler: src/warmer.warmer_handler
events:
- schedule: rate(5 minutes)
2. Use Provisioned Concurrency (AWS Lambda):
# serverless.yml
functions:
api:
handler: src/handler.handler
provisionedConcurrency: 5 # Keep 5 instances warm
events:
- http:
path: /api/{proxy+}
method: ANY
This eliminates cold starts but costs more — you pay for the provisioned instances even when idle.
3. Minimize package size:
# requirements.txt — only include what you need
boto3==1.34.0
# Don't include: pandas, numpy, scipy unless you actually need them
Use Lambda layers for shared dependencies:
layers:
commonDeps:
path: layer
compatibleRuntimes:
- python3.11
functions:
api:
handler: src/handler.handler
layers:
- !Ref CommonDepsLambdaLayer
4. Move initialization outside the handler:
import boto3
# This runs once per container, not per request
dynamodb = boto3.resource('dynamodb')
pets_table = dynamodb.Table('pets')
# Pre-load config
config = load_config()
def handler(event, context):
# dynamodb and config are already initialized
# This is fast
result = pets_table.get_item(Key={'id': event['id']})
return response(200, result['Item'])
Stateless Design
Serverless functions are stateless by nature. Each invocation is independent — you can't store data in memory between requests.
What This Means in Practice
# WRONG: This doesn't work in serverless
cache = {} # This is per-container, not shared
def handler(event, context):
pet_id = event['pathParameters']['id']
if pet_id in cache: # Might miss if different container handles request
return response(200, cache[pet_id])
pet = fetch_from_db(pet_id)
cache[pet_id] = pet # Only cached in this container
return response(200, pet)
# RIGHT: Use external cache (Redis/ElastiCache)
import redis
redis_client = redis.Redis(host=os.environ['REDIS_HOST'])
def handler(event, context):
pet_id = event['pathParameters']['id']
# Check shared cache
cached = redis_client.get(f'pet:{pet_id}')
if cached:
return response(200, json.loads(cached))
pet = fetch_from_db(pet_id)
# Store in shared cache
redis_client.setex(f'pet:{pet_id}', 300, json.dumps(pet))
return response(200, pet)
Session Management
Traditional sessions (stored in memory or local files) don't work. Use JWTs or external session stores:
import jwt
SECRET_KEY = os.environ['JWT_SECRET']
def authenticate(event):
auth_header = event['headers'].get('Authorization', '')
if not auth_header.startswith('Bearer '):
return None
token = auth_header[7:]
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])
return payload
except jwt.ExpiredSignatureError:
return None
except jwt.InvalidTokenError:
return None
def handler(event, context):
user = authenticate(event)
if not user:
return response(401, {'error': 'Unauthorized'})
# user data is in the JWT, no session lookup needed
pets = fetch_pets_for_user(user['user_id'])
return response(200, {'data': pets})
Cost Optimization
Serverless can be very cheap or surprisingly expensive depending on how you use it.
Understanding the Pricing Model
AWS Lambda pricing (as of early 2026): - $0.20 per 1 million requests - $0.0000166667 per GB-second of compute
For a function using 128MB memory running for 100ms: - Cost per request: 0.128 GB × 0.1s × $0.0000166667 = $0.000000213 - 1 million requests: ~$0.41
That's cheap. But it adds up with high traffic or long-running functions.
Optimization Strategies
Right-size memory allocation:
# Test different memory sizes
# More memory = faster execution = potentially lower cost
# 128MB at 500ms vs 512MB at 100ms — same cost, but 512MB is faster
# Use AWS Lambda Power Tuning to find optimal memory
# https://github.com/alexcasalboni/aws-lambda-power-tuning
Reduce execution time:
# Use connection pooling with RDS Proxy
# Direct RDS connections are slow to establish
# SLOW: New connection per invocation
def handler(event, context):
conn = psycopg2.connect(DATABASE_URL) # ~100ms
# ...
# FAST: Use RDS Proxy (connection pooling)
# Configure RDS_PROXY_ENDPOINT in environment
def handler(event, context):
conn = psycopg2.connect(RDS_PROXY_URL) # ~5ms via proxy
# ...
Use async processing for non-critical work:
import boto3
sqs = boto3.client('sqs')
def handler(event, context):
body = json.loads(event['body'])
# Save pet synchronously (user needs this)
pet = save_pet(body)
# Queue non-critical work asynchronously
sqs.send_message(
QueueUrl=os.environ['NOTIFICATIONS_QUEUE'],
MessageBody=json.dumps({
'type': 'pet_created',
'pet_id': pet['id'],
'owner_email': body['owner_email']
})
)
# Return immediately — don't wait for email to send
return response(201, pet)
Cache aggressively at API Gateway level:
# serverless.yml
functions:
api:
handler: src/handler.handler
events:
- http:
path: /api/pets
method: GET
caching:
enabled: true
ttlInSeconds: 300 # Cache list for 5 minutes
Serverless Frameworks Comparison
| Framework | Platforms | Language | Best For |
|---|---|---|---|
| Serverless Framework | AWS, Azure, GCP | Any | Multi-cloud, mature ecosystem |
| AWS SAM | AWS only | Any | AWS-native, CloudFormation integration |
| AWS CDK | AWS only | TypeScript/Python/Java | Infrastructure as code |
| Pulumi | Multi-cloud | TypeScript/Python/Go | Modern IaC |
| Wrangler | Cloudflare only | JS/TS | Edge computing |
For most teams starting out, Serverless Framework or AWS SAM are the pragmatic choices.
Real Limitations of Serverless
Be honest about the downsides:
Execution time limits:- AWS Lambda: 15 minutes max - Azure Functions: 10 minutes (Consumption plan) - Cloudflare Workers: 30 seconds (CPU time)
Long-running operations (video processing, large data exports) don't fit serverless well.
Payload size limits:- API Gateway: 10MB request/response - Lambda: 6MB synchronous, 256KB async
Concurrency limits:- AWS Lambda: 1,000 concurrent executions per region (default, can be increased) - Sudden traffic spikes can hit this limit
Vendor lock-in:Each platform has its own event format, SDK, and services. Migrating from Lambda to Azure Functions isn't trivial.
Debugging is harder:Local development requires emulation tools (SAM local, Wrangler dev). Distributed tracing is essential but adds complexity.
Database connections:Traditional databases (PostgreSQL, MySQL) have connection limits. Serverless functions can exhaust them quickly. Use connection poolers (RDS Proxy, PgBouncer) or serverless-native databases (DynamoDB, Cosmos DB, D1).
When to Use Serverless (and When Not To)
Good fit:- APIs with variable or unpredictable traffic - Event-driven processing (webhooks, file uploads) - Background jobs and scheduled tasks - Prototypes and MVPs - APIs with low sustained traffic
Poor fit:- APIs with consistently high, sustained traffic (traditional servers are cheaper) - Long-running operations (>15 minutes) - Applications requiring persistent connections (WebSockets need special handling) - Latency-sensitive applications where cold starts are unacceptable - Teams without cloud expertise (operational complexity is real)
Conclusion
Serverless APIs are a genuine shift in how you build and operate software. The operational benefits are real — no servers to patch, automatic scaling, pay-per-use pricing. But so are the limitations — cold starts, stateless constraints, vendor lock-in, and debugging complexity.
The PetStore API is a good candidate for serverless: it has variable traffic, simple CRUD operations, and doesn't need persistent connections. A high-frequency trading API or a video processing pipeline would be a poor fit.
Start with Cloudflare Workers if you need low latency globally and simple use cases. Use AWS Lambda with Serverless Framework for more complex applications that need the full AWS ecosystem. Avoid serverless for workloads with consistently high traffic or long-running operations.
The best architecture is the one that fits your actual workload, team skills, and budget — not the one with the best marketing.