Staff Prep 08: FastAPI Request Lifecycle — From TCP to Response
Back to Part 07: Zero-Downtime Migrations. You've written FastAPI routes. But do you know what happens before your route function runs? The full path from network packet to Python function goes through Uvicorn, ASGI, Starlette's router, the middleware stack, and dependency injection. You don't strictly need this to ship code, but the moment something goes sideways in production, you'll be grateful you know it.
Layer 1: uvicorn — the ASGI server
Uvicorn is an ASGI server built on uvloop and httptools. It handles the raw TCP connection, accepts the socket, parses the HTTP request, and calls your ASGI application through a standardised interface.
In production, Uvicorn runs with multiple workers, usually one per CPU core.
# Single worker (development)
uvicorn app.main:app --reload
# Multi-worker (production) via Gunicorn with Uvicorn workers
gunicorn app.main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 --timeout 30 --graceful-timeout 30
# Each worker is an independent Python process
# Each process has its own event loop, memory space, DB connection pool
# No shared state between workers — this is why in-process caches break
Layer 2: ASGI interface
ASGI is the contract between the server and your application. An ASGI app is a callable that
takes three things: scope (request metadata), receive (an async
function to pull request body chunks), and send (an async function to push response
parts back out).
from typing import Any
# This is what FastAPI IS at its core — an ASGI callable
async def app(scope: dict, receive: Any, send: Any) -> None:
if scope["type"] == "http":
# scope contains: method, path, headers, query_string, client IP
await send({
"type": "http.response.start",
"status": 200,
"headers": [(b"content-type", b"application/json")],
})
await send({
"type": "http.response.body",
"body": b'{"ok": true}',
})
Layer 3: Starlette — FastAPI's foundation
FastAPI is built on Starlette. Starlette provides the routing, middleware plumbing, request and response objects, WebSocket handling, background tasks, and lifespan management. FastAPI layers OpenAPI schema generation, Pydantic validation, and dependency injection on top of it. If you strip away FastAPI's features, what you're left with is Starlette.
from fastapi import FastAPI
from contextlib import asynccontextmanager
from sqlalchemy.ext.asyncio import AsyncEngine, create_async_engine
engine: AsyncEngine = None
@asynccontextmanager
async def lifespan(app: FastAPI):
# STARTUP: runs once when the worker starts
global engine
engine = create_async_engine("postgresql+asyncpg://...", pool_size=10)
print("DB pool created")
yield # application runs here
# SHUTDOWN: runs once when the worker stops (SIGTERM)
await engine.dispose()
print("DB pool closed")
app = FastAPI(lifespan=lifespan)
# Lifespan runs ONCE per worker process
# If you have 4 workers: 4 separate engines, 4 separate pools
# Total connections = workers * pool_size
Layer 4: middleware execution order
Middleware in Starlette wraps the app in a stack, and the order surprises people. The first middleware added is the outermost wrapper, so it runs first on requests and last on responses. The last one added sits tightest around the handler. I've debugged more than one production mystery that turned out to be middleware ordering.
import time
from starlette.middleware.base import BaseHTTPMiddleware
from fastapi import FastAPI, Request
app = FastAPI(lifespan=lifespan)
# Middleware added last = innermost = runs last on request, first on response
app.add_middleware(DatabaseSessionMiddleware)
# Middleware added first = outermost = runs first on request, last on response
app.add_middleware(RequestTimingMiddleware)
class RequestTimingMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
start = time.perf_counter()
response = await call_next(request) # proceeds to next middleware/handler
elapsed = time.perf_counter() - start
response.headers["X-Response-Time"] = f"{elapsed:.3f}s"
return response
# Request flow:
# RequestTimingMiddleware.before -> DatabaseSessionMiddleware.before
# -> route handler
# -> DatabaseSessionMiddleware.after -> RequestTimingMiddleware.after
# CRITICAL: middleware added via app.add_middleware() runs in REVERSE order
# The last add_middleware call wraps the tightest around the app
Layer 5: dependency injection
Dependency injection is the part of FastAPI I actually like. Dependencies are async or sync
callables declared with Depends(). They can yield resources for cleanup, chain into
other dependencies, and get cached for the duration of a single request.
from fastapi import Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession
# Generator-style dependency: yield + cleanup
async def get_db() -> AsyncSession:
async with AsyncSessionLocal() as session:
try:
yield session
await session.commit()
except Exception:
await session.rollback()
raise
# Dependency chaining
async def get_current_user(
token: str = Header(alias="Authorization"),
db: AsyncSession = Depends(get_db),
) -> User:
user = await verify_jwt_and_fetch_user(token, db)
if not user:
raise HTTPException(status_code=401)
return user
# Scoped dependencies
async def require_admin(user: User = Depends(get_current_user)) -> User:
if user.role != "admin":
raise HTTPException(status_code=403)
return user
@app.get("/admin/users")
async def list_users(
admin: User = Depends(require_admin),
db: AsyncSession = Depends(get_db), # same instance as in get_current_user!
):
# FastAPI caches dependencies per request by default
# get_db() is called ONCE even though two places depend on it
return await db.execute(select(User))
Layer 6: route matching and validation
from pydantic import BaseModel, Field
class OrderCreate(BaseModel):
user_id: int = Field(gt=0)
amount: float = Field(gt=0, le=100000)
items: list[str] = Field(min_length=1)
@app.post("/orders", status_code=201)
async def create_order(
order: OrderCreate, # Pydantic validates the request body
db: AsyncSession = Depends(get_db),
current_user: User = Depends(get_current_user),
):
# By the time this runs:
# 1. All middleware has run (timing, auth headers, etc.)
# 2. Pydantic has validated and type-coerced the request body
# 3. All dependencies have been resolved
# 4. Any dependency that raised HTTPException has already stopped execution
new_order = Order(user_id=order.user_id, amount=order.amount)
db.add(new_order)
return new_order
Exception handling: where errors Go
from fastapi import Request
from fastapi.responses import JSONResponse
# Global exception handler — catches unhandled exceptions from any route
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
# Log the full traceback here
import traceback
traceback.print_exc()
return JSONResponse(
status_code=500,
content={"error": "Internal server error", "request_id": request.state.request_id}
)
# Custom exception for domain errors
class DomainError(Exception):
def __init__(self, message: str, code: int = 400):
self.message = message
self.code = code
@app.exception_handler(DomainError)
async def domain_error_handler(request: Request, exc: DomainError):
return JSONResponse(status_code=exc.code, content={"error": exc.message})
Quiz: test your understanding
Before moving on, answer these in your head (or out loud):
- You have 4 Uvicorn workers and your DB pool_size is 10. How many total Postgres connections does your app open? What happens when one worker crashes?
- Two routes both use
Depends(get_db). In a single request that calls both routes (e.g., a dependency chain), how many database sessions are created? - You add three middlewares A, B, C with
app.add_middleware()in that order. In what order do they execute for an incoming request? For the outgoing response? - What is the difference between
startupevent handlers andlifespan? Why does Starlette recommend lifespan? - A dependency raises an
HTTPException(401). What happens to the other dependencies in the same request that have not yet run? What about cleanup in dependencies that already yielded?
Next up: Part 09: async vs sync in Python. When async actually helps, when it hurts, and the CPU-bound trap most developers walk straight into.