ArchitectureStaff

Staff Prep 08: FastAPI Request Lifecycle — From TCP to Response

April 4, 20269 min readPART 06 / 18

Back to Part 07: Zero-Downtime Migrations. You've written FastAPI routes. But do you know what happens before your route function runs? The full path from network packet to Python function goes through Uvicorn, ASGI, Starlette's router, the middleware stack, and dependency injection. You don't strictly need this to ship code, but the moment something goes sideways in production, you'll be grateful you know it.

Layer 1: uvicorn — the ASGI server

Uvicorn is an ASGI server built on uvloop and httptools. It handles the raw TCP connection, accepts the socket, parses the HTTP request, and calls your ASGI application through a standardised interface.

In production, Uvicorn runs with multiple workers, usually one per CPU core.

bash

# Single worker (development)
uvicorn app.main:app --reload

# Multi-worker (production) via Gunicorn with Uvicorn workers
gunicorn app.main:app     --workers 4     --worker-class uvicorn.workers.UvicornWorker     --bind 0.0.0.0:8000     --timeout 30     --graceful-timeout 30

# Each worker is an independent Python process
# Each process has its own event loop, memory space, DB connection pool
# No shared state between workers — this is why in-process caches break

Layer 2: ASGI interface

ASGI is the contract between the server and your application. An ASGI app is a callable that takes three things: scope (request metadata), receive (an async function to pull request body chunks), and send (an async function to push response parts back out).

python

from typing import Any

# This is what FastAPI IS at its core — an ASGI callable
async def app(scope: dict, receive: Any, send: Any) -> None:
    if scope["type"] == "http":
        # scope contains: method, path, headers, query_string, client IP
        await send({
            "type": "http.response.start",
            "status": 200,
            "headers": [(b"content-type", b"application/json")],
        })
        await send({
            "type": "http.response.body",
            "body": b'{"ok": true}',
        })

Layer 3: Starlette — FastAPI's foundation

FastAPI is built on Starlette. Starlette provides the routing, middleware plumbing, request and response objects, WebSocket handling, background tasks, and lifespan management. FastAPI layers OpenAPI schema generation, Pydantic validation, and dependency injection on top of it. If you strip away FastAPI's features, what you're left with is Starlette.

python

from fastapi import FastAPI
from contextlib import asynccontextmanager
from sqlalchemy.ext.asyncio import AsyncEngine, create_async_engine

engine: AsyncEngine = None

@asynccontextmanager
async def lifespan(app: FastAPI):
    # STARTUP: runs once when the worker starts
    global engine
    engine = create_async_engine("postgresql+asyncpg://...", pool_size=10)
    print("DB pool created")

    yield  # application runs here

    # SHUTDOWN: runs once when the worker stops (SIGTERM)
    await engine.dispose()
    print("DB pool closed")

app = FastAPI(lifespan=lifespan)

# Lifespan runs ONCE per worker process
# If you have 4 workers: 4 separate engines, 4 separate pools
# Total connections = workers * pool_size

Layer 4: middleware execution order

Middleware in Starlette wraps the app in a stack, and the order surprises people. The first middleware added is the outermost wrapper, so it runs first on requests and last on responses. The last one added sits tightest around the handler. I've debugged more than one production mystery that turned out to be middleware ordering.

python

import time
from starlette.middleware.base import BaseHTTPMiddleware
from fastapi import FastAPI, Request

app = FastAPI(lifespan=lifespan)

# Middleware added last = innermost = runs last on request, first on response
app.add_middleware(DatabaseSessionMiddleware)

# Middleware added first = outermost = runs first on request, last on response
app.add_middleware(RequestTimingMiddleware)

class RequestTimingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        start = time.perf_counter()
        response = await call_next(request)  # proceeds to next middleware/handler
        elapsed = time.perf_counter() - start
        response.headers["X-Response-Time"] = f"{elapsed:.3f}s"
        return response

# Request flow:
# RequestTimingMiddleware.before -> DatabaseSessionMiddleware.before
# -> route handler
# -> DatabaseSessionMiddleware.after -> RequestTimingMiddleware.after

# CRITICAL: middleware added via app.add_middleware() runs in REVERSE order
# The last add_middleware call wraps the tightest around the app

Layer 5: dependency injection

Dependency injection is the part of FastAPI I actually like. Dependencies are async or sync callables declared with Depends(). They can yield resources for cleanup, chain into other dependencies, and get cached for the duration of a single request.

python

from fastapi import Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession

# Generator-style dependency: yield + cleanup
async def get_db() -> AsyncSession:
    async with AsyncSessionLocal() as session:
        try:
            yield session
            await session.commit()
        except Exception:
            await session.rollback()
            raise

# Dependency chaining
async def get_current_user(
    token: str = Header(alias="Authorization"),
    db: AsyncSession = Depends(get_db),
) -> User:
    user = await verify_jwt_and_fetch_user(token, db)
    if not user:
        raise HTTPException(status_code=401)
    return user

# Scoped dependencies
async def require_admin(user: User = Depends(get_current_user)) -> User:
    if user.role != "admin":
        raise HTTPException(status_code=403)
    return user

@app.get("/admin/users")
async def list_users(
    admin: User = Depends(require_admin),
    db: AsyncSession = Depends(get_db),  # same instance as in get_current_user!
):
    # FastAPI caches dependencies per request by default
    # get_db() is called ONCE even though two places depend on it
    return await db.execute(select(User))

Layer 6: route matching and validation

python

from pydantic import BaseModel, Field

class OrderCreate(BaseModel):
    user_id: int = Field(gt=0)
    amount: float = Field(gt=0, le=100000)
    items: list[str] = Field(min_length=1)

@app.post("/orders", status_code=201)
async def create_order(
    order: OrderCreate,  # Pydantic validates the request body
    db: AsyncSession = Depends(get_db),
    current_user: User = Depends(get_current_user),
):
    # By the time this runs:
    # 1. All middleware has run (timing, auth headers, etc.)
    # 2. Pydantic has validated and type-coerced the request body
    # 3. All dependencies have been resolved
    # 4. Any dependency that raised HTTPException has already stopped execution
    new_order = Order(user_id=order.user_id, amount=order.amount)
    db.add(new_order)
    return new_order

Exception handling: where errors Go

python

from fastapi import Request
from fastapi.responses import JSONResponse

# Global exception handler — catches unhandled exceptions from any route
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
    # Log the full traceback here
    import traceback
    traceback.print_exc()
    return JSONResponse(
        status_code=500,
        content={"error": "Internal server error", "request_id": request.state.request_id}
    )

# Custom exception for domain errors
class DomainError(Exception):
    def __init__(self, message: str, code: int = 400):
        self.message = message
        self.code = code

@app.exception_handler(DomainError)
async def domain_error_handler(request: Request, exc: DomainError):
    return JSONResponse(status_code=exc.code, content={"error": exc.message})

Quiz: test your understanding

Before moving on, answer these in your head (or out loud):

You have 4 Uvicorn workers and your DB pool_size is 10. How many total Postgres connections does your app open? What happens when one worker crashes?
Two routes both use Depends(get_db). In a single request that calls both routes (e.g., a dependency chain), how many database sessions are created?
You add three middlewares A, B, C with app.add_middleware() in that order. In what order do they execute for an incoming request? For the outgoing response?
What is the difference between startup event handlers and lifespan? Why does Starlette recommend lifespan?
A dependency raises an HTTPException(401). What happens to the other dependencies in the same request that have not yet run? What about cleanup in dependencies that already yielded?

Next up: Part 09: async vs sync in Python. When async actually helps, when it hurts, and the CPU-bound trap most developers walk straight into.

← PREV

FastAPI 105: Caching Strategies & Redis Patterns, Cache-Aside, Invalidation & Thundering Herd