FastAPI 102: Async vs Sync — What Actually Happens
In Part 1 we traced the full path a request
takes through FastAPI — ASGI server, middleware, dependency injection, handler. One of the key
questions left open: should your handler be async def or def? Most developers
pick one and stick with it. The actual answer depends on what your handler does — and getting it wrong
either kills your throughput silently or adds unnecessary overhead. This is Part 2: what actually
happens at the runtime level for each choice.
The event loop: one thread, cooperative multitasking
Uvicorn (FastAPI's ASGI server) runs on a single-threaded event loop — Python's asyncio
event loop. "Single-threaded" doesn't mean it can only handle one request at a time — it means it
handles all requests on one thread, switching between them cooperatively.
The key word is cooperatively. The event loop runs one coroutine at a time. A coroutine
"yields" control back to the event loop when it hits an await expression. While it's
waiting (for a network response, a DB query, a file read), the event loop runs other coroutines.
When the awaited operation completes, the coroutine is resumed.
Event Loop Thread: T=0ms Handle request A: start processing T=5ms Request A hits "await db.query()" — yields to event loop T=5ms Handle request B: start processing T=8ms Request B hits "await http_client.get()" — yields T=12ms DB query for A completes → resume request A T=15ms HTTP call for B completes → resume request B T=16ms Request A sends response T=18ms Request B sends response All of this happened on ONE thread.
This is why async is powerful for I/O-bound work. While waiting for the database, you're not burning a thread — the thread is handling other requests.
What actually happens with async def
When you define a route with async def, FastAPI calls it as a coroutine directly on
the event loop. No thread pool, no overhead — just the event loop running your code until it hits
an await, at which point it yields and handles other work.
import httpx
from fastapi import FastAPI
app = FastAPI()
# ✅ Correct async def — awaits an async I/O operation
@app.get("/users/{id}")
async def get_user(id: int):
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{id}")
# While waiting for HTTP response, event loop handles other requests
return response.json()
When await client.get(...) is called, the coroutine suspends, the event loop runs
other coroutines, and when the HTTP response arrives, your coroutine resumes. Zero threads used.
Thousands of concurrent requests possible on one CPU core.
The silent killer: blocking inside async def
Here's where developers get burned. If you call a blocking function inside async def,
you don't get an error — the code just runs. But blocking the event loop means nothing else can run
until your blocking call returns.
import requests # sync HTTP library
# ❌ WRONG — requests.get() is blocking. It blocks the entire event loop.
# ALL other concurrent requests wait while this network call is in progress.
@app.get("/users/{id}")
async def get_user_broken(id: int):
response = requests.get(f"https://api.example.com/users/{id}")
# While this takes 200ms, every other request is frozen
return response.json()
# ✅ CORRECT — httpx.AsyncClient is async-native
@app.get("/users/{id}")
async def get_user_fixed(id: int):
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{id}")
return response.json()
The same trap exists with database drivers. Calling a synchronous SQLAlchemy session inside
async def blocks the event loop for every query.
from sqlalchemy.orm import Session
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine
# ❌ Sync SQLAlchemy inside async def — blocks event loop
@app.get("/orders")
async def get_orders_broken(db: Session = Depends(get_sync_db)):
return db.query(Order).all() # Blocking DB call — freezes event loop
# ✅ Async SQLAlchemy — correct
@app.get("/orders")
async def get_orders_fixed(db: AsyncSession = Depends(get_async_db)):
result = await db.execute(select(Order))
return result.scalars().all()
What actually happens with def
When you define a route with def (no async), FastAPI does not call it on the event
loop. Instead, it offloads it to a thread pool executor — by default,
asyncio.get_event_loop().run_in_executor(None, func) using Python's
ThreadPoolExecutor.
# FastAPI internally does something like:
import asyncio
from functools import partial
# When it encounters a def route:
result = await loop.run_in_executor(None, partial(your_sync_handler, **kwargs))
This means:
- Your function runs in a separate thread — blocking I/O doesn't freeze the event loop
- The event loop is free to handle other requests while your thread is blocked
- But there's overhead: thread creation/pooling, context switching, GIL interaction
- The thread pool has a limited size — under high concurrency, requests queue waiting for a thread
import requests
from sqlalchemy.orm import Session
# ✅ def with sync libraries — FastAPI runs this in a thread pool
# Blocking requests.get() is fine here — it blocks the thread, not the event loop
@app.get("/users/{id}")
def get_user_sync(id: int, db: Session = Depends(get_sync_db)):
response = requests.get(f"https://api.example.com/users/{id}")
user = db.query(User).filter(User.id == id).first()
return {"user": user, "external": response.json()}
The decision framework
Here's the decision tree for every route and dependency you write:
What does my function do?
│
├─► I/O-bound work (DB, HTTP, file, cache)
│ │
│ ├─► Async library available (httpx, asyncpg, aioredis)?
│ │ └─► async def + await ✅ (best performance)
│ │
│ └─► Only sync library available (requests, psycopg2)?
│ └─► def (thread pool) ✅ (safe, not optimal)
│
├─► CPU-bound work (image processing, ML inference, heavy computation)
│ └─► def + ProcessPoolExecutor ✅
│ (thread pool won't help — Python GIL limits CPU parallelism)
│
└─► Fast/trivial computation (no I/O)
└─► Either works — async def if already in async context
import asyncio
from concurrent.futures import ProcessPoolExecutor
executor = ProcessPoolExecutor()
# CPU-bound: image resizing, ML inference, etc.
@app.post("/resize")
async def resize_image(file: UploadFile):
image_bytes = await file.read()
# Offload CPU work to process pool — won't block event loop or GIL
loop = asyncio.get_event_loop()
resized = await loop.run_in_executor(executor, resize_sync, image_bytes)
return StreamingResponse(io.BytesIO(resized), media_type="image/jpeg")
Real code comparisons
Requests vs httpx
import requests
import httpx
# SYNC — use inside def routes or run_in_executor
def fetch_user_sync(user_id: int) -> dict:
response = requests.get(f"https://api.example.com/users/{user_id}")
response.raise_for_status()
return response.json()
# ASYNC — use inside async def routes
async def fetch_user_async(user_id: int) -> dict:
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{user_id}")
response.raise_for_status()
return response.json()
# Performance comparison under 100 concurrent requests:
# requests (in def, thread pool): ~2.1s total (thread overhead, GIL contention)
# httpx async (in async def): ~0.4s total (single thread, cooperative switching)
Sqlalchemy sync vs async
from sqlalchemy import create_engine, select
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
# Sync engine — use with def routes
sync_engine = create_engine("postgresql://user:pass@localhost/db")
SyncSession = sessionmaker(bind=sync_engine)
def get_sync_db():
db = SyncSession()
try:
yield db
finally:
db.close()
# Async engine — use with async def routes
async_engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(async_engine, class_=AsyncSession)
async def get_async_db():
async with AsyncSessionLocal() as session:
yield session
# Usage
@app.get("/orders/sync")
def list_orders_sync(db: Session = Depends(get_sync_db)):
return db.execute(select(Order)).scalars().all()
@app.get("/orders/async")
async def list_orders_async(db: AsyncSession = Depends(get_async_db)):
result = await db.execute(select(Order))
return result.scalars().all()
Asyncio.gather: concurrent vs sequential awaits
In async def, awaiting operations sequentially means waiting for each one to complete
before starting the next. asyncio.gather() runs them concurrently — all start immediately,
you wait for all to finish.
import asyncio
import httpx
import time
# ❌ Sequential — total time = sum of all call durations
@app.get("/dashboard/slow")
async def dashboard_slow(user_id: int):
async with httpx.AsyncClient() as client:
orders = await client.get(f"/api/orders/{user_id}") # 80ms
profile = await client.get(f"/api/profile/{user_id}") # 60ms
balance = await client.get(f"/api/balance/{user_id}") # 50ms
# Total: ~190ms — each call waits for the previous
# ✅ Concurrent — total time = max of all call durations
@app.get("/dashboard/fast")
async def dashboard_fast(user_id: int):
async with httpx.AsyncClient() as client:
orders_task = client.get(f"/api/orders/{user_id}")
profile_task = client.get(f"/api/profile/{user_id}")
balance_task = client.get(f"/api/balance/{user_id}")
orders, profile, balance = await asyncio.gather(
orders_task, profile_task, balance_task
)
# Total: ~80ms — all calls run in parallel
return {
"orders": orders.json(),
"profile": profile.json(),
"balance": balance.json()
}
asyncio.gather() is one of the highest-use async patterns for APIs that aggregate
data from multiple sources.
run_in_executor: escaping blocking code
Sometimes you have a sync library you can't replace — a legacy SDK, a C extension, a third-party
client with no async version. run_in_executor lets you call it from an async def
route without blocking the event loop.
import asyncio
from concurrent.futures import ThreadPoolExecutor
thread_pool = ThreadPoolExecutor(max_workers=20)
# Legacy sync function you can't change
def legacy_payment_process(amount: float, card_token: str) -> dict:
return old_payment_sdk.charge(amount, card_token) # Blocking
@app.post("/payment")
async def process_payment(payment: PaymentRequest):
loop = asyncio.get_event_loop()
# Run blocking function in thread pool — event loop stays free
result = await loop.run_in_executor(
thread_pool,
legacy_payment_process,
payment.amount,
payment.card_token
)
return result
# Cleaner with functools.partial for multiple args:
from functools import partial
result = await loop.run_in_executor(
thread_pool,
partial(legacy_payment_process, payment.amount, payment.card_token)
)
Thread pool size matters. Default is min(32, os.cpu_count() + 4). For I/O-bound work
(most cases), you can set it higher. For CPU-bound, use ProcessPoolExecutor instead.
Background tasks: what they don't do
A common misconception: background tasks run in parallel with the response. They don't — they run after the response is sent. And they share the event loop.
from fastapi import BackgroundTasks
import asyncio
async def send_analytics(user_id: int, action: str):
# Runs AFTER response sent — but on the same event loop
await analytics_client.track(user_id, action) # async — OK
# time.sleep(5) here would freeze the event loop for 5 seconds — very bad
@app.post("/purchase")
async def purchase(item_id: int, bg: BackgroundTasks, user: User = Depends(get_current_user)):
order = await create_order(user.id, item_id)
bg.add_task(send_analytics, user.id, "purchase")
return order # Client gets response; send_analytics runs after
# For fire-and-forget tasks that need true parallelism:
# Use a task queue (Celery, ARQ, Dramatiq) instead of BackgroundTasks
Quiz
Q1. A def route calls requests.get(). Does this block the event loop?
No. FastAPI runs def routes in a thread pool executor. The blocking requests.get() call blocks the thread it's running in, not the event loop. The event loop is free to handle other async requests while this thread waits for the HTTP response.
The cost: a thread is occupied for the duration of the call. Under high concurrency, you can exhaust the thread pool. But for moderate loads, def + sync library is perfectly fine.
Q2. You have an async def route that calls time.sleep(2). What happens to other requests during those 2 seconds?
They all freeze. time.sleep() is a blocking call. Inside async def, it runs on the event loop thread. For 2 seconds, the event loop is completely blocked — no other coroutines can run, no other requests can be handled.
Fix: use await asyncio.sleep(2) instead. This suspends the coroutine and yields control to the event loop, which can run other requests while waiting. If you need to sleep in a sync context, use def (runs in thread pool, sleep only blocks that thread).
Q3. You need to call a CPU-heavy image compression function inside a FastAPI route. Should you use async def or def, and with which executor?
async def with ProcessPoolExecutor. CPU-bound work is bounded by Python's GIL — even in a thread pool, only one thread can execute Python bytecode at a time. Multiple threads don't parallelize CPU work.
A ProcessPoolExecutor spawns separate processes that bypass the GIL, enabling true parallel CPU execution. Use await loop.run_in_executor(process_executor, compress_image, data) from an async def route to run CPU work in a process while the event loop stays free.
Plain def runs in a thread pool — it would work (blocks a thread, not the event loop), but doesn't give you CPU parallelism.
Q4. You use asyncio.gather() to fire 5 HTTP requests simultaneously from a route. One of them raises an exception. What happens to the others?
By default, the exception propagates immediately and the other tasks are cancelled. asyncio.gather()'s default behavior is to raise the first exception and cancel remaining tasks.
To handle errors per-task instead, use return_exceptions=True:
results = await asyncio.gather(
task1(), task2(), task3(), task4(), task5(),
return_exceptions=True # Returns exceptions as values instead of raising
)
# results is a list — some may be exceptions, some may be actual values
for r in results:
if isinstance(r, Exception):
log.error(f"Task failed: {r}")
else:
process(r)
Use return_exceptions=True when you want best-effort behavior — get as many results as possible even if some calls fail.
That covers the two biggest conceptual hurdles in FastAPI: the full request lifecycle (Part 1) and the async vs sync model (Part 2). Together they explain most of the production bugs and performance surprises you'll encounter. Next up — Part 3: API Design — Pagination, Filtering & Error Handling at Scale. Cursor vs offset pagination, Pydantic query param filtering, and enforcing one error shape across the entire API.