The Shared State Trap: How a FastAPI 'Optimisation' Leaked User Data
← Back
March 11, 2026Python9 min read

The Shared State Trap: How a FastAPI 'Optimisation' Leaked User Data

Published March 11, 20269 min read

We replaced Flask's request-scoped g with a plain module-level dict during our FastAPI migration. It worked perfectly in tests and staging. In production, under concurrent load, it silently served one tenant's data to a completely different user — for three days before we caught it.

The Rewrite Nobody Questioned

Four years into running a Flask 1.x reporting API, we decided to rewrite it in FastAPI. The pitch was sound: native async support for slow I/O endpoints, automatic request validation via Pydantic, and OpenAPI docs that would actually stay in sync with reality. Management approved. Engineering was excited. Two sprints later, tests were green, our slowest endpoints were 40% faster in load tests, and we deployed on a Wednesday.

By all appearances, the migration was a success. Our dashboards glowed a healthy green. Then, seventy-two hours later, the support tickets started arriving.

Wrong Data, Zero Errors

"I'm seeing reports that don't belong to my company."

The first ticket we dismissed as a frontend cache glitch. The second made us nervous. The third — with a screenshot — confirmed our worst fear: a multi-tenant data leak. Users were receiving valid, well-formed API responses with correct HTTP 200 status codes, but the data inside belonged to a different organisation.

The terrifying part? The logs were completely clean. No exceptions. No 500 errors. No suspicious query patterns. No anomalous latency spikes. Just a steady stream of healthy 200 responses that happened to contain the wrong organisation's data.

I spent the next afternoon adding deep instrumentation — logging the org_id extracted from the JWT at auth time, the org_id passed to each database query, and the org_id present on the returned rows. I deployed and waited. When the next incident hit, the log line read:

auth.org_id=2041 → query.org_id=2041 → result.org_id=1038

The auth was correct. The query filter was correct. The data that came back wasn't. Unless… the data didn't come from the database at all.

The "Optimisation" That Broke Everything

In the old Flask codebase, we used flask.g extensively — Flask's request-scoped proxy that stores arbitrary per-request data for the duration of a single request. It was how we passed context (org ID, user ID, request metadata) down through deep call chains without threading it through every function signature. It was convenient. It was idiomatic Flask. It worked reliably for four years.

During the FastAPI migration, one of the team replaced flask.g with what seemed like an equivalent: a module-level dictionary. Cleaner, they thought. No import from Flask. More "Pythonic."

services/context.py — the broken version
# Looked harmless. Was catastrophic.
_request_context: dict = {}

def set_context(org_id: int, user_id: int) -> None:
    _request_context["org_id"] = org_id
    _request_context["user_id"] = user_id

def get_org_id() -> int:
    return _request_context.get("org_id")

# Used in the route handler:
@router.get("/reports/{report_id}")
async def get_report(
    report_id: int,
    token: TokenData = Depends(verify_token),
):
    set_context(token.org_id, token.user_id)  # Set context for this "request"
    await asyncio.sleep(0)                     # Yield to event loop (batching)
    report = await fetch_report(report_id)    # Calls get_org_id() internally
    return report

In Flask, this pattern is safe. Flask uses Werkzeug's LocalProxy backed by threading.local() under the hood. With a thread-per-request model, each thread has its own isolated copy of any thread-local variable. Setting and reading _request_context from Flask's g is inherently scoped to one request, one thread.

FastAPI is different. It runs on an async event loop. A single OS thread handles thousands of concurrent requests. That module-level _request_context dict is one object in memory, shared by every concurrent coroutine. When two requests are running simultaneously and both write to the same keys — the last write wins, and whoever reads next gets the wrong value.

How the Corruption Happens

To understand why this fails, you need to see how Python's async event loop interleaves coroutines. When a coroutine hits an await, it yields control back to the event loop, which picks up another coroutine. This cooperative scheduling is why async code is fast — it's also why shared mutable state is a trap.

BROKEN: Module-level dict, two concurrent requests

  Time │  Request A (org=2041)           Request B (org=1038)
  ─────┼──────────────────────────────────────────────────────
   t1  │  set_context(org_id=2041)
       │  _request_context = {"org_id": 2041}
   t2  │  await asyncio.sleep(0) ──────► yields to event loop
   t3  │                                 set_context(org_id=1038)
       │                                 _request_context = {"org_id": 1038}
   t4  │                                 await db.fetch(...) ──► yields
   t5  │  ◄────────────────────────────── event loop resumes A
   t6  │  get_org_id()
   t7  │  returns 1038  ✗  ← B overwrote A's key!
   t8  │  query: WHERE org_id = 1038
   t9  │  → org 1038's data returned to org 2041's user

       _request_context = {"org_id": 1038}
                          ─────────────────
                      One shared dict. All requests.

Any await is a potential interleave point. Our handler set the context, then immediately awaited — a cache lookup, a database call, sometimes just asyncio.sleep(0) for batching. In that window, another request could write to the same dict. When the first request resumed, it read the wrong org ID, queried with the wrong filter, and returned the wrong tenant's data.

Under low load, the timing rarely aligned. Under production load with dozens of concurrent requests, it happened constantly. Because the responses were structurally valid — correct JSON shape, HTTP 200, real data — no automated monitor caught it. There was nothing to catch. From the system's perspective, everything was working.

The Fix: contextvars

Python 3.7 introduced contextvars — a module designed exactly for this problem. A ContextVar is automatically scoped to the current async task (or OS thread). Each coroutine gets its own isolated binding. It is the async-native equivalent of thread-local storage, and it works correctly across await boundaries.

services/context.py — the fixed version
from contextvars import ContextVar
from typing import Optional

# Each async task gets its own isolated copy of these values.
# ContextVar is safe across await boundaries — no shared state.
_org_id_var: ContextVar[Optional[int]] = ContextVar("org_id", default=None)
_user_id_var: ContextVar[Optional[int]] = ContextVar("user_id", default=None)

def set_context(org_id: int, user_id: int) -> None:
    _org_id_var.set(org_id)
    _user_id_var.set(user_id)

def get_org_id() -> int:
    org_id = _org_id_var.get()
    if org_id is None:
        raise RuntimeError("org_id not set — is set_context() missing from this path?")
    return org_id

def get_user_id() -> int:
    user_id = _user_id_var.get()
    if user_id is None:
        raise RuntimeError("user_id not set — is set_context() missing from this path?")
    return user_id

When Request A calls _org_id_var.set(2041), Python's async runtime stores that binding in A's execution context — a lightweight namespace the event loop maintains per coroutine. When Request B calls _org_id_var.set(1038), it writes to B's context. The two never touch.

FIXED: ContextVar, two concurrent requests

  Time │  Request A (org=2041)           Request B (org=1038)
  ─────┼──────────────────────────────────────────────────────
   t1  │  _org_id_var.set(2041)
       │  Context A: { _org_id_var → 2041 }
   t2  │  await asyncio.sleep(0) ──────► yields to event loop
   t3  │                                 _org_id_var.set(1038)
       │                                 Context B: { _org_id_var → 1038 }
   t4  │                                 await db.fetch(...) ──► yields
   t5  │  ◄────────────────────────────── event loop resumes A
   t6  │  _org_id_var.get()
   t7  │  returns 2041  ✓  ← reads from A's own context
   t8  │  query: WHERE org_id = 2041
   t9  │  → org 2041's data returned to org 2041's user  ✓

       Context A: { _org_id_var: 2041 }   ← isolated
       Context B: { _org_id_var: 1038 }   ← isolated

One import swap. One class change. That's all it took to fix the bug. The damage it caused took considerably longer to address.

An Honest Post-Mortem

We ran a full audit of every affected request — three days of logs, cross-referenced against support tickets and org ID mismatches in our access logs. We identified seventeen tenants who had received at least one response containing another tenant's data. We disclosed to every one of them individually, revoked the affected report exports, and filed a GDPR incident report.

It was one of the most uncomfortable conversations I've had with clients. The data involved wasn't especially sensitive — aggregated analytics, not financial records or PII — but that barely softened it. Data isolation is a contract. You can't partially honour a contract and call it a success.

3 days undetected
17 affected tenants
1 import to fix it
0 automated alerts fired

What We Changed After

Beyond the immediate fix, we made three structural changes to prevent a recurrence:

  • Explicit over implicit context: We deprecated the context helpers entirely on new endpoints. org_id and user_id are now injected via FastAPI's Depends() system as typed parameters. Every function that needs the org ID receives it explicitly — the data flow is visible in every function signature, not hidden in a global.
  • Cross-tenant isolation tests: We added integration tests that fire two concurrent requests for different orgs and assert each response contains only data belonging to the requesting org. These tests run in CI on every PR and took about three hours to write. They would have caught this bug in staging immediately.
  • Module-level state lint rule: We added a custom Pylint rule that flags any mutable module-level dict or list inside the services/ directory. Module-level state is fine for config and constants — not for per-request data. The linter makes the distinction enforced, not advisory.

The Broader Lesson

The mistake wasn't carelessness. The developer who introduced it was experienced. The pattern — storing request context in a "global" — is completely normal in Flask, Django, and every other thread-per-request framework. It's how you avoid prop-drilling context through twenty function signatures. For four years it had worked without issue.

The problem was translating a thread-safe pattern to an async context without understanding what made it thread-safe in the first place.

Flask's g isn't just a dict. It's backed by LocalProxy, which wraps threading.local(). The safety is invisible unless you've read the source. When we copied the pattern without copying the mechanism, we got all of the convenience and none of the isolation.

When migrating from a synchronous to an asynchronous framework, every piece of "ambient" state deserves a hard look. Thread-local storage, request-local proxies, singleton caches — they all behave differently when your execution model changes. What was safe in a thread-per-request world can become a data leak in an async one.

If you're running FastAPI and passing context through your call chain via anything other than explicit parameters or ContextVar, I'd audit it today. Not tomorrow. Today. Silent data leaks are patient. They wait for the right concurrency timing, then they show up in a support ticket with a screenshot.

Rey, writing for Darshan Turakhia · March 2026
Share this
← All Posts9 min read