System DesignProduction

Designing a Real-Time Cinema Seat Booking System

March 21, 202618 min readPART 01 / 06

A mid-size cinema chain (45 theaters across 12 cities in India) wants to ditch their third-party booking vendor. They're paying 18% commission per ticket, they can't run their own pricing, and their seat maps don't update in real time. The brief: build an in-house platform where 8,000 concurrent users can fight over 300 seats per screen, see holds appear within a second, and never double-book.

Stack: React PWA, Python (FastAPI), Postgres, Redis. Budget: ₹3L/month on AWS Mumbai. Team is one lead, two backend engineers, one frontend. MVP in 10 weeks for 5 pilot theaters.

Five phases. Each one builds on the last.

Phase 1: seat locking without double-booking

The core problem looks simple. Hold a seat for 8 minutes while someone pays. If they pay, confirm it. If they don't, release it. And never let two people book the same seat.

Two systems handle it, each with a different job.

Redis handles the temporary holds. When a user clicks seat D7, the server runs a single Redis command:

SET hold:show_123:D7 '{"user_id":"u_abc","held_at":1711024800,"price_locked":280}' EX 480 NX

Three things are happening in that one command:

EX 480: the key expires in 480 seconds (8 minutes). No cleanup cron needed, Redis handles it.
NX: "only set if the key does not exist." This is atomic. If two users click D7 at the same millisecond, only one SET NX succeeds. The other gets nil.
The value includes price_locked, the price at the moment of selection. More on this in Phase 5.

Redis SET NX is your first line of defense. It's single-threaded and atomic. Two concurrent requests for the same seat are physically serialized. One wins, one loses, no race condition possible.

Postgres handles confirmed bookings. When payment succeeds, the server writes to a bookings table with a unique constraint:

CREATE TABLE bookings (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    showtime_id UUID NOT NULL REFERENCES showtimes(id),
    seat_id     VARCHAR(4) NOT NULL,       -- "D7", "A12"
    user_id     UUID NOT NULL REFERENCES users(id),
    price       NUMERIC(8,2) NOT NULL,
    status      VARCHAR(20) DEFAULT 'confirmed',
    created_at  TIMESTAMPTZ DEFAULT now(),

    UNIQUE (showtime_id, seat_id)          -- THE constraint that prevents double-booking
);

The booking insert uses ON CONFLICT:

INSERT INTO bookings (showtime_id, seat_id, user_id, price, status)
VALUES ($1, $2, $3, $4, 'confirmed')
ON CONFLICT (showtime_id, seat_id) DO NOTHING
RETURNING *;

If the insert returns a row, the booking succeeded. If it returns nothing, someone else already booked that seat. One query, no race condition. Postgres enforces it at the storage engine level.

The two-step flow

The booking flow splits into two steps: optimistic selection, then pessimistic confirmation. (I've seen teams try to do it in one step and it never ends well.)

  SEAT SELECTION (Optimistic — Redis)
  ═══════════════════════════════════════════════

  User clicks D7
       │
       ▼
  Redis SET NX  hold:show_123:D7
       │
       ├── Success (key didn't exist)
       │       │
       │       ▼
       │   Return "held" to user
       │   Broadcast hold via WebSocket (Phase 2)
       │   Start 8-minute countdown on frontend
       │
       └── Failure (key already exists)
               │
               ▼
           Return "already held" — seat shows grey


  PAYMENT (Pessimistic — Postgres)
  ═══════════════════════════════════════════════

  User clicks "Pay ₹280"
       │
       ▼
  Backend checks Redis: does hold still exist?
  Is it this user's hold?
       │
       ├── No → "Hold expired. Please select again."
       │
       └── Yes
            │
            ▼
       Process payment via gateway
            │
            ├── Payment fails → hold remains in Redis
            │                    user can retry within 8 min
            │
            └── Payment succeeds
                     │
                     ▼
                INSERT INTO bookings ... ON CONFLICT DO NOTHING
                     │
                     ├── Row returned → BOOKED. Delete Redis hold.
                     │                  Broadcast "confirmed" via WS.
                     │
                     └── No row → someone else booked it
                                  (edge case — refund payment)

The two-step approach lets Redis handle the fast "can I grab this?" question and lets Postgres handle the slow "is this actually mine?" question. Redis absorbs the thundering herd. Postgres is the court of last resort.

Phase 2: real-time sync — WebSocket + pub/sub

200 people are browsing the 7 PM show. User A holds seat D7. How do the other 199 people see it go grey within a second?

Polling fails fast. If 200 clients poll the server every second, that's 200 HTTP requests per second returning mostly empty responses. At peak (8,000 users across all shows), it's 8,000 req/sec of pure waste. Our ₹3L/month budget can't absorb that.

WebSocket flips the model. Instead of the client asking "anything new?", the server pushes when something happens. One persistent connection per user, tiny frame overhead (2-14 bytes per message header vs ~200-800 bytes for HTTP headers).

How the connection starts

A WebSocket connection starts as a regular HTTP request. The client sends an Upgrade: websocket header, the server responds with 101 Switching Protocols, and from that point on, the TCP connection stays open for bidirectional messaging.

Why HTTP first? Because firewalls, corporate networks, hotel WiFi, and load balancers all allow ports 80 and 443. A custom protocol on a custom port would get blocked by half the networks in India. By piggybacking on HTTP, WebSocket works everywhere HTTP works.

The multi-server problem

On a single server, broadcasting is trivial. Loop over the connections, send the message. But in production you're running multiple FastAPI instances behind an ALB. User A is on Server 1, User D is on Server 3. When A holds a seat, Server 1 knows. Server 3 has no idea.

Redis Pub/Sub solves this. Every server subscribes to a channel per showtime. When any server writes a hold, it publishes to that channel. Redis delivers the message to all subscribed servers, and each one pushes to its own local WebSocket clients.

  User A holds D7
       │
       ▼
  Server 1
       │
       ├── ① Redis SET NX (hold the seat)
       │
       └── ② Redis PUBLISH showtime:show_123
               {"seat":"D7", "status":"held", "seq":42}
                    │
          ┌─────────┼─────────────┐
          ▼         ▼             ▼
      Server 1   Server 2     Server 3
      (subscribed)(subscribed) (subscribed)
          │         │             │
          ▼         ▼             ▼
     Push to     Push to      Push to
     Users A,B   Users C,D    Users E,F
     via WS      via WS       via WS

  Total time from click to everyone seeing it: ~50-100ms

The initial load race condition

When a user opens the seat map, two things happen in parallel: the client connects to the WebSocket and fetches the full seat map via REST. There's a window where a Pub/Sub event arrives before the REST response, and the REST response (slightly stale) overwrites it.

The fix is sequence numbers. Every seat event has a monotonically increasing seq. The REST response includes the latest seq at the time it was generated. The client buffers all WebSocket events, applies the REST snapshot, then replays only buffered events with a higher seq.

  Timeline:
  ─────────────────────────────────────────────────▶

  WS connects        REST response        Replay buffer
      │               arrives (seq:41)         │
      │  buffer:           │                   │
      │  [seq:42 D7 held]  │                   │
      │  [seq:43 A1 held]  │                   │
      │                    ▼                   ▼
      │               Set baseline        Apply seq 42, 43
      │               from REST           (both > 41, so apply)
      │
  From here on, apply events in real time.
  No data lost. No stale overwrites.

Phase 3: scaling the broadcast

The math gets aggressive fast. Friday 7 PM, Avengers premiere. 500 concurrent users on one showtime, 50 seat events per second (holds, releases, confirmations). That's:

50 events/sec × 500 connections = 25,000 WebSocket sends per second

For one showtime. Your server CPU spikes to 100% and starts dropping connections.

Solution: 300ms batching

Instead of broadcasting every event immediately, the server buffers events and flushes every 300 milliseconds. In that window, 15 seat changes collapse into one message.

class EventBatcher:
    def __init__(self):
        self.buffers: dict[str, list] = defaultdict(list)

    async def add(self, showtime_id: str, event: dict):
        self.buffers[showtime_id].append(event)

    async def flush_loop(self):
        while True:
            await asyncio.sleep(0.3)  # 300ms batches
            for showtime_id, events in list(self.buffers.items()):
                if events:
                    batch = events.copy()
                    events.clear()
                    await manager.broadcast(showtime_id, {
                        "type": "batch",
                        "updates": batch
                    })

Before: 50 events/sec × 500 users = 25,000 sends/sec. After: ~3.3 batches/sec × 500 users = 1,650 sends/sec.

A 15x reduction with zero visible impact. 300ms is imperceptible when a human is staring at a seat map.

Delta compression

Instead of sending one object per seat, compress into a diff:

// Instead of:
[{seat:"D7",status:"held"}, {seat:"D8",status:"held"}, {seat:"A3",status:"released"}]

// Send:
{held: ["D7","D8"], released: ["A3"], confirmed: []}

Smaller payload. Fewer bytes. Faster to parse on the client.

Immediate actor response

One UX detail matters here. The user who holds the seat gets an immediate response over their own WebSocket connection. Only the broadcast to everyone else is batched. So User A clicks D7, sees instant confirmation, and Users B through 500 see it within 300ms on the next batch flush. Feels instant to everyone.

Phase 4: failure modes

Everything above works when the system is healthy. Staff-level thinking means asking: what happens when it isn't?

Browser crash during hold

User selects D7, starts payment, and their phone dies. The WebSocket disconnects. The question: do you release the seat immediately?

No. The Redis hold has its own 8-minute TTL, completely independent of the WebSocket. Clean up the WebSocket connection to save server resources, but let Redis handle the business logic. If the user comes back within 8 minutes and reconnects, check Redis for their existing hold and restore their session.

Connection health and business state are decoupled. The WebSocket is an I/O channel. The hold is business logic with its own timer.

Dead connection detection

The server sends a WebSocket ping frame every 30 seconds. The browser automatically sends a pong back (no JavaScript needed). If two consecutive pings get no pong (60 seconds of silence), the connection is dead. Close it.

Why not one missed ping? Mobile users ride elevators, switch from WiFi to 4G, and walk through dead zones. One missed ping is normal. Two means they're gone.

Why not three? Every second a "live" connection is actually dead is a wasted resource, and on our seat map a ghost connection means one fewer seat-change update gets delivered. Two missed pings (60 seconds) is the sweet spot: shorter than the 8-minute hold, long enough to survive a tunnel.

Redis pub/sub drops silently

This is the scariest failure mode. A server's Pub/Sub subscription dies but everything else works. The server still accepts WebSocket connections and still writes holds to Redis, but it never receives broadcast events. Users on that server see a frozen seat map.

Fix: health check on the subscription. Publish a heartbeat to a control channel every 10 seconds. If a server doesn't receive the heartbeat, it knows its subscription is dead and should reconnect and resubscribe. And when any client detects stale data (no updates for 30+ seconds on a busy show), fall back to a REST poll to resync.

Redis goes down entirely

Fail closed. No new holds accepted. The "hold seat" button shows an error. The seat map still works though: show confirmed bookings from Postgres and mark everything else as available. Users can see the map but can't select seats until Redis recovers. Degraded but not broken.

Payment gateway timeout

The payment gateway takes 30 seconds and times out. The seat hold is still in Redis with 6+ minutes remaining. Show the user a "retry payment" button. Don't release the hold just because the gateway was slow. The hold timer is the user's guarantee. They have 8 minutes regardless of what the payment gateway does.

Design every failure mode around the question, "does the user lose their seat unfairly?" If the answer is yes, fix it. If the system degrades but the user's intent is preserved, that's acceptable.

Phase 5: dynamic pricing

The whole reason the cinema chain is building in-house is pricing control. Weekday afternoon shows should be cheaper. The last 20 seats in a hot Friday show should cost more. Loyalty members get 10% off.

When is the price computed?

At selection time, not at page load. The seat map shows base prices. When a user clicks a seat, the server computes the real price based on current demand and locks it in Redis alongside the hold:

# Price computation at hold time
def compute_price(showtime, seat, user) -> Decimal:
    base = showtime.base_price                           # ₹200
    zone_mult = ZONE_MULTIPLIERS[seat.zone]              # Premium: 1.4x
    demand = get_demand_factor(showtime)                  # 85% full: 1.15x
    day_mult = DAY_MULTIPLIERS[showtime.day_of_week]     # Friday: 1.1x
    time_mult = time_to_show_factor(showtime.start_time) # 2hrs out: 1.0x
    loyalty = 0.9 if user.is_loyalty_member else 1.0     # 10% off

    return (base * zone_mult * demand * day_mult * time_mult * loyalty)
    .quantize(Decimal('1'))  # Round to whole rupees

# Lock the price in Redis with the hold
redis.set(
    f"hold:{showtime.id}:{seat.id}",
    json.dumps({
        "user_id": user.id,
        "price_locked": str(computed_price),
        "held_at": time.time()
    }),
    ex=480, nx=True
)

The price_locked field is the user's contract. No matter what happens to demand in the next 8 minutes, they pay ₹322. If 50 more seats get booked and the price rises to ₹380, it doesn't matter. Their price is locked.

When the booking is confirmed, Postgres records the locked price:

INSERT INTO bookings (showtime_id, seat_id, user_id, price, status)
VALUES ($1, $2, $3, $locked_price, 'confirmed')
-- $locked_price comes from Redis hold, NOT recomputed

Demand factor calculation

The demand factor is simple: what percentage of seats are held or booked?

def get_demand_factor(showtime) -> float:
    total_seats = showtime.screen.total_seats        # 300
    confirmed = count_bookings(showtime.id)            # from Postgres
    held = count_holds(showtime.id)                    # from Redis SCAN
    occupancy = (confirmed + held) / total_seats       # 0.0 to 1.0

    if occupancy < 0.3:   return 0.85    # Discount — fill the theater
    if occupancy < 0.5:   return 1.0     # Base price
    if occupancy < 0.7:   return 1.1     # Mild surge
    if occupancy < 0.85:  return 1.2     # Getting hot
    return 1.35                           # Last seats premium

This gets cached in Redis for 30 seconds to avoid computing it on every seat click. Stale by 30 seconds is fine. Pricing doesn't need to be real-time, just responsive.

Dynamic pricing and seat holding are coupled through Redis. The hold stores the locked price, the demand factor reads hold counts. Redis is the nexus of both real-time systems. This is why your Redis instance is the most critical piece of infrastructure, and why "Redis goes down" is the worst failure mode.

The architecture — all together

  ┌─────────────────────────────────────────────────────────┐
  │  CLIENTS (React PWA)                                    │
  │                                                         │
  │  ┌──────────┐  ┌──────────┐  ┌──────────┐              │
  │  │ SeatMap   │  │ SeatMap   │  │ SeatMap   │  × 8,000   │
  │  │ User A    │  │ User B    │  │ User C    │  concurrent │
  │  └─────┬────┘  └─────┬────┘  └─────┬────┘              │
  │        │WS           │WS           │WS                  │
  └────────┼─────────────┼─────────────┼────────────────────┘
           │             │             │
           ▼             ▼             ▼
  ┌─────────────────────────────────────────────────────────┐
  │  ALB (AWS Application Load Balancer)                    │
  │  /ws      → WebSocket target group (sticky sessions)   │
  │  /api/*   → REST target group                          │
  └────────┬─────────────┬─────────────┬────────────────────┘
           │             │             │
           ▼             ▼             ▼
  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
  │  FastAPI #1   │ │  FastAPI #2   │ │  FastAPI #3   │
  │  WS Server    │ │  WS Server    │ │  WS Server    │
  │  REST API     │ │  REST API     │ │  REST API     │
  │  Event Batcher│ │  Event Batcher│ │  Event Batcher│
  └───────┬───────┘ └───────┬───────┘ └───────┬───────┘
          │                 │                 │
          │   ┌─────────────┼─────────────┐   │
          │   │  SUBSCRIBE  │  SUBSCRIBE  │   │
          ▼   ▼             ▼             ▼   ▼
  ┌─────────────────────────────────────────────────────────┐
  │  Redis (ElastiCache)                                    │
  │                                                         │
  │  Keys:     hold:show_123:D7 → {user, price, held_at}   │
  │            hold:show_123:A1 → {user, price, held_at}   │
  │            TTL: 480s (auto-expire)                      │
  │                                                         │
  │  Pub/Sub:  showtime:show_123 → broadcast seat events    │
  │            control:heartbeat → subscription health check │
  │                                                         │
  │  Cache:    demand:show_123 → 0.72 (30s TTL)            │
  └─────────────────────────────────────────────────────────┘
          │
          │  Confirmed bookings only
          ▼
  ┌─────────────────────────────────────────────────────────┐
  │  PostgreSQL (RDS Multi-AZ)                              │
  │                                                         │
  │  bookings:  UNIQUE(showtime_id, seat_id)                │
  │  showtimes: screen, film, start_time, base_price        │
  │  users:     auth, loyalty_status                        │
  │  payments:  gateway_ref, amount, status                 │
  │                                                         │
  │  The court of last resort. If it's in Postgres,         │
  │  it's real. Everything else is ephemeral.                │
  └─────────────────────────────────────────────────────────┘

Cost breakdown (₹3l/month budget)

Service	Spec	Monthly Cost
FastAPI (3× ECS Fargate)	1 vCPU, 2 GB each	~₹12,000
Redis (ElastiCache)	cache.t3.medium	~₹5,000
PostgreSQL (RDS)	db.t3.medium, Multi-AZ	~₹8,000
ALB	Standard	~₹3,000
Data transfer	~500 GB/month	~₹4,500
CloudFront (static)	React PWA bundle	~₹500
Total		~₹33,000

₹33,000 out of a ₹3,00,000 budget. That leaves room for monitoring (Grafana Cloud free tier), a staging environment, and scaling up when the pilot succeeds. Always leave budget headroom. Production load will be higher than your estimates.

What I'd build in week 1

If I had 10 weeks and this team, here's week 1:

Day 1-2: Postgres schema + seed data. Screens, showtimes, seats. Run EXPLAIN ANALYZE on the booking query. Test the unique constraint with concurrent inserts.
Day 3: Redis hold logic. SET NX, TTL expiry, read-back. Hammer the race condition with 100 concurrent goroutines hitting the same seat.
Day 4-5: FastAPI WebSocket endpoint + Redis Pub/Sub listener. Two terminal windows, two WebSocket connections, one holds a seat, the other sees it. That's the proof of concept. Everything else is refinement.

The rest of the 10 weeks goes into the React seat map, payment integration, batching, error handling, dynamic pricing, load testing, and polish. But the hard part (the concurrency model) is proven by end of week 1.