System DesignProduction

Designing a Real-Time Cinema Seat Booking System

March 21, 202618 min read

A mid-size cinema chain — 45 theaters across 12 cities in India — wants to ditch their third-party booking vendor. They're paying 18% commission per ticket, they can't run their own pricing, and their seat maps don't update in real time. The brief: build an in-house platform where 8,000 concurrent users can fight over 300 seats per screen, see holds appear within a second, and never — ever — double-book.

Stack: React PWA, Python (FastAPI), Postgres, Redis. Budget: ₹3L/month on AWS Mumbai. Team: one lead, two backend engineers, one frontend engineer. MVP in 10 weeks for 5 pilot theaters.

This is the full architecture breakdown in five phases. Each phase builds on the previous one.

Phase 1: Seat Locking Without Double-Booking

The core problem is deceptively simple: when someone selects a seat, hold it for 8 minutes while they pay. If they don't pay, release it. If they do pay, confirm it permanently. And make absolutely sure two people can't book the same seat.

Two systems handle this, each with a different job:

Redis handles temporary holds. When a user clicks seat D7, the server runs a single Redis command:

SET hold:show_123:D7 '{"user_id":"u_abc","held_at":1711024800,"price_locked":280}' EX 480 NX

Three things are happening in that one command:

EX 480 — the key expires in 480 seconds (8 minutes). No cleanup cron needed. Redis handles it.
NX — "only set if the key does not exist." This is atomic. If two users click D7 at the same millisecond, only one SET NX succeeds. The other gets nil.
The value includes price_locked — the price at the moment of selection. More on this in Phase 5.

Key insight: Redis SET NX is your first line of defense. It's single-threaded and atomic. Two concurrent requests for the same seat are physically serialized. One wins, one loses. No race condition possible.

Postgres handles confirmed bookings. When payment succeeds, the server writes to a bookings table with a unique constraint:

CREATE TABLE bookings (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    showtime_id UUID NOT NULL REFERENCES showtimes(id),
    seat_id     VARCHAR(4) NOT NULL,       -- "D7", "A12"
    user_id     UUID NOT NULL REFERENCES users(id),
    price       NUMERIC(8,2) NOT NULL,
    status      VARCHAR(20) DEFAULT 'confirmed',
    created_at  TIMESTAMPTZ DEFAULT now(),

    UNIQUE (showtime_id, seat_id)          -- THE constraint that prevents double-booking
);

The booking insert uses ON CONFLICT:

INSERT INTO bookings (showtime_id, seat_id, user_id, price, status)
VALUES ($1, $2, $3, $4, 'confirmed')
ON CONFLICT (showtime_id, seat_id) DO NOTHING
RETURNING *;

If the insert returns a row — booking succeeded. If it returns nothing — someone else already booked that seat. One query. No race condition. Postgres enforces it at the storage engine level.

The Two-Step Flow

The booking flow is deliberately split into two steps: optimistic selection and pessimistic confirmation.

  SEAT SELECTION (Optimistic — Redis)
  ═══════════════════════════════════════════════

  User clicks D7
       │
       ▼
  Redis SET NX  hold:show_123:D7
       │
       ├── Success (key didn't exist)
       │       │
       │       ▼
       │   Return "held" to user
       │   Broadcast hold via WebSocket (Phase 2)
       │   Start 8-minute countdown on frontend
       │
       └── Failure (key already exists)
               │
               ▼
           Return "already held" — seat shows grey


  PAYMENT (Pessimistic — Postgres)
  ═══════════════════════════════════════════════

  User clicks "Pay ₹280"
       │
       ▼
  Backend checks Redis: does hold still exist?
  Is it this user's hold?
       │
       ├── No → "Hold expired. Please select again."
       │
       └── Yes
            │
            ▼
       Process payment via gateway
            │
            ├── Payment fails → hold remains in Redis
            │                    user can retry within 8 min
            │
            └── Payment succeeds
                     │
                     ▼
                INSERT INTO bookings ... ON CONFLICT DO NOTHING
                     │
                     ├── Row returned → BOOKED. Delete Redis hold.
                     │                  Broadcast "confirmed" via WS.
                     │
                     └── No row → someone else booked it
                                  (edge case — refund payment)

Key insight: The two-step approach means Redis handles the fast, high-frequency "can I grab this?" question, and Postgres handles the slow, critical "is this actually mine?" question. Redis absorbs the thundering herd. Postgres is the court of last resort.

Phase 2: Real-Time Sync — WebSocket + Pub/Sub

200 people are browsing the 7 PM show. User A holds seat D7. How do the other 199 people see it go grey within a second?

Why polling fails: If 200 clients poll the server every second, that's 200 HTTP requests per second returning mostly empty responses. At peak (8,000 users across all shows), that's 8,000 req/sec of pure waste. Our ₹3L/month budget can't absorb that.

WebSocket flips the model. Instead of the client asking "anything new?", the server pushes when something happens. One persistent connection per user, tiny frame overhead (2-14 bytes per message header vs ~200-800 bytes for HTTP headers).

How the Connection Starts

A WebSocket connection starts as a regular HTTP request. The client sends an Upgrade: websocket header, the server responds with 101 Switching Protocols, and from that point on, the TCP connection stays open for bidirectional messaging.

Why HTTP first? Because firewalls, corporate networks, hotel WiFi, and load balancers all allow ports 80 and 443. A custom protocol on a custom port would get blocked by half the networks in India. By piggybacking on HTTP, WebSocket works everywhere HTTP works.

The Multi-Server Problem

On a single server, broadcasting is trivial — loop over all connections, send the message. But in production you're running multiple FastAPI instances behind an ALB. User A is on Server 1. User D is on Server 3. When A holds a seat, Server 1 knows. Server 3 has no idea.

Redis Pub/Sub solves this. Every server subscribes to a channel per showtime. When any server writes a hold, it publishes to that channel. Redis delivers the message to all subscribed servers. Each server then pushes to its own local WebSocket clients.

  User A holds D7
       │
       ▼
  Server 1
       │
       ├── ① Redis SET NX (hold the seat)
       │
       └── ② Redis PUBLISH showtime:show_123
               {"seat":"D7", "status":"held", "seq":42}
                    │
          ┌─────────┼─────────────┐
          ▼         ▼             ▼
      Server 1   Server 2     Server 3
      (subscribed)(subscribed) (subscribed)
          │         │             │
          ▼         ▼             ▼
     Push to     Push to      Push to
     Users A,B   Users C,D    Users E,F
     via WS      via WS       via WS

  Total time from click to everyone seeing it: ~50-100ms

The Initial Load Race Condition

When a user opens the seat map, two things happen in parallel: the client connects to the WebSocket and fetches the full seat map via REST. There's a window where a Pub/Sub event arrives before the REST response, and the REST response (slightly stale) overwrites it.

The fix: sequence numbers. Every seat event has a monotonically increasing seq. The REST response includes the latest seq at the time it was generated. The client buffers all WebSocket events, applies the REST snapshot, then replays only buffered events with a higher seq.

  Timeline:
  ─────────────────────────────────────────────────▶

  WS connects        REST response        Replay buffer
      │               arrives (seq:41)         │
      │  buffer:           │                   │
      │  [seq:42 D7 held]  │                   │
      │  [seq:43 A1 held]  │                   │
      │                    ▼                   ▼
      │               Set baseline        Apply seq 42, 43
      │               from REST           (both > 41, so apply)
      │
  From here on, apply events in real time.
  No data lost. No stale overwrites.

Phase 3: Scaling the Broadcast

The math gets aggressive fast. Friday 7 PM, Avengers premiere. 500 concurrent users on one showtime. 50 seat events per second (holds, releases, confirmations). That's:

50 events/sec × 500 connections = 25,000 WebSocket sends per second

For one showtime. Your server CPU spikes to 100% and starts dropping connections.

Solution: 300ms Batching

Instead of broadcasting every event immediately, the server buffers events and flushes every 300 milliseconds. In that window, 15 seat changes collapse into one message.

class EventBatcher:
    def __init__(self):
        self.buffers: dict[str, list] = defaultdict(list)

    async def add(self, showtime_id: str, event: dict):
        self.buffers[showtime_id].append(event)

    async def flush_loop(self):
        while True:
            await asyncio.sleep(0.3)  # 300ms batches
            for showtime_id, events in list(self.buffers.items()):
                if events:
                    batch = events.copy()
                    events.clear()
                    await manager.broadcast(showtime_id, {
                        "type": "batch",
                        "updates": batch
                    })

Before: 50 events/sec × 500 users = 25,000 sends/sec
After: ~3.3 batches/sec × 500 users = 1,650 sends/sec

That's a 15x reduction with zero visible impact. 300ms is imperceptible to a human staring at a seat map.

Delta Compression

Instead of sending one object per seat, compress into a diff:

// Instead of:
[{seat:"D7",status:"held"}, {seat:"D8",status:"held"}, {seat:"A3",status:"released"}]

// Send:
{held: ["D7","D8"], released: ["A3"], confirmed: []}

Smaller payload. Fewer bytes. Faster to parse on the client.

Immediate Actor Response

One critical UX detail: the user who holds the seat gets an immediate response over their WebSocket connection. Only the broadcast to everyone else is batched. So User A clicks D7, sees instant confirmation, and Users B through 500 see it within 300ms in the next batch flush. Feels instant to everyone.

Phase 4: Failure Modes

Everything above works when the system is healthy. Staff-level thinking means asking: what happens when it isn't?

Browser Crash During Hold

User selects D7, starts payment, and their phone dies. The WebSocket disconnects. The question: do you release the seat immediately?

No. The Redis hold has its own 8-minute TTL that's completely independent of the WebSocket. Clean up the WebSocket connection (save server resources), but let Redis handle the business logic. If the user comes back within 8 minutes and reconnects, they can resume — check Redis for their existing hold, restore their session.

This separation is important: connection health and business state are decoupled. The WebSocket is an I/O channel. The hold is business logic with its own timer.

Dead Connection Detection

The server sends a WebSocket ping frame every 30 seconds. The browser automatically sends a pong back (no JavaScript needed). If two consecutive pings get no pong (60 seconds of silence), the connection is dead. Close it.

Why not one missed ping? Mobile users ride elevators, switch from WiFi to 4G, and walk through dead zones. One missed ping is normal. Two means they're gone.

Why not three? Every second the connection is "alive" but dead is a wasted resource. And for our seat map, a ghost connection means one fewer seat-change update is actually delivered. Two missed pings (60 seconds) is the sweet spot — shorter than the 8-minute hold, long enough to survive a tunnel.

Redis Pub/Sub Drops Silently

The scariest failure: a server's Pub/Sub subscription dies but everything else works. The server still accepts WebSocket connections, still writes holds to Redis, but never receives broadcast events. Users on that server see a frozen seat map.

The fix: health check on the subscription. Publish a heartbeat to a control channel every 10 seconds. If a server doesn't receive the heartbeat, it knows its subscription is dead — reconnect and resubscribe. And when any client detects stale data (no updates for 30+ seconds on a busy show), automatically fall back to a REST poll to resync.

Redis Goes Down Entirely

Fail closed. No new holds accepted — the "hold seat" button shows an error. But the seat map still works: show confirmed bookings from Postgres and mark everything else as available. Users can see the map but can't select seats until Redis recovers. Degraded but not broken.

Payment Gateway Timeout

The payment gateway takes 30 seconds and times out. The seat hold is still in Redis with 6+ minutes remaining. Show the user a "retry payment" button. Don't release the hold just because the gateway was slow. The hold timer is the user's guarantee — they have 8 minutes regardless of what the payment gateway does.

Key insight: Design every failure mode around the question: "Does the user lose their seat unfairly?" If the answer is yes, fix it. If the system degrades but the user's intent is preserved, that's acceptable.

Phase 5: Dynamic Pricing

The whole reason the cinema chain is building in-house: they want to control pricing. Weekday afternoon shows should be cheaper. The last 20 seats in a hot Friday show should be more expensive. Loyalty members get 10% off.

When is the Price Computed?

At selection time, not at page load. The seat map shows base prices. When a user clicks a seat, the server computes the real price based on current demand and locks it in Redis alongside the hold:

# Price computation at hold time
def compute_price(showtime, seat, user) -> Decimal:
    base = showtime.base_price                           # ₹200
    zone_mult = ZONE_MULTIPLIERS[seat.zone]              # Premium: 1.4x
    demand = get_demand_factor(showtime)                  # 85% full: 1.15x
    day_mult = DAY_MULTIPLIERS[showtime.day_of_week]     # Friday: 1.1x
    time_mult = time_to_show_factor(showtime.start_time) # 2hrs out: 1.0x
    loyalty = 0.9 if user.is_loyalty_member else 1.0     # 10% off

    return (base * zone_mult * demand * day_mult * time_mult * loyalty)
    .quantize(Decimal('1'))  # Round to whole rupees

# Lock the price in Redis with the hold
redis.set(
    f"hold:{showtime.id}:{seat.id}",
    json.dumps({
        "user_id": user.id,
        "price_locked": str(computed_price),
        "held_at": time.time()
    }),
    ex=480, nx=True
)

The price_locked field means: no matter what happens to demand in the next 8 minutes, this user pays ₹322. If 50 more seats get booked and the price rises to ₹380 — doesn't matter. Their price is locked.

When the booking is confirmed, Postgres records the locked price:

INSERT INTO bookings (showtime_id, seat_id, user_id, price, status)
VALUES ($1, $2, $3, $locked_price, 'confirmed')
-- $locked_price comes from Redis hold, NOT recomputed

Demand Factor Calculation

The demand factor is simple: what percentage of seats are held or booked?

def get_demand_factor(showtime) -> float:
    total_seats = showtime.screen.total_seats        # 300
    confirmed = count_bookings(showtime.id)            # from Postgres
    held = count_holds(showtime.id)                    # from Redis SCAN
    occupancy = (confirmed + held) / total_seats       # 0.0 to 1.0

    if occupancy < 0.3:   return 0.85    # Discount — fill the theater
    if occupancy < 0.5:   return 1.0     # Base price
    if occupancy < 0.7:   return 1.1     # Mild surge
    if occupancy < 0.85:  return 1.2     # Getting hot
    return 1.35                           # Last seats premium

This is cached in Redis for 30 seconds to avoid computing it on every seat click. Stale by 30 seconds is fine — pricing doesn't need to be real-time, just responsive.

Key insight: Dynamic pricing and seat holding are coupled through Redis. The hold stores the locked price. The demand factor reads hold counts. Redis is the nexus of both real-time systems. This is why your Redis instance is the most critical piece of infrastructure — and why "Redis goes down" is the worst failure mode.

The Architecture — All Together

  ┌─────────────────────────────────────────────────────────┐
  │  CLIENTS (React PWA)                                    │
  │                                                         │
  │  ┌──────────┐  ┌──────────┐  ┌──────────┐              │
  │  │ SeatMap   │  │ SeatMap   │  │ SeatMap   │  × 8,000   │
  │  │ User A    │  │ User B    │  │ User C    │  concurrent │
  │  └─────┬────┘  └─────┬────┘  └─────┬────┘              │
  │        │WS           │WS           │WS                  │
  └────────┼─────────────┼─────────────┼────────────────────┘
           │             │             │
           ▼             ▼             ▼
  ┌─────────────────────────────────────────────────────────┐
  │  ALB (AWS Application Load Balancer)                    │
  │  /ws      → WebSocket target group (sticky sessions)   │
  │  /api/*   → REST target group                          │
  └────────┬─────────────┬─────────────┬────────────────────┘
           │             │             │
           ▼             ▼             ▼
  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
  │  FastAPI #1   │ │  FastAPI #2   │ │  FastAPI #3   │
  │  WS Server    │ │  WS Server    │ │  WS Server    │
  │  REST API     │ │  REST API     │ │  REST API     │
  │  Event Batcher│ │  Event Batcher│ │  Event Batcher│
  └───────┬───────┘ └───────┬───────┘ └───────┬───────┘
          │                 │                 │
          │   ┌─────────────┼─────────────┐   │
          │   │  SUBSCRIBE  │  SUBSCRIBE  │   │
          ▼   ▼             ▼             ▼   ▼
  ┌─────────────────────────────────────────────────────────┐
  │  Redis (ElastiCache)                                    │
  │                                                         │
  │  Keys:     hold:show_123:D7 → {user, price, held_at}   │
  │            hold:show_123:A1 → {user, price, held_at}   │
  │            TTL: 480s (auto-expire)                      │
  │                                                         │
  │  Pub/Sub:  showtime:show_123 → broadcast seat events    │
  │            control:heartbeat → subscription health check │
  │                                                         │
  │  Cache:    demand:show_123 → 0.72 (30s TTL)            │
  └─────────────────────────────────────────────────────────┘
          │
          │  Confirmed bookings only
          ▼
  ┌─────────────────────────────────────────────────────────┐
  │  PostgreSQL (RDS Multi-AZ)                              │
  │                                                         │
  │  bookings:  UNIQUE(showtime_id, seat_id)                │
  │  showtimes: screen, film, start_time, base_price        │
  │  users:     auth, loyalty_status                        │
  │  payments:  gateway_ref, amount, status                 │
  │                                                         │
  │  The court of last resort. If it's in Postgres,         │
  │  it's real. Everything else is ephemeral.                │
  └─────────────────────────────────────────────────────────┘

Cost Breakdown (₹3L/month budget)

Service	Spec	Monthly Cost
FastAPI (3× ECS Fargate)	1 vCPU, 2 GB each	~₹12,000
Redis (ElastiCache)	cache.t3.medium	~₹5,000
PostgreSQL (RDS)	db.t3.medium, Multi-AZ	~₹8,000
ALB	Standard	~₹3,000
Data transfer	~500 GB/month	~₹4,500
CloudFront (static)	React PWA bundle	~₹500
Total		~₹33,000

₹33,000 out of a ₹3,00,000 budget. That leaves room for monitoring (Grafana Cloud free tier), a staging environment, and scaling up when the pilot succeeds. Always leave budget headroom — the production load will be higher than your estimates.

What I'd Build in Week 1

If I had 10 weeks and this team, here's week 1:

Day 1-2: Postgres schema + seed data. Screens, showtimes, seats. Run EXPLAIN ANALYZE on the booking query. Get the unique constraint tested with concurrent inserts.
Day 3: Redis hold logic. SET NX, TTL expiry, read-back. Test the race condition with 100 concurrent goroutines hitting the same seat.
Day 4-5: FastAPI WebSocket endpoint + Redis Pub/Sub listener. Two terminal windows, two WebSocket connections, one holds a seat, the other sees it. That's the proof of concept. Everything else is refinement.

The rest of the 10 weeks: React seat map, payment integration, batching, error handling, dynamic pricing, load testing, and polish. But the hard part — the concurrency model — is proven by end of week 1.