ALL
POSTS

41 posts so far.

March 15, 2026Database10 min read

How a NOT NULL Column Migration Locked Our Users Table for 14 Minutes

A routine schema migration to add a NOT NULL column with a default value triggered a full table rewrite in Postgres, holding an exclusive lock on 2.4 million rows and taking our entire platform offline for 14 minutes at 10 AM on a Monday.

March 15, 2026Python10 min read

How SQLAlchemy's Identity Map Served Stale Data to 23,000 API Requests

We managed SQLAlchemy sessions manually in Flask, skipping Flask-SQLAlchemy. Forgetting one line — Session.remove() — turned the ORM's per-thread identity map into a stale-data cache that silently returned outdated records for six hours.

March 14, 2026Architecture10 min read

How AWS SQS Visibility Timeout Caused the Same Order to Be Processed 847 Times

A production war story about how a 30-second SQS visibility timeout turned a slow order processor into a duplicate-charge machine — and how we fixed it with heartbeats and a distributed lock.

March 14, 2026Architecture10 min read

How a Race Condition in Our Cron Job Sent 2.3 Million Duplicate Emails in One Night

A nightly email digest cron job was running on two servers simultaneously without a distributed lock — what started as a minor scheduling overlap turned into a 2.3 million email catastrophe that got our domain blacklisted before sunrise.

March 14, 2026Architecture9 min read

How Next.js 15's Full Route Cache Served Stale Prices at Checkout for 3 Hours

After migrating a SaaS checkout flow to Next.js 15 App Router, our price display layer silently served cached values — not the live database prices — costing us 3 hours of confused customers and 19 manual refunds.

March 14, 2026Mobile9 min read

How a Single Power User's Post Triggered 45,000 DB Queries and Crashed Our Mobile API

A synchronous push notification fanout loop for a user with 45,000 followers exhausted our Flask database connection pool in 90 seconds, failing 62% of mobile requests for 3 hours.

March 13, 2026Security10 min read

How Rotating a JWT Secret Logged Out 34,000 Users and Exposed a Session Design Flaw

A routine security rotation invalidated every active session simultaneously, triggered a support flood, and revealed that our JWT architecture had no graceful degradation path whatsoever.

March 13, 2026Docker9 min read

How a DigitalOcean Firewall Rule Silently Dropped 23% of Production Traffic for 11 Days

Intermittent user timeouts, normal server metrics, and zero firewall logs — how a stateless firewall rule was killing TCP connections before they reached Nginx, and why it took eleven days to find it.

PAGE 4 / 6  ·  41 POSTS

Blog — Page 4 | Darshan Turakhia | Darshan Turakhia