ALL
POSTS
41 posts so far.
The AI-Generated Migration That Dropped the Wrong Index and Took Our API from 3ms to 45 Seconds
We asked an AI coding assistant to clean up redundant PostgreSQL indexes. It dropped the wrong one. API latency jumped from 3ms to 45 seconds. Here is what a 3.5-hour investigation taught us about trusting AI with schema changes.
Our OpenAI Bill Went From $23 to $4,200 in 48 Hours — A Missing Stop Sequence Did It
We built a feedback-processing pipeline that used GPT-4 to categorise and summarise user feedback. A single missing stop sequence caused the model to loop indefinitely, generating 40-million tokens of circular output over a holiday weekend while our alerts stayed quiet.
ECS Autoscaling Fought Our Postgres max_connections at 2AM and Postgres Won
We scaled to 38 ECS tasks during a flash sale. Each task held 10 Postgres connections. Our RDS instance allowed 170. The math was never going to work.
We Found Our .env File in 47 Public Forks After a Junior Dev's First Open Source PR
A junior developer forked our private repo to submit a bug fix, unknowingly committed our .env file, and GitHub indexed it. We had production credentials exposed in 47 public forks before anyone noticed.
Our AI Documentation Bot Invented 14 API Routes That Never Existed — 6,000 Users Integrated Against Them
We shipped an LLM-powered documentation assistant trained on our API docs. Within three weeks, it had confidently hallucinated 14 non-existent endpoints. Developers built integrations against them. Support tickets arrived. We had to choose between breaking those integrations or actually building the routes the AI had promised.
The AI Agent That Cleaned Up Our K8s Manifests and Crashed Production
We let a Cursor AI agent refactor our Kubernetes deployment files to remove boilerplate. Six hours later, 34% of requests were failing as pods OOMKilled faster than they could restart.
We Upgraded Our Embedding Model and Our RAG Pipeline Returned Wrong Results for 6 Days
We upgraded from text-embedding-ada-002 to text-embedding-3-large without re-embedding our 2.3M documents. Cosine similarity searches silently returned wrong content for six days — valid JSON, HTTP 200, completely wrong answers.
How a GitHub Actions Cache Hit Skipped Our Tests and Shipped a Regression to 12,000 Users
Our CI pipeline showed green for six consecutive deploys while never running the new test files we added. A cache key tied only to package-lock.json silently restored stale compiled test artifacts — new tests never compiled, never ran, and a broken discount-code checkout reached production for 6 hours.
PAGE 3 / 6 · 41 POSTS