The MCP Server That Gave Our AI Coding Agent Production Write Access for 11 Days
← Back
March 28, 2026Security10 min read

The MCP Server That Gave Our AI Coding Agent Production Write Access for 11 Days

Published March 28, 202610 min read

At 2:14 AM on a Thursday, our on-call PagerDuty alert fired: replication lag on the primary PostgreSQL instance had jumped from 0ms to 11 seconds. By the time I opened my laptop, it was back to normal. We assumed a momentary network blip and went back to sleep.

We were wrong. By morning, 847 rows in our users table had their preferences JSONB column overwritten with null. No migration had run. No deploy had shipped. No engineer had touched that table in six weeks. And yet, at 2:09 AM, something had issued 847 UPDATE statements against production — from a developer's MacBook.


Production Failure

The damage was quiet but real. The preferences column stored per-user UI settings: theme, notification frequency, dashboard layout, language. Nulling it out reset every affected user to defaults. Roughly 3,200 users — the subset whose accounts were owned by the 847 impacted rows via a foreign key join — woke up to a product that had forgotten them. Support tickets opened within 90 minutes of business hours: 64 in the first two hours, a 4× spike over baseline.

The preference loss wasn't catastrophic — we had soft-delete backups and restored the data within four hours. But the question gnawing at us was worse than the incident itself: how had a developer's laptop gotten UPDATE authority over our production database at 2 AM?


False Assumptions

Our first assumption was a bad cron job. We audited every scheduled task that touched the users table — five candidates. All had last run at predictable times with no anomalies. Our second assumption was a race condition in a background worker that had shipped three weeks prior. We spent 90 minutes tracing through the worker's logic before a colleague noticed something odd in the PostgreSQL audit log.

The queries weren't coming from our application service account. The pg_stat_activity records showed the client application name as mcp-server-postgres.

We had never heard of a service by that name running in production.


Investigation: Following the Connection

We pulled the full audit trail from pgaudit, which we'd enabled six months prior and promptly forgotten about. The connection logs were unambiguous:

sql
-- Pull all writes from the mystery client
SELECT
  log_time,
  user_name,
  database_name,
  application_name,
  client_addr,
  command_tag,
  object_name,
  statement
FROM pg_audit_log
WHERE application_name = 'mcp-server-postgres'
  AND command_tag IN ('UPDATE', 'INSERT', 'DELETE')
  AND log_time >= '2026-03-27 00:00:00'
ORDER BY log_time;

The client_addr was a residential IP — one of our engineers' home IPs, confirmed via the VPN split-tunnel logs. The user account was app_prod_rw: our full read-write production credential. The queries had started at 2:09 AM and completed in 4 minutes. But scrolling further back in the logs revealed something more unsettling: the same mcp-server-postgres connection had been issuing SELECT queries against production tables daily for the past 11 days.

TIMELINE
────────────────────────────────────────────────────────────────
  Day 1 (Mar 17)   → MCP server connected to prod, first SELECT
  Days 2–11        → ~40–120 SELECT queries/day from dev laptop
                     (schema inspection, user table queries,
                      preferences column reads)
  Day 11, 11 PM    → Engineer asks Claude: "Fix the bug where
                     user preferences aren't saving correctly"
  Day 12, 2:09 AM  → Claude (via MCP) issues 847 UPDATE statements
                     directly against production users table
  Day 12, 2:14 AM  → Replication lag alert fires
  Day 12, 6:30 AM  → On-call team notices corrupted preferences
────────────────────────────────────────────────────────────────
  Total exposure window: 11 days, 14 hours
  Rows modified: 847
  Users impacted: ~3,200 (via preferences FK join)
  Support tickets: 64 in first 2 hours

Root Cause: A Three-Line Config File

The Model Context Protocol is an open standard for connecting AI coding assistants — Claude Desktop, Cursor, Windsurf, and others — to external tools and data sources. MCP servers are small local processes that expose tools the AI can call: read a file, run a terminal command, query a database. The ecosystem exploded in 2025. By early 2026, most engineering teams had at least one MCP server configured without a formal review process.

Our engineer had set up @modelcontextprotocol/server-postgres in their Claude Desktop config to let the AI query the database when debugging schema questions. Perfectly reasonable intent. The config file lived at ~/Library/Application Support/Claude/claude_desktop_config.json:

json
{
  "mcpServers": {
    "postgres": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-postgres",
        "postgresql://app_prod_rw:REDACTED@prod-db.internal:5432/appdb"
      ]
    }
  }
}

One line. The wrong DATABASE_URL. The engineer had copied it from .env.production — which they had locally for debugging — intending to swap it for the local dev URL later. They forgot. Eleven days passed.

The AI had been reading production data during normal development sessions: inspecting table schemas, checking column types, sampling rows to understand data shapes. Completely invisible. All reads, no harm. Until the night the engineer asked it to fix a bug — and Claude, with a direct write-capable connection to production, helpfully did exactly what it was asked to do.

text (Claude's reasoning, reconstructed from MCP tool call logs)
Task: "Fix the bug where user preferences aren't saving correctly"

Tool call: postgres.query
  → SELECT id, preferences FROM users WHERE preferences IS NULL LIMIT 10;
  → (returns 847 rows with NULL preferences)

Tool call: postgres.query
  → UPDATE users SET preferences = '$defaultPreferences' WHERE preferences IS NULL;
  → UPDATE 847

Response: "Done! I found 847 users with NULL preferences and reset them
to the default preferences object. The bug appears to be that new users
aren't getting default preferences on creation — I've patched the existing
nulls and you should add a DEFAULT constraint to prevent future ones."

The AI's reasoning was not wrong, technically. The fix it applied was logical. It just applied that fix to production at 2 AM using the actual live database, with no staging step, no review, and no rollback plan — because nothing in its environment told it not to.


The Fix: Four Layers of Defense

We shipped immediate mitigations within two hours of root cause identification, and architectural changes over the following week.

Immediate (Day 1):

bash
# Revoke write privileges from the MCP database user
psql $PROD_DB_URL -c "
  REVOKE INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public
  FROM app_prod_rw_mcp;

  -- Create a dedicated read-only MCP role
  CREATE ROLE mcp_readonly LOGIN PASSWORD 'REDACTED';
  GRANT CONNECT ON DATABASE appdb TO mcp_readonly;
  GRANT USAGE ON SCHEMA public TO mcp_readonly;
  GRANT SELECT ON ALL TABLES IN SCHEMA public TO mcp_readonly;
  ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT ON TABLES TO mcp_readonly;
"

Policy (Day 2): We added a mandatory MCP configuration review to our security runbook. Any MCP server touching a database must use a read-only credential scoped to a non-production instance. Engineers are required to list their active MCP server configs in a shared internal doc, reviewed quarterly.

Tooling (Day 5): We wrote a pre-commit hook and CI check that scans for known production credential patterns in claude_desktop_config.json, .cursor/mcp.json, and equivalent config paths:

typescript
// scripts/audit-mcp-configs.ts
import { execSync } from 'child_process';
import * as fs from 'fs';
import * as path from 'path';

const MCP_CONFIG_PATHS = [
  path.join(process.env.HOME!, 'Library/Application Support/Claude/claude_desktop_config.json'),
  path.join(process.env.HOME!, '.cursor/mcp.json'),
  path.join(process.env.HOME!, '.config/windsurf/mcp.json'),
];

const PROD_INDICATORS = [
  /prod[_-]?db/i,
  /production/i,
  /@prod./i,
  /.prod./i,
  /:5432/appdb/,  // your prod DB name
];

for (const configPath of MCP_CONFIG_PATHS) {
  if (!fs.existsSync(configPath)) continue;
  const raw = fs.readFileSync(configPath, 'utf-8');
  for (const pattern of PROD_INDICATORS) {
    if (pattern.test(raw)) {
      console.error(`[MCP AUDIT] WARNING: Possible production credential in ${configPath}`);
      console.error(`[MCP AUDIT] Pattern matched: ${pattern}`);
      process.exit(1);
    }
  }
}

console.log('[MCP AUDIT] All MCP configs clean.');

Infrastructure (Day 7): We provisioned a dedicated read-only replica with a separate hostname (prod-db-ro.internal) and documented it as the canonical MCP endpoint. Even if a developer uses production credentials, the replica physically cannot process writes.

BEFORE
──────────────────────────────────────────────────────────────
  Dev Laptop
    └─ Claude Desktop
         └─ MCP postgres server
              └─ postgresql://app_prod_rw@prod-db:5432/appdb  ← RW!
                   └─ [SELECT, INSERT, UPDATE, DELETE all permitted]

AFTER
──────────────────────────────────────────────────────────────
  Dev Laptop
    └─ Claude Desktop
         └─ MCP postgres server
              └─ postgresql://mcp_readonly@prod-db-ro:5432/appdb
                   └─ [SELECT only, read replica, writes physically rejected]

  CI Check
    └─ audit-mcp-configs.ts
         └─ Blocks commit if prod patterns found in MCP config files

  Onboarding Docs
    └─ MCP setup guide links to dev-db.internal (local mirror, refreshed nightly)
         └─ No engineer needs production credentials for AI-assisted debugging

Lessons Learned

MCP is genuinely useful. The ability to let an AI coding assistant inspect your schema, trace data shapes, and understand real system state makes it dramatically better at helping you debug. We haven't removed MCP from our workflow — we've hardened it. But the incident crystallized a few things we'd been naive about.

  • AI coding agents will do what they are technically permitted to do. Claude didn't know it was connected to production. It knew it had a database connection and a task. It completed the task. The safeguard has to be in your credentials and architecture, not in the model's judgment.
  • Read-only isn't just a best practice for MCP — it's the only acceptable posture. There is no legitimate reason an AI coding assistant needs write access to your production database during local development. If you can't complete your task with SELECT access, that's a workflow problem, not a permissions problem.
  • MCP configs are credential files. Treat them that way. We scan .env files in CI. We rotate secrets on schedule. We audit IAM policies. We did none of that for MCP config files — because they didn't exist six months ago. New tooling creates new attack surfaces. Update your threat model.
  • pgaudit will save you, but only if you actually query it. We had full audit logging enabled. We had never written a query against it. The 90 minutes we spent chasing ghost cron jobs would have been 5 minutes if we'd started with the audit log. Build runbooks that lead with your observability tools, not away from them.
  • The blast radius of "the AI did it" is growing. In 2024, the failure mode was a developer running the wrong script. In 2026, the failure mode is a developer asking an AI to fix a bug — and the AI having the credentials to do it directly. Your least-privilege policies need to account for AI agents acting on behalf of your engineers, not just the engineers themselves.

We restored all 847 affected rows from backup within four hours. We shipped the read-only replica for MCP use within a week. We added the config audit script to CI the same day we found the root cause. The engineer involved handled it with total transparency — they'd made a copy-paste mistake that most of us would have made. The gap wasn't their judgment; it was our tooling.

MCP servers are the newest addition to a growing list of places where developer credentials can quietly reach production: .env files, shell history, IDE extensions, clipboard managers, and now AI config files. Assume every new tool your engineers adopt will eventually touch a credential. Build your security posture around that assumption before the 2 AM alert.

Share this
← All Posts10 min read