We Killed the PHP Monolith. It Took 18 Months and One Client's Data.
Seven years of PHP. Forty-three database tables. Six thousand lines of Blade templates. Three thousand paying clients. And a team that voted — unanimously — to rewrite it all in Next.js. I was the one who said it would take three months. Eighteen months later, I finally understood what software migration actually means.
This isn't a tutorial. It's a confession. The kind you write after the war is over and you want to save the next engineer from walking into the same minefield.
The Monolith That Ran the Company
The product was a B2B SaaS platform — client management, invoicing, job scheduling, reporting. Built on Laravel 5 back in 2017, when Laravel was the right answer. By 2023, it was still running. Reliably, even. But every new feature request sent a cold shiver through the team.
The codebase had accrued seven years of decisions, each one sensible in isolation, none of them sensible together. Business logic lived in controllers. Controllers called other controllers. There were 14 different "helper" files, three of which no one dared touch. A simple field rename on the clients table meant updating 23 Blade templates by hand.
┌─────────────────────────────────────────────────────┐ │ PHP MONOLITH (2023) │ ├─────────────────────────────────────────────────────┤ │ Browser → Apache → Laravel Router │ │ │ │ │ ┌───────────┼───────────┐ │ │ ↓ ↓ ↓ │ │ Controller Controller Controller │ │ │ │ │ │ │ ┌─────────┼─────┐ │ ┌─────┴──────┐ │ │ ↓ ↓ ↓ ↓ ↓ ↓ │ │ Model Helper Model Model Helper14 Model │ │ │ │ │ │ │ │ └────┬────┘ └────────┬─────────┘ │ │ ↓ ↓ │ │ MySQL DB Blade Templates │ │ (43 tables, no migrations │ │ applied since 2021) │ └─────────────────────────────────────────────────────┘
The technical debt wasn't the real problem. The real problem was velocity. New engineers spent six weeks just orienting themselves. A feature that should take two days took two weeks. We were losing competitive ground to newer competitors built on modern stacks.
The decision to migrate felt inevitable. The pitch was clean: Next.js for the frontend, a proper REST API layer, TypeScript everywhere. Component architecture. Tests that ran in under a minute. A codebase a new engineer could contribute to on day three.
What we underestimated: the monolith had seven years of implicit contracts baked into it. They weren't documented. They weren't tested. They just worked. Until we started pulling threads.
The Plan: Strangler Fig. The Reality: Strangling Ourselves.
We chose the strangler fig pattern — the industry-standard approach to big rewrites. Run the new system in parallel, route traffic to it piece by piece, and gradually strangle the old system until it's dead. Clean. Surgical. Beloved by conference talks.
Month one went beautifully. We stood up the Next.js app, connected it to the existing MySQL database, built the login screen, the dashboard skeleton. An Nginx reverse proxy sat in front of both systems: new routes went to Next.js, everything else fell through to PHP.
┌──────────────── STRANGLER FIG SETUP ────────────────────┐ │ │ │ Browser │ │ │ │ │ ↓ │ │ Nginx Reverse Proxy │ │ │ │ │ ├─── /app/clients* ──────→ Next.js (new) │ │ ├─── /app/invoices* ──────→ Next.js (new) │ │ │ │ │ └─── /* (everything else) → PHP Laravel (old) │ │ │ │ ┌──────────────────────────────────────────────────┐ │ │ │ Shared MySQL Database │ │ │ │ (both apps read/write directly) │ │ │ └──────────────────────────────────────────────────┘ │ │ │ │ THE HIDDEN PROBLEM: Sessions, cache, and auth live │ │ in completely different worlds on each side. │ └──────────────────────────────────────────────────────────┘
The first real problem hit us in week three: session management. PHP stored sessions server-side in Redis. Next.js was issuing JWTs. When a user logged in on the PHP side and clicked a link that routed them to a Next.js page, the Next.js app had no idea who they were. They were silently redirected to the login screen.
At first this only affected internal testers. Then we accidentally routed a client-facing page to Next.js without closing the auth gap. Forty-seven clients got logged out mid-session on a Thursday afternoon. Eleven emailed support. One called the CEO directly.
The Session Bridge: A Hack That Became Infrastructure
Our fix was inelegant but necessary. We built a PHP endpoint that accepted a valid PHP session cookie and returned a short-lived signed token. Next.js middleware would request this token on the first authenticated page load, then use it against a new Node API layer.
<?php
// PHP → Next.js session bridge
// Called by Next.js middleware on first authenticated load
session_start();
if (!isset($_SESSION['user_id'])) {
http_response_code(401);
echo json_encode(['error' => 'No active session']);
exit;
}
$userId = $_SESSION['user_id'];
$role = $_SESSION['user_role'];
$expires = time() + 300; // 5-minute bridge window
$payload = base64_encode(json_encode([
'uid' => $userId,
'role' => $role,
'exp' => $expires,
]));
// Shared secret lives in .env on both the PHP and Node sides
$signature = hash_hmac('sha256', $payload, getenv('BRIDGE_SECRET'));
header('Content-Type: application/json');
echo json_encode([
'token' => $payload . '.' . $signature,
'expiresAt' => $expires,
]);
It worked. It also became a critical piece of infrastructure we hadn't planned for, weren't testing properly, and which silently failed if the PHP session had expired between the time Next.js loaded and the time the middleware made the bridge request. We added retry logic. Then monitoring. Then alerts. A "temporary" bridge endpoint lived in production for eleven months.
The Database Problem Nobody Mentioned
Running two applications against one database sounds fine until you think about it for five
minutes. Both apps were doing writes. Both had their own connection pools. The PHP app used
Laravel's Eloquent ORM with soft deletes on almost every model — a deleted_at
timestamp that marked a row as gone without physically removing it. The Next.js API layer,
using Prisma, didn't know this convention existed.
For three months, when users deleted a client in the new Next.js interface, the record was hard-deleted from the database. The PHP app, expecting soft deletes, then threw null reference errors when related data tried to load a parent record that was now gone. The errors were silent — logged to a file no one was actively watching.
We found out when a client called to ask why six months of job history had disappeared. They'd deleted and re-added a client thinking it would "reset" the account. Instead, it permanently destroyed everything associated with that client's ID — jobs, invoices, notes, history. All of it. One hard DELETE CASCADE and it was gone.
What We Got Right (Eventually)
By month six, we stopped pretending the rewrite was on schedule and started treating it as the long migration it actually was. A few things finally turned the tide:
- API contracts first. We stopped letting both apps touch the database directly. Every module got a versioned API contract written before a single line of implementation. Both the PHP adapter and the new Next.js frontend talked exclusively through that API. No more shared-DB surprises.
- Feature flags over URL routing. Instead of routing by URL pattern at Nginx, we moved to a per-client feature flag system. Each account could be opted into specific Next.js features independently. Rollbacks became a config change, not a deployment.
- The database conventions document. We spent a week writing down every implicit convention in the legacy schema — soft deletes, JSON column formats, enum string values, composite key patterns, timestamp timezone assumptions. Every new API endpoint had a checklist item for each convention.
- An owner for the seam. One engineer owned the boundary between old and new full-time. That role shouldn't need to exist in a well-planned migration. In a real one, it's the most important seat on the team.
The Day We Flipped the Switch
Month seventeen. Every module had been migrated. The PHP app now served exactly one page: a legacy summary report that one enterprise client relied on and that we'd agreed to rebuild once the core migration was complete. Feature flags were at 100% for all other clients on all other routes.
We shut down the PHP server on a Tuesday at 2 PM. I watched the Nginx access logs for twenty minutes. No 404s. No 500s. No session bridge requests. The new system handled everything without a flicker.
There was no celebration. It felt more like exhaling after holding your breath for a year and a half — relief without elation, because we were too tired for elation. But the next feature request? Two engineers, three days, shipped clean. No Blade templates. No mystery helper files. TypeScript components, typed API contracts, tests that ran in twenty-eight seconds.
The monolith was dead. The strangler fig had finished the job. It just took eighteen months instead of three.
What I'd Tell Someone Starting This Tomorrow
Don't rewrite. Migrate. Those are different things. A rewrite is "start over and do it right." A migration is "move everything of value into a new home without breaking what already works." Migration is slower, messier, and far more likely to actually ship.
Triple your timeline estimate. Then add six months. Not because you're slow — because the old system will reveal new implicit contracts every single week. Budget for that discovery time or it will surprise you at the worst possible moment.
Document the seam obsessively. Every place the old and new systems touch is a bug waiting to happen. Auth handoffs, session formats, database conventions, file storage paths, cache key namespaces — write it all down. The day you assume the other system handles something the same way yours does is the day a client loses data.
Give someone ownership of the migration itself. Not as a side project. Not "when they have time." It is the most important engineering work happening in the organisation during that window, and it needs an engineer who wakes up every morning thinking exclusively about the seam.
Seven years of PHP didn't kill us. Eighteen months of migration didn't kill us. We came out the other side faster, cleaner, and a lot more humble about what the word "simple" means in a production codebase with real users depending on it.
Full Stack Engineer · March 2026