Node.js cluster module — use all your CPU cores
Node.js runs on a single thread. A 32-core machine running a single Node.js process uses one core and leaves 31 idle. The cluster module fixes this by forking worker processes — one per CPU core — all sharing the same port. The OS load-balances incoming connections. Here is how to implement it correctly, including graceful restarts and health monitoring.
Basic cluster setup
// cluster.ts
import cluster from 'node:cluster';
import { cpus } from 'node:os';
import { createServer } from './server';
const NUM_WORKERS = cpus().length;
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} starting ${NUM_WORKERS} workers`);
// Fork workers
for (let i = 0; i < NUM_WORKERS; i++) {
cluster.fork();
}
// Replace dead workers
cluster.on('exit', (worker, code, signal) => {
console.log(`Worker ${worker.process.pid} died (code ${code}, signal ${signal})`);
console.log('Forking replacement worker');
cluster.fork();
});
} else {
// Worker process: start the HTTP server
const app = createServer();
const port = parseInt(process.env.PORT || '3000');
app.listen(port, () => {
console.log(`Worker ${process.pid} listening on port ${port}`);
});
}
Graceful restart — zero-downtime deployments
When deploying, you want to restart workers one at a time so the server never goes down:
// Graceful rolling restart on SIGUSR2
if (cluster.isPrimary) {
const workers: cluster.Worker[] = [];
for (let i = 0; i < NUM_WORKERS; i++) {
workers.push(cluster.fork());
}
process.on('SIGUSR2', async () => {
console.log('Received SIGUSR2: starting rolling restart');
const allWorkers = Object.values(cluster.workers || {}).filter(Boolean) as cluster.Worker[];
for (const worker of allWorkers) {
await new Promise<void>((resolve) => {
// Wait for new worker to come up before killing old one
const newWorker = cluster.fork();
newWorker.once('listening', () => {
console.log(`New worker ${newWorker.process.pid} is ready`);
// Now gracefully shut down the old worker
worker.send('shutdown');
worker.disconnect();
worker.once('exit', () => {
console.log(`Old worker ${worker.process.pid} exited`);
resolve();
});
});
});
// Small delay between worker replacements
await new Promise(r => setTimeout(r, 500));
}
console.log('Rolling restart complete');
});
cluster.on('exit', (worker, code, signal) => {
if (signal !== 'SIGTERM') {
// Unexpected death — replace immediately
cluster.fork();
}
});
}
// In the worker: handle the shutdown message
process.on('message', (msg) => {
if (msg === 'shutdown') {
console.log(`Worker ${process.pid} shutting down gracefully`);
server.close(() => {
process.exit(0);
});
}
});
Triggering a rolling restart during deploy
# In your deploy script
# 1. Copy new code
rsync -az ./dist/ server:/app/dist/
# 2. Signal the primary to do a rolling restart
kill -SIGUSR2 $(cat /var/run/myapp.pid)
# 3. Watch the logs to confirm
tail -f /var/log/myapp.log | grep -E "ready|exited"
Sharing state between workers
Workers do not share memory — each is a separate OS process. This has implications:
- In-memory rate limiting counts are per-worker, not global
- WebSocket connections handled by one worker cannot receive messages pushed by another
- Any state that needs to be shared must go in Redis, a database, or a message queue
// BAD: in-memory rate limiting with cluster
const requestCounts: Record<string, number> = {}; // per-worker only
// GOOD: Redis-backed rate limiting works across all workers
import redis from 'ioredis';
const r = new redis();
async function isRateLimited(ip: string): Promise<boolean> {
const key = `rate:${ip}:${Math.floor(Date.now() / 60000)}`;
const count = await r.incr(key);
if (count === 1) await r.expire(key, 60);
return count > 100;
}
Health endpoint that reports all workers
// Primary collects health from all workers
if (cluster.isPrimary) {
const workerHealth: Record<number, object> = {};
Object.values(cluster.workers || {}).forEach((worker) => {
worker?.on('message', (msg) => {
if (msg.type === 'health') {
workerHealth[worker.process.pid] = msg.data;
}
});
});
// Simple HTTP health endpoint in the primary
import http from 'node:http';
http.createServer((req, res) => {
if (req.url === '/cluster-health') {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({
primaryPid: process.pid,
workerCount: Object.keys(cluster.workers || {}).length,
workers: workerHealth,
}));
}
}).listen(9999);
}
// Workers report health periodically
if (cluster.isWorker) {
setInterval(() => {
process.send?.({
type: 'health',
data: {
pid: process.pid,
memory: process.memoryUsage().heapUsed,
uptime: process.uptime(),
},
});
}, 5000);
}
When cluster is the wrong tool
Cluster works well for CPU-bound tasks and high-concurrency I/O servers. It is the wrong tool when:
- You are already running multiple instances behind a load balancer — cluster adds complexity without benefit
- Your bottleneck is I/O (database, network) rather than CPU — a single Node process with async I/O handles this fine
- You need shared state between workers — Redis is simpler than inter-process communication
The sweet spot for cluster: a CPU-intensive Node.js server (data transformation, image processing, heavy computation) on a single machine that you want to scale before moving to Kubernetes.