All posts
node.jsworker-threadsclustermulti-coreperformanceconcurrency

Node.js Worker Threads vs Cluster — When to Use Which

Worker threads and cluster mode solve different problems. This guide explains when to use each, how they interact with process managers, and how to get the most out of multi-core CPUs in Node.js.

Oxmgr Team ·

Node.js runs on a single thread. That’s not a bug — it’s what makes the event loop fast for I/O-heavy workloads. But modern servers have 8, 16, or 64 cores, and leaving them idle is wasteful.

Node.js offers two ways to use multiple cores: Cluster and Worker Threads. They solve different problems and shouldn’t be used interchangeably.

The Short Answer

Use caseSolution
Multiple HTTP servers, each handling requests independentlyCluster
CPU-intensive computation (image processing, crypto, parsing)Worker Threads
Parallel I/O (many database queries at once)Neither — async/await handles this
Mixed: web server + heavy computationBoth

Node.js Cluster

Cluster forks the Node.js process into multiple identical copies. Each copy runs your full application — its own event loop, memory space, and server. The OS (via the master process) distributes incoming connections across workers.

import cluster from 'node:cluster';
import { availableParallelism } from 'node:os';
import express from 'express';

const numCPUs = availableParallelism(); // number of logical CPUs

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} forking ${numCPUs} workers`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} exited — forking replacement`);
    cluster.fork(); // keep the pool full
  });

} else {
  // Each worker runs this
  const app = express();

  app.get('/', (req, res) => {
    res.json({ pid: process.pid });
  });

  app.listen(3000, () => {
    console.log(`Worker ${process.pid} listening on :3000`);
  });
}

All workers share port 3000. The OS distributes requests. Simple.

What cluster gives you:

  • Full multi-core CPU utilization for request handling
  • Isolation: one worker crash doesn’t affect others
  • Zero-downtime restart: restart workers one at a time while others keep serving

What cluster doesn’t give you:

  • Shared memory between workers (each has its own heap)
  • Parallelism for a single CPU-intensive operation
  • Any advantage for I/O-bound workloads (async/await already handles those)

Cluster with a Process Manager

Running cluster yourself means your app is responsible for forking and replacing crashed workers. A simpler approach: let Oxmgr handle the instances, and write a single-process app.

# oxfile.toml
[processes.api]
command = "node dist/server.js"
instances = 4   # Oxmgr forks 4 instances, each on a different port
restart_on_exit = true

With Nginx load-balancing across them:

upstream api {
    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;
    server 127.0.0.1:3003;
}

server {
    listen 80;
    location / {
        proxy_pass http://api;
    }
}

Your app stays simple (no cluster code), and Oxmgr handles restarts, health checks, and rolling deploys across all 4 instances. This is the approach we recommend for most production setups. See the multi-process deployment guide for the full Nginx configuration.

Node.js Worker Threads

Worker Threads are for CPU-bound work — operations that would block the event loop if run on the main thread.

The problem with CPU-heavy code on the main thread:

// This blocks the event loop for ~500ms
// Every other request waits during this time
app.get('/process', (req, res) => {
  const result = expensiveComputation(req.body.data); // blocks!
  res.json({ result });
});

While expensiveComputation runs, Node.js can’t process any other requests. From the outside, your server appears frozen.

The fix — offload to a worker thread:

import { Worker, isMainThread, parentPort, workerData } from 'node:worker_threads';
import { fileURLToPath } from 'node:url';

// This runs in worker threads
if (!isMainThread) {
  const result = expensiveComputation(workerData.data);
  parentPort.postMessage(result);
  process.exit(0);
}

// In the main thread
function runInWorker(data) {
  return new Promise((resolve, reject) => {
    const worker = new Worker(fileURLToPath(import.meta.url), {
      workerData: { data }
    });

    worker.on('message', resolve);
    worker.on('error', reject);
    worker.on('exit', (code) => {
      if (code !== 0) reject(new Error(`Worker exited with code ${code}`));
    });
  });
}

app.get('/process', async (req, res) => {
  const result = await runInWorker(req.body.data); // non-blocking!
  res.json({ result });
});

The main thread immediately gets back to handling requests while the worker runs on a separate OS thread.

Worker Thread Pool

Spawning a new thread per request is expensive. For frequent CPU-bound operations, use a thread pool:

import { Worker } from 'node:worker_threads';
import { EventEmitter } from 'node:events';

class WorkerPool extends EventEmitter {
  #workers = [];
  #queue = [];
  #available = [];

  constructor(workerScript, size = 4) {
    super();
    for (let i = 0; i < size; i++) {
      this.#addWorker(workerScript);
    }
  }

  #addWorker(script) {
    const worker = new Worker(script);

    worker.on('message', ({ result, error, id }) => {
      const { resolve, reject } = this.#queue.find(q => q.id === id) ?? {};
      this.#queue = this.#queue.filter(q => q.id !== id);

      if (error) reject(new Error(error));
      else resolve(result);

      // Mark as available and process next
      this.#available.push(worker);
      this.#processNext();
    });

    this.#available.push(worker);
  }

  #processNext() {
    if (this.#queue.length === 0 || this.#available.length === 0) return;

    const { task, id } = this.#queue[0];
    const worker = this.#available.shift();

    worker.postMessage({ task, id });
  }

  run(task) {
    return new Promise((resolve, reject) => {
      const id = Math.random().toString(36).slice(2);
      this.#queue.push({ id, task, resolve, reject });
      this.#processNext();
    });
  }
}

// Create pool of 4 workers
const pool = new WorkerPool('./compute-worker.js', 4);

app.post('/compress', async (req, res) => {
  const result = await pool.run({ type: 'compress', data: req.body.data });
  res.json(result);
});

Or use a battle-tested library:

npm install piscina
import Piscina from 'piscina';

const piscina = new Piscina({
  filename: new URL('./compute-worker.js', import.meta.url).href,
  maxThreads: 4
});

app.post('/process', async (req, res) => {
  const result = await piscina.run(req.body.data);
  res.json({ result });
});

piscina handles pool management, queuing, and worker lifecycle automatically.

Real-World Example: Image Processing

A typical use case — resize images on upload:

// image-worker.js
import sharp from 'sharp';

export default async function processImage({ buffer, width, height, format }) {
  const result = await sharp(Buffer.from(buffer))
    .resize(width, height, { fit: 'cover' })
    .toFormat(format)
    .toBuffer();

  return result;
}
// server.js
import Piscina from 'piscina';
import multer from 'multer';

const pool = new Piscina({
  filename: new URL('./image-worker.js', import.meta.url).href,
  maxThreads: 4
});

const upload = multer({ storage: multer.memoryStorage() });

app.post('/upload', upload.single('image'), async (req, res) => {
  const resized = await pool.run({
    buffer: req.file.buffer,
    width: 800,
    height: 600,
    format: 'webp'
  });

  res.set('Content-Type', 'image/webp');
  res.send(resized);
});

The main thread handles HTTP. Worker threads handle Sharp (which is CPU-intensive). 4 images can be resized simultaneously without blocking any requests.

When You Need Both

Heavy traffic + CPU-intensive work requires both cluster and worker threads:

[Load Balancer / Nginx]
        │
   ┌────┴────┐
[Worker 0] [Worker 1]  ← Cluster instances (one per CPU core for HTTP)
   │            │
[Thread Pool] [Thread Pool]  ← Worker threads within each cluster worker
# oxfile.toml — 4 cluster instances
[processes.api]
command = "node dist/server.js"
instances = 4
restart_on_exit = true
// Each cluster instance runs its own thread pool
const pool = new Piscina({
  filename: new URL('./worker.js', import.meta.url).href,
  maxThreads: 2   // 4 instances × 2 threads = 8 total worker threads
});

Don’t over-thread. instances × maxThreads should not exceed your CPU count by much, or you’ll spend more time context-switching than computing.

What Doesn’t Need Either

Most Node.js performance problems aren’t CPU-bound — they’re I/O waits. Database queries, HTTP calls, file reads. These are already handled efficiently by the event loop.

// This is fine — async I/O doesn't block the event loop
app.get('/data', async (req, res) => {
  const [users, orders] = await Promise.all([
    db.query('SELECT * FROM users WHERE active = true'),
    db.query('SELECT * FROM orders WHERE created_at > NOW() - INTERVAL 7 DAY')
  ]);
  res.json({ users, orders });
});

Two parallel database queries with zero threads. Adding cluster or worker threads here wouldn’t help.

The question to ask: Is this work CPU-bound (blocks the thread) or I/O-bound (waits for external data)? If I/O, async/await. If CPU, worker threads. If scaling I/O volume beyond one machine, cluster (or multiple Oxmgr instances).

Summary

  • Cluster / Oxmgr instances — scale request handling across CPU cores. Use for web servers, API servers, any app that needs horizontal scaling on a single machine.
  • Worker Threads — offload CPU-heavy computation off the main thread. Use for image processing, video encoding, cryptography, heavy data parsing.
  • Neither — most I/O-bound work. Trust the event loop.

For process-level management of clustered Node.js apps, see the Oxmgr docs — the instances field handles multi-process deployment without cluster code in your app.