Linux Process Management Guide

Running applications reliably on Linux means understanding how the OS thinks about processes — not just which commands to run. This guide covers the fundamentals: what processes are, how they live and die, how the OS controls them, and how modern process managers fit into that picture.

What Is a Process?

From Linux’s perspective, a process is a running instance of a program. It has:

A PID (Process ID) — unique integer, assigned by the kernel
A PPID (Parent PID) — the process that created it
An owner — user and group that control access
File descriptors — open files, sockets, pipes
Memory maps — stack, heap, code segments
An exit status — 0 for success, non-zero for error

You can see all processes with:

ps aux
# or more readable:
ps -eo pid,ppid,user,stat,comm --sort=pid | head -30

The Process Lifecycle

Every Linux process goes through the same lifecycle:

Created (fork/exec)
    ↓
Running (RUNNING or SLEEPING)
    ↓
Stopped (SIGSTOP) ← optional
    ↓
Zombie (waiting for parent to read exit status)
    ↓
Dead (removed from process table)

Running — actively executing or waiting for I/O Sleeping (S) — waiting for an event (I/O, timer, lock) Zombie (Z) — process has exited but parent hasn’t called wait() yet Stopped (T) — process is paused via SIGSTOP

The zombie state matters for process managers. When a managed process dies, the manager must call wait() to reap the zombie and free the PID. A leaky process manager that doesn’t reap children will accumulate zombie processes.

Process Signals

Signals are the OS’s mechanism for communicating with running processes. You’ve likely used kill — but kill sends any signal, not just termination.

Signal	Number	Default Action	Meaning
SIGHUP	1	Terminate	Terminal hangup / reload config
SIGINT	2	Terminate	Keyboard interrupt (Ctrl+C)
SIGQUIT	3	Core dump	Keyboard quit (Ctrl+\)
SIGKILL	9	Terminate	Forceful kill (cannot be caught)
SIGTERM	15	Terminate	Graceful termination request
SIGSTOP	19	Stop	Pause process (cannot be caught)
SIGCONT	18	Continue	Resume stopped process
SIGUSR1	10	Terminate	User-defined signal 1
SIGUSR2	12	Terminate	User-defined signal 2

SIGTERM vs SIGKILL

The most important distinction for production:

SIGTERM (15) — a polite request to shut down. The process can catch this signal, finish in-flight requests, flush buffers, close connections, and exit cleanly. This is what kill <pid> sends by default.

SIGKILL (9) — unconditional termination. The kernel kills the process immediately. No cleanup, no graceful shutdown, no chance to respond. This is what kill -9 <pid> sends.

Always try SIGTERM first and give the process time to respond. SIGKILL is the last resort.

In Node.js, handle SIGTERM for graceful shutdown:

process.on('SIGTERM', async () => {
  console.log('SIGTERM received, shutting down gracefully...');

  // Stop accepting new connections
  server.close(async () => {
    // Finish in-flight requests, close DB connections
    await db.close();
    process.exit(0);
  });

  // Force exit if graceful shutdown takes too long
  setTimeout(() => {
    console.error('Forced shutdown after timeout');
    process.exit(1);
  }, 30000);
});

SIGHUP for Config Reloads

By convention, SIGHUP (kill -HUP <pid>) tells a daemon to reload its configuration without restarting. Nginx, Apache, and many other servers respect this convention.

# Reload nginx config without downtime
kill -HUP $(cat /var/run/nginx.pid)
# Or:
nginx -s reload

Process managers use this internally during zero-downtime deploys.

Daemons

A daemon is a process that runs in the background, detached from any terminal. The word comes from Unix mythology — a daemon is a background helper spirit.

To daemonize a process, it must:

Fork from its parent
Create a new session (setsid) to detach from the controlling terminal
Fork again to ensure it can’t reacquire a terminal
Close standard file descriptors (stdin, stdout, stderr)
Change working directory to / (so it doesn’t hold a mount point busy)
Write a PID file so other processes can find it

This is complex. In practice, you don’t write this code yourself — you let systemd or a process manager handle it.

Systemd: The Modern Way

Systemd is the init system (PID 1) on every major Linux distribution since ~2015. It starts all system daemons at boot and manages them throughout the system’s lifetime.

Unit Files

Systemd manages services via unit files — declarative configs that describe how to run a process:

# /etc/systemd/system/myapp.service
[Unit]
Description=My Node.js Application
Documentation=https://myapp.com/docs
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=nodeapp
Group=nodeapp
WorkingDirectory=/var/www/myapp
ExecStart=/usr/bin/node dist/server.js
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5
TimeoutStopSec=30

# Environment
Environment=NODE_ENV=production
EnvironmentFile=-/var/www/myapp/.env

# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ReadWritePaths=/var/www/myapp/logs

# Resource limits
LimitNOFILE=65536
MemoryMax=512M

[Install]
WantedBy=multi-user.target

# Install and enable
sudo systemctl daemon-reload
sudo systemctl enable myapp
sudo systemctl start myapp

# Check status
sudo systemctl status myapp

# View logs (real-time)
journalctl -u myapp -f

# View last 100 lines
journalctl -u myapp -n 100

Restart Policies

Systemd’s Restart= key controls when to restart a service:

Value	Restarts when…
`no`	Never
`on-success`	Exit code 0
`on-failure`	Non-zero exit, signal, timeout
`on-abnormal`	Signal, timeout, watchdog
`always`	Any exit, including clean

For production apps, use Restart=on-failure or Restart=always.

cgroups: Resource Control

Systemd places each service in a cgroup (control group), which allows the kernel to enforce resource limits:

[Service]
MemoryMax=512M
MemorySwapMax=0
CPUQuota=50%
TasksMax=100

This prevents a runaway process from consuming all available memory and crashing the entire system. Systemd’s integration with cgroups is one of its biggest advantages over standalone process managers.

Process Managers vs Systemd

You might wonder: if systemd is already there, why use a separate process manager? The PM2 vs Systemd comparison covers this in depth — short answer: they solve different things and the best setups use both.

Need	Systemd	Process Manager
Boot persistence	Native	Via systemd unit
Crash recovery	✓	✓
Cluster mode	✗	✓
Config portability	✗ (Linux-only)	✓ (cross-platform)
Log management	Via journald	Built-in
Zero-downtime deploy	Complex	✓
Developer ergonomics	Low	High

The best production setup often combines both: systemd starts and manages the process manager as a system service, and the process manager handles application-level concerns like clustering, health checks, and rolling restarts.

# systemd unit for Oxmgr
[Service]
Type=forking
User=nodeapp
WorkingDirectory=/var/www/myapp
ExecStart=/usr/local/bin/oxmgr start --daemon
ExecStop=/usr/local/bin/oxmgr stop
ExecReload=/usr/local/bin/oxmgr reload
Restart=on-failure

Then Oxmgr handles the application-level complexity:

# oxfile.toml — version-controlled, cross-platform
[processes.api]
command = "node dist/server.js"
instances = 4
restart_on_exit = true

[processes.api.health_check]
endpoint = "http://localhost:3000/health"
interval_secs = 15

Practical Commands Reference

Process Inspection

# List all processes (sorted by CPU)
ps aux --sort=-%cpu | head -20

# Process tree
pstree -p

# Real-time process monitor
top
htop   # better, install with: apt install htop

# Find processes by name
pgrep -l node
pgrep -a node

# Process file descriptors
lsof -p <pid>

# Process memory maps
cat /proc/<pid>/maps

# Full process status
cat /proc/<pid>/status

Signal Sending

# Graceful shutdown
kill <pid>           # sends SIGTERM

# Force kill
kill -9 <pid>        # sends SIGKILL

# Reload config
kill -HUP <pid>      # sends SIGHUP

# Kill by name
pkill node
pkill -9 node

# Kill all processes of a user
pkill -u nodeapp

Systemd Management

# Service control
systemctl start myapp
systemctl stop myapp
systemctl restart myapp
systemctl reload myapp    # SIGHUP

# Status and logs
systemctl status myapp
journalctl -u myapp -f         # follow logs
journalctl -u myapp --since "1 hour ago"
journalctl -u myapp -n 100 --no-pager

# Boot management
systemctl enable myapp
systemctl disable myapp
systemctl is-enabled myapp

# List all services
systemctl list-units --type=service

Common Production Patterns

Graceful Shutdown Under Load

The proper shutdown sequence for a Node.js server:

Receive SIGTERM
Stop accepting new connections
Wait for in-flight requests to complete (with timeout)
Close database connections
Exit with code 0

let server;
let isShuttingDown = false;

server = app.listen(3000);

process.on('SIGTERM', () => {
  isShuttingDown = true;

  server.close(() => {
    // All connections closed
    pool.end(() => process.exit(0));
  });

  // Reject new requests during shutdown
  app.use((req, res) => {
    res.setHeader('Connection', 'close');
    res.status(503).json({ error: 'Server shutting down' });
  });
});

Process Limits

Increase file descriptor limits for high-traffic apps:

# Check current limits
ulimit -n

# Increase for current session
ulimit -n 65536

# Permanent — add to /etc/security/limits.conf
nodeapp soft nofile 65536
nodeapp hard nofile 65536

Via systemd:

[Service]
LimitNOFILE=65536

Summary

Linux process management has multiple layers:

OS layer — the kernel manages PIDs, signals, cgroups
Init layer — systemd manages boot-time service startup and system-level lifecycle
Application layer — process managers (PM2, Oxmgr) handle app-level concerns: clustering, health checks, developer-friendly config, cross-platform portability

For most Node.js production deployments, the recommended stack is:

systemd → Oxmgr → [app:0, app:1, app:2, app:3]

Systemd ensures Oxmgr survives reboots. Oxmgr handles everything your application needs: crash recovery, rolling deploys, health checks, and log management.

See the Oxmgr documentation to get started, or the deployment guide for a step-by-step walkthrough.