Documentation
Health Checks
Oxmgr health checks are command-based — the daemon runs a command on a polling interval and checks the exit code. Exit 0 = healthy. Any other exit code = failure.
Basic Configuration
After health_max_failures consecutive failures, the process is automatically restarted.
[[apps]]
name = "api"
command = "node server.js"
health_cmd = "curl -fsS http://127.0.0.1:3000/health"
health_interval_secs = 15
health_timeout_secs = 3
health_max_failures = 3Polling interval
health_interval_secs (default: 30s)
Check timeout
health_timeout_secs (default: 5s)
Restart trigger
After health_max_failures (default: 3)
Healthy signal
Exit code 0 = healthy, non-zero = failure
Any Command Works
The health command can be anything that returns an exit code — HTTP checks with curl,
database pings, custom scripts, or CLI health checks.
[[apps]]
name = "redis"
command = "redis-server"
health_cmd = "redis-cli ping"
health_interval_secs = 10
health_timeout_secs = 2Common patterns:
curl -fsS http://127.0.0.1:3000/health— HTTP endpoint (fails on non-2xx)redis-cli ping— Redis connectivitypg_isready -h localhost— PostgreSQL readiness./scripts/health-check.sh— custom logic
Readiness Gating During Reload
With wait_ready = true, zero-downtime reloads wait for the new process instance to pass
the health check before the old one is stopped. If readiness times out, the reload is aborted — the old process keeps running
and serves traffic uninterrupted.
[[apps]]
name = "api"
command = "node server.js"
health_cmd = "curl -fsS http://127.0.0.1:3000/health"
wait_ready = true
ready_timeout_secs = 30Reload sequence with wait_ready
- 1. Run pre_reload_cmd (if set). Abort on failure.
- 2. Start new process instance.
- 3. Poll health_cmd until exit 0 (or ready_timeout_secs exceeded).
- 4. If ready: stop old process, new one takes over.
- 5. If timeout: abort reload, old process keeps running.
Field Reference
| Field | Default | Description |
|---|---|---|
| health_cmd | — | Command to run. Exit 0 = healthy, non-zero = failure. |
| health_interval_secs | 30 | Seconds between checks. |
| health_timeout_secs | 5 | Max seconds for a single check to complete. |
| health_max_failures | 3 | Consecutive failures before auto-restart. |
| wait_ready | false | Gate reloads on health check readiness. Requires health_cmd. |
| ready_timeout_secs | 30 | Timeout for readiness during reload. Abort if exceeded. |
Still have questions?
Open an issue or browse the source on GitHub.