Overview
APM is Tom's Advanced Process Manager for Linux. One binary, copy and run — no config files required to get started. Complexity is APM's problem, not yours.
APM runs as a background daemon and manages worker processes. You interact with it through the apm CLI. The daemon auto-starts the first time you run any apm command.
Architecture
The daemon communicates with the CLI via an abstract Unix socket. Workers are child processes managed by the daemon. Each worker can have multiple parallel instances. The built-in reverse proxy routes incoming connections across instances using round-robin.
CLI ──(unix socket)── Daemon ── Worker [4 instances]
└── Worker [1 instance]
└── GUI server (port 6789)
Philosophy
- Zero config to start — defaults are correct for 90% of cases
- CLI-first, config files for persistence and power users
- Every error message answers "what do I do now"
- Never crash on a bad optional field — warn and continue
- Linux only. Windows is not supported.
Installation
Run the install script as root. It downloads the right binary for your architecture, sets up the system group and log file, installs the init service, and starts the daemon.
# One-liner install
$ curl -fsSL https://processmanager.dev/install.sh | sudo bash
# Or download first, review, then run
$ curl -fsSL https://processmanager.dev/install.sh -o install.sh
$ sudo bash install.sh
# Verify
$ apm --version
The installer sets up:
/usr/sbin/apm— binary/var/log/apm.log— daemon log (group-readable byapm)/etc/apm/apm.conf— default config (created if absent)apmOS group — add users withusermod -aG apm <user>- Startup service (systemd, OpenRC, or SysV — auto-detected)
The daemon auto-loads /etc/apm/apm.conf at startup — no separate boot step required.
Run as a systemd service
The installer registers APM with your init system automatically. The service runs the daemon at boot, supervised by systemd.
# Check status
$ systemctl status apm
# View logs
$ journalctl -u apm -f
Uninstall
# Remove binary and config
$ sudo apm uninstall
# Remove everything including logs, group, service
$ sudo apm uninstall --purge
Command reference
All commands communicate with the running daemon. If no daemon is running, APM starts one automatically.
apm [command] [options]
The CLI does not require sudo. The daemon listens on an abstract Unix socket (@apm), which has no filesystem permissions — any local user can run apm list, apm reload, apm restart, apm info, etc. sudo is only needed for apm install / apm uninstall (which write to /usr/sbin, /etc/apm, and the init system), and for reading worker log files directly when those files are owned by root. The apm OS group exists solely to grant read access to /var/log/apm.log — it is not required to use the CLI.
Process commands
| Command | Description | |
|---|---|---|
| apm run <exec> [args...] [flags] | Create and immediately start a worker without a config file. All --flag options from Worker Options apply. Worker name defaults to the executable name; use --name to override. | |
| apm start <name> | Start a registered worker that is currently stopped. | |
| apm stop <name> | Gracefully stop a worker and all its instances. | |
| apm restart <name> | Restart a worker (rolling if rolling was set). | |
| apm update <name> [flags] | Update a running worker's config and reload it. Accepts the same flags as run. Add --no-restart to apply the new config without restarting. | |
| apm list | List all workers with status, instance count, CPU, memory, and uptime. | |
| apm remove <name> | Stop and remove a worker. Alias: apm rm | |
| apm stopall | Stop all workers without stopping the daemon. | |
| apm rename <name> <new> | Rename a worker. | |
| apm copy <name> <new> | Duplicate a worker under a new name. |
Inspection commands
| Command | Description | |
|---|---|---|
| apm info <name> | Show full configuration and live state for one worker, including its IPC listen channel. | |
| apm log <name> | Stream a worker's stdout / stderr log. | |
| apm grep <pattern> [name] | Search worker logs for a pattern. | |
| apm env <name> | Print the environment a worker's children run with. | |
| apm wait <name> | Block until the worker reaches running state — useful in scripts. |
Config commands
| Command | Description | |
|---|---|---|
| apm boot | Load /etc/apm/apm.conf into the running daemon. Called automatically by startup scripts after the daemon starts. Safe to run manually — skips workers already running. | |
| apm load <file> | Load a config file and start all workers defined in it. Workers already running are skipped. | |
| apm unload <file> | Stop and remove every worker that was loaded from the given config file. | |
| apm reload <file> [--force] | Smart reload: diff the config against running workers, start new ones, restart changed ones (all config fields synced live — watcher patterns, TLS, rolling settings, proxy flags, etc.), stop removed ones. --force restarts unchanged workers too. | |
| apm saveconf | Write all workers back to their source config files (the file they were loaded from). Workers started via run without a file prompt for one. | |
| apm saveconf <name> <file> | Save a specific worker to a file and set that as its config file going forward. | |
| apm check <file> | Validate a config file — checks syntax and reports which workers it would create or reload, without applying anything. | |
| apm convert <pm2-config> [output] | Convert a PM2 ecosystem file (.json, .js, .cjs, .mjs) into APM worker { } blocks. JavaScript ecosystems are evaluated via the local node. Unsupported PM2 fields surface as inline # WARN: comments above the block they apply to. With no output, writes to stdout for piping. |
GUI & Monitor commands
| Command | Description | |
|---|---|---|
| apm gui | Start the web GUI and print its URL. No-op if already running. | |
| apm gui stop | Stop the GUI server. | |
| apm monitor | Live terminal dashboard — shows system CPU, RAM, load average, uptime, and per-worker/instance status, CPU%, memory, and restart counts. Updates every second. Press Ctrl+C to exit. |
Daemon commands
| Command | Description | |
|---|---|---|
| apm exit daemon | Stop all workers and shut down the daemon. | |
| apm install | Install APM to /usr/sbin/apm with group, log, and service setup. Requires root. | |
| apm uninstall [--purge] | Remove APM from the system. --purge removes logs, config, and service files. | |
| apm -v / --version | Print CLI and daemon version. | |
| apm -h / --help [--full] | Show command help. --full includes all commands. |
Run flags
Flags for apm run and apm update. The same options are available as config file fields (see Worker options).
# Start a worker from the CLI — no config file needed
$ apm run node server.js \
--name myapp \
--instances 4 \
--server http://0.0.0.0:3000 \
--watch "*.js" \
--restart # restart on clean exit
--rolling # rolling restart mode
# Update a running worker's instance count without restarting
$ apm update myapp --instances 8 --no-restart
# Save it back to a conf file
$ apm saveconf myapp /etc/apm/apm.conf.d/myapp.conf
Signals
Any signal name can be used as a command to forward that signal to a worker's child processes — all instances, or a single one by index:
$ apm SIGHUP myworker # send to all instances
$ apm SIGUSR1 myworker#2 # send to instance index 2 only
Config file
Config files define workers and daemon settings. They're loaded with apm load or apm reload. The system config path is /etc/apm/apm.conf.
Syntax
- Key-value pairs end with
; - Blocks use
{ } - Comments:
#or//to end of line - Strings: unquoted, or single/double/backtick quoted (quotes are stripped)
- Multiple values: comma-separated on one line, or repeat the key
- The
:suffix on keys is optional include <glob>;inlines another file at parse position
# Simple worker
worker {
name myapp;
exec node;
params server.js;
instances 4;
restart true;
watch *.js;
server http://0.0.0.0:3000;
}
Config hierarchy
APM's startup scripts call apm boot after the daemon starts, which loads /etc/apm/apm.conf. The main config is typically structured as:
/etc/apm/apm.conf— main config (daemon block + includes)/etc/apm/apm.conf.d/*.conf— drop-in worker configs, sorted by filename
You can also load configs manually at any time with apm load <file> or do a live diff with apm reload <file>.
daemon {
gui_port 6789;
}
# Load drop-in worker configs
include apm.conf.d/*.conf;
worker {
name portal;
exec node;
params app.js;
}
Multiple values
# Comma-separated on one line
server http://0.0.0.0:3000, ws://0.0.0.0:3001;
# Or repeat the key
ban_path *.php;
ban_path *wp-*;
ban_path *.env;
Worker options
All options are available both as CLI flags to apm start and as fields in a worker { } config block.
Identity
| Field | Default | Description |
|---|---|---|
| name | exec | Worker name. Used in all CLI output and log prefixes. |
| exec | required | Executable to run (looked up in PATH). |
| params | Arguments passed to the executable. Multiple values supported. | |
| path | cwd | Working directory for the child process. Env vars expanded. |
| instances | 1 | Number of parallel child processes to run. |
| user | Run child processes as this OS user. Daemon must run as root. |
Environment
| Field | Default | Description |
|---|---|---|
| env | Inject environment variables. Format: KEY=value. Multiple values supported. | |
| env_index | Inject the instance index as an env var. Specify the variable name. | |
| env_file | Path to a KEY=VALUE file. Read by APM before setuid drop; child inherits the env. |
Restart on clean exit
| Field | Default | Description |
|---|---|---|
| restart | false | Restart the process when it exits with code 0. |
| restart_delay | 250 | Milliseconds to wait before restarting after a clean exit. |
| max_restarts | 0 | Maximum clean-exit restarts. 0 = unlimited. |
Restart on error exit
| Field | Default | Description |
|---|---|---|
| restart_err | false | Restart the process when it exits with a non-zero code. |
| err_delay | 500 | Milliseconds to wait before restarting after an error exit. |
| max_err_restarts | 0 | Maximum error-exit restarts. 0 = unlimited. |
| err_grace | Milliseconds of uptime required before a restart counts against the limit. | |
| startup_grace | 2000 | Milliseconds the CLI waits on apm start / apm restart for a fast-exit failure to surface. If the child dies inside this window on its first attempt, captured stderr and pattern-matched hints are dumped to the terminal. Auto-restart paths skip the wait. |
| augur_full_scan | false | When true, augur runs a manifest + source-AST dependency scan before every fork (including watcher- and auto-restart-triggered ones), aborting the launch with a clear list if anything declared in package.json / requirements.txt / composer.json / Gemfile is not installed. When false (default), augur only runs reactively after a fast-exit failure. See the augur section. |
| restart_on_exit_codes | Comma-separated exit codes. When set, the process is restarted only on these codes — overrides both restart and restart_err. |
Shutdown
| Field | Default | Description |
|---|---|---|
| kill_timeout | 2000 | Milliseconds to wait for graceful shutdown (SIGTERM) before sending SIGKILL. |
Startup dependencies
| Field | Default | Description |
|---|---|---|
| depends_on | Comma-separated worker names that must be running before this worker starts. | |
| depends_timeout | 30000 | Milliseconds to wait for dependencies. After the timeout the worker starts anyway, with a warning. |
Health check
APM can probe a worker's health two ways. Pull mode — set health_check to a URL and APM sends periodic HTTP GETs (2xx/3xx = healthy). Push mode — set health_check to on and APM injects APM_HEALTH_URL into the child, which calls it to report in. Health status shows in apm list and the GUI.
| Field | Default | Description |
|---|---|---|
| health_check | A URL to probe (pull mode), or on (push mode). Empty = disabled. | |
| health_check_interval | 5000 | Milliseconds between probes. |
| health_check_timeout | 3000 | Milliseconds to wait for a single pull-mode probe response. |
| health_check_threshold | 3 | Consecutive failures before the worker is marked unhealthy. |
Connection drain
| Field | Default | Description |
|---|---|---|
| drain_timeout | 0 | Milliseconds to let active connections finish before a stop/restart. New connections are refused during the drain. 0 = stop immediately. |
Memory limit
| Field | Default | Description |
|---|---|---|
| memory_limit | Per-child memory cap enforced via Linux cgroups v2 (e.g. 256M, 1G). The kernel OOM-kills a child that exceeds it; APM then applies the restart policy. Requires cgroups v2 and a root daemon. |
Logging
| Field | Default | Description |
|---|---|---|
| log | Path to the stdout log file. | |
| err_log | Path to the stderr log file. Defaults to the same file as log. | |
| prefix | String prepended to each log line. | |
| log_time_format | Timestamp format for log lines. | |
| strip_ansi | false | Strip ANSI escape codes from log output. |
| syslog | Forward logs to syslog. Value is the destination (e.g. syslog://localhost:514). | |
| syslog_tag | Tag for syslog messages. | |
| log_max_size | Rotate the log file when it exceeds this size (e.g. 10M, 1G). Empty = no rotation. | |
| log_max_files | 5 | Number of rotated log files to keep. |
Proxy / HTTP
| Field | Default | Description |
|---|---|---|
| server | Bind address for the proxy server. See Server types. | |
| lowercase_hdrs | false | Lowercase all HTTP header names before forwarding to the child. |
| trust_proxy | true | Trust X-Forwarded-For / X-Real-IP headers for client IP resolution. On by default; set trust_proxy false; when APM is directly internet-facing so spoofed headers can't bypass Vanguard. |
| keep_alive | 120000 | HTTP keep-alive idle timeout in milliseconds. |
| max_conns | 0 | Maximum concurrent connections per server. 0 = unlimited. |
| trace_header | When set, APM injects this header (e.g. x-request-id) with a unique per-request ID into every forwarded request. | |
| session_persist | false | Persist session state across rolling restarts. |
| session_wait | 5000 | Milliseconds to wait for a new instance to accept a migrated session. |
File watcher
| Field | Default | Description |
|---|---|---|
| watch | Comma-separated glob patterns of file paths to watch. Restarts the worker when any match changes. See File watcher for pattern syntax. | |
| watch_ignore | Comma-separated glob patterns of paths to exclude from watching. | |
| watch_delay | 200 | Debounce delay in milliseconds before triggering a restart. |
| watch_conf | false | Auto-reload this worker when its own source config file changes on disk. |
Rolling restart
| Field | Default | Description |
|---|---|---|
| rolling | false | Enable rolling restart mode (one instance at a time). |
| rolling_delay | 1000 | Milliseconds between restarting each instance. |
Stats
| Field | Default | Description |
|---|---|---|
| stats_interval | Interval in milliseconds between stats collection cycles. |
Inter-worker IPC
| Field | Default | Description |
|---|---|---|
| listen | Channel name this worker listens on for inter-worker IPC. Every running child receives channel messages and stream requests. Omit to disable. | |
| ipc_timeout | 500 | Default timeout in milliseconds for request() calls. Per-call timeouts passed to request() override this value. |
Crash webhook (on_crash)
APM can POST a JSON payload (or send a GET request) to a URL of your choice whenever a child process crashes — i.e. exits with a non-zero code or is killed by a signal. Intentional stops (apm stop) are never reported.
worker {
name myapp;
cmd node server.js;
on_crash {
url https://hooks.example.com/apm-crash;
method POST; # POST (default) or GET
debounce 10000; # min ms between calls (floor: 5000)
log_lines 20; # tail lines to include in payload
log_source err; # "err" (default) or "out"
secret mysecret; # signs payload with HMAC-SHA256
}
}
| Field | Default | Description |
|---|---|---|
| url | Destination URL. Required — the block is ignored without it. | |
| method | POST | HTTP method. POST sends a JSON body; GET sends no body. |
| debounce | 5000 | Minimum milliseconds between webhook calls per worker. Minimum enforced value is 5000 — prevents flooding during a crash loop. |
| log_lines | 0 | Number of trailing lines from the log file to include in the log field of the payload. 0 = omit. |
| log_source | err | Which log to tail: err (stderr log) or out (stdout log). |
| secret | When set, APM signs the raw POST body with HMAC-SHA256 and sends the result in the X-APM-Signature: sha256=… header. |
Request headers
| Header | Value |
|---|---|
| X-APM-Worker | Worker name |
| X-APM-Event | crash |
| X-APM-Signature | sha256=<hex> — only present when secret is set |
POST payload
{
"worker": "myapp",
"instance": 1,
"exit_code": 1,
"exit_signal": "SIGKILL", // omitted if process exited normally
"runtime_ms": 4821,
"error_restarts": 3,
"timestamp": "2025-06-01T12:00:00Z",
"log": "Error: cannot connect to DB\n..." // omitted if log_lines = 0
}
Daemon config
Global APM settings live in a top-level daemon { } block. No daemon block = all defaults. Zero config still works.
daemon {
gui_port 6789; # web GUI port — 0 disables the GUI
gui_bind 127.0.0.1; # bind address — default is 0.0.0.0 (all interfaces)
gui_password secret; # GUI login password
auto_reload true; # reload config files when they change on disk
}
# telemetry is a top-level key — not inside daemon { }
telemetry false; # opt out of the anonymous usage ping
| Field | Default | Description |
|---|---|---|
| gui_port | 6789 | Port for the web GUI. The GUI starts automatically when this is set in the daemon block. Set to 0 (or omit) to disable — apm gui can still start it on demand. |
| gui_bind | 0.0.0.0 | Address the GUI binds to. Defaults to 0.0.0.0 — all interfaces. Set 127.0.0.1 to restrict it to localhost, or set a gui_password before exposing it on a network. |
| gui_password | Password for GUI access. When empty the GUI is served without authentication — only safe on a localhost bind. | |
| auto_reload | false | When true, the daemon watches the config files it loaded and reloads them automatically when they change on disk. |
telemetry is a top-level key, not a field inside daemon { }. APM sends an anonymous hourly ping — APM version, worker count, OS, and hardware class; no names, paths, or IPs. Opt out by placing telemetry false; at the top level of the config file.
Server types
APM's built-in proxy accepts connections and forwards them to worker instances via IPC. Specify servers with the server field. Multiple servers per worker are supported.
| Scheme | Description |
|---|---|
| http:// | HTTP reverse proxy. APM parses request headers and forwards the full request to a child instance. |
| ws:// | WebSocket proxy. Handles the upgrade handshake; bidirectional frame forwarding to child. |
| tcp:// | Raw TCP proxy. Bytes forwarded as-is. Use for databases, game servers, custom protocols. |
worker {
name api;
exec node;
params api.js;
# HTTP and WebSocket on separate ports
server http://0.0.0.0:3000;
server ws://0.0.0.0:3001;
# Or combined on one line
server http://0.0.0.0:3000, ws://0.0.0.0:3001;
}
Client IP resolution
When behind a CDN or reverse proxy (e.g. nginx), enable trust_proxy so APM resolves the real client IP from X-Forwarded-For headers. This affects Vanguard rate limiting and ban decisions.
trust_proxy true;
Vanguard
Vanguard is APM's built-in request firewall. It runs before worker IPC — rejected connections never reach your app. Configure it with a vanguard { } sub-block inside a worker.
worker {
name api;
exec node;
params api.js;
server http://0.0.0.0:3000;
vanguard {
rate_limit 100; # requests/sec per IP
rate_burst 200; # burst capacity
ban_ttl 300000; # auto-ban for 5 minutes
ban_path *.php, *wp-*, *.env, /.git*;
ban_response Forbidden;
}
}
IP filtering
| Field | Description |
|---|---|
| allow_ip | CIDR allowlist. Only matching IPs are allowed. Multiple values supported. |
| ban_ip | CIDR blocklist. Matching IPs are rejected immediately (silent RST for TCP, 403 for HTTP). |
Path banning
| Field | Description |
|---|---|
| ban_path | Comma-separated pattern list. Matched against the request path (query string stripped). Same four modes as the file watcher: *.ext ends-with, prefix* starts-with, *word* contains, exact exact match. |
| ban_response | HTTP response body for blocked requests. Default: Forbidden. |
Method filtering
| Field | Description |
|---|---|
| allow_method | HTTP method allowlist. If set, only listed methods reach the worker; everything else returns 405 Method Not Allowed. Repeatable. Case-insensitive (normalized to uppercase). |
| ban_method | HTTP method blocklist. Listed methods are rejected with 405. Repeatable. Evaluated before allow_method. Skipped for raw TCP workers. |
vanguard {
allow_method GET;
allow_method POST; # everything else → 405
# or, blocklist style:
ban_method TRACE;
ban_method OPTIONS;
}
ban_path *.php; # ends-with — block all .php requests
ban_path *wp-*; # contains — block WordPress probes
ban_path /.git*; # starts-with — block .git exposure
ban_path *.env; # ends-with — block .env file reads
ban_path /admin/login; # exact match — block a specific path
Rate limiting
| Field | Description |
|---|---|
| rate_limit | Token bucket rate in requests per second per real client IP. |
| rate_burst | Burst capacity. Defaults to rate_limit if not set. |
| ban_ttl | Milliseconds to auto-ban an IP after rate limit is exceeded. 0 = soft block (no ban, just drop). |
Rate-limited requests receive 429 Too Many Requests. Path/IP bans return 403 Forbidden (or silent TCP RST).
Logging
| Field | Description |
|---|---|
| log | Per-event block log lines. Set to off (also accepts false/no/0) to suppress. Default: on. |
| log_summary | Interval in seconds. When set, vanguard counts every block/drop and emits one summary line per interval — e.g. in last 60s, 412 events. Silent intervals are skipped. Default: 0 (disabled). Independent of log. |
vanguard {
rate_limit 200;
ban_path *.php, *.env;
log off; # silence per-event lines under sustained probing
log_summary 60; # one aggregate line per minute instead
}
CDN IP lists
APM's installer fetches Cloudflare's published egress IP ranges and writes them to /etc/apm/ips/ as ready-to-include partial configs. Include them inside a vanguard { } block to restrict direct access to CDN traffic only.
vanguard {
# Only allow Cloudflare egress IPs (IPv4 + IPv6)
include /etc/apm/ips/cloudflare-v4.part;
include /etc/apm/ips/cloudflare-v6.part;
rate_limit 500;
ban_path *.php, *wp-*, *.env, /.git*;
}
The IP lists are re-fetched automatically on every apm install or upgrade. To refresh them manually:
$ sudo apm install # re-runs the full installer, including IP fetch
allow_ip with CDN IP files to drop all non-CDN connections at the TCP level — before any HTTP parsing happens and before your app sees the request.
TLS
APM has first-class TLS support for all server types — HTTP, WebSocket, and TCP. Bring your own certificates.
worker {
name api;
exec node;
params api.js;
server https://0.0.0.0:443;
tls true;
tls_cert /etc/ssl/certs/myapp.crt;
tls_key /etc/ssl/private/myapp.key;
# tls_ca for mutual TLS (client cert verification)
tls_ca /etc/ssl/certs/ca.crt;
}
| Field | Description |
|---|---|
| tls | Enable TLS on all server listeners for this worker. |
| tls_cert | Path to the TLS certificate file (PEM). |
| tls_key | Path to the private key file (PEM). |
| tls_ca | Path to the CA certificate for mutual TLS. If set, client certificates are required and verified against this CA. |
File watcher
The file watcher monitors your source directory and triggers a worker restart when matching files change. Uses kernel file-watch events (inotify) — no polling.
worker {
name api;
exec node;
params server.js;
path /home/user/myapp;
watch *.js, *.json; # watch .js and .json files
watch_ignore *node_modules*; # ignore anything inside node_modules
watch_delay 200; # 200ms debounce
}
| Field | Description |
|---|---|
| watch | Comma-separated pattern list. Matched against the full path of each changed file. Worker restarts when any pattern matches. |
| watch_ignore | Comma-separated pattern list. Paths matching any of these are excluded from watch events even if they also match watch. |
| watch_delay | Debounce delay in milliseconds. Multiple rapid changes are batched into one restart. |
Pattern syntax
Watch patterns use a simple glob-style syntax — no regex needed. Four matching modes:
| Pattern | Mode | Example | Matches |
|---|---|---|---|
| *.ext | ends-with | *.js | Any file ending in .js |
| prefix* | starts-with | src/* | Any path starting with src/ |
| *word* | contains | *node_modules* | Any path containing node_modules |
| exact | exact match | config.json | Only that exact filename |
# Go source files, excluding generated code and vendor
watch *.go;
watch_ignore *_generated.go, *vendor*;
# JS/TS project — watch src/, ignore build output and deps
watch *.js, *.ts, *.json;
watch_ignore *node_modules*, *dist/*;
# Python — any .py file anywhere under path
watch *.py;
watch_ignore *__pycache__*;
watch_delay at 100–300 ms. Build tools often write multiple files in quick succession; the debounce ensures only one restart fires per save.
Rolling restart
Rolling restarts cycle through instances one at a time, keeping the rest running to serve traffic. Zero downtime for multi-instance workers.
worker {
instances 4;
rolling true;
rolling_delay 1000; # 1s between each instance restart
}
With session_persist true, open connections are migrated to a new instance before the old one is killed. Use session_wait to control how long APM waits for the new instance to become ready.
rolling true;
rolling_delay 500;
session_persist true;
session_wait 2000; # wait up to 2s for new instance
Logger
APM has a built-in logger for each worker. Every line written to a child process's stdout or stderr is intercepted, prefixed with a timestamp and worker name, and written to the configured destination. Coloring is applied by APM before writing — use strip_ansi to remove it when logging to files.
Destinations
| Field | Default | Description |
|---|---|---|
| log | File path for stdout. If omitted, output goes to the daemon log. | |
| err_log | same as log | File path for stderr. Defaults to the same file as log when not set. |
| syslog | Syslog destination URL, e.g. syslog://localhost:514. ANSI is always stripped for syslog regardless of strip_ansi. | |
| syslog_tag | Tag string attached to every syslog message for this worker. |
Prefix
Each log line is prefixed with the worker name (or a custom string). The prefix field supports the color syntax described below. APM automatically appends the instance number in multi-instance workers:
| Field | Default | Description |
|---|---|---|
| prefix | name | String prepended to every log line. Supports çN- color escapes. The instance index is appended automatically for multi-instance workers. |
worker {
name api;
exec node;
params server.js;
# cyan name, reset after — instance # is appended automatically
prefix ç51-api-serverçR-;
log /var/log/myapp/out.log;
err_log /var/log/myapp/err.log;
}
For a worker with instances 3, the stdout prefix becomes api-server#1, api-server#2, api-server#3 — each in a distinct color so instances are visually distinct in the live GUI and in log files.
Timestamp format
The timestamp prepended to each line is controlled by log_time_format. The format string uses strftime-style tokens and supports color escapes. The default is ç214-%Y-%m-%d %Tç59-.%FçR- (orange date, dim fractional seconds).
| Field | Default | Description |
|---|---|---|
| log_time_format | ç214-%Y-%m-%d %Tç59-.%FçR- | Timestamp format. Supports strftime tokens and color escapes. |
Strftime tokens
| Token | Output |
|---|---|
| %Y | 4-digit year — 2026 |
| %y | 2-digit year — 26 |
| %m | Month, zero-padded — 03 |
| %d | Day, zero-padded — 07 |
| %H | Hour 24h, zero-padded — 14 |
| %M | Minute, zero-padded — 05 |
| %S | Second, zero-padded — 09 |
| %T | Shorthand for %H:%M:%S |
| %F | Fractional seconds (microseconds) |
# Default — orange date, dim microseconds
log_time_format ç214-%Y-%m-%d %Tç59-.%FçR-;
# Compact — just HH:MM:SS in gray
log_time_format ç59-%TçR-;
# No color — plain ISO timestamp
log_time_format %Y-%m-%d %T;
Strip ANSI
| Field | Default | Description |
|---|---|---|
| strip_ansi | false | Strip ANSI color codes from all log output before writing to the file. Useful when you want clean logs on disk but colored output in the GUI. Always on for syslog destinations. |
strip_ansi false for local development (colors in the GUI look great), and set it to true in production log files so tools like grep, awk, and log shippers see clean text.
Color syntax — çN-
APM uses a compact color escape based on the 256-color terminal palette. The ç character (U+00E7) acts as the escape marker. This syntax works in prefix, log_time_format, and anywhere APM renders text to the terminal or log files.
| Syntax | ANSI equivalent | Description |
|---|---|---|
çN- | \033[38;5;Nm | Set foreground to 256-color palette index N (0–255). |
çN,BG- | \033[38;5;N;48;5;BGm | Foreground N, background BG. |
çN,BG,ATTR- | \033[38;5;N;48;5;BG;ATTRm | Foreground, background, and an SGR attribute (1 bold, 2 dim, 4 underline, 9 strikethrough). |
çR- | \033[0m | Reset all formatting. |
# Foreground only
prefix ç82-myappçR-; # bright green name
prefix ç196-myappçR-; # bright red name
prefix ç214-myappçR-; # orange name
# Foreground + background
prefix ç15,88-ERRORçR-; # white text on dark red background
# Bold foreground
prefix ç51,0,1-myappçR-; # bold cyan
Useful color reference
| Code | Approximate color |
|---|---|
ç1- | Dark red |
ç2- | Dark green |
ç6- | Cyan |
ç51- | Bright cyan |
ç80- | Green |
ç82- | Bright green |
ç88- | Dark red |
ç124- | Medium red |
ç165- | Magenta |
ç196- | Bright red |
ç202- | Orange-red |
ç208- | Orange |
ç214- | Amber / warm orange |
ç244- | Mid gray |
ç59- | Dark gray |
ç15- | White |
çR- | Reset |
StatsD
APM can forward worker metrics to any StatsD-compatible endpoint (StatsD, Graphite, Datadog Agent, Telegraf) over UDP. Add a statsd { } block to a worker.
worker {
name api;
exec node;
params server.js;
statsd {
host localhost:8125; # StatsD UDP endpoint
prefix apm.api; # metric namespace
interval 1; # flush interval, seconds
}
}
| Field | Default | Description |
|---|---|---|
| host | required | StatsD UDP endpoint as host:port. The block is inactive without it. |
| prefix | apm.<worker> | Namespace prefixed to every metric. Non-alphanumeric characters in the worker name are replaced with _. |
| interval | 1 | Flush interval in seconds. Metrics are batched into UDP packets kept under 1400 bytes. |
System metrics
Forwarded automatically every interval, as gauges:
| Metric | Description |
|---|---|
| <prefix>.cpu | Worker CPU usage (%) |
| <prefix>.rss | Resident memory (bytes) |
| <prefix>.instances | Running child count |
| <prefix>.active_conns | Current concurrent connections |
| <prefix>.total_conns | Lifetime total connections |
| <prefix>.restarts.normal / .error / .watch | Restart counts by cause |
| <prefix>.errors | Error-exit count |
Custom metrics
Metrics emitted from worker code with apm.metric(name, value, type) are aggregated across all of the worker's instances and forwarded with the same prefix — counters summed (|c), timings averaged (|ms), gauges last-value-wins (|g).
Web GUI
APM ships a built-in real-time web dashboard. Enable it with gui_port in the daemon config:
daemon {
gui_port 6789; # GUI port — omit or set 0 to disable
gui_bind 127.0.0.1; # default is 0.0.0.0 — set 127.0.0.1 to keep it local
}
When the daemon starts with the GUI enabled it prints the access URL:
$ apm start
GUI: http://127.0.0.1:6789/
Views
| Tab | Description |
|---|---|
| Workers | Live table of all workers and instances — status, uptime, CPU sparkline, CPU%, RAM, restart count, error count. Per-worker stop / start / restart / reload-config buttons. |
| Dashboard | Custom metric panels (LED, counter, text, graph, gauge, heatmap) defined in the worker's dashboard { } config block. Workers without a dashboard block show a placeholder. |
| Live Logs | Per-worker log stream replayed from a 200-line ring buffer on connect, then live. Includes both stdout and stderr. Clear, download, and pause controls. |
| Server Info | CPU model, thread count, speed, RAM, OS, kernel, architecture, uptime, network interfaces (IP, MAC, speed, RX/TX totals), load averages, and latency probes. |
Server Info — latency
The Server Info page has a Latency section with two cards:
| Card | How it works | Interval |
|---|---|---|
| Server → Internet | TCP connect time to Google (8.8.8.8:443), Cloudflare (1.1.1.1:443), and Quad9 (9.9.9.9:443). Measures server-side outbound connectivity. | On load, then every 30 s |
| Browser → Server | WebSocket round-trip time. The browser sends a ping frame; the server echoes it; the browser measures elapsed time. | On load, then every 30 s |
Disconnect behaviour
When the daemon stops, the WebSocket closes and the GUI immediately dims with a Session Ended overlay. Click Refresh Page to reconnect.
Dashboard
Each worker can expose a custom metric dashboard in the GUI. Define a dashboard { } block inside a worker config to create one. The dashboard is shown in the Dashboard tab when that worker is selected.
worker my-api {
exec node;
params server.js;
server http://127.0.0.1:3000;
dashboard {
name My API;
cols 6;
rows 4;
module {
type graph;
id 1;
name Requests/sec;
x 0; y 0; w 3; h 1;
}
module {
type gauge;
id 2;
name CPU %;
x 3; y 0; w 1; h 1;
min 0; max 100; unit %;
}
module {
type counter;
id 3;
name Total errors;
x 4; y 0; w 1; h 1;
}
module {
type text;
id 4;
name Last error;
x 5; y 0; w 1; h 1;
}
}
}
Dashboard block fields
| Field | Default | Description |
|---|---|---|
| name | worker name | Tab label shown in the GUI. |
| cols | 6 | Number of grid columns. |
| rows | 3 | Number of grid rows. |
Module fields
| Field | Required | Description |
|---|---|---|
| type | yes | Module type: led, counter, text, graph, gauge, heatmap. |
| id | yes | Integer ID. Must be unique within the dashboard. Used to route metrics from code to the right module. |
| x, y | yes | Grid position (0-based column, row). |
| w, h | yes | Width and height in grid cells. |
| name | no | Label shown inside the module. |
| unit | no | Unit suffix displayed next to the value (e.g. %, ms, req/s). |
| min, max | no | Value range. Used by gauge to scale the arc. Default 0–100. |
| color | no | Accent color (hex or CSS value). Used by led, graph, gauge. |
| base_color | no | Base / background color for heatmap cells. |
| source | no | Auto-feed a built-in metric without writing code: cpu (CPU%), ram (RAM MB), conn (active connections), ior / iow (disk I/O read/write). When set, setDashValue calls for this module are ignored. |
Module types
| Type | Description |
|---|---|
| led | Colored indicator light. Green when value > 0, configurable color. |
| counter | Large numeric display. Shows cumulative value. |
| text | Single-line text value. Good for status strings or last-event messages. |
| graph | Scrolling bar chart. Newest bar on the right, auto-scaling. |
| gauge | Arc gauge with min/max range and optional unit suffix. |
| heatmap | Grid of colored cells representing a 2-D value distribution. |
Sending metrics from code
Use apm.setDashValue(id, value, color?) in the Node.js connector to push a value to a dashboard module. This is distinct from apm.metric(), which is for StatsD-style system metrics only.
const ApmModule = require('./apm_module.node.js')
const apm = new ApmModule(async (session) => { /* handle connections */ })
// Push a number to module id 1 (graph)
apm.setDashValue(1, requestsPerSecond)
// Push a number with a dynamic color
apm.setDashValue(2, cpuPercent, cpuPercent > 80 ? '#ff5a5a' : '#4f8cff')
// Push a string to a text module (id 4)
apm.setDashValue(4, lastErrorMessage)
// LED on/off (1 = on, 0 = off)
apm.setDashValue(5, isHealthy ? 1 : 0, isHealthy ? '#47d16c' : '#ff5a5a')
| Parameter | Description |
|---|---|
| id | Module ID as defined in the dashboard { } config block. |
| value | Number for gauge, graph, counter, led; string for text. |
| color | Optional CSS color string to override the module's configured color dynamically. |
Counter vs gauge vs graph
| Module type | How value is applied |
|---|---|
| counter | Value is added to the running total each call (delta). To reset, call setDashValue(id, -currentTotal). |
| gauge | Absolute value replaces the current reading. Arc fills from min to max. |
| graph | Absolute value; appended as the newest bar on the right each call. |
| led | Any non-zero value turns the LED on; 0 turns it off. |
| text | String value replaces the displayed text. |
| heatmap | Numeric value 0–100 appended as the next cell. |
Node.js connector
The Node.js connector ships in two equivalent forms — pick whichever matches your project:
apm_module.node.js— CommonJS (require); Node 10+.apm_module.node.mjs— ES Modules (import); Node 14+, for"type":"module"packages.
Both files are line-for-line equivalent: same class, same API, same wire protocol. IPC happens over stdin/stdout using binary frames — no Unix sockets required in the child.
$ curl -fsSL https://processmanager.dev/connectors/apm_module.node.js -o apm_module.node.js
$ curl -fsSL https://processmanager.dev/connectors/apm_module.node.mjs -o apm_module.node.mjs
$ node apm_module.node.js -update
$ node apm_module.node.mjs -update
onConnect callback. The constructor sets up crash handlers and the stdin IPC listener immediately — call it once at startup before doing anything else. For ESM, the line below becomes import ApmModule from './apm_module.node.mjs'.
const ApmModule = require('./apm_module.node.js')
const apm = new ApmModule(async (session) => {
// session.protocol — 'http' | 'ws' | 'tcp'
// session.method — HTTP method
// session.path — full path + query
// session.headers — request headers
// session.remoteIp — real client IP (proxy-aware)
// session.cookies — parsed cookie map
session.write('Hello World', {
'content-type': 'text/plain',
'x-status': '200'
})
session.close()
})
If your worker doesn't handle sessions (e.g. a background job pushing dashboard metrics), pass an empty async function: new ApmModule(async () => {}).
Session API
| Property / Method | Description |
|---|---|
| session.protocol | 'http', 'ws', or 'tcp' |
| session.method | HTTP method (GET, POST, …) |
| session.path | Full path including query string |
| session.path_array | Decoded path segments as an array |
| session.query | Raw query string parts |
| session.query_object | Parsed { key: value | [values] } |
| session.cookies | Parsed cookie map |
| session.headers | Request headers object |
| session.remoteIp | Client IP. APM resolves from proxy headers when trust_proxy is set. |
| session.sessionId | Unique per-connection ID |
| session.instanceId | APM_INDEX of this instance (0-based) |
| session.sessionType | 'new' for fresh connections |
| session.sessionData | Free-form object. Persists across session callbacks. Use saveSessionData() to persist across rolling restarts. |
| session.active | true while the connection is open |
| session.onData | Set inside the callback. Called with (data, isBinary) for incoming data (WebSocket frames, TCP bytes). |
| session.onClose | Set inside the callback. Called when connection closes. |
| session.write(data, headers?) | Send HTTP response body / WebSocket frame. Pass headers object on first HTTP write to set status and headers. |
| session.close(code?, reason?) | Close the connection. HTTP status close or WebSocket close frame. |
| session.writeRaw(data) | Send raw bytes, bypassing HTTP/WebSocket framing. For TCP or low-level use. |
| session.saveSessionData() | Persist sessionData in the daemon. Survives rolling restart — new instance receives the same data. |
Instance methods
| Method | Description |
|---|---|
| apm.setDashValue(id, value, color?) | Push a value to a dashboard module. id is the integer module ID from the config. value is a number for gauge / graph / counter / led, or a string for text. color is an optional CSS color override. |
| apm.metric(name, value, type?) | Send a StatsD-style metric. name is a dot-separated string (e.g. 'req.ok'). type: 'counter' (default, summed per second), 'gauge' (last value), 'timing' (averaged). Visible in StatsD export. |
| apm.instanceId | APM_INDEX of this process instance (0-based string). |
Environment variables
APM injects the following into managed child processes:
| Variable | Description |
|---|---|
| APM | Set to 1. The connector checks for this and exits if not present. |
| APM_INDEX | 0-based instance index. Only injected when env_index is configured. |
WebSocket example
const ApmModule = require('./apm_module.node.js')
const apm = new ApmModule(async (session) => {
if (session.protocol !== 'ws') {
session.close(400)
return
}
session.onData = (data, isBinary) => {
// echo back
session.write(data)
}
session.onClose = () => {
console.log('disconnected', session.sessionId)
}
})
PHP / Python / Perl / Lua connectors
Connectors for other languages follow the same pattern as Node.js: drop a single file into your project, require / include it, construct an instance with an onConnect callback, then run the event loop. All connectors implement the full APM IPC protocol over stdin/stdout — no extra dependencies beyond what's noted on the connectors page.
$ curl -fsSL https://processmanager.dev/connectors/apm_module.php -o apm_module.php
$ curl -fsSL https://processmanager.dev/connectors/apm_module.py -o apm_module.py
$ curl -fsSL https://processmanager.dev/connectors/ApmModule.pm -o ApmModule.pm
$ curl -fsSL https://processmanager.dev/connectors/apm_module.lua -o apm_module.lua
Each connector file also supports self-update — run it with -update to fetch the latest version from the server (e.g. php apm_module.php -update). See the connectors page for version info, MD5 checksums, and per-language update commands. All four expose the same setDashValue(id, value, color?) and metric(name, value, type?) methods as the Node.js connector — see the Dashboard section.
Each subsection below shows the same three worked patterns per language: a hello-world HTTP session, an IPC channel handler (the worker listens on a channel and replies to request()), and a stream initiator (the worker opens a stream to a peer and exchanges data). The wire protocol is identical across languages — anything you can do from Node.js works from these. For a worked end-to-end example with workers in two different languages talking to each other, see cross-language IPC walkthrough.
Python
Python 3.6+, stdlib only. The connector exposes ApmSession and IpcStream classes and uses snake_case method names. The session.write() argument can be bytes or str.
from apm_module import ApmModule
def on_connect(s):
s.write(b'Hello from Python', {
'content-type': 'text/plain',
'x-status': '200',
})
s.close()
apm = ApmModule(on_connect)
apm.run()
Pair with a worker block like worker { exec python3; params hello.py; path /opt/myapp; server :8080; }.
from apm_module import ApmModule
apm = ApmModule(lambda s: s.close()) # no HTTP sessions here
def on_channel(channel, data, reply):
# `reply` is None for fire-and-forget send(); a callable for request()
if reply is None:
print(f'broadcast on {channel}: {data}')
return
reply({'pong': data.get('ping'), 'from': 'python'})
apm.on_channel = on_channel
apm.run()
Configure the listener in the worker block: listen "ping_service";. A peer (any language) can then call request("ping_service", {ping: 42}) and receive {pong: 42, from: 'python'}.
from apm_module import ApmModule
apm = ApmModule(lambda s: s.close())
def main():
stream = apm.request_stream('iot', {'device': 'sensor-7'}, 5000)
if stream is None:
print('no peer accepted on \'iot\'')
return
stream.on_data = lambda data, peer=None: print('in: ', data)
stream.on_close = lambda: print('closed')
stream.write(b'ready')
# Kick off the request after the connector loop is running
import threading; threading.Timer(0.1, main).start()
apm.run()
PHP
PHP 7.4+, no extra extensions. Methods use camelCase following PSR convention. Handlers are assigned to onChannel / onStream properties on the instance.
<?php
require_once __DIR__ . '/apm_module.php';
$apm = new ApmModule(function (ApmSession $s) {
$s->write('Hello from PHP', [
'content-type' => 'text/plain',
'x-status' => '200',
]);
$s->close();
});
$apm->run();
<?php
require_once __DIR__ . '/apm_module.php';
$apm = new ApmModule(function ($s) { $s->close(); });
$apm->onChannel = function ($channel, $data, $reply) {
// $reply is null for fire-and-forget send(); callable for request()
if ($reply === null) {
error_log("broadcast on $channel");
return;
}
$reply(['pong' => $data['ping'] ?? null, 'from' => 'php']);
};
$apm->run();
<?php
require_once __DIR__ . '/apm_module.php';
$apm = new ApmModule(function ($s) { $s->close(); });
// requestStream blocks until accepted/rejected/timeout; call after run() in
// a coroutine, or from a request handler. Returns IpcStream or null.
$apm->onChannel = function ($ch, $data, $reply) use ($apm) {
if ($ch !== 'kick' || $reply === null) return;
$stream = $apm->requestStream('iot', ['device' => 'sensor-7'], 5000);
if ($stream === null) { $reply(['ok' => false]); return; }
$stream->onData = function ($data) { error_log("in: $data"); };
$stream->onClose = function () { error_log('closed'); };
$stream->write('ready');
$reply(['ok' => true, 'stream' => $stream->id]);
};
$apm->run();
Perl
Perl 5.10+ with the JSON module (cpan install JSON or apt install libjson-perl). Object-oriented; handlers assigned via hash-element syntax ($apm->{on_channel} = sub { ... }).
use strict;
use warnings;
use ApmModule;
my $apm = ApmModule->new(sub {
my ($s) = @_;
$s->write('Hello from Perl', {
'content-type' => 'text/plain',
'x-status' => '200',
});
$s->close;
});
$apm->run;
use strict; use warnings;
use ApmModule;
my $apm = ApmModule->new(sub { $_[0]->close });
$apm->{on_channel} = sub {
my ($channel, $data, $reply) = @_;
if (!defined $reply) {
warn "broadcast on $channel\n";
return;
}
$reply->({ pong => $data->{ping}, from => 'perl' });
};
$apm->run;
use strict; use warnings;
use ApmModule;
my $apm = ApmModule->new(sub { $_[0]->close });
$apm->{on_channel} = sub {
my ($ch, $data, $reply) = @_;
return unless $ch eq 'kick' && $reply;
my $stream = $apm->request_stream('iot', { device => 'sensor-7' }, 5000);
if (!$stream) { $reply->({ ok => 0 }); return; }
$stream->{on_data} = sub { warn "in: $_[0]\n" };
$stream->{on_close} = sub { warn "closed\n" };
$stream->write('ready');
$reply->({ ok => 1, stream => $stream->{id} });
};
$apm->run;
Lua
Lua 5.3+ with lua-cjson (apt install lua-cjson or luarocks install lua-cjson). Method calls use : syntax (apm:run()); callbacks use . assignment (apm.on_channel = …).
local ApmModule = require('apm_module')
local apm = ApmModule.new(function(s)
s:write('Hello from Lua', {
['content-type'] = 'text/plain',
['x-status'] = '200',
})
s:close()
end)
apm:run()
local ApmModule = require('apm_module')
local apm = ApmModule.new(function(s) s:close() end)
apm.on_channel = function(channel, data, reply)
-- reply is nil for fire-and-forget send(); a function for request()
if not reply then
io.stderr:write('broadcast on ' .. channel .. '\n')
return
end
reply({ pong = data.ping, from = 'lua' })
end
apm:run()
local ApmModule = require('apm_module')
local apm = ApmModule.new(function(s) s:close() end)
apm.on_channel = function(ch, data, reply)
if ch ~= 'kick' or not reply then return end
local stream = apm:request_stream('iot', { device = 'sensor-7' }, 5000)
if not stream then reply({ ok = false }); return end
stream.on_data = function(d) io.stderr:write('in: ' .. d .. '\n') end
stream.on_close = function() io.stderr:write('closed\n') end
stream:write('ready')
reply({ ok = true, stream = stream.id })
end
apm:run()
Inter-worker IPC new in v2.0
Workers can now talk to each other through APM — across instances, across workers, and across languages. Two primitives: channels for stateless message passing, and streams for persistent bidirectional pipes. The daemon is the router; no extra sockets, no broker, no ports.
How it works
A worker declares a channel name with the listen config directive. The daemon maintains a registry mapping channel names → running workers. When any worker calls send(), request(), or requestStream(), the daemon routes the message to every running child of every worker listening on that channel. Workers never talk directly — all traffic flows over the existing stdin/stdout protocol that connectors already use, so no new dependencies and no new surface area.
┌────────────┐
│ APM daemon │ ← channel registry & router
└──┬──┬───┬───┘
│ │ │
┌─────────┘ │ └─────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ sender │ │ listener│ │ listener│
└─────────┘ └─────────┘ └─────────┘
Why this matters
- Cross-language. A Python worker can
requestdata from a Node.js worker and get a JSON reply. A PHP request handler can open a stream to a Go background worker. The connector API is identical across Node.js, Python, PHP, Perl, and Lua. - Zero config. Add
listen "channel_name";to a worker block. That's it. The daemon rebuilds the registry automatically on every worker start/stop. - Backwards compatible. Workers without
listenbehave exactly as before. The protocol is additive — old connectors are unaffected. - Crash-safe. When any child dies, the daemon cleans up its streams and cancels its pending requests automatically. Writes to closed streams are silently dropped (no-scream policy).
Channels — fire-and-forget & request/reply
Channels are lightweight stateless messaging. A sender emits a message to a named channel; the daemon broadcasts it to every child of every worker listening on that channel.
send() — fire-and-forget
Broadcasts a message with no response expected. Returns immediately. If nobody is listening, the message is silently dropped.
// Push a device status update to everyone listening on "devices"
apm.send('devices', { deviceId: 'sensor-5', online: true })
// Worker config must contain: listen "devices";
apm.onChannel = (channel, data) => {
console.log('got', channel, data)
}
The sender is never echoed to itself, even if it also listens on the same channel. This keeps broadcasts loop-free.
request() — request / first-reply-wins
Sends a message and waits for the first reply from any listening child. Late replies are silently dropped. Returns null on timeout or when no listener is running.
// Ask any iot worker for the status of device sensor-5
const status = await apm.request('iot', { query: 'status', device: 'sensor-5' }, 2000)
if (status === null) { /* nobody answered in time */ }
apm.onChannel = (channel, data, reply) => {
if (data.query === 'status') {
reply({ temp: 42, online: true })
}
}
Timeout priority: per-call timeout > worker ipc_timeout > default 500ms.
Streams — persistent bidirectional pipes
Streams are long-lived connections between workers. They follow a mediated star topology: one worker opens the stream (the mediator), others can attach as peers.
- Mediator writes fan out to all attached peers.
- Peer writes go to the mediator only, tagged with the peer ID so the mediator knows who spoke.
Think of it as a group trip organiser: the organiser can speak to everyone, but each participant only talks back to the organiser. This makes streams a natural fit for fan-out notification, shared live consoles, device control sessions, and worker-to-worker RPC sessions where one side coordinates many.
Typical flow
- Mediator calls
requestStream('channel', header). All listeners see a stream request. - One or more listeners call
stream.accept(). Others canstream.reject()(or just ignore — the timeout handles "nobody accepted"). - As soon as the first peer accepts, the mediator's Promise resolves with the stream object.
- Both sides call
stream.write(data)and handlestream.onData. Either side can callstream.close(). - When the last peer detaches, the stream auto-closes. When any child dies, its streams are cleaned up automatically.
// Open a live console to whichever iot worker owns device sensor-5
const stream = await apm.requestStream('iot_console', { device: 'sensor-5' }, 5000)
if (!stream) { /* no peer accepted within timeout */ return }
stream.onData = (chunk, peer) => { console.log('from', peer, chunk.toString()) }
stream.onClose = () => { console.log('console closed') }
stream.write('help\n')
iot_console)apm.onStream = (stream) => {
// header contains whatever the mediator sent
if (stream.header.device !== myDeviceId) { stream.reject(); return }
stream.accept({ name: 'sensor-5' })
stream.onData = (chunk) => { runCommand(chunk.toString()) }
stream.onClose = () => { /* cleanup */ }
}
No-scream policy
Writes to a closed stream are silently dropped — no error, no crash. The onClose callback still fires normally. This prevents race conditions when both sides close simultaneously, which is very common in real traffic.
Config & connector API
Worker config fields
| Field | Description |
|---|---|
listen | Channel name this worker listens on. Every running child of the worker will receive messages sent to this channel. Omit to disable (the default). |
ipc_timeout | Default timeout in milliseconds for request() calls from this worker. Per-call timeouts passed to request() override. Default: 500. |
# IoT WebSocket server — listens for control commands
worker {
name iot_ws;
exec node;
params iot_server.js;
server ws://0.0.0.0:9100;
instances 4;
listen "iot_control";
ipc_timeout 1000;
}
# Web control panel — sends commands to IoT workers
worker {
name web_panel;
exec node;
params panel.js;
server http://0.0.0.0:8080;
instances 2;
}
Cross-language API
Every connector exposes the same primitives. Method names are adapted to each language's conventions.
| Language | send | request | stream | receive |
|---|---|---|---|---|
| Node.js | apm.send(ch, data) | await apm.request(ch, data, t?) | await apm.requestStream(ch, hdr?, t?) | apm.onChannel = fn |
| Python | apm.send(ch, data) | apm.request(ch, data, t?) | apm.request_stream(ch, hdr?, t?) | apm.on_channel = fn |
| PHP | $apm->send(ch, data) | $apm->request(ch, data, t?) | $apm->requestStream(ch, hdr?, t?) | $apm->onChannel = fn |
| Perl | $apm->send(ch, data) | $apm->request(ch, data, t?) | $apm->request_stream(ch, hdr?, t?) | $apm->{on_channel} = sub |
| Lua | apm:send(ch, data) | apm:request(ch, data, t?) | apm:request_stream(ch, hdr?, t?) | apm.on_channel = fn |
Node.js request and requestStream return Promises. All other languages block internally (pumping their event loop where applicable) until a reply arrives or the timeout fires, so they can be used from straight-line code without async plumbing.
Cross-language IPC walkthrough
This section is a worked end-to-end example with workers in two different languages talking through APM. Same wire protocol on both sides, no extra plumbing — the daemon is the router.
Scenario: Node.js HTTP front-end calls a Python pricing service
An HTTP request hits the Node.js worker; the handler asks a separately-running Python worker for the current price of a product via an IPC request; the Node.js worker returns the answer to the client. Each side runs as its own APM worker (so each can scale, restart, and rolling-deploy independently).
worker {
name web;
exec node;
params front.js;
path /opt/shop;
server :8080;
instances 4;
# no `listen` here — this worker is a *caller*, not a service
}
worker {
name pricing;
exec python3;
params pricing.py;
path /opt/shop;
listen pricing; # become the receiver for `pricing` channel
ipc_timeout 500; # default reply window in ms
instances 2;
}
The listen "pricing" directive on the Python worker registers its instances as receivers for the pricing channel. When the Node.js side calls request('pricing', …), APM picks one running Python instance and routes the message; the first reply wins.
from apm_module import ApmModule
PRICES = {'sku-1': 9.99, 'sku-2': 14.50, 'sku-3': 99.00}
apm = ApmModule(lambda s: s.close()) # no HTTP, IPC only
def on_channel(channel, data, reply):
# channel == 'pricing'; data is whatever the caller sent
sku = data.get('sku')
price = PRICES.get(sku)
if reply is not None:
reply({'sku': sku, 'price': price, 'currency': 'GBP'})
apm.on_channel = on_channel
apm.run()
const ApmModule = require('./apm_module.node.js')
const apm = new ApmModule(async (session) => {
const sku = session.query_object.sku || 'sku-1'
// Cross-language IPC: send to the Python `pricing` service and await reply.
const reply = await apm.request('pricing', { sku }, 500)
if (!reply || reply.price == null) {
session.write(JSON.stringify({ error: 'unknown sku' }), {
'x-status': '404', 'content-type': 'application/json',
})
} else {
session.write(JSON.stringify(reply), {
'content-type': 'application/json',
})
}
session.close()
})
$ apm load /etc/apm/apm.conf.d/shop.conf
$ curl 'http://localhost:8080/?sku=sku-2'
{"sku":"sku-2","price":14.5,"currency":"GBP"}
What happened: the Node.js instance handling the request issued a request('pricing', …, 500). APM looked up registered listeners for the pricing channel, picked one Python instance (round-robin across the 2 instances), and delivered the message. The Python instance's on_channel fired with a non-null reply, which it invoked with a dict; APM serialised it back to the Node.js caller as the resolved value of the await. Total round-trip is one stdin/stdout frame per direction — no sockets opened, no JSON-over-HTTP, no broker.
Scaling notes
- Pick-one routing for
request: if you scalepricingto N instances, APM picks one per request. The first reply wins. Use this for stateless services where any instance can answer. - Fan-out for
send:apm.send('pricing', …)broadcasts to all listener instances and ignores replies. Use this for cache invalidation or background notifications. - Streams for stateful pipes: if you need a long-lived connection between two specific workers (e.g. a Lua mediator coordinating between two PHP peers, or a Node.js dashboard collector pulling continuous data from a Python sampler), use
request_stream/onStreaminstead — see the Streams section. - Mix and match: the same Python service can be called from a Node.js front-end, a Perl batch job, and a PHP cron simultaneously — APM doesn't care what language the caller is. The wire format is identical.
The four other connectors (PHP / Perl / Lua / Node.js) can be substituted for either side of the example above with the per-language snippets from the connectors section — only the syntax changes, not the wiring.
Augur — startup-failure diagnostician new in v2.1
When a worker crashes inside the startup_grace window (default 2 s) on its first attempt, APM's augur module takes over. Augur does three things, in order:
- Classifies the stderr by language. It recognises Node.js, Python, Perl, PHP, Ruby, Java, Go, and Bash failure patterns (
MODULE_NOT_FOUND,ModuleNotFoundError,Can't locate ... in @INC,Class 'X' not found,LoadError,ClassNotFoundException, etc.) and emits a single-line hint telling the user what to do next. - Surfaces universal errno conditions:
EADDRINUSE/ "Address already in use",ECONNREFUSED/ "Connection refused",EACCES/ "Permission denied" — matched as both symbolic and English forms so they fire across every runtime. - Deep-scans the worker's dependency manifest and source to list every missing dependency in one go — so a freshly-migrated worker that's missing five npm packages tells you all five at once, instead of crash → install → restart → next crash → install → restart …
Languages covered
| Language | Manifest scanned | Source fallback | Install hint |
|---|---|---|---|
| Node.js | package.json (dependencies + devDependencies), walks node_modules up the parent chain to handle hoisting | regex-extract require() / import paths; probe via node -e require.resolve(...) | npm install <list> |
| Perl | n/a (no universal manifest) | extract use X; / require X; from the entry script; probe with one perl -e invocation that tries each | cpan <list> |
| Python | requirements.txt (skips comments & pinned versions) | regex-extract import X / from X import; probe with python3 -c "importlib.import_module(...)" | pip install <list> |
| PHP | composer.json require + require-dev keys; checks vendor/<name>/ | n/a (PHP's autoloader makes source-extract unreliable) | composer require <list> |
| Ruby | Gemfile.lock (preferred, authoritative) or Gemfile | probe via ruby -e "gem(name)" | gem install <list> |
| Java | n/a yet — only the classifier (ClassNotFoundException, NoClassDefFoundError, UnsupportedClassVersionError) | n/a yet | n/a |
| Go | n/a (imports are resolved at compile time) | n/a | n/a |
| Bash | n/a | classifier matches : command not found | n/a |
Reactive mode (default)
Augur runs automatically when a child exits with non-zero or a signal within the startup-grace window on its first attempt. No configuration required — the captured stderr is dumped to the requesting CLI, classifier hints follow, then the deep-scan adds the augur — also missing (N): … summary if a manifest scan found additional gaps.
Pre-flight mode (opt-in)
Set augur_full_scan true on a worker and augur runs the manifest + source scan before every fork — including watcher- and auto-restart-triggered ones. If anything is missing the launch is aborted with a clear message; the child binary never starts. This costs one or two cheap probe sub-processes per restart but eliminates the "child crashes, child restarts, crashes again" loop on broken environments.
$ apm restart Portal4
● Starting Portal4
- Starting child #1
✗ Portal4#1 failed during startup (exit 1, 515 ms)
── child stderr ──────────────────────────────────────────────
Error: Cannot find module 'redis'
Require stack:
- /WEBZ/portal/v4/server.node.js
...
code: 'MODULE_NOT_FOUND',
──────────────────────────────────────────────────────────────
→ missing node module redis — run npm install
→ augur — also missing (5): ws axios dotenv mysql2 sharp → cd /WEBZ/portal/v4 && npm install ws axios dotenv mysql2 sharp
✗ Worker Portal4 failed to start — check logs
The deep-scan probes run as the worker's target user (same setuid + login-PATH mechanism as the worker launch itself), in the worker's path directory, with a 3-second timeout each. They never touch global system state and never write anything.
What's new in v2.1.0
Major — Augur startup-failure diagnostician
APM now reads worker crash output and tells you what to fix. When a child exits inside the startup-grace window on its first attempt, augur classifies the stderr by language (Node.js, Python, Perl, PHP, Ruby, Java, Go, Bash), surfaces POSIX errno conditions in plain English, and scans the worker's package.json / requirements.txt / composer.json / Gemfile.lock — or the source itself — to list every missing dependency in a single line. The migration scenario that previously took five restart cycles ("install redis, restart, install ws, restart, install mqtt, …") now takes one. See the augur section.
Opt-in pre-flight mode (augur_full_scan true) runs the scan before every fork, so the worker binary doesn't even attempt to start when the environment is broken.
Startup UX hardening (carried from v2.0.10)
Pre-flight validation of worker.path and the resolved exec binary; failed launches no longer stick in ◆ starting in apm list; apm restart with no arguments no longer crashes the daemon; first-8 KB of child stderr is captured and dumped to the requesting CLI when a fast-exit happens. New startup_grace knob controls the synchronous wait.
Reload-by-worker-name
apm reload <workerName> reloads just that worker from its origin conf file, without touching siblings declared in the same file. Useful when a multi-worker config holds several blocks and only one was edited.
Reload preserves log_time_format
A reload no longer silently wipes the per-line timestamp prefix when the conf block doesn't carry an explicit log_time_format — the daemon default is restored, matching the boot-time behaviour.
What's new in v2.0.0
Major — Inter-worker IPC
Workers can now communicate with each other through the daemon using two new primitives: channels (fire-and-forget and request/reply) and streams (persistent bidirectional pipes with mediated star topology). All routing happens inside the APM daemon over the existing stdin/stdout frame protocol — no new sockets, no new dependencies, no broker to run.
The feature is configured per worker with listen "channel_name"; and an optional ipc_timeout. All five official connectors (Node.js, Python, PHP, Perl, Lua) bumped to v3.0.0 with identical cross-language APIs. See the Inter-worker IPC section for the full guide.
Bug fix — listen / ipc_timeout dropped on initial config load
On first-time worker creation from a config file, addWorker translated config fields to CLI flags through workerParamMap, which was missing the two new IPC fields. Reloads picked them up correctly (they go through a separate path), but a cold boot never did. Both fields are now registered in the param map, so IPC works on first boot exactly as it does after a reload.
Improvement — apm info shows IPC config
apm info <worker> now displays an IPC section listing the configured listen channel and ipc_timeout (when set). This makes it easy to verify from the CLI that a worker is actually registered as a listener in the running daemon.
From v1.3.0 — still relevant
apm reload applies all worker fields live before the rolling restart, including watcher patterns, restart settings, TLS, session, and proxy flags. The file watcher is reopened in place when watch / watch_ignore / watch_delay change. watch_ignore with multiple patterns no longer silently dropped. Rapid file changes can no longer spawn duplicate WatcherRestart goroutines.