# APM — Advanced Process Manager for Linux (Full Manual) > APM is Tom's Advanced Process Manager for Linux. One binary, copy and run — no config files required to get started. Complexity is APM's problem, not yours. Source: https://processmanager.dev/manual.html Version: v2.1.3 Agent skill: https://processmanager.dev/SKILLS_apm.md — a drop-in skill file for AI coding assistants. Install it as a Claude Code skill (`~/.claude/skills/apm/SKILL.md`) or attach it as context in Cursor and other LLM editors to give the assistant a condensed, task-oriented APM reference. --- ## Overview APM runs as a background daemon and manages worker processes. You interact with it through the `apm` CLI. The daemon auto-starts the first time you run any `apm` command. ### Architecture The daemon communicates with the CLI via an abstract Unix socket. Workers are child processes managed by the daemon. Each worker can have multiple parallel instances. The built-in reverse proxy routes incoming connections across instances using round-robin. ``` CLI ──(unix socket)── Daemon ── Worker [4 instances] └── Worker [1 instance] └── GUI server (port 6789) ``` ### Philosophy - Zero config to start — defaults are correct for 90% of cases - CLI-first, config files for persistence and power users - Every error message answers "what do I do now" - Never crash on a bad optional field — warn and continue - Linux only. Windows is not supported. --- ## Installation Run the install script as root. It downloads the right binary for your architecture, sets up the system group and log file, installs the init service, and starts the daemon. ```sh # One-liner install $ curl -fsSL https://processmanager.dev/install.sh | sudo bash # Or download first, review, then run $ curl -fsSL https://processmanager.dev/install.sh -o install.sh $ sudo bash install.sh # Verify $ apm --version ``` The installer sets up: - `/usr/sbin/apm` — binary - `/var/log/apm.log` — daemon log (group-readable by `apm`) - `/etc/apm/apm.conf` — default config (created if absent) - `apm` OS group — add users with `usermod -aG apm ` - Startup service (systemd, OpenRC, or SysV — auto-detected) The daemon auto-loads `/etc/apm/apm.conf` at startup — no separate boot step required. ### Run as a systemd service The installer registers APM with your init system automatically. The service runs the daemon at boot, supervised by systemd. ```sh # Check status $ systemctl status apm # View logs $ journalctl -u apm -f ``` ### Uninstall ```sh # Remove binary and config $ sudo apm uninstall # Remove everything including logs, group, service $ sudo apm uninstall --purge ``` --- ## Command Reference All commands communicate with the running daemon. If no daemon is running, APM starts one automatically. ``` apm [command] [options] ``` The CLI does **not** require `sudo`. The daemon listens on an abstract Unix socket (`@apm`), which has no filesystem permissions — any local user can run `apm list`, `apm reload`, `apm restart`, `apm info`, etc. `sudo` is only needed for `apm install` / `apm uninstall` (which write to `/usr/sbin`, `/etc/apm`, and the init system), and for reading worker log *files* directly (`tail /path/.log`) when those files are owned by root. The `apm` OS group exists solely to grant read access to `/var/log/apm.log` — it is not required to use the CLI. ### Process commands | Command | Description | |---------|-------------| | `apm run [args...] [flags]` | Create and immediately start a worker without a config file. All `--flag` options from Worker Options apply. Worker name defaults to the executable name; use `--name` to override. | | `apm start ` | Start a registered worker that is currently stopped. | | `apm stop ` | Gracefully stop a worker and all its instances. | | `apm restart ` | Restart a worker (rolling if `rolling` was set). | | `apm update [flags]` | Update a running worker's config and reload it. Accepts the same flags as `run`. Add `--no-restart` to apply the new config without restarting. | | `apm list` | List all workers with status, instance count, CPU, memory, and uptime. | | `apm remove ` | Stop and remove a worker. Alias: `apm rm`. | | `apm stopall` | Stop all workers without stopping the daemon. | | `apm rename ` | Rename a worker. | | `apm copy ` | Duplicate a worker under a new name. | ### Inspection commands | Command | Description | |---------|-------------| | `apm list` | List all workers with status, instance count, CPU, memory, and uptime. Alias: `apm ls`. | | `apm info ` | Show full configuration and live state for one worker, including its IPC `listen` channel and `ipc_timeout`. | | `apm log ` | Stream a worker's stdout / stderr log. | | `apm grep [name]` | Search worker logs for a pattern. | | `apm env ` | Print the environment a worker's children run with. | | `apm wait ` | Block until the worker reaches running state — useful in scripts. | | `apm monitor` | Live terminal dashboard — system CPU, RAM, load average, uptime, and per-worker/instance status, CPU%, memory, restart counts. Updates every second; Ctrl+C to exit. | ### Config commands | Command | Description | |---------|-------------| | `apm boot` | Load `/etc/apm/apm.conf` into the running daemon. Called automatically by startup scripts after the daemon starts. Safe to run manually — skips workers already running. | | `apm load ` | Load a config file and start all workers defined in it. Workers already running are skipped. | | `apm unload ` | Stop and remove every worker that was loaded from the given config file. | | `apm reload [--force]` | Smart reload: diff the config against running workers, start new ones, restart changed ones (all config fields synced live — watcher patterns, TLS, rolling settings, proxy flags, etc.), stop removed ones. `--force` restarts unchanged workers too. | | `apm saveconf` | Write all workers back to their source config files (the file they were loaded from). Workers started via `run` without a file prompt for one. | | `apm saveconf ` | Save a specific worker to a file and set that as its config file going forward. | | `apm check ` | Validate a config file — checks syntax and reports which workers it would create or reload, without applying anything. | ### GUI & daemon commands | Command | Description | |---------|-------------| | `apm gui` | Start the web GUI and print its URL. No-op if already running. | | `apm gui stop` | Stop the GUI server. | | `apm exit daemon` | Stop all workers and shut down the daemon. | | `apm install` | Install APM to `/usr/sbin/apm` with group, log, and service setup. Requires root. | | `apm uninstall [--purge]` | Remove APM from the system. `--purge` removes logs, config, and service files. | | `apm -v / --version` | Print CLI and daemon version. | | `apm -h / --help [--full]` | Show command help. `--full` includes all commands. | ### Signals Any signal name can be used as a command to forward that signal to a worker's child processes — all instances, or one by index: ```sh $ apm SIGHUP myworker # send to all instances $ apm SIGUSR1 myworker#2 # send to instance index 2 only ``` ### Run flags Flags for `apm run` and `apm update`. The same options are available as config file fields (see Worker Options). ```sh # Start a worker from the CLI — no config file needed $ apm run node server.js \ --name myapp \ --instances 4 \ --server http://0.0.0.0:3000 \ --watch "*.js" \ --restart # restart on clean exit --rolling # rolling restart mode # Update a running worker's instance count without restarting $ apm update myapp --instances 8 --no-restart # Save it back to a conf file $ apm saveconf myapp /etc/apm/apm.conf.d/myapp.conf ``` Flags accept either `-flag` or `--flag`. Boolean flags (`--restart`) take no value. --- ## Config File Config files define workers and daemon settings. They're loaded with `apm load` or `apm reload`. The system config path is `/etc/apm/apm.conf`. ### Syntax - Key-value pairs end with `;` - Blocks use `{ }` - Comments: `#` or `//` to end of line - Strings: unquoted, or single/double/backtick quoted (quotes are stripped) - Multiple values: comma-separated on one line, or repeat the key - The `:` suffix on keys is optional - `include ;` inlines another file at parse position ``` # Simple worker worker { name myapp; exec node; params server.js; instances 4; restart true; watch *.js; server http://0.0.0.0:3000; } ``` ### Config hierarchy APM's startup scripts call `apm boot` after the daemon starts, which loads `/etc/apm/apm.conf`. The main config is typically structured as: 1. `/etc/apm/apm.conf` — main config (daemon block + includes) 2. `/etc/apm/apm.conf.d/*.conf` — drop-in worker configs, sorted by filename You can also load configs manually at any time with `apm load ` or do a live diff with `apm reload `. ``` daemon { gui_port 6789; } # Load drop-in worker configs include apm.conf.d/*.conf; worker { name portal; exec node; params app.js; } ``` Include paths are relative to the including file's directory. Glob patterns are supported. Circular includes are detected and rejected. ### Multiple values ``` # Comma-separated on one line server http://0.0.0.0:3000, ws://0.0.0.0:3001; # Or repeat the key ban_path *.php; ban_path *wp-*; ban_path *.env; ``` --- ## Worker Options All options are available both as CLI flags to `apm run` / `apm update` and as fields in a `worker { }` config block. ### Identity | Field | Default | Description | |-------|---------|-------------| | `name` | exec name | Worker name. Used in all CLI output and log prefixes. | | `exec` | required | Executable to run (looked up in PATH). | | `params` | | Arguments passed to the executable. Multiple values supported. | | `path` | cwd | Working directory for the child process. Env vars expanded. | | `instances` | 1 | Number of parallel child processes to run. | | `user` | | Run child processes as this OS user. Daemon must run as root. | ### Environment | Field | Default | Description | |-------|---------|-------------| | `env` | | Inject environment variables. Format: `KEY=value`. Multiple values supported. | | `env_index` | | Inject the instance index as an env var. Specify the variable name. | | `env_file` | | Path to a `KEY=VALUE` file. Read by APM before the setuid drop; child inherits the env. | ### Restart on clean exit | Field | Default | Description | |-------|---------|-------------| | `restart` | false | Restart the process when it exits with code 0. | | `restart_delay` | 250 | Milliseconds to wait before restarting after a clean exit. | | `max_restarts` | 0 | Maximum clean-exit restarts. 0 = unlimited. | ### Restart on error exit | Field | Default | Description | |-------|---------|-------------| | `restart_err` | false | Restart the process when it exits with a non-zero code. | | `err_delay` | 500 | Milliseconds to wait before restarting after an error exit. | | `max_err_restarts` | 0 | Maximum error-exit restarts. 0 = unlimited. | | `err_grace` | | Milliseconds of uptime required before a restart counts against the limit. | | `restart_on_exit_codes` | | Comma-separated exit codes. When set, the process restarts *only* on these codes — overrides both `restart` and `restart_err`. | ### Shutdown | Field | Default | Description | |-------|---------|-------------| | `kill_timeout` | 2000 | Milliseconds to wait for graceful shutdown (SIGTERM) before sending SIGKILL. | ### Startup dependencies | Field | Default | Description | |-------|---------|-------------| | `depends_on` | | Comma-separated worker names that must be running before this worker starts. | | `depends_timeout` | 30000 | Milliseconds to wait for dependencies. After the timeout the worker starts anyway, with a warning. | ### Health check APM probes worker health two ways. **Pull mode** — set `health_check` to a URL; APM sends periodic HTTP GETs (2xx/3xx = healthy). **Push mode** — set `health_check` to `on`; APM injects `APM_HEALTH_URL` into the child, which calls it to report in. Health status shows in `apm list` and the GUI. | Field | Default | Description | |-------|---------|-------------| | `health_check` | | A URL to probe (pull mode), or `on` (push mode). Empty = disabled. | | `health_check_interval` | 5000 | Milliseconds between probes. | | `health_check_timeout` | 3000 | Milliseconds to wait for a single pull-mode probe response. | | `health_check_threshold` | 3 | Consecutive failures before the worker is marked unhealthy. | ### Connection drain | Field | Default | Description | |-------|---------|-------------| | `drain_timeout` | 0 | Milliseconds to let active connections finish before a stop/restart. New connections are refused during the drain. 0 = stop immediately. | ### Memory limit | Field | Default | Description | |-------|---------|-------------| | `memory_limit` | | Per-child memory cap enforced via Linux cgroups v2 (e.g. `256M`, `1G`). The kernel OOM-kills a child that exceeds it; APM then applies the restart policy. Requires cgroups v2 and a root daemon. | ### Logging | Field | Default | Description | |-------|---------|-------------| | `log` | | Path to the stdout log file. | | `err_log` | same as `log` | Path to the stderr log file. Defaults to the same file as `log`. | | `prefix` | name | String prepended to each log line. | | `log_time_format` | | Timestamp format for log lines. | | `strip_ansi` | false | Strip ANSI escape codes from log output. | | `syslog` | | Forward logs to syslog. Value is the destination (e.g. `syslog://localhost:514`). | | `syslog_tag` | | Tag for syslog messages. | | `log_max_size` | | Rotate the log file when it exceeds this size (e.g. `10M`, `1G`). Empty = no rotation. | | `log_max_files` | 5 | Number of rotated log files to keep. | ### Proxy / HTTP | Field | Default | Description | |-------|---------|-------------| | `server` | | Bind address for the proxy server. See Server Types. | | `lowercase_hdrs` | false | Lowercase all HTTP header names before forwarding to the child. | | `trust_proxy` | true | Trust `X-Forwarded-For` / `X-Real-IP` headers for client IP resolution. **On by default.** Set `trust_proxy false;` when APM is directly internet-facing, so spoofed headers can't fool Vanguard's rate-limit and ban decisions. | | `keep_alive` | 120000 | HTTP keep-alive idle timeout in milliseconds. | | `max_conns` | 0 | Maximum concurrent connections per server. 0 = unlimited. | | `trace_header` | | When set, APM injects this header (e.g. `x-request-id`) with a unique per-request ID into every forwarded request. | | `session_persist` | false | Persist session state across rolling restarts. | | `session_wait` | 5000 | Milliseconds to wait for a new instance to accept a migrated session. | ### File watcher | Field | Default | Description | |-------|---------|-------------| | `watch` | | Comma-separated glob patterns of file paths to watch. Restarts the worker when any match changes. See File Watcher for pattern syntax. | | `watch_ignore` | | Comma-separated glob patterns of paths to exclude from watching. | | `watch_delay` | 200 | Debounce delay in milliseconds before triggering a restart. | | `watch_conf` | false | Auto-reload this worker when its own source config file changes on disk. | ### Rolling restart | Field | Default | Description | |-------|---------|-------------| | `rolling` | false | Enable rolling restart mode (one instance at a time). | | `rolling_delay` | 1000 | Milliseconds between restarting each instance. | ### Stats | Field | Default | Description | |-------|---------|-------------| | `stats_interval` | | Interval in milliseconds between stats collection cycles. | ### Inter-worker IPC | Field | Default | Description | |-------|---------|-------------| | `listen` | | Channel name this worker listens on for inter-worker IPC. Every running child receives channel messages and stream requests. Omit to disable. | | `ipc_timeout` | 500 | Default timeout in milliseconds for `request()` calls. Per-call timeouts passed to `request()` override this value. | ### Crash webhook (`on_crash`) APM can POST a JSON payload (or send a GET request) to a URL of your choice whenever a child process crashes — i.e. exits with a non-zero code or is killed by a signal. Intentional stops (`apm stop`) are never reported. ``` worker { name myapp; exec node; params server.js; on_crash { url https://hooks.example.com/apm-crash; method POST; # POST (default) or GET debounce 10000; # min ms between calls (floor: 5000) log_lines 20; # tail lines to include in payload log_source err; # "err" (default) or "out" secret mysecret; # signs payload with HMAC-SHA256 } } ``` | Field | Default | Description | |-------|---------|-------------| | `url` | | Destination URL. Required — the block is ignored without it. | | `method` | POST | HTTP method. `POST` sends a JSON body; `GET` sends no body. | | `debounce` | 5000 | Minimum milliseconds between webhook calls per worker. Minimum enforced value is 5000 — prevents flooding during a crash loop. | | `log_lines` | 0 | Number of trailing lines from the log file to include in the `log` field of the payload. 0 = omit. | | `log_source` | err | Which log to tail: `err` (stderr log) or `out` (stdout log). | | `secret` | | When set, APM signs the raw POST body with HMAC-SHA256 and sends the result in the `X-APM-Signature: sha256=…` header. | Request headers: | Header | Value | |--------|-------| | `X-APM-Worker` | Worker name | | `X-APM-Event` | `crash` | | `X-APM-Signature` | `sha256=` — only present when `secret` is set | POST payload: ```json { "worker": "myapp", "instance": 1, "exit_code": 1, "exit_signal": "SIGKILL", // omitted if process exited normally "runtime_ms": 4821, "error_restarts": 3, "timestamp": "2025-06-01T12:00:00Z", "log": "Error: cannot connect to DB\n..." // omitted if log_lines = 0 } ``` --- ## Daemon Config Global APM settings live in a top-level `daemon { }` block. No daemon block = all defaults. Zero config still works. ``` daemon { gui_port 6789; # web GUI port — 0 disables the GUI gui_bind 127.0.0.1; # bind address — default is 0.0.0.0 (all interfaces) gui_password secret; # GUI login password auto_reload true; # reload config files when they change on disk } # telemetry is a TOP-LEVEL key — not inside daemon { } telemetry false; # opt out of the anonymous usage ping ``` | Field | Default | Description | |-------|---------|-------------| | `gui_port` | 6789 | Port for the web GUI. The GUI starts automatically when this is set in the daemon block. Set to 0 (or omit) to disable — `apm gui` can still start it on demand. | | `gui_bind` | 0.0.0.0 | Address the GUI binds to. Defaults to `0.0.0.0` — **all interfaces**. Set `127.0.0.1` to restrict it to localhost, or set a `gui_password` before exposing it on a network. | | `gui_password` | | Password for GUI access. When empty the GUI is served without authentication — only safe on a localhost bind. | | `auto_reload` | false | When true, the daemon watches the config files it loaded and reloads them automatically when they change on disk. | **`telemetry`** is a *top-level* config key, not a field inside `daemon { }`. APM sends an anonymous hourly ping — APM version, worker count, OS, and hardware class; no names, paths, or IPs. Opt out by placing `telemetry false;` at the top level of the config file. --- ## Server Types APM's built-in proxy accepts connections and forwards them to worker instances via IPC. Specify servers with the `server` field. Multiple servers per worker are supported. | Scheme | Description | |--------|-------------| | `http://` | HTTP reverse proxy. APM parses request headers and forwards the full request to a child instance. | | `ws://` | WebSocket proxy. Handles the upgrade handshake; bidirectional frame forwarding to child. | | `tcp://` | Raw TCP proxy. Bytes forwarded as-is. Use for databases, game servers, custom protocols. | ``` worker { name api; exec node; params api.js; # HTTP and WebSocket on separate ports server http://0.0.0.0:3000; server ws://0.0.0.0:3001; # Or combined on one line server http://0.0.0.0:3000, ws://0.0.0.0:3001; } ``` ### Client IP resolution When behind a CDN or reverse proxy (e.g. nginx), enable `trust_proxy` so APM resolves the real client IP from `X-Forwarded-For` headers. This affects Vanguard rate limiting and ban decisions. ``` trust_proxy true; ``` --- ## Vanguard (Request Firewall) Vanguard is APM's built-in request firewall. It runs before worker IPC — rejected connections never reach your app. Configure it with a `vanguard { }` sub-block inside a worker. ``` worker { name api; exec node; params api.js; server http://0.0.0.0:3000; vanguard { rate_limit 100; # requests/sec per IP rate_burst 200; # burst capacity ban_ttl 300000; # auto-ban for 5 minutes ban_path *.php, *wp-*, *.env, /.git*; ban_response Forbidden; } } ``` ### IP filtering | Field | Description | |-------|-------------| | `allow_ip` | CIDR allowlist. Only matching IPs are allowed. Multiple values supported. | | `ban_ip` | CIDR blocklist. Matching IPs are rejected immediately (silent RST for TCP, 403 for HTTP). | ### Path banning | Field | Description | |-------|-------------| | `ban_path` | Comma-separated pattern list. Matched against the request path (query string stripped). Same four modes as the file watcher: `*.ext` ends-with, `prefix*` starts-with, `*word*` contains, `exact` exact match. | | `ban_response` | HTTP response body for blocked requests. Default: `Forbidden`. | ``` ban_path *.php; # ends-with — block all .php requests ban_path *wp-*; # contains — block WordPress probes ban_path /.git*; # starts-with — block .git exposure ban_path *.env; # ends-with — block .env file reads ban_path /admin/login; # exact match — block a specific path ``` ### Method filtering | Field | Description | |-------|-------------| | `allow_method` | HTTP method allowlist. If set, only listed methods reach the worker; everything else returns `405 Method Not Allowed`. Repeatable. Case-insensitive (normalized to uppercase). | | `ban_method` | HTTP method blocklist. Listed methods are rejected with `405`. Repeatable. Evaluated before `allow_method`. Skipped for raw TCP workers. | ``` vanguard { allow_method GET; allow_method POST; # everything else → 405 # or, blocklist style: ban_method TRACE; ban_method OPTIONS; } ``` ### Rate limiting | Field | Description | |-------|-------------| | `rate_limit` | Token bucket rate in requests per second per real client IP. | | `rate_burst` | Burst capacity. Defaults to `rate_limit` if not set. | | `ban_ttl` | Milliseconds to auto-ban an IP after rate limit is exceeded. 0 = soft block (no ban, just drop). | Rate-limited requests receive `429 Too Many Requests`. Path/IP bans return `403 Forbidden` (or silent TCP RST). ### Logging | Field | Description | |-------|-------------| | `log` | Per-event block log lines. Set to `off` (also accepts `false`/`no`/`0`) to suppress. Default: `on`. | | `log_summary` | Interval in seconds. When set, vanguard counts every block/drop and emits one summary line per interval — e.g. `in last 60s, 412 events`. Silent intervals are skipped. Default: `0` (disabled). Independent of `log`. | ``` vanguard { rate_limit 200; ban_path *.php, *.env; log off; # silence per-event lines under sustained probing log_summary 60; # one aggregate line per minute instead } ``` ### CDN IP lists APM's installer fetches Cloudflare's published egress IP ranges and writes them to `/etc/apm/ips/` as ready-to-include partial configs. Include them inside a `vanguard { }` block to restrict direct access to CDN traffic only. ``` vanguard { # Only allow Cloudflare egress IPs (IPv4 + IPv6) include /etc/apm/ips/cloudflare-v4.part; include /etc/apm/ips/cloudflare-v6.part; rate_limit 500; ban_path *.php, *wp-*, *.env, /.git*; } ``` The IP lists are re-fetched automatically on every `apm install` or upgrade. To refresh them manually: `sudo apm install`. Tip: Combine `allow_ip` with CDN IP files to drop all non-CDN connections at the TCP level — before any HTTP parsing happens and before your app sees the request. --- ## TLS APM has first-class TLS support for all server types — HTTP, WebSocket, and TCP. Bring your own certificates. ``` worker { name api; exec node; params api.js; server https://0.0.0.0:443; tls true; tls_cert /etc/ssl/certs/myapp.crt; tls_key /etc/ssl/private/myapp.key; # tls_ca for mutual TLS (client cert verification) tls_ca /etc/ssl/certs/ca.crt; } ``` | Field | Description | |-------|-------------| | `tls` | Enable TLS on all server listeners for this worker. | | `tls_cert` | Path to the TLS certificate file (PEM). | | `tls_key` | Path to the private key file (PEM). | | `tls_ca` | Path to the CA certificate for mutual TLS. If set, client certificates are required and verified against this CA. | Testing without nginx: use TLS directly on APM to test HTTPS/WSS locally. For production, APM + nginx is the typical setup where nginx handles TLS termination. --- ## File Watcher The file watcher monitors your source directory and triggers a worker restart when matching files change. Uses kernel file-watch events (inotify) — no polling. ``` worker { name api; exec node; params server.js; path /home/user/myapp; watch *.js, *.json; # watch .js and .json files watch_ignore *node_modules*; # ignore anything inside node_modules watch_delay 200; # 200ms debounce } ``` | Field | Description | |-------|-------------| | `watch` | Comma-separated pattern list. Matched against the full path of each changed file. Worker restarts when any pattern matches. | | `watch_ignore` | Comma-separated pattern list. Paths matching any of these are excluded from watch events even if they also match `watch`. | | `watch_delay` | Debounce delay in milliseconds. Multiple rapid changes are batched into one restart. | ### Pattern syntax Watch patterns use a simple glob-style syntax — no regex needed. Four matching modes: | Pattern | Mode | Example | Matches | |---------|------|---------|---------| | `*.ext` | ends-with | `*.js` | Any file ending in `.js` | | `prefix*` | starts-with | `src/*` | Any path starting with `src/` | | `*word*` | contains | `*node_modules*` | Any path containing `node_modules` | | `exact` | exact match | `config.json` | Only that exact filename | ``` # Go source files, excluding generated code and vendor watch *.go; watch_ignore *_generated.go, *vendor*; # JS/TS project — watch src/, ignore build output and deps watch *.js, *.ts, *.json; watch_ignore *node_modules*, *dist/*; # Python — any .py file anywhere under path watch *.py; watch_ignore *__pycache__*; ``` Tip: Keep `watch_delay` at 100–300 ms. Build tools often write multiple files in quick succession; the debounce ensures only one restart fires per save. --- ## Rolling Restart Rolling restarts cycle through instances one at a time, keeping the rest running to serve traffic. Zero downtime for multi-instance workers. ``` worker { instances 4; rolling true; rolling_delay 1000; # 1s between each instance restart } ``` With `session_persist true`, open connections are migrated to a new instance before the old one is killed. Use `session_wait` to control how long APM waits for the new instance to become ready. ``` rolling true; rolling_delay 500; session_persist true; session_wait 2000; # wait up to 2s for new instance ``` --- ## Logger APM has a built-in logger for each worker. Every line written to a child process's stdout or stderr is intercepted, prefixed with a timestamp and worker name, and written to the configured destination. Coloring is applied by APM before writing — use `strip_ansi` to remove it when logging to files. ### Destinations | Field | Default | Description | |-------|---------|-------------| | `log` | | File path for stdout. If omitted, output goes to the daemon log. | | `err_log` | same as log | File path for stderr. Defaults to the same file as `log` when not set. | | `syslog` | | Syslog destination URL, e.g. `syslog://localhost:514`. ANSI is always stripped for syslog regardless of `strip_ansi`. | | `syslog_tag` | | Tag string attached to every syslog message for this worker. | ### Prefix Each log line is prefixed with the worker name (or a custom string). The `prefix` field supports the color syntax described below. APM automatically appends the instance number in multi-instance workers. | Field | Default | Description | |-------|---------|-------------| | `prefix` | name | String prepended to every log line. Supports `çN-` color escapes. The instance index is appended automatically for multi-instance workers. | ``` worker { name api; exec node; params server.js; # cyan name, reset after — instance # is appended automatically prefix ç51-api-serverçR-; log /var/log/myapp/out.log; err_log /var/log/myapp/err.log; } ``` For a worker with `instances 3`, the stdout prefix becomes `api-server#1`, `api-server#2`, `api-server#3` — each in a distinct color so instances are visually distinct in the live GUI and in log files. ### Timestamp format The timestamp prepended to each line is controlled by `log_time_format`. The format string uses strftime-style tokens and supports color escapes. The default is `ç214-%Y-%m-%d %Tç59-.%FçR-` (orange date, dim fractional seconds). | Field | Default | Description | |-------|---------|-------------| | `log_time_format` | `ç214-%Y-%m-%d %Tç59-.%FçR-` | Timestamp format. Supports strftime tokens and color escapes. | Strftime tokens: | Token | Output | |-------|--------| | `%Y` | 4-digit year — `2026` | | `%y` | 2-digit year — `26` | | `%m` | Month, zero-padded — `03` | | `%d` | Day, zero-padded — `07` | | `%H` | Hour 24h, zero-padded — `14` | | `%M` | Minute, zero-padded — `05` | | `%S` | Second, zero-padded — `09` | | `%T` | Shorthand for `%H:%M:%S` | | `%F` | Fractional seconds (microseconds) | ``` # Default — orange date, dim microseconds log_time_format ç214-%Y-%m-%d %Tç59-.%FçR-; # Compact — just HH:MM:SS in gray log_time_format ç59-%TçR-; # No color — plain ISO timestamp log_time_format %Y-%m-%d %T; ``` ### Strip ANSI | Field | Default | Description | |-------|---------|-------------| | `strip_ansi` | false | Strip ANSI color codes from all log output before writing to the file. Useful when you want clean logs on disk but colored output in the GUI. Always on for syslog destinations. | Tip: Keep `strip_ansi false` for local development (colors in the GUI look great), and set it to `true` in production log files so tools like `grep`, `awk`, and log shippers see clean text. ### Color syntax — `çN-` APM uses a compact color escape based on the 256-color terminal palette. The `ç` character (U+00E7) acts as the escape marker. This syntax works in `prefix`, `log_time_format`, and anywhere APM renders text to the terminal or log files. | Syntax | ANSI equivalent | Description | |--------|-----------------|-------------| | `çN-` | `\033[38;5;Nm` | Set foreground to 256-color palette index N (0–255). | | `çN,BG-` | `\033[38;5;N;48;5;BGm` | Foreground N, background BG. | | `çN,BG,ATTR-` | `\033[38;5;N;48;5;BG;ATTRm` | Foreground, background, and an SGR attribute (1 bold, 2 dim, 4 underline, 9 strikethrough). | | `çR-` | `\033[0m` | Reset all formatting. | ``` # Foreground only prefix ç82-myappçR-; # bright green name prefix ç196-myappçR-; # bright red name prefix ç214-myappçR-; # orange name # Foreground + background prefix ç15,88-ERRORçR-; # white text on dark red background # Bold foreground prefix ç51,0,1-myappçR-; # bold cyan ``` Useful color reference: | Code | Approximate color | |------|-------------------| | `ç1-` | Dark red | | `ç2-` | Dark green | | `ç6-` | Cyan | | `ç51-` | Bright cyan | | `ç80-` | Green | | `ç82-` | Bright green | | `ç88-` | Dark red | | `ç124-` | Medium red | | `ç165-` | Magenta | | `ç196-` | Bright red | | `ç202-` | Orange-red | | `ç208-` | Orange | | `ç214-` | Amber / warm orange | | `ç244-` | Mid gray | | `ç59-` | Dark gray | | `ç15-` | White | | `çR-` | Reset | 256-color palette: any value from 0 to 255 is valid — use any standard xterm-256 chart to pick colors. The codes listed above are the ones used by APM's own output; they work well in most terminal themes. --- ## StatsD APM can forward worker metrics to any StatsD-compatible endpoint (StatsD, Graphite, Datadog Agent, Telegraf) over UDP. Add a `statsd { }` block to a worker. ``` worker { name api; exec node; params server.js; statsd { host localhost:8125; # StatsD UDP endpoint prefix apm.api; # metric namespace interval 1; # flush interval, seconds } } ``` | Field | Default | Description | |-------|---------|-------------| | `host` | required | StatsD UDP endpoint as `host:port`. The block is inactive without it. | | `prefix` | apm. | Namespace prefixed to every metric. Non-alphanumeric characters in the worker name are replaced with `_`. | | `interval` | 1 | Flush interval in seconds. Metrics are batched into UDP packets kept under 1400 bytes. | **System metrics** are forwarded automatically every interval, as gauges: `.cpu` (%), `.rss` (bytes), `.instances`, `.active_conns`, `.total_conns`, `.restarts.normal` / `.error` / `.watch`, `.errors`. **Custom metrics** emitted from worker code with `apm.metric(name, value, type)` are aggregated across all of the worker's instances and forwarded with the same prefix — counters summed (`|c`), timings averaged (`|ms`), gauges last-value-wins (`|g`). --- ## Web GUI APM ships a built-in real-time web dashboard. Enable it with `gui_port` in the daemon config: ``` daemon { gui_port 6789; # GUI port — omit or set 0 to disable gui_bind 127.0.0.1; # default is 0.0.0.0 — set 127.0.0.1 to keep it local } ``` When the daemon starts with the GUI enabled it prints the access URL: ``` $ apm start GUI: http://127.0.0.1:6789/ ``` Start or check the GUI from the CLI with `apm gui` (start and print URL) and `apm gui stop`. ### Views | Tab | Description | |-----|-------------| | Workers | Live table of all workers and instances — status, uptime, CPU sparkline, CPU%, RAM, restart count, error count. Per-worker stop / start / restart / reload-config buttons. | | Dashboard | Custom metric panels (LED, counter, text, graph, gauge, heatmap) defined in the worker's `dashboard { }` config block. Workers without a dashboard block show a placeholder. | | Live Logs | Per-worker log stream replayed from a 200-line ring buffer on connect, then live. Includes both stdout and stderr. Clear, download, and pause controls. | | Server Info | CPU model, thread count, speed, RAM, OS, kernel, architecture, uptime, network interfaces (IP, MAC, speed, RX/TX totals), load averages, and latency probes. | ### Server Info — latency The Server Info page has a Latency section with two cards: | Card | How it works | Interval | |------|--------------|----------| | Server → Internet | TCP connect time to Google (8.8.8.8:443), Cloudflare (1.1.1.1:443), and Quad9 (9.9.9.9:443). Measures server-side outbound connectivity. | On load, then every 30 s | | Browser → Server | WebSocket round-trip time. The browser sends a ping frame; the server echoes it; the browser measures elapsed time. | On load, then every 30 s | ### Disconnect behaviour When the daemon stops, the WebSocket closes and the GUI immediately dims with a *Session Ended* overlay. Click *Refresh Page* to reconnect. --- ## Dashboard Each worker can expose a custom metric dashboard in the GUI. Define a `dashboard { }` block inside a worker config to create one. The dashboard is shown in the *Dashboard* tab when that worker is selected. ``` worker my-api { exec node; params server.js; server http://127.0.0.1:3000; dashboard { name My API; cols 6; rows 4; module { type graph; id 1; name Requests/sec; x 0; y 0; w 3; h 1; } module { type gauge; id 2; name CPU %; x 3; y 0; w 1; h 1; min 0; max 100; unit %; } module { type counter; id 3; name Total errors; x 4; y 0; w 1; h 1; } module { type text; id 4; name Last error; x 5; y 0; w 1; h 1; } } } ``` ### Dashboard block fields | Field | Default | Description | |-------|---------|-------------| | `name` | worker name | Tab label shown in the GUI. | | `cols` | 6 | Number of grid columns. | | `rows` | 3 | Number of grid rows. | ### Module fields | Field | Required | Description | |-------|----------|-------------| | `type` | yes | Module type: `led`, `counter`, `text`, `graph`, `gauge`, `heatmap`. | | `id` | yes | Integer ID. Must be unique within the dashboard. Used to route metrics from code to the right module. | | `x`, `y` | yes | Grid position (0-based column, row). | | `w`, `h` | yes | Width and height in grid cells. | | `name` | no | Label shown inside the module. | | `unit` | no | Unit suffix displayed next to the value (e.g. `%`, `ms`, `req/s`). | | `min`, `max` | no | Value range. Used by `gauge` to scale the arc. Default 0–100. | | `color` | no | Accent color (hex or CSS value). Used by `led`, `graph`, `gauge`. | | `base_color` | no | Base / background color for `heatmap` cells. | | `source` | no | Auto-feed a built-in metric without writing code: `cpu` (CPU%), `ram` (RAM MB), `conn` (active connections), `ior` / `iow` (disk I/O read/write). When set, `setDashValue` calls for this module are ignored. | ### Module types | Type | Description | |------|-------------| | `led` | Colored indicator light. Green when value > 0, configurable color. | | `counter` | Large numeric display. Shows cumulative value. | | `text` | Single-line text value. Good for status strings or last-event messages. | | `graph` | Scrolling bar chart. Newest bar on the right, auto-scaling. | | `gauge` | Arc gauge with min/max range and optional unit suffix. | | `heatmap` | Grid of colored cells representing a 2-D value distribution. | ### Sending metrics from code Use `apm.setDashValue(id, value, color?)` in the connector to push a value to a dashboard module. This is distinct from `apm.metric()`, which is for StatsD-style system metrics. ```js const ApmModule = require('./apm_module.node.js') const apm = new ApmModule(async (session) => { /* handle connections */ }) // Push a number to module id 1 (graph) apm.setDashValue(1, requestsPerSecond) // Push a number with a dynamic color apm.setDashValue(2, cpuPercent, cpuPercent > 80 ? '#ff5a5a' : '#4f8cff') // Push a string to a text module (id 4) apm.setDashValue(4, lastErrorMessage) // LED on/off (1 = on, 0 = off) apm.setDashValue(5, isHealthy ? 1 : 0, isHealthy ? '#47d16c' : '#ff5a5a') ``` | Parameter | Description | |-----------|-------------| | `id` | Module ID as defined in the `dashboard { }` config block. | | `value` | Number for `gauge`, `graph`, `counter`, `led`; string for `text`. | | `color` | Optional CSS color string to override the module's configured color dynamically. | ### Counter vs gauge vs graph | Module type | How value is applied | |-------------|----------------------| | `counter` | Value is added to the running total each call (delta). To reset, call `setDashValue(id, -currentTotal)`. | | `gauge` | Absolute value replaces the current reading. Arc fills from `min` to `max`. | | `graph` | Absolute value; appended as the newest bar on the right each call. | | `led` | Any non-zero value turns the LED on; 0 turns it off. | | `text` | String value replaces the displayed text. | | `heatmap` | Numeric value 0–100 appended as the next cell. | --- ## Node.js Connector The Node.js connector ships in two equivalent forms: - `apm_module.node.js` — CommonJS (`require`), Node 10+. - `apm_module.node.mjs` — ES Modules (`import`), Node 14+, for `"type":"module"` packages. Both files are line-for-line equivalent — same class, same API, same wire protocol. IPC happens over stdin/stdout using binary frames — no Unix sockets required in the child. ```sh # download (CommonJS) $ curl -fsSL https://processmanager.dev/connectors/apm_module.node.js -o apm_module.node.js # download (ESM) $ curl -fsSL https://processmanager.dev/connectors/apm_module.node.mjs -o apm_module.node.mjs # update in-place $ node apm_module.node.js -update $ node apm_module.node.mjs -update ``` The module exports a class. Require (or import) it, then construct an instance passing your `onConnect` callback. The constructor sets up crash handlers and the stdin IPC listener immediately — call it once at startup before doing anything else. For ESM, swap the `require` line below for `import ApmModule from './apm_module.node.mjs'`. ```js const ApmModule = require('./apm_module.node.js') const apm = new ApmModule(async (session) => { // session.protocol — 'http' | 'ws' | 'tcp' // session.method — HTTP method // session.path — full path + query // session.headers — request headers // session.remoteIp — real client IP (proxy-aware) // session.cookies — parsed cookie map session.write('Hello World', { 'content-type': 'text/plain', 'x-status': '200' }) session.close() }) ``` If your worker doesn't handle sessions (e.g. a background job pushing dashboard metrics), pass an empty async function: `new ApmModule(async () => {})`. ### Session API | Property / Method | Description | |-------------------|-------------| | `session.protocol` | `'http'`, `'ws'`, or `'tcp'` | | `session.method` | HTTP method (GET, POST, …) | | `session.path` | Full path including query string | | `session.path_array` | Decoded path segments as an array | | `session.query` | Raw query string parts | | `session.query_object` | Parsed `{ key: value | [values] }` | | `session.cookies` | Parsed cookie map | | `session.headers` | Request headers object | | `session.remoteIp` | Client IP. APM resolves from proxy headers when `trust_proxy` is set. | | `session.sessionId` | Unique per-connection ID | | `session.instanceId` | `APM_INDEX` of this instance (0-based) | | `session.sessionType` | `'new'` for fresh connections | | `session.sessionData` | Free-form object. Persists across session callbacks. Use `saveSessionData()` to persist across rolling restarts. | | `session.active` | `true` while the connection is open | | `session.onData` | Set inside the callback. Called with `(data, isBinary)` for incoming data (WebSocket frames, TCP bytes). | | `session.onClose` | Set inside the callback. Called when the connection closes. | | `session.write(data, headers?)` | Send HTTP response body / WebSocket frame. Pass headers object on first HTTP write to set status and headers. | | `session.close(code?, reason?)` | Close the connection. HTTP status close or WebSocket close frame. | | `session.writeRaw(data)` | Send raw bytes, bypassing HTTP/WebSocket framing. For TCP or low-level use. | | `session.saveSessionData()` | Persist `sessionData` in the daemon. Survives rolling restart — the new instance receives the same data. | ### Instance methods | Method | Description | |--------|-------------| | `apm.setDashValue(id, value, color?)` | Push a value to a dashboard module. `id` is the integer module ID from the config. `value` is a number for gauge / graph / counter / led, or a string for text. `color` is an optional CSS color override. | | `apm.metric(name, value, type?)` | Send a StatsD-style metric. `name` is a dot-separated string (e.g. `'req.ok'`). `type`: `'counter'` (default, summed per second), `'gauge'` (last value), `'timing'` (averaged). | | `apm.instanceId` | `APM_INDEX` of this process instance (0-based string). | ### Environment variables APM injects the following into managed child processes: | Variable | Description | |----------|-------------| | `APM` | Set to `1`. The connector checks for this and exits if not present. | | `APM_INDEX` | 0-based instance index. Only injected when `env_index` is configured. | ### WebSocket example ```js const ApmModule = require('./apm_module.node.js') const apm = new ApmModule(async (session) => { if (session.protocol !== 'ws') { session.close(400) return } session.onData = (data, isBinary) => { // echo back session.write(data) } session.onClose = () => { console.log('disconnected', session.sessionId) } }) ``` --- ## PHP / Python / Perl / Lua Connectors Connectors for other languages follow the same pattern: drop a single file into your project, require / include it, and pass an `onConnect` callback. All connectors implement the full APM IPC protocol over stdin/stdout — no extra dependencies beyond what's noted on the connectors page (https://processmanager.dev/connectors/). ```sh $ curl -fsSL https://processmanager.dev/connectors/apm_module.php -o apm_module.php $ curl -fsSL https://processmanager.dev/connectors/apm_module.py -o apm_module.py $ curl -fsSL https://processmanager.dev/connectors/ApmModule.pm -o ApmModule.pm $ curl -fsSL https://processmanager.dev/connectors/apm_module.lua -o apm_module.lua ``` Each connector file also supports self-update — run it with `-update` to fetch the latest version from the server (e.g. `php apm_module.php -update`). See the connectors page for version info, MD5 checksums, and per-language update commands. All connectors expose the same `setDashValue(id, value, color?)` and `metric(name, value, type?)` methods as the Node.js connector, plus the same inter-worker IPC primitives (see below). --- ## Inter-Worker IPC (new in v2.0) Workers can talk to each other through APM — across instances, across workers, and across languages. Two primitives: **channels** for stateless message passing, and **streams** for persistent bidirectional pipes. The daemon is the router; no extra sockets, no broker, no ports. ### How it works A worker declares a channel name with the `listen` config directive. The daemon maintains a registry mapping channel names → running workers. When any worker calls `send()`, `request()`, or `requestStream()`, the daemon routes the message to every running child of every worker listening on that channel. Workers never talk directly — all traffic flows over the existing stdin/stdout protocol that connectors already use, so no new dependencies and no new surface area. ``` ┌────────────┐ │ APM daemon │ ← channel registry & router └──┬──┬───┬──┘ │ │ │ ┌─────────┘ │ └─────────┐ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ sender │ │ listener│ │ listener│ └─────────┘ └─────────┘ └─────────┘ ``` ### Why this matters - **Cross-language.** A Python worker can `request` data from a Node.js worker and get a JSON reply. A PHP request handler can open a stream to a Go background worker. The connector API is identical across Node.js, Python, PHP, Perl, and Lua. - **Zero config.** Add `listen "channel_name";` to a worker block. That's it. The daemon rebuilds the registry automatically on every worker start/stop. - **Backwards compatible.** Workers without `listen` behave exactly as before. The protocol is additive — old connectors are unaffected. - **Crash-safe.** When any child dies, the daemon cleans up its streams and cancels its pending requests automatically. Writes to closed streams are silently dropped (no-scream policy). ### Channels — fire-and-forget & request/reply Channels are lightweight stateless messaging. A sender emits a message to a named channel; the daemon broadcasts it to every child of every worker listening on that channel. **`send()` — fire-and-forget.** Broadcasts a message with no response expected. Returns immediately. If nobody is listening, the message is silently dropped. The sender is never echoed to itself, even if it also listens on the same channel. ```js // sender — Node.js apm.send('devices', { deviceId: 'sensor-5', online: true }) // listener — Node.js (worker config must contain: listen "devices";) apm.onChannel = (channel, data) => { console.log('got', channel, data) } ``` **`request()` — request / first-reply-wins.** Sends a message and waits for the *first* reply from any listening child. Late replies are silently dropped. Returns `null` on timeout or when no listener is running. ```js // sender — Node.js const status = await apm.request('iot', { query: 'status', device: 'sensor-5' }, 2000) if (status === null) { /* nobody answered in time */ } // listener — Node.js apm.onChannel = (channel, data, reply) => { if (data.query === 'status') { reply({ temp: 42, online: true }) } } ``` Timeout priority: per-call timeout > worker `ipc_timeout` > default 500ms. ### Streams — persistent bidirectional pipes Streams are long-lived connections between workers. They follow a **mediated star** topology: one worker opens the stream (the *mediator*), others can attach as *peers*. - **Mediator writes** fan out to **all** attached peers. - **Peer writes** go to the **mediator only**, tagged with the peer ID so the mediator knows who spoke. Think of it as a group trip organiser: the organiser can speak to everyone, but each participant only talks back to the organiser. This makes streams a natural fit for fan-out notification, shared live consoles, device control sessions, and worker-to-worker RPC sessions where one side coordinates many. Typical flow: 1. Mediator calls `requestStream('channel', header)`. All listeners see a stream request. 2. One or more listeners call `stream.accept()`. Others can `stream.reject()` (or just ignore — the timeout handles "nobody accepted"). 3. As soon as the first peer accepts, the mediator's Promise resolves with the stream object. 4. Both sides call `stream.write(data)` and handle `stream.onData`. Either side can call `stream.close()`. 5. When the last peer detaches, the stream auto-closes. When any child dies, its streams are cleaned up automatically. ```js // mediator — Node.js const stream = await apm.requestStream('iot_console', { device: 'sensor-5' }, 5000) if (!stream) { /* no peer accepted within timeout */ return } stream.onData = (chunk, peer) => { console.log('from', peer, chunk.toString()) } stream.onClose = () => { console.log('console closed') } stream.write('help\n') // peer — Node.js (worker listens on iot_console) apm.onStream = (stream) => { // header contains whatever the mediator sent if (stream.header.device !== myDeviceId) { stream.reject(); return } stream.accept({ name: 'sensor-5' }) stream.onData = (chunk) => { runCommand(chunk.toString()) } stream.onClose = () => { /* cleanup */ } } ``` **No-scream policy.** Writes to a closed stream are silently dropped — no error, no crash. The `onClose` callback still fires normally. This prevents race conditions when both sides close simultaneously, which is very common in real traffic. ### Config & connector API | Field | Description | |-------|-------------| | `listen` | Channel name this worker listens on. Every running child of the worker receives messages sent to this channel. Omit to disable (the default). | | `ipc_timeout` | Default timeout in milliseconds for `request()` calls from this worker. Per-call timeouts passed to `request()` override. Default: 500. | ``` # IoT WebSocket server — listens for control commands worker { name iot_ws; exec node; params iot_server.js; server ws://0.0.0.0:9100; instances 4; listen "iot_control"; ipc_timeout 1000; } # Web control panel — sends commands to IoT workers worker { name web_panel; exec node; params panel.js; server http://0.0.0.0:8080; instances 2; } ``` Cross-language API — every connector exposes the same primitives; method names are adapted to each language's conventions: | Language | send | request | stream | receive | |----------|------|---------|--------|---------| | Node.js | `apm.send(ch, data)` | `await apm.request(ch, data, t?)` | `await apm.requestStream(ch, hdr?, t?)` | `apm.onChannel = fn` | | Python | `apm.send(ch, data)` | `apm.request(ch, data, t?)` | `apm.request_stream(ch, hdr?, t?)` | `apm.on_channel = fn` | | PHP | `$apm->send(ch, data)` | `$apm->request(ch, data, t?)` | `$apm->requestStream(ch, hdr?, t?)` | `$apm->onChannel = fn` | | Perl | `$apm->send(ch, data)` | `$apm->request(ch, data, t?)` | `$apm->request_stream(ch, hdr?, t?)` | `$apm->{on_channel} = sub` | | Lua | `apm:send(ch, data)` | `apm:request(ch, data, t?)` | `apm:request_stream(ch, hdr?, t?)` | `apm.on_channel = fn` | Node.js `request` and `requestStream` return Promises. All other languages block internally (pumping their event loop where applicable) until a reply arrives or the timeout fires, so they can be used from straight-line code without async plumbing. Stream requests are received via `onStream` (Node.js/Python/Lua) or `{on_stream}` (Perl/PHP equivalent). --- ## What's New in v2.1.3 ### `env_file` (and a dozen other worker fields) now work on initial load A user deploying a Deno app reported `env_file /abs/.env;` parsed cleanly but the child saw none of the variables, and `apm env ` showed no env-file section. Same silent-drop affected `log_max_size`, `log_max_files`, `depends_on`, `depends_timeout`, `drain_timeout`, `memory_limit`, `trace_header`, `health_check`, `health_check_*`, `restart_on_exit_codes`, and `strip_ansi`. All only on first-time worker creation from a conf file; `apm reload` and `apm update` were unaffected. Root cause: the create-from-config path runs `addWorker` → emit flags as internal argv → `StartWorker`. `StartWorker`'s xtraParams switch had no case for any of those flags (and an incorrect single-dash case for `strip_ansi` where the emitter used double-dash), so the values vanished with no warning. Same root-cause class as the v2.1.2 `params` fix. Added the missing 13 case clauses, fixed the `strip_ansi` mismatch, and added a `default:` arm that prints `⚠ Unknown flag ignored` so future omissions surface immediately instead of silently dropping. ### `apm check` no longer cries wolf on valid execs `apm check` (and the corresponding line in `apm reload` output) printed `⚠ exec not found at /WEBZ/foo/home/user/.deno/bin/deno` for a config like ``` exec /home/user/.deno/bin/deno; path /WEBZ/foo; ``` because the check naively did `filepath.Join(path, exec)` — and on Linux, Join concatenates even when the second arg is absolute. PATH-resolvable bare names like `exec deno;` got the same wrong treatment. Meanwhile the runtime resolution (which uses `exec.LookPath` with the target user's login PATH) was correct, so the worker ran fine; users just got a scary warning every check/reload. Rewrote the check to mirror the runtime: absolute exec → stat as-is; contains a slash → join with `path`; bare name → PATH lookup, fall back to `path/exec` only if PATH lookup fails. --- ## What's New in v2.1.2 ### `params` values starting with `-` no longer silently dropped A config like ``` worker { exec /home/user/.deno/bin/deno; params run, -A, server.js; } ``` was previously exec'ing the child as `deno run` — the `-A` and `server.js` tokens were silently dropped because APM's internal argv parser treated the first token starting with `-` as the boundary between worker args and APM's own CLI flags. The documented examples like `params server.js, --port, 3000;` lost `--port 3000` the same way. Fixed by routing config-supplied params through an explicit internal `-args` flag so values that begin with `-` are preserved verbatim. Config syntax is unchanged. ### `apm wait` / `apm update` no longer hang the CLI on success Both commands emitted their final status line as a streaming-continuation frame (`0x04` terminator) and then returned without sending a closing `0x03` frame — the CLI sat in its read loop waiting for a terminator that never arrived, requiring Ctrl+C to escape. The daemon now sends a defensive final terminator after each command returns if none was sent. The fix is structural, not per-call: any current or future `Cli_*` handler that forgets the terminator will still get one written for it. Streaming commands (`apm monitor`, `apm log`) opt out via a new internal `Streaming` flag on the CLI struct so the safety net doesn't kill them prematurely. ### Docs: `apm` CLI does not need `sudo` `web/llms.txt`, `web/llms-full.txt`, `web/SKILLS_apm.md`, and `web/manual.html` were each updated to state explicitly that everyday `apm` commands run as any local user — the daemon listens on an abstract Unix socket (`@apm`) with no filesystem permissions, and the `apm` OS group exists only to grant read access to `/var/log/apm.log`. `SKILLS_apm.md` also had a wrong claim that the CLI socket was uid-restricted; that was corrected. `sudo` is needed only for `apm install` / `apm uninstall` and for tailing worker log *files* directly when the worker runs as root. --- ## What's New in v2.1.1 ### PM2 ecosystem converter New `apm convert [output]` command translates a PM2 `ecosystem.json` / `ecosystem.config.js` / `.cjs` / `.mjs` into APM `worker { }` blocks. JSON is parsed natively; JavaScript is evaluated by shelling out to `node -e "console.log(JSON.stringify(require('...')))"` (anyone migrating from pm2 already has node installed). When no output path is given, the result is written to stdout for piping. **Field mappings** (full table in the manual): `name`, `script`+`interpreter` → `exec`+`params`, `cwd` → `path`, `instances` (numeric or `'max'`/`'auto'`/`0`/`-1` → `runtime.NumCPU()` with a WARN noting the host's CPU count), `watch`/`ignore_watch`/`watch_delay`, `max_memory_restart` → `cgroup_memory_max`, `restart_delay`, `max_restarts` → `max_err_restarts`, `min_uptime` → `err_grace`, `kill_timeout`, `error_file` → `err_log`, `out_file`/`log_file` → `log`, `log_date_format` (moment.js tokens translated to Go's reference time layout, `YYYY-MM-DD HH:mm:ss.SSS` → `2006-01-02 15:04:05.000`), `user`, `autorestart` → `restart_err`, `env` + `env_` merged with the most-likely-active profile inline and the rest emitted as commented alternatives the user can swap in, `env_file`, `instance_var` → `env_index`. **PM2 features without an APM equivalent** surface as inline `# WARN:` comments above the block they apply to, so the user grep-greps for them: `exec_mode: 'cluster'` (suggests adding a `server:` line for proxy load-balancing instead), `cron_restart` (suggests a systemd timer), `node_args` set when interpreter is not node (silently dropped, the WARN explains), unrecognised moment.js tokens in `log_date_format`. PM2-specific noise (`vizion`, `force`, `treekill`, `pid_file`, `merge_logs`, `combine_logs`, `time`, `wait_ready`, `listen_timeout`, `automation`, `pmx`) is silently ignored — those are either defaults APM already matches or features that don't apply outside pm2's runtime. The output is one `worker { }` block per pm2 app, in input order, prefixed with a header comment indicating the source file and a safer "cat this into your main conf and reload" hint than the destructive `apm reload ` (which would stop sibling workers absent from the new file). --- ## What's New in v2.1.0 ### Major — Augur: startup-failure diagnostician When a worker exits non-zero or by signal inside the `startup_grace` window on its first attempt, the new `modules/augur` package takes the captured stderr, classifies it, and (when a language manifest is present) deep-scans the worker's dependencies to list every missing one in a single line. The migration scenario that previously took five restart cycles — "install redis, restart, install ws, restart, install mqtt, …" — now takes one. **Classifier** covers Node.js (`Cannot find module`, `MODULE_NOT_FOUND`, `SyntaxError`), Python (`ModuleNotFoundError`, `ImportError`, `SyntaxError`, `IndentationError`, `NameError`), Perl (`Can't locate X.pm in @INC`, `syntax error at X line N`, `BEGIN failed`), PHP (`Class 'X' not found`, `Call to undefined function`, `Parse error`, `Failed opening required`), Ruby (`LoadError`, `(SyntaxError)`, `syntax error, unexpected`), Java (`Could not find or load main class`, `ClassNotFoundException`, `NoClassDefFoundError`, `UnsupportedClassVersionError`), Go (`panic:`, nil-pointer dereference), and Bash (`command not found`). POSIX errno hints — `EADDRINUSE`/"Address already in use", `ECONNREFUSED`/"Connection refused", `EACCES`/"Permission denied" — fire across every language. **Deep-scan** is manifest-first when a manifest exists: | Language | Manifest | Source fallback | Install hint | |----------|--------------------------------------------|---------------------------------------------------------------------|-------------------------| | Node.js | `package.json` deps + devDeps; walks `node_modules` upward to handle hoisting | regex-extract `require()`/`import` paths, batch-probe via `node -e require.resolve(...)` | `npm install ` | | Perl | n/a (no universal manifest) | regex-extract `use X;`/`require X;`, batch-probe via `perl -e` | `cpan ` | | Python | `requirements.txt` | regex-extract `import X`/`from X import`, probe via `python3 -c` | `pip install ` | | PHP | `composer.json` `require`/`require-dev` | n/a (PHP's autoloader makes source-extract unreliable) | `composer require `| | Ruby | `Gemfile.lock` (preferred) or `Gemfile` | probe via `ruby -e "gem(name)"` | `gem install ` | | Java | classifier only — no deep-scan yet | n/a | n/a | | Go | n/a (compile-time) | n/a | n/a | Deep-scan sub-processes run as the worker's target user (same setuid + login-PATH machinery as the worker launch), in the worker's cwd, with a 3-second timeout per invocation. Augur never writes anything. **Pre-flight mode** is opt-in per worker: `augur_full_scan true` runs the manifest + source scan *before* every fork (including watcher- and exit-triggered auto-restarts) and aborts the launch when something is missing. Costs one or two cheap probe processes per restart in exchange for never starting a broken worker. The architecture keeps augur cleanly separated from the workers package — it imports nothing from workers, takes plain-data input plus an optional `Probe` closure for sub-process execution. Adding a new language is one new `xxx_scan.go` file plus a stanza in `classify.go`. ### Reload-by-worker-name `apm reload ` reloads just one worker from its origin `ConfFile` without touching siblings in the same file. Falls through to the existing file-path behaviour when the argument doesn't match a loaded worker. Workers started ad-hoc via `apm run` (no `ConfFile`) get a clean "use `apm restart` instead" message rather than a confusing file-not-found error. ### Bug fix — reload no longer wipes per-line timestamps `applyConfigToWorker` unconditionally overwrote `LogTimeFormat` with the value parsed from the reloaded conf — empty when the block doesn't carry `log_time_format`. The fix restores `global.LogTimeFormat` whenever the conf is silent on that key, matching the boot-time init at `workers.go`. Without this, the diff machinery saw `LogTimeFormat` "change" from the daemon default to empty string on every reload and rebuilt loggers with `TimeStampFormat: ""` — which the go-logger interprets as "emit no timestamp." ### Startup UX hardening (carried from v2.0.10) Pre-flight validation of `worker.path` and the resolved `exec` binary before fork; failed launches reset `Status` to Stopped instead of sticking at `◆ starting`; `apm restart` with no args no longer crashes the daemon; first 8 KB of child stderr captured and dumped to the requesting CLI synchronously when a fast-exit happens within `startup_grace` (default 2000 ms). --- ## What's New in v2.0.10 Startup UX hardening. Every failure mode that a fresh admin can produce — wrong `path`, missing `node_modules`, syntax error in the worker source, port already taken, etc. — now lands as one coherent terminal message with an actionable hint, instead of `Worker X started` followed by a silent crash visible only in the log file. ### Daemon panic on `apm restart` (no args) The `restart` command handler did `for _, p := range cc.Params[1:]` which panicked with `slice bounds out of range [1:0]` whenever the user ran a bare `apm restart`. The daemon crashed and systemd brought it back, but to the CLI it manifested as `! Daemon disconnected`. Replaced with an index-based loop so an empty params slice falls through to the proper `name is missing` hint without taking the daemon down. ### Pre-flight `path` and `exec` validation `Children.Start` now stats `worker.path` (cwd) **before** building the `exec.Cmd`. If it does not exist, the launch aborts with `✗ path 'X' does not exist or is not accessible` and a hint pointing at `apm.conf`. This catches the common post-migration / typo'd path case where Go's `os/exec` would otherwise produce a misleading `fork/exec /usr/bin/node: no such file or directory` — `cmd.Path` is always named in that error even when the actual ENOENT comes from the child's `chdir(cmd.Dir)` running after `setuid` and before `execve`. The resolved binary is also stat'd after the user-PATH override step (so a stale `cmd.Err` from the daemon's initial `LookPath` no longer leaks through), and `cmd.Err` is cleared when the user-switch block successfully repoints `cmd.Path` to an absolute path. ### Fast-exit stderr surfacing The stderr reader appends each line to a per-child capped buffer (8 KB). When a child exits within `StartupGrace` ms on its first attempt, the buffer is dumped to the requesting CLI along with pattern-matched hints: - `Cannot find module 'X'` → `missing node module 'X' — run npm install` - `SyntaxError` → `syntax error in worker source` - `EADDRINUSE` → `port already in use` - `EACCES` / `permission denied` → `permission denied — check ownership and the 'user' setting` - `ECONNREFUSED` → `a dependency (db, redis, …) refused the connection` ### Synchronous startup-grace wait When a launch is user-initiated (CLI attached, no prior restart attempts on this child), `Children.Start` blocks up to `StartupGrace` ms before returning — signalled by the child-exit goroutine via a buffered channel — so the rich error above reaches the user's terminal in the same `apm start`/`apm restart` invocation, not after the command has already printed "Worker X started". Auto-restart paths (watcher, exit-driven timers) skip the wait entirely; the restart loop is not slowed by 2 s per attempt. New `startup_grace ` worker setting controls the window; default `2000`. Wired through the config parser, `-startup_grace` CLI flag (both `apm run` and the reload-update path), and `reload_actions` (no implicit restart — change applies on next start). ### Worker no longer stuck at `◆ starting` When a child failed to launch (pipe error, cmd.Start error, our new pre-flight failures) the worker's `Status` was never reset from `StatusNew`/`StatusInit`, and `apm list` showed `◆ starting` forever. `Worker.Start` now resets to `StatusStopped` after `wsg.Wait()` if no child reached `ChildStarts`. --- ## What's New in v2.0.9 Hot fixes for issues that surfaced in 2.0.8 production use. ### Bug fix — file logs went silent after rotation / RebuildLoggers `RebuildLoggers` and `rotateLogs` used a close-then-recreate pattern. If `logger.New` failed to recreate (e.g. transient FS error, permission, empty path), the close had already dropped the go-logger refcount to 0 — the OS fd closed, the writer goroutine exited, and `c.Log` / `c.ErrorLog` got rebound to a silent no-op function. The daemon's stdout/stderr reader goroutines kept running but their writes went nowhere. `LogHook` (the GUI WebSocket feed) is called separately on the next line of the reader so GUI live-logs kept working — which is what made this bug visible: file logs silent, GUI fine. `apm restart --force` recreated the worker from scratch and recovered. Fix is test-then-swap: build the new logger first, only close the old one if the new came up successfully. `Children.Start` also gained a defensive `ensureChildLoggers` call that recreates `c.logInst` / `c.errLogInst` from current `Worker.LogFile` / `ErrorLogFile` if they are nil — auto-heals any dead-logger state before the next exec, regardless of how it got there. ### Bug fix — `apm list` / `apm info` sometimes hung after printing Unix domain sockets in stream mode have no message boundaries; the kernel may split a frame across reads at any byte boundary. When the daemon's final 0x03 frame got split such that the previous read ended with a partial header (0x01 + a few length digits, no 0x02 yet), the CLI's parser hit `pli == -1` and used `continue` instead of `break` — cycling the inner `for { }` loop on the same buffer forever. CPU pinned, cursor stuck. Earlier complete frames had already printed correctly, which is why "everything prints fine, then never returns." Fix is one line: `continue` → `break` so the outer loop fetches more bytes from the socket, matching the pattern already used by the "have-header-missing-payload" branch below. Probability scaled with output size; small `apm info ` rarely hit it, but `apm list` with several workers and children hit it ~25%+ of the time. ### Bug fix — `apm upgrade` left old daemon running on old version `cliUpgrade` called `killDaemon` which reads the PID from `/root/.apm/apm.pid`. If the user ran `apm upgrade` as a non-root user (the common case), that file is unreadable and `killDaemon` silently returned — but `cliUpgrade` still printed `"Stopping daemon... OK"`. The old daemon survived, held the abstract `@apm` socket, and the new daemon's bind failed silently; `install.sh`'s systemctl restart and manual-start fallback both failed quietly. User had to `apm shutdown` + `apm boot` manually to recover. Fix is socket-based: new `daemonShutdown()` helper connects to the abstract socket, sends `exit daemon`, drains responses until 0x03 or socket close, then polls until the socket is actually released. Abstract sockets are kernel-managed and reachable from any user — no PID file required. `cliUpgrade` calls this first; falls back to `killDaemon` only if the socket path can't confirm exit. `install.sh` got a parallel block that walks `$INSTALL_PATH`, `/usr/local/bin/apm`, and `$(command -v apm)` to find a working binary, then runs `timeout 6 "$APM_BIN" exit daemon` before the PID-file fallback — covers `bash install.sh` direct invocations. ### Bug fix — `Worker.Reload` raced with `WatcherRestart` Carried forward from the 2.0.8 correctness pass: a concurrent `apm restart --force` during a file-watcher restart could observe an intermediate `StatusStopped` between children being cycled and bail with "already stopped" while the watcher was still bringing things back up. Reload now acquires `watcherRestartMu` for its full duration so the two operations queue cleanly instead of interleaving. --- ## What's New in v2.0.8 Reload correctness, info accuracy, env masking, include-aware saveconf, and per-language connector docs. ### Reload — diff-driven action map `apm reload ` now figures out exactly which part of a worker needs touching, rather than always bouncing children. Every config key is mapped to one of: `noop`, `scale`, `loggers`, `statsd`, `vanguard`, `servers`, `watcher`, or `restart`. The daemon snapshots a per-action fingerprint, applies the new config, re-snapshots, and dispatches only the actions whose fingerprints changed. Adding a new key triggers a warning (`unknown key — no reload action mapped`) so silent-drop bugs can't reappear. Practical effect: editing `log:` rebuilds loggers in place (no child restart). Editing `server:` rebinds the proxy listener in place (no child restart). Editing `vanguard {}` rules recompiles the IP/path filters in place. Editing `exec`, `path`, `env`, `user`, `listen`, `tls_*` triggers a full Stop+Start. `apm restart --force` and `apm shutdown && apm boot` still do the heavy-handed thing if you want it. ### Reload — bool/string clear-on-reload semantics Removing a line from the conf now actually clears the field on reload. Previously bool fields used `if w.GetBool(k) { nw.X = true }` — they could go false → true but never back. Same for many strings. Now: removing `tls true;` or `restart true;` disables them; removing `log /path;` reverts to the global log; removing `on_crash {}` clears the webhook. `exec`, `path`, `instances`, and the integer knobs (delays, timeouts, thresholds) stay conditional — clearing those would do more harm than good. ### Reload — `Reload` serialised with `WatcherRestart` `apm restart --force` no longer races with a concurrent file-watcher restart. Both now acquire the same per-worker mutex, so they queue instead of interleaving (the "already stopped" spurious message is gone). ### Config parser — include-aware `include apm.conf.d/*;` directives are now parsed recursively (separate `Load` per file) instead of byte-spliced into one stream. Every worker block is tagged with the include file it came from. `apm check` shows the origin per block; `apm saveconf` writes each worker back to its source include, not to the top-level conf; `apm reload` respects per-block origins when the watcher fires. ### `apm saveconf` — safer `apm saveconf` (no-arg form) previously wrote serialised worker blocks to each `ConfFile` it had collected — wiping out any `daemon {}` blocks, `include` directives, comments, or anything else in those files. With include-aware ConfFile, workers from `apm.conf.d/*.conf` now correctly write back to their own files. As a defensive layer, `saveconf` re-scans the target file's literal top-level keywords before overwriting; if anything other than `worker` is present, it refuses with a hint. `--force` flag bypasses the check. Also fixed: `saveconf` was writing `env` as `env K1=v1; K2=v2;` (broken — the parser stops at the first `;` and treats `K2=v2` as a new key with empty value, losing every entry but the first). Now emits one `env\tKEY=val;` line per entry; the parser merges repeated keys. ### `apm info` — actual runtime state The Restart section now shows the actual restart counts (normal / error / watch), the time since the last restart, the total error count, and a "gave up" counter for stopped children that hit `max_restarts` or `max_err_restarts`. Per-child rows in the Children section show a trailing `restarts: 3n 1e 2w` summary and a `gave up (max N)` marker on stopped children that exhausted their budget. ### `apm info` / `apm env` — secret masking with `--full` Env values are masked in CLI output when their key looks sensitive (PASSWORD, PASSWD, PASS, SECRET, TOKEN, KEY, PRIVATE, CREDENTIAL, AUTH, APIKEY, DSN — substring match, case-insensitive). Mask format: all but last 4 chars replaced with `*`; values under 8 chars fully masked. `apm info --full` and `apm env --full` show plain values. APM-injected env (`APM=1`, `APM_INDEX`, `APM_HEALTH_*`) is never masked. ### Statsd config picked up on reload `apm reload` now parses the `statsd {}` block (previously only `addWorker` did, so the config was silently lost on first reload). New `RebuildStatsd()` method restarts the sender goroutine in place when only the statsd config changed — no child restart. ### `watch_conf` watches include files When a worker has `watch_conf true;`, APM now starts a file watcher on the include file the worker actually came from (not just the top-level conf the user passed to `apm load`). The watcher set is reconciled after every `load` / `reload` / `unload`, so orphaned watchers stop and new ones start automatically. ### GUI — doubled `wgap` tbody fixed The workers table sometimes showed a doubled `` separator before the last worker after restart cycles. Fixed: `buildWorker` now skips its leading gap if the table already ends with one, and emits its own trailing gap, so the bracket pattern stays `[gap][W][gap][W]...[gap][W][gap]` whether workers come from init or are added later via `_wsTick`. ### Manual — per-language connector examples + cross-language walkthrough `#connectors` section expanded from "download URL + one-liner" to four per-language subsections (Python, PHP, Perl, Lua), each with three worked examples: hello-world HTTP, IPC channel listener with `send`/`request` distinction, and stream initiator. New `#ipc-crosslang` section walks through an end-to-end "Node.js HTTP front-end calls a Python pricing service" scenario with config, both source files, and a curl that exercises the round-trip. --- ## What's New in v2.0.0 ### Major — Inter-worker IPC Workers can now communicate with each other through the daemon using two new primitives: **channels** (fire-and-forget and request/reply) and **streams** (persistent bidirectional pipes with mediated star topology). All routing happens inside the APM daemon over the existing stdin/stdout frame protocol — no new sockets, no new dependencies, no broker to run. The feature is configured per worker with `listen "channel_name";` and an optional `ipc_timeout`. All five official connectors (Node.js, Python, PHP, Perl, Lua) bumped to v3.0.0 with identical cross-language APIs. See the Inter-Worker IPC section for the full guide. ### Bug fix — `listen` / `ipc_timeout` dropped on initial config load On first-time worker creation from a config file, `addWorker` translated config fields to CLI flags through `workerParamMap`, which was missing the two new IPC fields. Reloads picked them up correctly (they go through a separate path), but a cold boot never did. Both fields are now registered in the param map, so IPC works on first boot exactly as it does after a reload. ### Improvement — `apm info` shows IPC config `apm info ` now displays an **IPC** section listing the configured `listen` channel and `ipc_timeout` (when set). This makes it easy to verify from the CLI that a worker is actually registered as a listener in the running daemon. ### From v1.3.0 — still relevant `apm reload` applies all worker fields live before the rolling restart, including watcher patterns, restart settings, TLS, session, and proxy flags. The file watcher is reopened in place when `watch` / `watch_ignore` / `watch_delay` change. `watch_ignore` with multiple patterns is no longer silently dropped. Rapid file changes can no longer spawn duplicate `WatcherRestart` goroutines. --- ## Socket & Runtime Files APM's CLI and daemon talk over an abstract Unix domain socket. Abstract sockets are kernel-managed: they exist only while the daemon is alive and leave no file on disk. | Item | Path / Name | Notes | |------|-------------|-------| | CLI socket | `@apm` | Abstract socket. Visible with `ss -xl | grep apm`. No file — kernel cleans it up on daemon exit. | | PID file | `~/.apm/apm.pid` | Removed on clean exit. If the daemon crashes it stays behind; APM detects and replaces it on next start. When installed as a system service, `/root/.apm/apm.pid`. | | Config (user) | `~/.apm/config.conf` | Loaded automatically on daemon start if it exists. | | Config (system) | `/etc/apm/apm.conf` | Also loaded automatically. Created by `apm install`. Drop worker configs into `/etc/apm/apm.conf.d/`. | | Log (user) | `~/.apm/apm.log` | Written when APM is running as a regular user. | | Log (service) | `/var/log/apm.log` | Written when running as the systemd service installed by `apm install`. | | Runtime dir | `~/.apm/` | Created automatically on first run. | ### Socket access control The CLI socket is restricted to the user that started the daemon, plus root. Other users cannot issue `apm` commands. Run the daemon as the user (or via the system service as root) that should own it; there is no config key to widen CLI access to an OS group. --- ## Nginx Integration APM runs its own proxy layer, so Nginx sits in front as an SSL terminator and vhost router. Always set `trust_proxy true` so Vanguard sees real client IPs. ### HTTP reverse proxy ```nginx server { listen 443 ssl; server_name myapp.example.com; ssl_certificate /etc/letsencrypt/live/myapp.example.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/myapp.example.com/privkey.pem; location / { proxy_pass http://127.0.0.1:3000; proxy_http_version 1.1; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } # apm.conf worker { name myapp; exec node; params server.js; server http://127.0.0.1:3000; trust_proxy true; } ``` ### WebSocket proxy ```nginx location /ws { proxy_pass http://127.0.0.1:3001; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_read_timeout 3600s; proxy_send_timeout 3600s; } ``` ### HTTP + WebSocket on one domain ```nginx # WebSocket endpoint — must come before location / location /ws { proxy_pass http://127.0.0.1:3001; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_read_timeout 3600s; proxy_send_timeout 3600s; } location / { proxy_pass http://127.0.0.1:3000; proxy_http_version 1.1; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } # apm.conf worker { name myapp; exec node; params server.js; server http://127.0.0.1:3000, ws://127.0.0.1:3001; trust_proxy true; } ``` ### Unix socket upstream ```nginx upstream apm_myapp { server unix:/run/apm/myapp.sock; keepalive 32; } server { location / { proxy_pass http://apm_myapp; proxy_http_version 1.1; proxy_set_header Connection ""; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } # apm.conf worker { name myapp; exec node; params server.js; server http://unix:/run/apm/myapp.sock; trust_proxy true; } ``` --- ## Apache Integration Apache uses `mod_proxy`, `mod_proxy_http`, and `mod_proxy_wstunnel` for reverse proxying. ```sh $ sudo a2enmod proxy proxy_http proxy_wstunnel rewrite headers ssl $ sudo systemctl reload apache2 ``` Always set `trust_proxy true` in APM worker config when behind Apache. ### HTTP reverse proxy ```apache ServerName myapp.example.com SSLEngine on SSLCertificateFile /etc/letsencrypt/live/myapp.example.com/fullchain.pem SSLCertificateKeyFile /etc/letsencrypt/live/myapp.example.com/privkey.pem ProxyPreserveHost On ProxyPass / http://127.0.0.1:3000/ ProxyPassReverse / http://127.0.0.1:3000/ RequestHeader set X-Forwarded-Proto "https" RequestHeader set X-Real-IP "%{REMOTE_ADDR}e" ``` ### WebSocket proxy `mod_proxy_wstunnel` handles the `Upgrade` handshake. The `RewriteRule` rewrites the scheme to `ws://`. ```apache RewriteEngine On RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteRule ^/ws(/.*)?$ ws://127.0.0.1:3001/ws$1 [P,L] ProxyPreserveHost On ProxyPass /ws ws://127.0.0.1:3001/ws ProxyPassReverse /ws ws://127.0.0.1:3001/ws ``` ### Unix socket upstream Apache uses a pipe syntax: `unix:/path/to/sock|http://localhost/`. ```apache ProxyPass / "unix:/run/apm/myapp.sock|http://localhost/" ProxyPassReverse / "unix:/run/apm/myapp.sock|http://localhost/" ``` Enable the site: ```sh $ sudo a2ensite myapp $ sudo apache2ctl configtest $ sudo systemctl reload apache2 ``` --- ## IPC Wire Protocol APM communicates with each managed worker process over stdin / stdout using a lightweight binary framing protocol — no Unix sockets, no network stack, just pipes. The language connectors implement this for you; this section is for writing a connector for a language APM does not yet ship. Every frame: ``` ┌──────┬────────────────────┬──────────────────────┬──────┬─────────────────┐ │ 0x05 │ uint32 big-endian │ JSON header │ 0x03 │ binary payload │ │ 1 B │ 4 B │ json_len bytes │ 1 B │ binary_len B │ └──────┴────────────────────┴──────────────────────┴──────┴─────────────────┘ ``` Key difference between directions: - APM → worker: `uint32 = json_len + binary_len` (0x03 separator NOT counted) - Worker → APM: `uint32 = json_len + 1 + binary_len` (0x03 separator IS counted) `0x05` is the frame start marker (ENQ); `0x03` is the JSON/binary separator (ETX). ### Reading (APM → worker) ```python if buf[0] != 0x05: resync() payload_len = uint32_be(buf[1:5]) frame_len = payload_len + 6 # wait for this many bytes total sep = frame.index(0x03, offset=5) header = json.parse(frame[5 : sep]) binary = frame[sep+1 :] ``` ### Writing (worker → APM) ```python json_bytes = json.encode(header) length = len(json_bytes) + 1 + len(binary) # +1 for 0x03 frame = b'\x05' + uint32_be(length) + json_bytes + b'\x03' + binary stdout.write(frame) stdout.flush() ``` ### APM → worker JSON header fields | Field | Type | Description | |-------|------|-------------| | `_sessionId` | string | Unique identifier for this connection. Present on every frame. | | `_type` | string | `data` — body/WebSocket frame arrived. `chunk` — streaming chunk. `event` — lifecycle event. | | `_event` | string | When `_type=event`: `connectionClosed` — peer disconnected. | | `_sessionType` | string | `new` for a fresh connection; `moved` for a session migrated by a rolling restart. | | `_sessionData` | object | Persisted data from a previous session (populated after rolling restart). | | `protocol` | string | `http`, `ws`, or `tcp`. | | `method` | string | HTTP method. Null for non-HTTP. | | `path` | string | Request path (without query string). | | `path_array` | array | Path split on `/`, decoded. | | `query` | string | Raw query string. | | `query_object` | object | Parsed query params. Multi-value keys become arrays. | | `headers` | object | Request headers with lower-case keys. | | `cookies` | object | Parsed cookie map. | | `remoteAddress` | string | Comma-separated client IP chain (first value is the real client IP). | | `dataType` | string | `text` or `binary` (on `data`/`chunk` frames). | ### Worker → APM `_command` values Every outgoing frame must include `_session` (the session ID) and `_command`. | Command | Extra fields | Description | |---------|--------------|-------------| | `write` | `dataType` (text\|binary), any HTTP response header, `x-status` (HTTP status as string) | Send HTTP response body or WebSocket frame. Headers only processed on the first `write` per session. | | `writeRaw` | — | Send raw bytes directly to the socket, bypassing HTTP/WS framing. | | `closeConnection` | `code` (integer), `_reason` (optional string) | Close the connection with an HTTP status or WS close frame. | | `saveSessionData` | `_sessionData` — object to persist | Store data in the daemon; the replacement worker receives it in `_sessionData` after a rolling restart. | | `metric` | `name` (string), `value` (number), `type` (`counter`\|`gauge`\|`timing`) | Emit a custom metric. Not tied to a session — `_session` can be empty. | ### Minimal custom connector skeleton ```python # pseudo-code def send_frame(header, binary=b''): j = json.encode({**header, '_session': session_id}) length = len(j) + 1 + len(binary) # +1 for 0x03 stdout.write(b'\x05' + pack('>I', length) + j + b'\x03' + binary) stdout.flush() def read_frame(): while len(buf) < 5: buf += stdin.read(4096) payload_len = unpack('>I', buf[1:5])[0] frame_len = payload_len + 6 while len(buf) < frame_len: buf += stdin.read(4096) frame, buf = buf[:frame_len], buf[frame_len:] sep = frame.index(0x03, 5) header = json.decode(frame[5:sep]) binary = frame[sep+1:] return header, binary ```