源文件:
07-operations-runbook.md# 07. Operations Runbook
# Check service status
sudo systemctl status node-site@<domain>.service --no-pager
sudo journalctl -u node-site@<domain>.service -n 100 --no-pager
# Check local listener
sudo ss -lntp | grep <port>
# Check local health endpoint
curl -fsS http://127.0.0.1:<port>/healthz
# Check public HTTPS endpoint
curl -fsS https://<domain>/healthz
# Inspect release layout
find /var/www/nodeXX/<domain> -maxdepth 2 -mindepth 1 -print | sort
readlink -f /var/www/nodeXX/<domain>/current
# Re-seed shared env
sudo init-node-shared-env --domain <domain> --file .env.local --force --set KEY=VALUE
# Redeploy current repo
sudo cicd-deploy-node-site --domain <domain>
# Roll back to previous release
sudo rollback-node-release --domain <domain>
# Roll back to a specific release
sudo rollback-node-release --domain <domain> --release 20260602151802
# Manual fallback deploy
sudo deploy-node-release \
--site-root /var/www/node22/<domain> \
--repo /srv/git/<domain> \
--ref main \
--node-version 22 \
--service node-site@<domain>.service \
--shared-link .env.local \
--healthcheck-url http://127.0.0.1:<port>/healthz
# Common failure points
# Healthcheck fails right after restart
Common cause:
- app startup is slower than healthcheck attempts
Action:
- inspect
journalctl - increase app startup readiness or adjust wrapper retry window later if needed
# Rollback fails
Check:
- target release still exists under
releases/ - service starts on the older release
- shared links are still valid
- old release still passes healthcheck
Note:
rollback-node-releaseautomatically restores the previouscurrentif the rollback target fails healthcheck
# Nginx proxy returns 502
Check:
- service is running
- port is listening
- SELinux port is labeled
http_port_t httpd_can_network_connectis on
# Env values not visible in app
Check:
shared/.env.localexists- current release has symlink to
.env.local - app actually reads process env variables
# CI deploy denied by sudo
Check:
sudo -u cicd sudo -n -l
sudo visudo -c
目录
- 07. Operations Runbook
- Check service status
- Check local listener
- Check local health endpoint
- Check public HTTPS endpoint
- Inspect release layout
- Re-seed shared env
- Redeploy current repo
- Roll back to previous release
- Roll back to a specific release
- Manual fallback deploy
- Common failure points
- Healthcheck fails right after restart
- Rollback fails
- Nginx proxy returns 502
- Env values not visible in app
- CI deploy denied by sudo