Health Checks
Monitoring the Keeper Gateway using health checks
Overview
This document describes the health check functionality implemented for the KeeperPAM Gateway. Health checks provide essential monitoring capabilities that solve several common operational challenges.
Health Checks provide the following benefits:
Allow load balancers to automatically remove unhealthy instances from rotation and add them back when they recover.
Integrate with monitoring systems (Prometheus, Nagios, Datadog, etc.) to provide automated alerting and dashboards showing gateway health across your infrastructure.
Enable automated monitoring scripts and orchestration tools to detect failures and trigger recovery procedures without human intervention.
The health check service is disabled by default. You must activate it as documented in the next sections.
Simple Health Check Configuration
The below configuration enables a basic health check service on the binary and Docker installation methods. More advanced configuration is also available as documented below.
Activating Health Check - Binary Install method
Start the gateway with health check enabled:
gateway start --health-check
Only after the gateway is running with health check enabled, you can check its health:
gateway health-check
If you get an error like "Could not connect to health check server", it means you haven't enabled the health check properly.
If you see "Exception No such command 'keeper-gateway.exe'", you're using the wrong command syntax. Always use "gateway" as the command name.
Activating Health Check - Docker Install method
Modify the docker-compose.yml file to enable health checks:
Add
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED
Add the
healthcheck
section with the desired check intervals
keeper-gateway:
...
environment:
...
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED: 'true'
healthcheck:
test:
- CMD
- /usr/local/bin/keeper-gateway
- health-check
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
...
restart: unless-stopped
After changing the docker-compose file, pick up the changes and restart the container:
docker compose up -d
If the container name is keeper-gateway, a one-line bash command to find the service status can be found like this:
docker inspect --format='{{.State.Health.Status}}' keeper-gateway
If you don't know the container name, this script will give it to you:
docker ps --filter "status=running" --format "{{.Names}} {{.Image}}" | grep keeper-gateway | awk '{print $1}'
Here's an example of checking the health with a bash command:
$ docker inspect --format='{{.State.Health.Status}}' my-gateway-container
healthy
The below complete bash script can be added to your watchdog services to check service status and automatically restart the container if it's unhealthy:
#!/bin/bash
# Find running container matching keeper-gateway
CONTAINER_NAME=$(docker ps --filter "status=running" --format "{{.Names}} {{.Image}}" | grep keeper-gateway | awk '{print $1}')
if [ -z "$CONTAINER_NAME" ]; then
echo "$(date): No running keeper-gateway container found."
exit 1
fi
# Get health status
HEALTH=$(docker inspect --format='{{.State.Health.Status}}' "$CONTAINER_NAME")
if [ "$HEALTH" != "healthy" ]; then
echo "$(date): $CONTAINER_NAME is $HEALTH. Restarting..."
docker restart "$CONTAINER_NAME"
else
echo "$(date): $CONTAINER_NAME is healthy."
fi
To schedule this health check on a Linux system, it can be added to the cron
crontab -e
Add to the crontab to watch every minute...
* * * * * /path/to/watchdog.sh >> /var/log/keeper-watchdog.log 2>&1
You'll need to create the log file...
sudo touch /var/log/keeper-watchdog.log
sudo chown $(whoami) /var/log/keeper-watchdog.log
Advanced Healthcheck Configuration Examples
The below section provides detailed configuration for customization of the health checks in different environments.
Basic HTTP
gateway start --health-check
gateway health-check
curl http://127.0.0.1:8099/health
HTTP with Auth
gateway start --health-check --health-check-auth-token mytoken
gateway health-check --token mytoken
curl -H "Authorization: Bearer mytoken" http://127.0.0.1:8099/health
HTTPS (SSL)
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem
gateway health-check --ssl
curl -k https://127.0.0.1:8099/health
HTTPS with Auth
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem --health-check-auth-token mytoken
gateway health-check --ssl --token mytoken
curl -k -H "Authorization: Bearer mytoken" https://127.0.0.1:8099/health
Custom Port
gateway start --health-check --health-check-port 8443
gateway health-check --port 8443
curl http://127.0.0.1:8443/health
Custom Host
gateway start --health-check --health-check-host 0.0.0.0
gateway health-check --host 0.0.0.0
curl http://0.0.0.0:8099/health
Production Setup
gateway start --health-check --health-check-host 0.0.0.0 --health-check-port 8443 --health-check-ssl --health-check-ssl-cert /etc/ssl/cert.pem --health-check-ssl-key /etc/ssl/key.pem --health-check-auth-token $(cat /etc/secrets/token)
gateway health-check --host 0.0.0.0 --port 8443 --ssl --token $(cat /etc/secrets/token)
curl -k -H "Authorization: Bearer $(cat /etc/secrets/token)" https://0.0.0.0:8443/health
Output Format Examples
Simple Status
gateway health-check --ssl --token mytoken
Returns OK: Gateway is running and connected (exit code 0) or CRITICAL: ... (exit code 1)
Detailed Info
gateway health-check --ssl --token mytoken --info
Key=value pairs suitable for monitoring scripts
JSON Format
gateway health-check --ssl --token mytoken --json
Full JSON response matching HTTP endpoint
Troubleshooting Commands
Check if server is running
curl http://127.0.0.1:8099/health
Connection success or "Connection refused"
Test SSL connectivity
curl -k https://127.0.0.1:8099/health
SSL handshake success or SSL error
Test authentication
curl -k -H "Authorization: Bearer wrongtoken" https://127.0.0.1:8099/health
{"error": "Invalid authentication token"}
Check server binding
curl http://0.0.0.0:8099/health
Success if bound to 0.0.0.0, failure if bound to 127.0.0.1
Error Messages and Troubleshooting
The CLI health check provides detailed error messages to help diagnose issues:
Authentication Errors (HTTP 401)
CRITICAL: Authentication failed when connecting to https://127.0.0.1:8099/health
ERROR: Invalid or missing authentication token.
Possible fixes:
1. Check if auth token is required:
curl -k https://127.0.0.1:8099/health
2. Provide the correct auth token:
gateway health-check --ssl --token YOUR_TOKEN
3. Check gateway startup logs for the configured token
Connection Errors
CRITICAL: Could not connect to health check server at http://127.0.0.1:8099/health
ERROR: Connection failed.
Possible causes:
1. Health check server is not running
2. Wrong host/port combination
3. Network connectivity issues
4. SSL/non-SSL mismatch
Troubleshooting steps:
1. Verify gateway is running with health check enabled:
gateway start --health-check
2. Check if server is using SSL:
gateway health-check --ssl
3. Verify host and port:
Current: 127.0.0.1:8099
4. Test with curl:
curl http://127.0.0.1:8099/health
SSL Certificate Errors
CRITICAL: SSL error connecting to health check server at https://127.0.0.1:8099/health
ERROR: SSL certificate validation failed.
Possible causes:
- Self-signed certificate (try curl with -k flag)
- Invalid certificate path
- Certificate expired
Implementation
The Gateway health check is implemented using Bottle, a lightweight WSGI micro web-framework for Python. Bottle was chosen for the following advantages:
Minimal dependency (single file, ~60KB in size)
Enhanced security over the built-in Python HTTP server
Proper request routing and handling
Better error management
Thread safety
Production-ready with minimal overhead
CLI Health Check
You can check the gateway's health from the command line:
gateway health-check
This command returns:
Exit code 0 if the gateway is healthy
Exit code 1 if the gateway is not running or not healthy
Text output indicating the status (OK/CRITICAL/WARNING)
For detailed output in a machine-parsable format (one key=value pair per line):
gateway health-check -i
For JSON format output (matching the HTTP endpoint format):
gateway health-check -j
If your health check server is using SSL:
gateway health-check --ssl
If your health check server requires authentication:
gateway health-check --ssl --token your_auth_token
If your health check server is running on a non-default port:
gateway health-check --port 8123
If your health check server is running on a different host:
gateway health-check --host 10.0.0.5
The detailed output includes:
Gateway version
Connection status
WebSocket metrics (when available)
Process information (in background mode)
This makes it suitable for monitoring scripts and cron jobs.
Note: The CLI health check command requires the HTTP health check server to be running. If the health check server is not running, the command will return an error message suggesting to enable the health check server.
HTTP Health Check
The gateway includes a secure HTTP health check endpoint that can be enabled with environment variables or command line arguments.
Configuration
The health check server can be configured using environment variables or command line arguments:
Environment Variables
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED
Enable HTTP health check (1, true, yes)
Disabled
KEEPER_GATEWAY_HEALTH_CHECK_PORT
Port for HTTP server
8099
KEEPER_GATEWAY_HEALTH_CHECK_HOST
Host address to bind to
127.0.0.1
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN
Authentication token for requests
None
KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL
Enable SSL (1, true, yes)
Disabled
KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT
Path to SSL certificate
None
KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY
Path to SSL private key
None
Command Line Arguments
When starting the gateway, you can also use these command line arguments:
--health-check Enable the health check server
--health-check-port INT Port for the health check server (default: 8099)
--health-check-host STRING Host address to bind to (default: 127.0.0.1)
--health-check-auth-token Auth token for the health check server
--health-check-ssl Enable SSL for the health check server
--health-check-ssl-cert Path to SSL certificate
--health-check-ssl-key Path to SSL private key
Command line arguments take precedence over environment variables when both are specified.
Example Commands
Basic health check with default settings:
gateway start --health-check
Custom port and authentication token:
gateway start --health-check --health-check-port 9000 --health-check-auth-token mysecrettoken
Bind to all interfaces (only in secure environments):
gateway start --health-check --health-check-host 0.0.0.0
Enable SSL with certificate and key:
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/to/cert.pem --health-check-ssl-key /path/to/key.pem
Complete example with all options:
gateway start --health-check --health-check-port 8443 --health-check-host 10.0.0.5 --health-check-auth-token mysecrettoken --health-check-ssl --health-check-ssl-cert /path/to/cert.pem --health-check-ssl-key /path/to/key.pem
Usage
When enabled, the HTTP health check endpoint will be available at:
http://localhost:8099/health
Or with SSL:
https://localhost:8099/health
Response Format
The endpoint returns:
HTTP 200 if the gateway is healthy
HTTP 503 if the gateway is not healthy
JSON response with details:
{
"status": "healthy",
"message": "Gateway is running and connected",
"details": {
"timestamp": 1742849941,
"version": 1,
"connection_status": "connected",
"websocket": {
"uptime_seconds": 85,
"uptime_human": "1m 25s",
"last_ping_received_seconds_ago": 10,
"latency_ms": 75,
"last_ping_sent_timestamp": 1742850455,
"last_pong_received_timestamp": 1742850455
}
}
}
The response includes:
status: Overall health status ("healthy" or "unhealthy")
message: Human-readable description of the status
details: Detailed information about the gateway
timestamp: Current server timestamp
version: API version
connection_status: Current connection status ("connected", "disconnected", etc.)
websocket: WebSocket connection metrics
uptime_seconds: WebSocket connection uptime in seconds
uptime_human: Human-readable uptime (e.g., "1m 25s")
last_ping_received_seconds_ago: Seconds since the last ping was received
latency_ms: Round-trip latency of the last ping-pong in milliseconds
last_ping_sent_timestamp: Unix timestamp when the last ping was sent
last_pong_received_timestamp: Unix timestamp when the last pong was received
Example Responses
Healthy Gateway:
{
"status": "healthy",
"message": "Gateway is running and connected",
"details": {
"timestamp": 1742849941,
"version": 1,
"connection_status": "connected",
"websocket": {
"uptime_seconds": 85,
"uptime_human": "1m 25s",
"last_ping_received_seconds_ago": 10,
"latency_ms": 75,
"last_ping_sent_timestamp": 1742850455,
"last_pong_received_timestamp": 1742850455
}
}
}
Unhealthy Gateway:
{
"status": "unhealthy",
"message": "Gateway is not properly connected (status: reconnecting)",
"details": {
"timestamp": 1742850874,
"version": 1,
"connection_status": "reconnecting",
"websocket": {
"uptime_seconds": 1018,
"uptime_human": "16m 58s",
"last_ping_received_seconds_ago": 324,
"latency_ms": 77
}
}
}
Note that some metrics like latency_ms
, last_ping_sent_timestamp
, and last_pong_received_timestamp
may not always be present in the response. The availability of these metrics depends on the current state of the WebSocket connection and the timing of ping/pong messages.
Status Update Delays
The health check reflects the current state of the WebSocket connection, but there may be a delay in status updates.
Delayed Status Updates
When connectivity is lost, it may take up to 2 minutes for the health check to report an "unhealthy" status, as the gateway attempts to reconnect. Similarly, when connectivity is restored, it may take up to 2 minutes for the health check to reflect a "healthy" status.
This latency is intentional and allows the gateway to attempt reconnection without immediately reporting failures for transient connectivity issues.
Security
The HTTP health check includes the following security features:
Authentication: When
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN
is set, requests must include the token in the Authorization header:Authorization: Bearer <token>
SSL/TLS: When SSL is enabled, all communication is encrypted. You must provide a valid certificate and private key.
Localhost binding: The server binds to localhost only by default, not exposing the endpoint over the network.
Security Headers: The health check server adds the following security headers to responses:
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy: default-src 'none'
Rate Limiting: Automatic rate limiting is applied to non-localhost connections (60 requests per minute per IP).
Information Protection: When the server is bound to a non-localhost address, sensitive information is automatically redacted from the response.
Forced SSL: SSL is automatically enforced when binding to non-localhost interfaces.
TLS Compatibility
The health check server is configured to support a wide range of clients by:
Using secure TLS defaults (TLS 1.2+ minimum) for maximum security
Supporting modern cipher suites for strong encryption
Automatically handling protocol negotiation for HTTP and HTTPS
For clients that support modern TLS versions, use standard curl commands:
curl -k -H "Authorization: Bearer your_token" https://localhost:8099/health
Docker-Specific Configuration Requirements
When running Keeper Gateway inside Docker, special configuration may be required to expose the health check to the host or external systems:
Binding to 0.0.0.0
The health check server must bind to
0.0.0.0
to be reachable outside the container.127.0.0.1
restricts access to within the container only.
SSL Enforcement
When using
0.0.0.0
, Keeper Gateway forces SSL to protect health check data.You must provide a valid certificate and key or the server will not start.
Authentication Requirement
If binding to
0.0.0.0
, you must also specify anAUTH_TOKEN
to secure the endpoint.
Docker Compose Example
services:
keeper-gateway:
image: keeper/gateway:latest
ports:
- "8099:8099"
volumes:
- ./certs:/certs:ro
environment:
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED: true
KEEPER_GATEWAY_HEALTH_CHECK_HOST: "0.0.0.0"
KEEPER_GATEWAY_HEALTH_CHECK_PORT: 8099
KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL: true
KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT: /certs/healthcheck.crt
KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY: /certs/healthcheck.key
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN: mysecrettoken
Generate a Self-Signed Certificate
mkdir -p certs
openssl req -x509 -nodes -days 365 \
-newkey rsa:2048 \
-keyout certs/healthcheck.key \
-out certs/healthcheck.crt \
-subj "/CN=localhost"
Test the Endpoint from the Host
curl -k -H "Authorization: Bearer mysecrettoken" https://localhost:8099/health
Example Linux Configuration
# Enable HTTP health check
export KEEPER_GATEWAY_HEALTH_CHECK_ENABLED=true
export KEEPER_GATEWAY_HEALTH_CHECK_PORT=8099
export KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN=mysecrettoken
# Start the gateway
gateway start
Or using command line arguments:
gateway start --health-check --health-check-port 8099 --health-check-auth-token mysecrettoken
Self-Signed SSL Certificates
For testing or internal use, you can generate self-signed certificates to enable SSL/TLS encryption:
# Generate a private key
openssl genrsa -out healthcheck.key 2048
# Generate a certificate signing request (CSR)
openssl req -new -key healthcheck.key -out healthcheck.csr -subj "/CN=localhost"
# Generate a self-signed certificate (valid for 365 days)
openssl x509 -req -days 365 -in healthcheck.csr -signkey healthcheck.key -out healthcheck.crt
# Set the environment variables
export KEEPER_GATEWAY_HEALTH_CHECK_ENABLED=true
export KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL=true
export KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT=/path/to/healthcheck.crt
export KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY=/path/to/healthcheck.key
export KEEPER_GATEWAY_HEALTH_CHECK_PORT=8443 # Typical HTTPS port
export KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN=mysecrettoken
# Start the gateway
gateway start
Or using command line arguments:
gateway --health-check --health-check-port 8443 --health-check-ssl --health-check-ssl-cert /path/to/healthcheck.crt --health-check-ssl-key /path/to/healthcheck.key --health-check-auth-token mysecrettoken start
When using self-signed certificates, your HTTP client will need to be configured to trust the certificate or ignore SSL verification (not recommended for production).
Monitoring Integration
This endpoint can be used with monitoring systems like:
Prometheus with blackbox exporter
Nagios/Icinga
Zabbix
Datadog
AWS CloudWatch
Any monitoring system that can perform HTTP checks
Last updated
Was this helpful?