Health Checks
Monitoring the Keeper Gateway using health checks
Overview
This document describes the health check functionality implemented for the KeeperPAM Gateway. Health checks provide essential monitoring capabilities that solve several common operational challenges. Health checks require the Keeper Gateway 1.5.5 or newer.
Docker Container Lifecycle Management
Problem: When a gateway goes offline or loses connection, Docker containers continue running and report as "healthy" even though the gateway inside is not functioning properly. This creates misleading container status and prevents proper automated recovery.
Solution: With health checks enabled, Docker can properly monitor the gateway's actual operational status and automatically restart or terminate containers when the gateway is unhealthy.
Load Balancer Integration
Problem: Load balancers need to know which gateway instances are actively handling connections to route traffic appropriately.
Solution: Health check endpoints allow load balancers to automatically remove unhealthy instances from rotation and add them back when they recover.
Monitoring and Alerting
Problem: Operations teams need real-time visibility into gateway status across multiple deployments and environments.
Solution: Health checks integrate with monitoring systems (Prometheus, Nagios, Datadog, etc.) to provide automated alerting and dashboards showing gateway health across your infrastructure.
Service Orchestration
Problem: In Kubernetes or other orchestration platforms, services need to know when gateways are ready to accept connections and when they should be restarted.
Solution: Health checks provide the necessary endpoints for readiness probes, liveness probes, and automated restart policies.
Automated Recovery
Problem: Manual intervention is required to detect and restart failed gateway instances.
Solution: Health checks enable automated monitoring scripts and orchestration tools to detect failures and trigger recovery procedures without human intervention.
IMPORTANT: The health check server is disabled by default.
You must explicitly enable it with the --health-check
flag or by setting the environment variable KEEPER_GATEWAY_HEALTH_CHECK_ENABLED=true
when starting the gateway.
Quick Start / Common Issues
Step 1: Start the gateway with health check enabled:
gateway start --health-check
Step 2: Only after the gateway is running with health check enabled, you can check its health:
gateway health-check
If you get an error like "Could not connect to health check server", it means you haven't completed Step 1.
If you see "Exception No such command 'keeper-gateway.exe'", you're using the wrong command syntax. Always use gateway as the command name.
Complete Configuration Examples
Here are comprehensive examples showing how to start the gateway and check its health in different configurations:
Basic HTTP
gateway start --health-check
gateway health-check
curl http://127.0.0.1:8099/health
HTTP with Auth
gateway start --health-check --health-check-auth-token mytoken
gateway health-check --token mytoken
curl -H "Authorization: Bearer mytoken" http://127.0.0.1:8099/health
HTTPS (SSL)
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem
gateway health-check --ssl
curl -k https://127.0.0.1:8099/health
HTTPS with Auth
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem --health-check-auth-token mytoken
gateway health-check --ssl --token mytoken
curl -k -H "Authorization: Bearer mytoken" https://127.0.0.1:8099/health
Custom Port
gateway start --health-check --health-check-port 8443
gateway health-check --port 8443
curl http://127.0.0.1:8443/health
Custom Host
gateway start --health-check --health-check-host 0.0.0.0
gateway health-check --host 0.0.0.0
curl http://0.0.0.0:8099/health
Production Setup
gateway start --health-check --health-check-host 0.0.0.0 --health-check-port 8443 --health-check-ssl --health-check-ssl-cert /etc/ssl/cert.pem --health-check-ssl-key /etc/ssl/key.pem --health-check-auth-token $(cat /etc/secrets/token)
gateway health-check --host 0.0.0.0 --port 8443 --ssl --token $(cat /etc/secrets/token)
curl -k -H "Authorization: Bearer $(cat /etc/secrets/token)" https://0.0.0.0:8443/health
Output Format Examples
Simple Status
gateway health-check --ssl --token mytoken
Returns OK: Gateway is running and connected (exit code 0) or CRITICAL: ... (exit code 1)
Detailed Info
gateway health-check --ssl --token mytoken --info
Key=value pairs suitable for monitoring scripts
JSON Format
gateway health-check --ssl --token mytoken --json
Full JSON response matching HTTP endpoint
Troubleshooting Commands
Check if server is running
curl http://127.0.0.1:8099/health
Connection success or "Connection refused"
Test SSL connectivity
curl -k https://127.0.0.1:8099/health
SSL handshake success or SSL error
Test authentication
curl -k -H "Authorization: Bearer wrongtoken" https://127.0.0.1:8099/health
{"error": "Invalid authentication token"}
Check server binding
curl http://0.0.0.0:8099/health
Success if bound to 0.0.0.0, failure if bound to 127.0.0.1
Error Messages and Troubleshooting
The CLI health check provides detailed error messages to help diagnose issues:
Authentication Errors (HTTP 401)
CRITICAL: Authentication failed when connecting to https://127.0.0.1:8099/health
ERROR: Invalid or missing authentication token.
Possible fixes:
1. Check if auth token is required:
curl -k https://127.0.0.1:8099/health
2. Provide the correct auth token:
gateway health-check --ssl --token YOUR_TOKEN
3. Check gateway startup logs for the configured token
Connection Errors
CRITICAL: Could not connect to health check server at http://127.0.0.1:8099/health
ERROR: Connection failed.
Possible causes:
1. Health check server is not running
2. Wrong host/port combination
3. Network connectivity issues
4. SSL/non-SSL mismatch
Troubleshooting steps:
1. Verify gateway is running with health check enabled:
gateway start --health-check
2. Check if server is using SSL:
gateway health-check --ssl
3. Verify host and port:
Current: 127.0.0.1:8099
4. Test with curl:
curl http://127.0.0.1:8099/health
SSL Certificate Errors
CRITICAL: SSL error connecting to health check server at https://127.0.0.1:8099/health
ERROR: SSL certificate validation failed.
Possible causes:
- Self-signed certificate (try curl with -k flag)
- Invalid certificate path
- Certificate expired
Implementation
The Gateway health check is implemented using Bottle, a lightweight WSGI micro web-framework for Python. Bottle was chosen for the following advantages:
Minimal dependency (single file, ~60KB in size)
Enhanced security over the built-in Python HTTP server
Proper request routing and handling
Better error management
Thread safety
Production-ready with minimal overhead
CLI Health Check
You can check the gateway's health from the command line:
gateway health-check
This command returns:
Exit code 0 if the gateway is healthy
Exit code 1 if the gateway is not running or not healthy
Text output indicating the status (OK/CRITICAL/WARNING)
For detailed output in a machine-parsable format (one key=value pair per line):
gateway health-check -i
For JSON format output (matching the HTTP endpoint format):
gateway health-check -j
If your health check server is using SSL:
gateway health-check --ssl
If your health check server requires authentication:
gateway health-check --ssl --token your_auth_token
If your health check server is running on a non-default port:
gateway health-check --port 8123
If your health check server is running on a different host:
gateway health-check --host 10.0.0.5
The detailed output includes:
Gateway version
Connection status
WebSocket metrics (when available)
Process information (in background mode)
This makes it suitable for monitoring scripts and cron jobs.
Note: The CLI health check command requires the HTTP health check server to be running. If the health check server is not running, the command will return an error message suggesting to enable the health check server.
HTTP Health Check
The gateway includes a secure HTTP health check endpoint that can be enabled with environment variables or command line arguments.
Configuration
The health check server can be configured using environment variables or command line arguments:
Environment Variables
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED
Enable HTTP health check (1, true, yes)
Disabled
KEEPER_GATEWAY_HEALTH_CHECK_PORT
Port for HTTP server
8099
KEEPER_GATEWAY_HEALTH_CHECK_HOST
Host address to bind to
127.0.0.1
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN
Authentication token for requests
None
KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL
Enable SSL (1, true, yes)
Disabled
KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT
Path to SSL certificate
None
KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY
Path to SSL private key
None
Command Line Arguments
When starting the gateway, you can also use these command line arguments:
--health-check Enable the health check server
--health-check-port INT Port for the health check server (default: 8099)
--health-check-host STRING Host address to bind to (default: 127.0.0.1)
--health-check-auth-token Auth token for the health check server
--health-check-ssl Enable SSL for the health check server
--health-check-ssl-cert Path to SSL certificate
--health-check-ssl-key Path to SSL private key
Command line arguments take precedence over environment variables when both are specified.
Example Commands
Basic health check with default settings:
gateway start --health-check
Custom port and authentication token:
gateway start --health-check --health-check-port 9000 --health-check-auth-token mysecrettoken
Bind to all interfaces (only in secure environments):
gateway start --health-check --health-check-host 0.0.0.0
Enable SSL with certificate and key:
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/to/cert.pem --health-check-ssl-key /path/to/key.pem
Complete example with all options:
gateway start --health-check --health-check-port 8443 --health-check-host 10.0.0.5 --health-check-auth-token mysecrettoken --health-check-ssl --health-check-ssl-cert /path/to/cert.pem --health-check-ssl-key /path/to/key.pem
Usage
When enabled, the HTTP health check endpoint will be available at:
http://localhost:8099/health
Or with SSL:
https://localhost:8099/health
Response Format
The endpoint returns:
HTTP 200 if the gateway is healthy
HTTP 503 if the gateway is not healthy
JSON response with details:
{
"status": "healthy",
"message": "Gateway is running and connected",
"details": {
"timestamp": 1742849941,
"version": 1,
"connection_status": "connected",
"websocket": {
"uptime_seconds": 85,
"uptime_human": "1m 25s",
"last_ping_received_seconds_ago": 10,
"latency_ms": 75,
"last_ping_sent_timestamp": 1742850455,
"last_pong_received_timestamp": 1742850455
}
}
}
The response includes:
status: Overall health status ("healthy" or "unhealthy")
message: Human-readable description of the status
details: Detailed information about the gateway
timestamp: Current server timestamp
version: API version
connection_status: Current connection status ("connected", "disconnected", etc.)
websocket: WebSocket connection metrics
uptime_seconds: WebSocket connection uptime in seconds
uptime_human: Human-readable uptime (e.g., "1m 25s")
last_ping_received_seconds_ago: Seconds since the last ping was received
latency_ms: Round-trip latency of the last ping-pong in milliseconds
last_ping_sent_timestamp: Unix timestamp when the last ping was sent
last_pong_received_timestamp: Unix timestamp when the last pong was received
Example Responses
Healthy Gateway:
{
"status": "healthy",
"message": "Gateway is running and connected",
"details": {
"timestamp": 1742849941,
"version": 1,
"connection_status": "connected",
"websocket": {
"uptime_seconds": 85,
"uptime_human": "1m 25s",
"last_ping_received_seconds_ago": 10,
"latency_ms": 75,
"last_ping_sent_timestamp": 1742850455,
"last_pong_received_timestamp": 1742850455
}
}
}
Unhealthy Gateway:
{
"status": "unhealthy",
"message": "Gateway is not properly connected (status: reconnecting)",
"details": {
"timestamp": 1742850874,
"version": 1,
"connection_status": "reconnecting",
"websocket": {
"uptime_seconds": 1018,
"uptime_human": "16m 58s",
"last_ping_received_seconds_ago": 324,
"latency_ms": 77
}
}
}
Note that some metrics like latency_ms
, last_ping_sent_timestamp
, and last_pong_received_timestamp
may not always be present in the response. The availability of these metrics depends on the current state of the WebSocket connection and the timing of ping/pong messages.
Status Update Delays
The health check reflects the current state of the WebSocket connection, but there may be a delay in status updates.
Delayed Status Updates
When connectivity is lost, it may take up to 2 minutes for the health check to report an "unhealthy" status, as the gateway attempts to reconnect. Similarly, when connectivity is restored, it may take up to 2 minutes for the health check to reflect a "healthy" status.
This latency is intentional and allows the gateway to attempt reconnection without immediately reporting failures for transient connectivity issues.
Security
The HTTP health check includes the following security features:
Authentication: When
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN
is set, requests must include the token in the Authorization header:Authorization: Bearer <token>
SSL/TLS: When SSL is enabled, all communication is encrypted. You must provide a valid certificate and private key.
Localhost binding: The server binds to localhost only by default, not exposing the endpoint over the network.
Security Headers: The health check server adds the following security headers to responses:
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy: default-src 'none'
Rate Limiting: Automatic rate limiting is applied to non-localhost connections (60 requests per minute per IP).
Information Protection: When the server is bound to a non-localhost address, sensitive information is automatically redacted from the response.
Forced SSL: SSL is automatically enforced when binding to non-localhost interfaces.
TLS Compatibility
The health check server is configured to support a wide range of clients by:
Using secure TLS defaults (TLS 1.2+ minimum) for maximum security
Supporting modern cipher suites for strong encryption
Automatically handling protocol negotiation for HTTP and HTTPS
For clients that support modern TLS versions, use standard curl commands:
curl -k -H "Authorization: Bearer your_token" https://localhost:8099/health
Docker-Specific Configuration Requirements
When running Keeper Gateway inside Docker, special configuration is required to expose the health check to the host or external systems:
Binding to 0.0.0.0
The health check server must bind to
0.0.0.0
to be reachable outside the container.127.0.0.1
restricts access to within the container only.
SSL Enforcement
When using
0.0.0.0
, Keeper Gateway forces SSL to protect health check data.You must provide a valid certificate and key or the server will not start.
Authentication Requirement
If binding to
0.0.0.0
, you must also specify anAUTH_TOKEN
to secure the endpoint.
Docker Compose Example
services:
keeper-gateway:
image: keeper/gateway:latest
ports:
- "8099:8099"
volumes:
- ./certs:/certs:ro
environment:
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED: true
KEEPER_GATEWAY_HEALTH_CHECK_HOST: "0.0.0.0"
KEEPER_GATEWAY_HEALTH_CHECK_PORT: 8099
KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL: true
KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT: /certs/healthcheck.crt
KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY: /certs/healthcheck.key
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN: mysecrettoken
Generate a Self-Signed Certificate
mkdir -p certs
openssl req -x509 -nodes -days 365 \
-newkey rsa:2048 \
-keyout certs/healthcheck.key \
-out certs/healthcheck.crt \
-subj "/CN=localhost"
Test the Endpoint from the Host
curl -k -H "Authorization: Bearer mysecrettoken" https://localhost:8099/health
Example Linux Configuration
# Enable HTTP health check
export KEEPER_GATEWAY_HEALTH_CHECK_ENABLED=true
export KEEPER_GATEWAY_HEALTH_CHECK_PORT=8099
export KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN=mysecrettoken
# Start the gateway
gateway start
Or using command line arguments:
gateway start --health-check --health-check-port 8099 --health-check-auth-token mysecrettoken
Self-Signed SSL Certificates
For testing or internal use, you can generate self-signed certificates to enable SSL/TLS encryption:
# Generate a private key
openssl genrsa -out healthcheck.key 2048
# Generate a certificate signing request (CSR)
openssl req -new -key healthcheck.key -out healthcheck.csr -subj "/CN=localhost"
# Generate a self-signed certificate (valid for 365 days)
openssl x509 -req -days 365 -in healthcheck.csr -signkey healthcheck.key -out healthcheck.crt
# Set the environment variables
export KEEPER_GATEWAY_HEALTH_CHECK_ENABLED=true
export KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL=true
export KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT=/path/to/healthcheck.crt
export KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY=/path/to/healthcheck.key
export KEEPER_GATEWAY_HEALTH_CHECK_PORT=8443 # Typical HTTPS port
export KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN=mysecrettoken
# Start the gateway
gateway start
Or using command line arguments:
gateway --health-check --health-check-port 8443 --health-check-ssl --health-check-ssl-cert /path/to/healthcheck.crt --health-check-ssl-key /path/to/healthcheck.key --health-check-auth-token mysecrettoken start
When using self-signed certificates, your HTTP client will need to be configured to trust the certificate or ignore SSL verification (not recommended for production).
Monitoring Integration
This endpoint can be used with monitoring systems like:
Prometheus with blackbox exporter
Nagios/Icinga
Zabbix
Datadog
AWS CloudWatch
Any monitoring system that can perform HTTP checks
Last updated
Was this helpful?