Health Checks
Monitoring the Keeper Gateway using health checks
Overview
This document describes the health check functionality implemented for the KeeperPAM Gateway. Health checks provide essential monitoring capabilities that solve several common operational challenges.
Health Checks provide the following benefits:
Allow load balancers to automatically remove unhealthy instances from rotation and add them back when they recover.
Integrate with monitoring systems (Prometheus, Nagios, Datadog, etc.) to provide automated alerting and dashboards showing gateway health across your infrastructure.
Enable automated monitoring scripts and orchestration tools to detect failures and trigger recovery procedures without human intervention.
The health check service is disabled by default. You must activate it as documented in the next sections.
Simple Health Check Configuration
The below configuration enables a basic health check service on the binary and Docker installation methods. More advanced configuration is also available as documented below.
Activating Health Check - Binary Install method
Start the gateway with health check enabled:
gateway start --health-checkOnly after the gateway is running with health check enabled, you can check its health:
gateway health-checkIf you get an error like "Could not connect to health check server", it means you haven't enabled the health check properly.
If you see "Exception No such command 'keeper-gateway.exe'", you're using the wrong command syntax. Always use "gateway" as the command name.
Activating Health Check - Docker Install method
Modify the docker-compose.yml file to enable health checks:
Add
KEEPER_GATEWAY_HEALTH_CHECK_ENABLEDAdd the
healthchecksection with the desired check intervals
After changing the docker-compose file, pick up the changes and restart the container:
If the container name is keeper-gateway, a one-line bash command to find the service status can be found like this:
If you don't know the container name, this script will give it to you:
Here's an example of checking the health with a bash command:
The below complete bash script can be added to your watchdog services to check service status and automatically restart the container if it's unhealthy. Replace /path/to/ with the proper path.
To schedule this health check on a Linux system, it can be added to the cron
Add to the crontab to watch every minute...
Advanced Healthcheck Configuration Examples
The below section provides detailed configuration for customization of the health checks in different environments.
Basic HTTP
gateway start --health-check
gateway health-check
curl http://127.0.0.1:8099/health
HTTP with Auth
gateway start --health-check --health-check-auth-token mytoken
gateway health-check --token mytoken
curl -H "Authorization: Bearer mytoken" http://127.0.0.1:8099/health
HTTPS (SSL)
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem
gateway health-check --ssl
curl -k https://127.0.0.1:8099/health
HTTPS with Auth
gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem --health-check-auth-token mytoken
gateway health-check --ssl --token mytoken
curl -k -H "Authorization: Bearer mytoken" https://127.0.0.1:8099/health
Custom Port
gateway start --health-check --health-check-port 8443
gateway health-check --port 8443
curl http://127.0.0.1:8443/health
Custom Host
gateway start --health-check --health-check-host 0.0.0.0
gateway health-check --host 0.0.0.0
curl http://0.0.0.0:8099/health
Production Setup
gateway start --health-check --health-check-host 0.0.0.0 --health-check-port 8443 --health-check-ssl --health-check-ssl-cert /etc/ssl/cert.pem --health-check-ssl-key /etc/ssl/key.pem --health-check-auth-token $(cat /etc/secrets/token)
gateway health-check --host 0.0.0.0 --port 8443 --ssl --token $(cat /etc/secrets/token)
curl -k -H "Authorization: Bearer $(cat /etc/secrets/token)" https://0.0.0.0:8443/health
Output Format Examples
Simple Status
gateway health-check --ssl --token mytoken
Returns OK: Gateway is running and connected (exit code 0) or CRITICAL: ... (exit code 1)
Detailed Info
gateway health-check --ssl --token mytoken --info
Key=value pairs suitable for monitoring scripts
JSON Format
gateway health-check --ssl --token mytoken --json
Full JSON response matching HTTP endpoint
Troubleshooting Commands
Check if server is running
curl http://127.0.0.1:8099/health
Connection success or "Connection refused"
Test SSL connectivity
curl -k https://127.0.0.1:8099/health
SSL handshake success or SSL error
Test authentication
curl -k -H "Authorization: Bearer wrongtoken" https://127.0.0.1:8099/health
{"error": "Invalid authentication token"}
Check server binding
curl http://0.0.0.0:8099/health
Success if bound to 0.0.0.0, failure if bound to 127.0.0.1
Error Messages and Troubleshooting
The CLI health check provides detailed error messages to help diagnose issues:
Authentication Errors (HTTP 401)
Connection Errors
SSL Certificate Errors
Implementation
The Gateway health check is implemented using Bottle, a lightweight WSGI micro web-framework for Python. Bottle was chosen for the following advantages:
Minimal dependency (single file, ~60KB in size)
Enhanced security over the built-in Python HTTP server
Proper request routing and handling
Better error management
Thread safety
Production-ready with minimal overhead
CLI Health Check
You can check the gateway's health from the command line:
This command returns:
Exit code 0 if the gateway is healthy
Exit code 1 if the gateway is not running or not healthy
Text output indicating the status (OK/CRITICAL/WARNING)
For detailed output in a machine-parsable format (one key=value pair per line):
For JSON format output (matching the HTTP endpoint format):
If your health check server is using SSL:
If your health check server requires authentication:
If your health check server is running on a non-default port:
If your health check server is running on a different host:
The detailed output includes:
Gateway version
Connection status
WebSocket metrics (when available)
Process information (in background mode)
This makes it suitable for monitoring scripts and cron jobs.
Note: The CLI health check command requires the HTTP health check server to be running. If the health check server is not running, the command will return an error message suggesting to enable the health check server.
HTTP Health Check
The gateway includes a secure HTTP health check endpoint that can be enabled with environment variables or command line arguments.
Configuration
The health check server can be configured using environment variables or command line arguments:
Environment Variables
KEEPER_GATEWAY_HEALTH_CHECK_ENABLED
Enable HTTP health check (1, true, yes)
Disabled
KEEPER_GATEWAY_HEALTH_CHECK_PORT
Port for HTTP server
8099
KEEPER_GATEWAY_HEALTH_CHECK_HOST
Host address to bind to
127.0.0.1
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN
Authentication token for requests
None
KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL
Enable SSL (1, true, yes)
Disabled
KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT
Path to SSL certificate
None
KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY
Path to SSL private key
None
Command Line Arguments
When starting the gateway, you can also use these command line arguments:
Command line arguments take precedence over environment variables when both are specified.
Example Commands
Basic health check with default settings:
Custom port and authentication token:
Bind to all interfaces (only in secure environments):
Enable SSL with certificate and key:
Complete example with all options:
Usage
When enabled, the HTTP health check endpoint will be available at:
Or with SSL:
Response Format
The endpoint returns:
HTTP 200 if the gateway is healthy
HTTP 503 if the gateway is not healthy
JSON response with details:
The response includes:
status: Overall health status ("healthy" or "unhealthy")
message: Human-readable description of the status
details: Detailed information about the gateway
timestamp: Current server timestamp
version: API version
connection_status: Current connection status ("connected", "disconnected", etc.)
websocket: WebSocket connection metrics
uptime_seconds: WebSocket connection uptime in seconds
uptime_human: Human-readable uptime (e.g., "1m 25s")
last_ping_received_seconds_ago: Seconds since the last ping was received
latency_ms: Round-trip latency of the last ping-pong in milliseconds
last_ping_sent_timestamp: Unix timestamp when the last ping was sent
last_pong_received_timestamp: Unix timestamp when the last pong was received
Example Responses
Healthy Gateway:
Unhealthy Gateway:
Note that some metrics like latency_ms, last_ping_sent_timestamp, and last_pong_received_timestamp may not always be present in the response. The availability of these metrics depends on the current state of the WebSocket connection and the timing of ping/pong messages.
Status Update Delays
The health check reflects the current state of the WebSocket connection, but there may be a delay in status updates.
Delayed Status Updates
When connectivity is lost, it may take up to 2 minutes for the health check to report an "unhealthy" status, as the gateway attempts to reconnect. Similarly, when connectivity is restored, it may take up to 2 minutes for the health check to reflect a "healthy" status.
This latency is intentional and allows the gateway to attempt reconnection without immediately reporting failures for transient connectivity issues.
Security
The HTTP health check includes the following security features:
Authentication: When
KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKENis set, requests must include the token in the Authorization header:SSL/TLS: When SSL is enabled, all communication is encrypted. You must provide a valid certificate and private key.
Localhost binding: The server binds to localhost only by default, not exposing the endpoint over the network.
Security Headers: The health check server adds the following security headers to responses:
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy: default-src 'none'
Rate Limiting: Automatic rate limiting is applied to non-localhost connections (60 requests per minute per IP).
Information Protection: When the server is bound to a non-localhost address, sensitive information is automatically redacted from the response.
Forced SSL: SSL is automatically enforced when binding to non-localhost interfaces.
TLS Compatibility
The health check server is configured to support a wide range of clients by:
Using secure TLS defaults (TLS 1.2+ minimum) for maximum security
Supporting modern cipher suites for strong encryption
Automatically handling protocol negotiation for HTTP and HTTPS
For clients that support modern TLS versions, use standard curl commands:
Docker-Specific Configuration Requirements
When running Keeper Gateway inside Docker, special configuration may be required to expose the health check to the host or external systems:
Binding to 0.0.0.0
The health check server must bind to
0.0.0.0to be reachable outside the container.127.0.0.1restricts access to within the container only.
SSL Enforcement
When using
0.0.0.0, Keeper Gateway forces SSL to protect health check data.You must provide a valid certificate and key or the server will not start.
Authentication Requirement
If binding to
0.0.0.0, you must also specify anAUTH_TOKENto secure the endpoint.
Docker Compose Example
Generate a Self-Signed Certificate
Test the Endpoint from the Host
Example Linux Configuration
Or using command line arguments:
Self-Signed SSL Certificates
For testing or internal use, you can generate self-signed certificates to enable SSL/TLS encryption:
Or using command line arguments:
When using self-signed certificates, your HTTP client will need to be configured to trust the certificate or ignore SSL verification (not recommended for production).
Monitoring Integration
This endpoint can be used with monitoring systems like:
Prometheus with blackbox exporter
Nagios/Icinga
Zabbix
Datadog
AWS CloudWatch
Any monitoring system that can perform HTTP checks
Last updated
Was this helpful?

