Health Checks

Monitoring the Keeper Gateway using health checks

Overview

This document describes the health check functionality implemented for the KeeperPAM Gateway. Health checks provide essential monitoring capabilities that solve several common operational challenges.

Health Checks provide the following benefits:

  • Allow load balancers to automatically remove unhealthy instances from rotation and add them back when they recover.

  • Integrate with monitoring systems (Prometheus, Nagios, Datadog, etc.) to provide automated alerting and dashboards showing gateway health across your infrastructure.

  • Enable automated monitoring scripts and orchestration tools to detect failures and trigger recovery procedures without human intervention.


Simple Health Check Configuration

The below configuration enables a basic health check service on the binary and Docker installation methods. More advanced configuration is also available as documented below.

Activating Health Check - Binary Install method

Start the gateway with health check enabled:

gateway start --health-check

Only after the gateway is running with health check enabled, you can check its health:

gateway health-check

If you get an error like "Could not connect to health check server", it means you haven't enabled the health check properly.

If you see "Exception No such command 'keeper-gateway.exe'", you're using the wrong command syntax. Always use "gateway" as the command name.


Activating Health Check - Docker Install method

Modify the docker-compose.yml file to enable health checks:

  • Add KEEPER_GATEWAY_HEALTH_CHECK_ENABLED

  • Add the healthcheck section with the desired check intervals

After changing the docker-compose file, pick up the changes and restart the container:

If the container name is keeper-gateway, a one-line bash command to find the service status can be found like this:

If you don't know the container name, this script will give it to you:

Here's an example of checking the health with a bash command:

The below complete bash script can be added to your watchdog services to check service status and automatically restart the container if it's unhealthy. Replace /path/to/ with the proper path.

To schedule this health check on a Linux system, it can be added to the cron

Add to the crontab to watch every minute...


Advanced Healthcheck Configuration Examples

The below section provides detailed configuration for customization of the health checks in different environments.

Configuration
Start Command
CLI Health Check
Curl Health Check

Basic HTTP

gateway start --health-check

gateway health-check

curl http://127.0.0.1:8099/health

HTTP with Auth

gateway start --health-check --health-check-auth-token mytoken

gateway health-check --token mytoken

curl -H "Authorization: Bearer mytoken" http://127.0.0.1:8099/health

HTTPS (SSL)

gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem

gateway health-check --ssl

curl -k https://127.0.0.1:8099/health

HTTPS with Auth

gateway start --health-check --health-check-ssl --health-check-ssl-cert /path/cert.pem --health-check-ssl-key /path/key.pem --health-check-auth-token mytoken

gateway health-check --ssl --token mytoken

curl -k -H "Authorization: Bearer mytoken" https://127.0.0.1:8099/health

Custom Port

gateway start --health-check --health-check-port 8443

gateway health-check --port 8443

curl http://127.0.0.1:8443/health

Custom Host

gateway start --health-check --health-check-host 0.0.0.0

gateway health-check --host 0.0.0.0

curl http://0.0.0.0:8099/health

Production Setup

gateway start --health-check --health-check-host 0.0.0.0 --health-check-port 8443 --health-check-ssl --health-check-ssl-cert /etc/ssl/cert.pem --health-check-ssl-key /etc/ssl/key.pem --health-check-auth-token $(cat /etc/secrets/token)

gateway health-check --host 0.0.0.0 --port 8443 --ssl --token $(cat /etc/secrets/token)

curl -k -H "Authorization: Bearer $(cat /etc/secrets/token)" https://0.0.0.0:8443/health

Output Format Examples

Output Format
CLI Command
Description

Simple Status

gateway health-check --ssl --token mytoken

Returns OK: Gateway is running and connected (exit code 0) or CRITICAL: ... (exit code 1)

Detailed Info

gateway health-check --ssl --token mytoken --info

Key=value pairs suitable for monitoring scripts

JSON Format

gateway health-check --ssl --token mytoken --json

Full JSON response matching HTTP endpoint

Troubleshooting Commands

Issue
Test Command
Expected Result

Check if server is running

curl http://127.0.0.1:8099/health

Connection success or "Connection refused"

Test SSL connectivity

curl -k https://127.0.0.1:8099/health

SSL handshake success or SSL error

Test authentication

curl -k -H "Authorization: Bearer wrongtoken" https://127.0.0.1:8099/health

{"error": "Invalid authentication token"}

Check server binding

curl http://0.0.0.0:8099/health

Success if bound to 0.0.0.0, failure if bound to 127.0.0.1

Error Messages and Troubleshooting

The CLI health check provides detailed error messages to help diagnose issues:

Authentication Errors (HTTP 401)

Connection Errors

SSL Certificate Errors

Implementation

The Gateway health check is implemented using Bottle, a lightweight WSGI micro web-framework for Python. Bottle was chosen for the following advantages:

  • Minimal dependency (single file, ~60KB in size)

  • Enhanced security over the built-in Python HTTP server

  • Proper request routing and handling

  • Better error management

  • Thread safety

  • Production-ready with minimal overhead

CLI Health Check

You can check the gateway's health from the command line:

This command returns:

  • Exit code 0 if the gateway is healthy

  • Exit code 1 if the gateway is not running or not healthy

  • Text output indicating the status (OK/CRITICAL/WARNING)

For detailed output in a machine-parsable format (one key=value pair per line):

For JSON format output (matching the HTTP endpoint format):

If your health check server is using SSL:

If your health check server requires authentication:

If your health check server is running on a non-default port:

If your health check server is running on a different host:

The detailed output includes:

  • Gateway version

  • Connection status

  • WebSocket metrics (when available)

  • Process information (in background mode)

This makes it suitable for monitoring scripts and cron jobs.

Note: The CLI health check command requires the HTTP health check server to be running. If the health check server is not running, the command will return an error message suggesting to enable the health check server.


HTTP Health Check

The gateway includes a secure HTTP health check endpoint that can be enabled with environment variables or command line arguments.

Configuration

The health check server can be configured using environment variables or command line arguments:

Environment Variables

Environment Variable
Purpose
Default

KEEPER_GATEWAY_HEALTH_CHECK_ENABLED

Enable HTTP health check (1, true, yes)

Disabled

KEEPER_GATEWAY_HEALTH_CHECK_PORT

Port for HTTP server

8099

KEEPER_GATEWAY_HEALTH_CHECK_HOST

Host address to bind to

127.0.0.1

KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN

Authentication token for requests

None

KEEPER_GATEWAY_HEALTH_CHECK_USE_SSL

Enable SSL (1, true, yes)

Disabled

KEEPER_GATEWAY_HEALTH_CHECK_SSL_CERT

Path to SSL certificate

None

KEEPER_GATEWAY_HEALTH_CHECK_SSL_KEY

Path to SSL private key

None

Command Line Arguments

When starting the gateway, you can also use these command line arguments:

Command line arguments take precedence over environment variables when both are specified.

Example Commands

Basic health check with default settings:

Custom port and authentication token:

Bind to all interfaces (only in secure environments):

Enable SSL with certificate and key:

Complete example with all options:

Usage

When enabled, the HTTP health check endpoint will be available at:

Or with SSL:

Response Format

The endpoint returns:

  • HTTP 200 if the gateway is healthy

  • HTTP 503 if the gateway is not healthy

  • JSON response with details:

The response includes:

  • status: Overall health status ("healthy" or "unhealthy")

  • message: Human-readable description of the status

  • details: Detailed information about the gateway

    • timestamp: Current server timestamp

    • version: API version

    • connection_status: Current connection status ("connected", "disconnected", etc.)

    • websocket: WebSocket connection metrics

      • uptime_seconds: WebSocket connection uptime in seconds

      • uptime_human: Human-readable uptime (e.g., "1m 25s")

      • last_ping_received_seconds_ago: Seconds since the last ping was received

      • latency_ms: Round-trip latency of the last ping-pong in milliseconds

      • last_ping_sent_timestamp: Unix timestamp when the last ping was sent

      • last_pong_received_timestamp: Unix timestamp when the last pong was received

Example Responses

Healthy Gateway:

Unhealthy Gateway:

Note that some metrics like latency_ms, last_ping_sent_timestamp, and last_pong_received_timestamp may not always be present in the response. The availability of these metrics depends on the current state of the WebSocket connection and the timing of ping/pong messages.

Status Update Delays

The health check reflects the current state of the WebSocket connection, but there may be a delay in status updates.

Delayed Status Updates

When connectivity is lost, it may take up to 2 minutes for the health check to report an "unhealthy" status, as the gateway attempts to reconnect. Similarly, when connectivity is restored, it may take up to 2 minutes for the health check to reflect a "healthy" status.

This latency is intentional and allows the gateway to attempt reconnection without immediately reporting failures for transient connectivity issues.

Security

The HTTP health check includes the following security features:

  1. Authentication: When KEEPER_GATEWAY_HEALTH_CHECK_AUTH_TOKEN is set, requests must include the token in the Authorization header:

  2. SSL/TLS: When SSL is enabled, all communication is encrypted. You must provide a valid certificate and private key.

  3. Localhost binding: The server binds to localhost only by default, not exposing the endpoint over the network.

  4. Security Headers: The health check server adds the following security headers to responses:

    • X-Content-Type-Options: nosniff

    • X-Frame-Options: DENY

    • Content-Security-Policy: default-src 'none'

  5. Rate Limiting: Automatic rate limiting is applied to non-localhost connections (60 requests per minute per IP).

  6. Information Protection: When the server is bound to a non-localhost address, sensitive information is automatically redacted from the response.

  7. Forced SSL: SSL is automatically enforced when binding to non-localhost interfaces.

TLS Compatibility

The health check server is configured to support a wide range of clients by:

  • Using secure TLS defaults (TLS 1.2+ minimum) for maximum security

  • Supporting modern cipher suites for strong encryption

  • Automatically handling protocol negotiation for HTTP and HTTPS

For clients that support modern TLS versions, use standard curl commands:


Docker-Specific Configuration Requirements

When running Keeper Gateway inside Docker, special configuration may be required to expose the health check to the host or external systems:

Binding to 0.0.0.0

  • The health check server must bind to 0.0.0.0 to be reachable outside the container.

  • 127.0.0.1 restricts access to within the container only.

SSL Enforcement

  • When using 0.0.0.0, Keeper Gateway forces SSL to protect health check data.

  • You must provide a valid certificate and key or the server will not start.

Authentication Requirement

  • If binding to 0.0.0.0, you must also specify an AUTH_TOKEN to secure the endpoint.

Docker Compose Example

Generate a Self-Signed Certificate

Test the Endpoint from the Host


Example Linux Configuration

Or using command line arguments:

Self-Signed SSL Certificates

For testing or internal use, you can generate self-signed certificates to enable SSL/TLS encryption:

Or using command line arguments:

When using self-signed certificates, your HTTP client will need to be configured to trust the certificate or ignore SSL verification (not recommended for production).

Monitoring Integration

This endpoint can be used with monitoring systems like:

  • Prometheus with blackbox exporter

  • Nagios/Icinga

  • Zabbix

  • Datadog

  • AWS CloudWatch

  • Any monitoring system that can perform HTTP checks

Last updated

Was this helpful?