Health & readiness

Liveness and readiness probes for load balancers and orchestrators like Kubernetes.

The engine exposes two unauthenticated probes so an orchestrator or load balancer can tell whether to restart an instance and whether to route traffic to it.

Method + pathQuestion it answersChecks
GET /healthzIs the process alive? (liveness)Nothing external. Always 200 while the listener serves.
GET /readyzIs it ready to serve traffic? (readiness)Every networked dependency is reachable and the engine is not shutting down.

Both bypass authentication - probes carry no API key - and are not written to the request log.

Liveness: GET /healthz

Returns 200 with {"status":"ok"} as long as the HTTP listener is up. It deliberately checks no dependency: a transient database blip should not cause an orchestrator to kill and restart a healthy process. Use it for a Kubernetes livenessProbe.

curl -i $DURABLEX_ENGINE_URL/healthz
# 200 OK
# {"status":"ok"}

Readiness: GET /readyz

Returns 200 with {"status":"ready"} only when every dependency probe passes and the engine is not draining; otherwise 503 with {"status":"not_ready"}. Use it for a Kubernetes readinessProbe so the load balancer stops routing to an instance that cannot serve.

The checks map reports each probed dependency:

curl -s $DURABLEX_ENGINE_URL/readyz
# {"status":"ready","checks":{"store":"ok"}}

Readiness probes only what is actually a network dependency:

  • store - the datastore, always probed.
  • queue / limiter - probed only when backed by a network service; in-process defaults have no network to check and never appear in checks.

A failed dependency is marked unavailable and flips the response to 503. The detailed connection error is written to the engine log, not the response body, since /readyz is unauthenticated:

# {"status":"not_ready","checks":{"store":"ok","queue":"unavailable"}}

During shutdown

On SIGINT/SIGTERM the engine begins a graceful drain. For the whole drain window /readyz returns 503 with {"status":"not_ready","draining":true} so the load balancer removes the instance from rotation before in-flight requests finish, rather than sending it new traffic mid-shutdown.

Kubernetes example

livenessProbe:
  httpGet:
    path: /healthz
    port: 6770
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /readyz
    port: 6770
  periodSeconds: 10

On this page