Health & Diagnostics¶
What you'll learn here: how to read /healthz, what 200 vs 503 means, and how to interpret degraded states.
Endpoint¶
GET /healthz
/healthz is always public, even when core.auth.enabled: true.
Response status:
200: overall healthy503: one or more components degraded
Payload shape¶
Top-level fields include:
status:okordegradedservers: per-upstream health snapshotspersistence: backend health + configured/effective backend infoadapter_wiring: readiness of configured adapter wiringstartupandstartup_reconciliationwhen availabledegraded_reasonwhen status isdegraded
When upstream ping is disabled for a server (core.upstream_ping.enabled: false), that server entry reports:
status: "ok"detail: "upstream_ping_disabled"
It does not include the normal ping object.
degraded_reason¶
Common values:
upstream_unhealthyadapter_wiring_incomplete- persistence-policy reasons such as:
persistence_unavailable_during_<phase>fallback_memory_activated_during_<phase>persistence_unavailable_via_<component>
So degraded_reason is not limited to only three fixed strings.
Example payload¶
{
"status": "degraded",
"degraded_reason": "upstream_unhealthy",
"servers": [
{
"server_id": "playwright",
"mount_path": "/mcp/playwright",
"upstream_url": "http://playwright:8931/mcp",
"status": "degraded",
"breaker": {
"state": "open"
},
"ping": {
"last_latency_ms": 5012.1,
"last_error": "timeout"
},
"detail": "upstream_unhealthy"
}
],
"persistence": {
"status": "ok",
"configured_type": "disk",
"effective_type": "disk",
"fallback_active": false
},
"adapter_wiring": {
"ready": true
}
}
Deployment use¶
Compose healthcheck:
services:
remote-mcp-adapter:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8932/healthz"]
interval: 15s
timeout: 5s
retries: 3
Kubernetes probe:
livenessProbe:
httpGet:
path: /healthz
port: 8932
initialDelaySeconds: 10
periodSeconds: 15
Next steps¶
- See also: Configuration - upstream ping and startup settings.
- See also: Security - public vs protected routes.