Healthcheck

eRPC has a built-in /healthcheck endpoint that can be used to check the health of the service within Kubernetes, Railway, etc.

Config

You can configure healthcheck on top-level in erpc.yaml file:

logLevel: debug
server:
  # ...
  # (OPTIONAL) For zero-downtime deployments, wait before shutting down the server.
  # During this period active requests are still being processed but no new requests are accepted.
  # Because readiness /healthcheck endpoint will start returning unhealthy after receiving SIGTERM,
  # so that Kubernetes (or any other orchestrator) removes the old pod from list of available endpoints.
  #
  # You usually need two separate delays:
  #   waitBeforeShutdown – after the pod receives SIGTERM it is marked **NotReady** (via healthcheck) but
  #                        the listener keeps running for this duration. Existing
  #                        requests can finish, new ones are rejected. Set it to at
  #                        least (readinessProbe.periodSeconds × readinessProbe.failureThreshold) + 1s.
  #   waitAfterShutdown  – once the HTTP server is fully stopped we keep the process
  #                        alive for this duration so load-balancers (Envoy, kube-proxy…)
  #                        can gracefully close any still-open TCP connections.
  waitBeforeShutdown: 30s
  waitAfterShutdown: 30s
  # ...
 
healthCheck:
  # (OPTIONAL) Mode can be "simple" (just returns OK/ERROR) or "verbose" (returns detailed JSON)
  mode: verbose
  
  # (OPTIONAL) Default evaluation strategy to use when one isn't specified in the request
  # See the "Evaluation Strategies" section below for options...
  defaultEval: "any:initializedUpstreams"
  
  # (OPTIONAL) Authentication for the healthcheck endpoint
  auth:
    strategies:
      - type: secret
        secret:
          value: <CHANGE-ME-OR-REMOVE-THIS-STRATEGY>
      - type: network
        network:
          # To allow requests coming from the same host (localhost, 127.0.0.1, ::1)
          allowLocalhost: true
          # To allow requests coming from private networks
          allowedCIDRs:
          - "10.0.0.0/8"
          - "172.16.0.0/12"
          - "192.168.0.0/16"

It is recommended to use healthcheck endpoint for readiness probe only. For liveness probe use TCP healthcheck on the port specified in server.httpPort (4000 by default).

Readiness and Liveness

For zero-downtime deployments and general health tracking, configure readiness and liveness probes in your orchestrator's deployment configuration.

For example in Kubernetes:

# Allow up to 1 minute to startup if there are too many upstreams or they are slow.
startupProbe:
  httpGet:
    path: /healthcheck
    port: 4000
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 6
 
# Readiness fails after 20 seconds max, after receiving SIGTERM.
# Great for zero-downtime deployments, and good enough for general health tracking.
readinessProbe:
  httpGet:
    path: /healthcheck
    port: 4000
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 5
  failureThreshold: 2
  successThreshold: 1
 
# Liveness checks if http server is running, otherwise it means eRPC itself is dead.
livenessProbe:
  tcpSocket:
    port: 4000
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 1
  failureThreshold: 3
  successThreshold: 1

Evaluation Strategies

The healthcheck endpoint supports different strategies to evaluate the health of your upstreams. You can specify the strategy in the configuration or by adding ?eval=strategy_name to the URL.

Available strategies:

Strategy	Description
`any:initializedUpstreams`	Returns healthy if any upstreams are initialized (default)
`all:activeUpstreams`	Returns healthy if all configured upstreams are initialized AND not cordoned
`any:errorRateBelow90`	Returns healthy if any upstream has an error rate below 90%
`all:errorRateBelow90`	Returns healthy if all upstreams have an error rate below 90%
`any:errorRateBelow100`	Returns healthy if any upstream has an error rate below 100%
`all:errorRateBelow100`	Returns healthy if all upstreams have an error rate below 100%
`any:evm:eth_chainId`	Returns healthy if any EVM upstream reports the expected chain ID
`all:evm:eth_chainId`	Returns healthy if all EVM upstreams report the expected chain ID

Error rate is read from score tracking component of each Upstream and it is a fast memory-access operation.
The eth_chainId evals will send an actual request to the upstreams (in parallel), thus ensure proper timeout is set for the healthcheck (e.g. on Kubernetes readinessProbe.timeoutSeconds).
The all:activeUpstreams is an aggressive strategy that checks both initialization status and cordon status of ALL configured upstreams. An upstream is "cordoned" when selection policy exclude it from the list.

Endpoints

Global healthcheck

Check the health of all projects and their upstreams:

curl http://localhost:4000/healthcheck -v
# < HTTP/1.1 200 OK
# OK

The global healthcheck checks all active projects and all upstreams. For example even if 1 upstream (on any network) is healthy the any:initializedUpstreams strategy will return healthy.

Project-specific healthcheck

Check the health of a specific project and network:

curl http://localhost:4000/main/evm/1/healthcheck -v # OR http://localhost:4000/main/evm/1
# < HTTP/1.1 200 OK
# OK

For project-specific healthchecks, only the upstreams for the specified network are checked.

Using a custom evaluation strategy

You can specify which evaluation strategy to use via the query parameter:

# Check if any upstream has an error rate below 90%
curl http://localhost:4000/healthcheck?eval=any:errorRateBelow90
 
# Check if all EVM upstreams report the correct chain ID
curl http://localhost:4000/main/evm/1/healthcheck?eval=all:evm:eth_chainId

The evaluation strategy can be specified in the configuration as well, as shown above.

Response Modes

Simple Mode (default)

In simple mode, the healthcheck returns a plain text "OK" with a 200 status code if healthy, or an error JSON with a non-200 status code if unhealthy.

curl http://localhost:4000/healthcheck
# OK

Verbose Mode

In verbose mode, the healthcheck returns a detailed JSON response with information about the status of each project and upstream:

curl http://localhost:4000/healthcheck
# {
#   "status": "OK",
#   "message": "all systems operational",
#   "details": {
#     "main": {
#       "status": "OK",
#       "message": "3 / 3 upstreams have low error rates",
#       "config": {
#         "networks": 2,
#         "upstreams": 3,
#         "providers": 1
#       },
#       "upstreams": {
#         "alchemy-mainnet": {
#           "network": "evm:1",
#           "metrics": { ... },
#           "status": "OK"
#         },
#         ...
#       }
#     }
#   }
# }

Authentication

If you've configured authentication for the healthcheck endpoint, you'll need to include the appropriate credentials:

# Using a token in a query parameter
curl "http://localhost:4000/healthcheck?secret=CHANGE_ME"
 
# ... OR using a token in a header
curl http://localhost:4000/healthcheck -H "X-ERPC-Secret-Token: CHANGE_ME"

Aliasing healthcheck

If you have configured domain aliasing, you can append the /healthcheck to the URL:

# When aliasing is NOT used:
curl http://rpc.example.com/main/evm/42161/healthcheck -v
 
# When only project is aliased:
curl http://rpc.example.com/evm/42161/healthcheck -v
 
# When only project and network architecture is aliased:
curl http://evm-rpc.example.com/42161/healthcheck -v
 
# When all project, network architecture and chain are aliased:
curl http://eth-evm-rpc.example.com/healthcheck -v

URL Batching