Healthcheck
eRPC has a built-in /healthcheck
endpoint that can be used to check the health of the service within Kubernetes, Railway, etc.
Config
You can configure healthcheck on top-level in erpc.yaml
file:
logLevel: debug
server:
# ...
# (OPTIONAL) For zero-downtime deployments, wait before shutting down the server.
# During this period active requests are still being processed but no new requests are accepted.
# Because readiness /healthcheck endpoint will start returning unhealthy after receiving SIGTERM,
# so that Kubernetes (or any other orchestrator) removes the old pod from list of available endpoints.
#
# You usually need two separate delays:
# waitBeforeShutdown – after the pod receives SIGTERM it is marked **NotReady** (via healthcheck) but
# the listener keeps running for this duration. Existing
# requests can finish, new ones are rejected. Set it to at
# least (readinessProbe.periodSeconds × readinessProbe.failureThreshold) + 1s.
# waitAfterShutdown – once the HTTP server is fully stopped we keep the process
# alive for this duration so load-balancers (Envoy, kube-proxy…)
# can gracefully close any still-open TCP connections.
waitBeforeShutdown: 30s
waitAfterShutdown: 30s
# ...
healthCheck:
# (OPTIONAL) Mode can be "simple" (just returns OK/ERROR) or "verbose" (returns detailed JSON)
mode: verbose
# (OPTIONAL) Default evaluation strategy to use when one isn't specified in the request
# See the "Evaluation Strategies" section below for options...
defaultEval: "any:initializedUpstreams"
# (OPTIONAL) Authentication for the healthcheck endpoint
auth:
strategies:
- type: secret
secret:
value: <CHANGE-ME-OR-REMOVE-THIS-STRATEGY>
- type: network
network:
# To allow requests coming from the same host (localhost, 127.0.0.1, ::1)
allowLocalhost: true
# To allow requests coming from private networks
allowedCIDRs:
- "10.0.0.0/8"
- "172.16.0.0/12"
- "192.168.0.0/16"
It is recommended to use healthcheck endpoint for readiness probe only. For liveness probe use TCP healthcheck on the port specified in server.httpPort
(4000 by default).
Readiness and Liveness
For zero-downtime deployments and general health tracking, configure readiness and liveness probes in your orchestrator's deployment configuration.
For example in Kubernetes:
# Allow up to 1 minute to startup if there are too many upstreams or they are slow.
startupProbe:
httpGet:
path: /healthcheck
port: 4000
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 6
# Readiness fails after 20 seconds max, after receiving SIGTERM.
# Great for zero-downtime deployments, and good enough for general health tracking.
readinessProbe:
httpGet:
path: /healthcheck
port: 4000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 5
failureThreshold: 2
successThreshold: 1
# Liveness checks if http server is running, otherwise it means eRPC itself is dead.
livenessProbe:
tcpSocket:
port: 4000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
successThreshold: 1
Evaluation Strategies
The healthcheck endpoint supports different strategies to evaluate the health of your upstreams. You can specify the strategy in the configuration or by adding ?eval=strategy_name
to the URL.
Available strategies:
Strategy | Description |
---|---|
any:initializedUpstreams | Returns healthy if any upstreams are initialized (default) |
any:errorRateBelow90 | Returns healthy if any upstream has an error rate below 90% |
all:errorRateBelow90 | Returns healthy if all upstreams have an error rate below 90% |
any:errorRateBelow100 | Returns healthy if any upstream has an error rate below 100% |
all:errorRateBelow100 | Returns healthy if all upstreams have an error rate below 100% |
any:evm:eth_chainId | Returns healthy if any EVM upstream reports the expected chain ID |
all:evm:eth_chainId | Returns healthy if all EVM upstreams report the expected chain ID |
- Error rate is read from score tracking component of each Upstream and it is a fast memory-access operation.
- The
eth_chainId
evals will send an actual request to the upstreams (in parallel), thus ensure proper timeout is set for the healthcheck (e.g. on Kubernetes readinessProbe.timeoutSeconds).
Endpoints
Global healthcheck
Check the health of all projects and their upstreams:
curl http://localhost:4000/healthcheck -v
# < HTTP/1.1 200 OK
# OK
The global healthcheck checks all active projects and all upstreams. For example even if 1 upstream (on any network) is healthy the any:initializedUpstreams
strategy will return healthy.
Project-specific healthcheck
Check the health of a specific project and network:
curl http://localhost:4000/main/evm/1/healthcheck -v # OR http://localhost:4000/main/evm/1
# < HTTP/1.1 200 OK
# OK
For project-specific healthchecks, only the upstreams for the specified network are checked.
Using a custom evaluation strategy
You can specify which evaluation strategy to use via the query parameter:
# Check if any upstream has an error rate below 90%
curl http://localhost:4000/healthcheck?eval=any:errorRateBelow90
# Check if all EVM upstreams report the correct chain ID
curl http://localhost:4000/main/evm/1/healthcheck?eval=all:evm:eth_chainId
The evaluation strategy can be specified in the configuration as well, as shown above.
Response Modes
Simple Mode (default)
In simple mode, the healthcheck returns a plain text "OK" with a 200 status code if healthy, or an error JSON with a non-200 status code if unhealthy.
curl http://localhost:4000/healthcheck
# OK
Verbose Mode
In verbose mode, the healthcheck returns a detailed JSON response with information about the status of each project and upstream:
curl http://localhost:4000/healthcheck
# {
# "status": "OK",
# "message": "all systems operational",
# "details": {
# "main": {
# "status": "OK",
# "message": "3 / 3 upstreams have low error rates",
# "config": {
# "networks": 2,
# "upstreams": 3,
# "providers": 1
# },
# "upstreams": {
# "alchemy-mainnet": {
# "network": "evm:1",
# "metrics": { ... },
# "status": "OK"
# },
# ...
# }
# }
# }
# }
Authentication
If you've configured authentication for the healthcheck endpoint, you'll need to include the appropriate credentials:
# Using a token in a query parameter
curl "http://localhost:4000/healthcheck?secret=CHANGE_ME"
# ... OR using a token in a header
curl http://localhost:4000/healthcheck -H "X-ERPC-Secret-Token: CHANGE_ME"
Aliasing healthcheck
If you have configured domain aliasing, you can append the /healthcheck
to the URL:
# When aliasing is NOT used:
curl http://rpc.example.com/main/evm/42161/healthcheck -v
# When only project is aliased:
curl http://rpc.example.com/evm/42161/healthcheck -v
# When only project and network architecture is aliased:
curl http://evm-rpc.example.com/42161/healthcheck -v
# When all project, network architecture and chain are aliased:
curl http://eth-evm-rpc.example.com/healthcheck -v