Config
Failsafe

Failsafe

AIOpen as plain markdown for AI

Failsafe policies handle intermittent upstream issues — timeouts, slowdowns, rate limits, transient errors, disagreement between upstreams. They live on networks (the whole request lifecycle, including failover across upstreams) and on upstreams (one attempt against one endpoint).

Six policies

Each one has its own page. This page covers what's common to all of them: scoping, defaults, the observability layer that records every attempt.

  • Timeout — bound how long a request may take. Fixed or quantile-adaptive.
  • Retry — replay transient failures with backoff. Empty-result and block-unavailable get separate knobs.
  • Hedge — race a backup request when the primary is slow.
  • Circuit breaker — temporarily remove an upstream after sustained failure.
  • Consensus — query multiple upstreams in parallel and require agreement.
  • Integrity — empty-response handling and data-correctness checks.

Scoping each policy

Every entry in failsafe[] can be scoped by method and by finality. Entries are evaluated in order — the first whose matchMethod + matchFinality matches the request wins.

  • matchMethod — matcher syntax: * (wildcard), | (OR), ! (NOT). E.g. "eth_call|trace_*", "!debug_*".
  • matchFinality — list of finality states. Omit to match every finality.
projectsnetworks[]failsafe[]
erpc.yaml
projects:  - id: main    networks:      - architecture: evm        evm:          chainId: 1        failsafe:          # Per-method scoping: heavy methods get a longer ceiling.          - matchMethod: "trace_*|debug_*"            timeout: { duration: 60s }            retry:   { maxAttempts: 1 }          # Finality scoping: realtime data deserves shorter timeouts and aggressive hedges.          - matchMethod: "*"            matchFinality: ["realtime", "unfinalized"]            timeout: { duration: 5s }            retry:   { maxAttempts: 3, delay: 100ms }            hedge:   { quantile: 0.9, minDelay: 50ms, maxCount: 2 }          # Catch-all (matched last).          - matchMethod: "*"            timeout: { duration: 30s }            retry:   { maxAttempts: 3, delay: 0ms }

Finality states

matchFinality accepts these four values. There is no latest value — that's a block tag, not a finality state. Using matchFinality: ["latest"] silently never matches.

StateWhat it means
finalizedBlock past the chain's finalization horizon. Safe from reorgs. e.g. eth_getBlockByNumber on an old block, finalized eth_getLogs ranges. Relaxed failsafe is fine.
unfinalizedRecent block that could still reorg. Pending-block data also counts. e.g. eth_getBlockByNumber("latest") on a fresh block. May need more aggressive retries and shorter timeouts.
realtimeData that updates every block: eth_blockNumber, eth_gasPrice, eth_maxPriorityFeePerGas, net_peerCount. Short timeouts + hedge are common.
unknownBlock number not derivable from request/response: eth_getTransactionByHash, trace_transaction, debug_traceTransaction. Data is typically immutable once mined; block context just isn't surfaced.
⚠️

matchFinality: ["latest"] is invalid. latest is a block tag, not a finality state. Use realtime or unfinalized instead.

Where each policy is valid

PolicyNetwork levelUpstream levelNotes
timeoutNetwork timeout covers the full lifecycle (including every upstream retry). Upstream timeout bounds one attempt.
retryNetwork-level retries rotate across upstreams. Upstream-level retries hit the same upstream. Empty-result retries (emptyResultAccept, etc.) only fire at the network level.
hedge(no-op)Hedge races across upstreams; setting it at the upstream level is meaningless.
circuitBreakerTrips one upstream out of the rotation; the network's selection policy then routes elsewhere.
consensusMost commonly network-level; per-upstream usage is rare.

Disabling a policy

Set the policy's value to null (YAML) or undefined/omit (TypeScript) to opt out of a default that would otherwise apply.

failsafe:
  - matchMethod: "*"
    timeout: { duration: 30s }
    retry: null            # explicitly disable retry on this method

Per-attempt observability

Every request carries a full execution trace exposed via trace spans, Prometheus metrics, and HTTP response headers. Useful for debugging retry/hedge/consensus decisions without server-side traces.

Trace span attributes

The Network.Forward span carries:

  • execution.attempts / execution.retries / execution.hedges (totals across all scopes)
  • execution.network_attempts / execution.network_retries / execution.network_hedges
  • upstreams.tried — ordered list of upstream IDs touched
  • upstreams.outcomes — per-attempt outcome: success / empty / transport_error / server_error / client_error / rate_limited / missing_data / exec_revert / block_unavailable / breaker_open / cancelled / timeout / skipped
  • upstreams.reasons — why each upstream was selected: primary / retry / hedge / consensus_slot / sweep
  • upstreams.durations_ms

Each individual attempt also produces Upstream.tryForward.SendRequest and Upstream.forwardAttempt child spans with upstream.id, request.method, attempt counters, and the per-attempt outcome classification.

HTTP response headers

The same trace is mirrored into HTTP response headers for client-side debugging. Headers are emitted on every response path (success, JSON-RPC error, validation reject, auth reject, rate-limit). Default mode is all.

HeaderModeDescription
X-ERPC-Cachesummary, allHIT / MISS
X-ERPC-Upstreamsummary, allWinning upstream ID (single-winner case)
X-ERPC-Durationsummary, allWall-clock ms
X-ERPC-Attemptssummary, allTotal physical operations across all scopes (Upstream + Cache)
X-ERPC-Upstream-Attempts / -Retries / -Hedgessummary, allUpstream-scope counters
X-ERPC-Network-Attempts / -Retries / -Hedgessummary, allNetwork-scope rotation and retry counters
X-ERPC-Cache-Attempts / -Retries / -Hedgessummary, all (when non-zero)Cache-scope counters
X-ERPC-Consensus-Slots / -Disputes / -Low-Participantsall (when non-zero)Consensus participation counters
X-ERPC-UpstreamsallPer-attempt participation log (see format below)

X-ERPC-Upstreams format: each segment is <id>=<reason>:<outcome>:<duration>ms[:won], joined by ;:

X-ERPC-Upstreams: alchemy=primary:success:50ms:won;quicknode=hedge:timeout:5000ms;drpc=consensus_slot:exec_revert:20ms

:won is present when this attempt contributed to the final response. For single-winner requests exactly one segment carries :won; for consensus every participant in the winning agreement group does.

Toggle via server.executionHeaders:

server:
  executionHeaders: all        # default — full per-attempt trace
  # executionHeaders: summary  # counters only (no X-ERPC-Upstreams slice)
  # executionHeaders: off      # no X-ERPC-* headers at all

Common pitfalls

  • matchFinality: ["latest"] — silently never matches. Valid values are finalized, unfinalized, realtime, unknown.
  • Mixing upstream + network retry without thinking about the productupstream.retry.maxAttempts: 3 × network.retry.maxAttempts: 3 = up to 9 attempts per request. Easy to accidentally 9× your upstream traffic.
  • Network timeout shorter than upstream.timeout × maxAttempts — the network gives up before the upstream's retry budget is exhausted. Set the network timeout generously.
  • circuitBreaker at network level — silently ignored; only valid at the upstream level.
  • emptyResultIgnore is deprecated — rename to emptyResultAccept. For network-wide empty-retry control, use directiveDefaults.retryEmpty: false (or per-request ?retryEmpty=false).
  • Single-object legacy failsafe: { ... } form — still accepted, but the array form with matchMethod: "*" is canonical. The single-object form is implicitly matchMethod: "*".
  • retry.delay: 0ms doesn't disable retry — it means "no wait between attempts". Use maxAttempts: 1 to disable retry entirely.
  • Write methods aren't retried even when retry is configured. Set network.evm.idempotentTransactionBroadcast: true if you want eth_sendRawTransaction to be safe under retry/hedge.
Copy for your AI assistant — failsafe scoping & observability referenceExpand for every option, default, and edge case — or copy this entire section into your AI assistant.

FailsafeConfig — top-level fields

FieldTypeNotes
matchMethodstringMatcher pattern. Defaults to "*". Supports * (wildcard), | (OR), ! (NOT).
matchFinality("finalized"|"unfinalized"|"realtime"|"unknown")[]When omitted, matches every finality. Do not use "latest" here — that's a block tag, not a state.
timeoutTimeoutPolicyConfigSee Timeout policy.
retryRetryPolicyConfigSee Retry policy.
hedgeHedgePolicyConfigSee Hedge policy.
circuitBreakerCircuitBreakerPolicyConfigUpstream-only. See Circuit breaker.
consensusConsensusPolicyConfigSee Consensus.

Each policy is independent — you can set any subset on a single failsafe[] entry. Evaluation order within failsafe[] is top-to-bottom; first match wins.

Where each policy lives — at a glance

PolicyNetworkUpstreamCache (failsafeForGets/failsafeForSets)
timeout
retry
hedge(no-op)
circuitBreaker
consensus(rare)

Retryable vs non-retryable errors (canonical list)

Retryable (retry will replay these):

  • HTTP 5xx from the upstream
  • HTTP 408 (request timeout)
  • HTTP 429 (rate limit) — but prefer rateLimitAutoTune for sustained pressure
  • Network errors (TCP reset, DNS failure)
  • Empty responses for methods NOT in retry.emptyResultAccept, when retryEmpty directive is set
  • Block-unavailable conditions where the request's block reference is beyond every upstream's known head

Non-retryable (single-attempt; never retried):

  • HTTP 4xx other than 408/429
  • MethodNotSupported from the upstream
  • Empty responses for methods in retry.emptyResultAccept at-or-below emptyResultConfidence horizon
  • Write methods (eth_sendRawTransaction, eth_sendTransaction) — unless evm.idempotentTransactionBroadcast is enabled on the network

Per-method scoping recipes

Different policy per finality:

failsafe:
  - matchMethod: "*"
    matchFinality: ["realtime", "unfinalized"]
    timeout: { duration: 5s }
    retry:   { maxAttempts: 3, delay: 100ms }
  - matchMethod: "*"
    matchFinality: ["finalized"]
    timeout: { duration: 60s }     # tolerate long backfill reads
    retry:   { maxAttempts: 5, delay: 200ms }
  - matchMethod: "*"
    matchFinality: ["unknown"]      # tx-hash keyed (receipts, traces by hash)
    timeout: { duration: 30s }
    retry:   { maxAttempts: 3 }

Different policy per method group:

failsafe:
  - matchMethod: "trace_*|debug_*"   # expensive — don't multiply
    timeout: { duration: 60s }
    retry:   { maxAttempts: 1 }
  - matchMethod: "eth_getLogs"
    timeout: { duration: 30s }
    retry:   { maxAttempts: 3, delay: 100ms }
  - matchMethod: "*"
    timeout: { duration: 15s }
    retry:   { maxAttempts: 3 }

Real-world example — high-throughput DeFi with hedging

Aggressive hedge across upstreams at network level; per-method fine-tuning at upstream level.

projects:
  - id: defi-prod
    networks:
      - architecture: evm
        evm: { chainId: 1 }
        failsafe:
          - matchMethod: "*"
            hedge: { quantile: 0.9, minDelay: 50ms, maxCount: 2 }
            timeout: { duration: 10s }
    upstreams:
      - id: primary-node
        endpoint: https://primary.example
        failsafe:
          # Price-feed reads — fast and unforgiving
          - matchMethod: "eth_call"
            matchFinality: ["realtime", "unfinalized"]
            timeout: { duration: 1s }
            retry:   { maxAttempts: 1 }
          # Block lookups — slower but must succeed
          - matchMethod: "eth_getBlock*"
            timeout: { duration: 5s }
            retry:   { maxAttempts: 5, delay: 100ms }

Real-world example — indexer chasing tip with broad empty retries

Network-wide retry-empty (caches will catch up shortly), tight per-method scoping for the long-tail backfill.

projects:
  - id: indexer
    networks:
      - architecture: evm
        evm: { chainId: 1 }
        directiveDefaults:
          retryEmpty: true        # treat empty as retryable across the board
        failsafe:
          - matchMethod: "eth_getLogs|eth_call"
            retry:
              maxAttempts: 5
              delay: 100ms
              backoffFactor: 1.2
              jitter: 50ms
              emptyResultConfidence: finalizedBlock
              emptyResultMaxAttempts: 2
          - matchMethod: "*"
            retry: { maxAttempts: 3, delay: 200ms }
            timeout: { duration: 30s }

For policy-level field tables, defaults, gotchas, and metrics, see the dedicated pages: Timeout, Retry, Hedge, Circuit breaker, Consensus, Integrity.