# Timeout > Source: https://docs.erpc.cloud/config/failsafe/timeout > Give every request a hard latency budget — three nested layers keep stalled upstreams from tying up your connections indefinitely. > Format: machine-readable markdown export of the docs page above. > All collapsible AI sections are inlined and fully expanded. # Timeout A single slow upstream can monopolize connections and ripple into your application's latency. eRPC solves this with three nested timeout layers — a hard HTTP ceiling, a lifecycle bound covering every retry and hedge, and a per-upstream limit that triggers failover instead of waiting forever. Set them once and never babysit stalled providers again. **What you get** - Predictable latency budgets for callers regardless of upstream behavior - Automatic failover when a single upstream stalls — other providers still win - Adaptive timeouts that self-tune to per-method P50/P90/P99 as traffic flows - Protection against feedback loops: a built-in floor prevents fast quantiles from collapsing the budget ## Quick taste Illustrative, not a tuned production config — adaptive network timeout keyed to per-method P99: **Config path:** `projects[].networks[].failsafe[].timeout` **YAML — `erpc.yaml`:** ```yaml projects: - id: main networks: - architecture: evm evm: { chainId: 1 } failsafe: - matchMethod: "*" timeout: duration: # adaptive per-method P99: fast methods get tight budgets automatically quantile: 0.99 base: 2s # floor prevents budget from collapsing after a run of fast responses min: 500ms max: 30s ``` **TypeScript — `erpc.ts`:** ```typescript projects: [{ id: "main", networks: [{ architecture: "evm", evm: { chainId: 1 }, failsafe: [{ matchMethod: "*", timeout: { duration: { // adaptive per-method P99: fast methods get tight budgets automatically quantile: 0.99, base: "2s", // floor prevents budget from collapsing after a run of fast responses min: "500ms", max: "30s", }, }, }], }], }] ``` ## Agent reference Copy one of these prompts into your AI agent session (Claude Code, Cursor, …) — each one points the agent at this page's machine-readable reference so it can do the work correctly: **Prompt Example #1: set up timeout budgets from scratch** ```text Add timeout failsafe policies to my eRPC config from scratch so every request has a hard latency budget. Use adaptive (quantile) mode at the network scope for per-method self-tuning, and a tighter upstream-scope timeout so a single stalled provider triggers failover rather than burning the whole budget. Work with my existing eRPC config. Read the full reference first: https://docs.erpc.cloud/config/failsafe/timeout.llms.txt ``` **Prompt Example #2: audit existing timeouts for footguns** ```text Audit the timeout settings in my eRPC config: check that network timeouts are larger than upstream timeout × retry maxAttempts, that quantile mode always has a base or max (cold-start fail-open risk), that min is not set to 0 in quantile mode (feedback-loop collapse risk), and that all failsafe timeouts are shorter than server.maxTimeout. Show any violations with suggested fixes. Reference: https://docs.erpc.cloud/config/failsafe/timeout.llms.txt ``` **Prompt Example #3: debug requests timing out unexpectedly** ```text My eRPC instance is returning timeout errors on some requests. Help me figure out which timeout layer is firing (server.maxTimeout vs network-scope vs upstream-scope), what Prometheus metrics and response headers to check, and how to adjust the config in my eRPC config so the right layer fires at the right time. Reference: https://docs.erpc.cloud/config/failsafe/timeout.llms.txt ``` **Prompt Example #4: tune timeouts for archival / batch workloads** ```text I run both latency-sensitive realtime queries and heavy archival eth_getLogs scans through the same eRPC instance. Add per-method timeout failsafe entries in my eRPC config so archive methods get a generous ceiling while realtime reads stay tight, without raising server.maxTimeout for everyone. Reference: https://docs.erpc.cloud/config/failsafe/timeout.llms.txt ``` **Prompt Example #5: pair upstream timeouts with retry for automatic failover** ```text Configure eRPC so that when one upstream stalls it times out quickly and the network-level retry automatically routes to a different provider — without the caller ever seeing a failure unless ALL providers are down. Use a tight upstream-scope timeout and a generous network timeout in my eRPC config that covers the full retry budget. Reference: https://docs.erpc.cloud/config/failsafe/timeout.llms.txt ``` --- ### Timeout — full agent reference ### How it works **Three-layer hierarchy.** From outermost to innermost: 1. **HTTP server `maxTimeout`** (`server.maxTimeout`, default `150s`). Applied via `TimeoutHandler` wrapping the entire HTTP handler using `context.WithTimeoutCause(r.Context(), dt, ErrHandlerTimeout)`. This is an absolute ceiling — no request survives it regardless of what failsafe policies are set. Source: [`erpc/http_server.go:L65-67`](https://github.com/erpc/erpc/blob/main/erpc/http_server.go#L65-L67), [`erpc/http_timeout.go:L20-143`](https://github.com/erpc/erpc/blob/main/erpc/http_timeout.go#L20-L143). 2. **Network-scope failsafe timeout** (`networks[].failsafe[].timeout`). Applied as `context.WithTimeoutCause` wrapping the entire `networkExecutor.Run` call — it bounds ALL retries and hedges for that method match, not just a single attempt. Fires with `ErrDynamicTimeoutExceeded`, later classified as `ErrFailsafeTimeoutExceeded{scope: "network"}`. Source: [`erpc/network_executor.go:L164-171`](https://github.com/erpc/erpc/blob/main/erpc/network_executor.go#L164-L171). 3. **Upstream-scope failsafe timeout** (`upstreams[].failsafe[].timeout`). Applied inside the upstream's own failsafe executor, bounding one upstream's entire retry budget. Classification becomes `ErrFailsafeTimeoutExceeded{scope: "upstream"}`. When this fires, the network's retry or selection policy can still route elsewhere. **No per-attempt timeout.** There is no isolated timeout wrapping only a single RPC call. Both scopes are lifecycle-scoped. A 500ms network timeout with `maxAttempts: 3` gives a 500ms total budget shared across all three attempts — not 500ms each. This is intentional and test-locked (`erpc/networks_timeout_test.go:L378`). **`NewTimeoutFunc` construction path.** `common.NewTimeoutFunc(logger, cfg)` builds a `TimeoutFunc` from `TimeoutPolicyConfig`. The returned function is called per-request and returns `*time.Duration` (nil disables the timeout). Construction steps: 1. `cfg == nil || cfg.Duration.IsZero()` → return nil (no timeout for this scope) 2. `quantile <= 0` → pre-compute `dur = spec.Resolve(nil)` once at construction, return a constant function 3. `quantile > 0` and `min == 0` → apply the auto-floor to a local copy, then build a per-request closure 4. Per-request closure calls `ntw.GetMethodMetrics(method).GetResponseQuantiles()` then `resolved.Resolve(qt)`, and emits `MetricNetworkTimeoutDurationSeconds` on each call Source: [`common/timeout_func.go:L23-83`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L23-L83). **`ErrDynamicTimeoutExceeded` sentinel type.** Declared as `var ErrDynamicTimeoutExceeded = errors.New("dynamic timeout exceeded")` at [`common/errors.go:L1981-1984`](https://github.com/erpc/erpc/blob/main/common/errors.go#L1981-L1984). It is a plain `*errors.errorString` — NOT a `StandardError` (has no `Code`, `Message`, or `Details` fields). This is intentional: it is a control-flow signal, not a client-facing domain error. Detection via `errors.Is(err, common.ErrDynamicTimeoutExceeded)` works through context propagation. After classification it is wrapped by `NewErrFailsafeTimeoutExceeded(scope, cause, &startTime)` into a `StandardError` with code `ErrCodeFailsafeTimeoutExceeded`. Source: [`common/errors.go:L1588-1603`](https://github.com/erpc/erpc/blob/main/common/errors.go#L1588-L1603). **Per-method quantile tracking implementation.** The `QuantileTracker` maintained by the network's method-metrics registry uses a streaming quantile algorithm (not a fixed histogram). It accumulates actual wall-clock response durations per method and answers `GetQuantile(q)` queries. `eth_call` and `eth_getLogs` accumulate independent trackers. The `AdaptiveDuration.ResolveForRequest` utility bridges from a request context to the tracker. Source: [`common/network.go:L51-60`](https://github.com/erpc/erpc/blob/main/common/network.go#L51-L60), [`common/adaptive_duration.go:L270-290`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration.go#L270-L290). **`TimeoutHandler` response buffering.** `TimeoutHandler` buffers the entire response in a pooled `bytes.Buffer` via `timeoutWriter`. On timeout it discards the buffer and writes the fixed JSON-RPC error body. On normal completion within the deadline it copies buffered headers and body to the wire. Source: [`erpc/http_timeout.go:L36-143`](https://github.com/erpc/erpc/blob/main/erpc/http_timeout.go#L36-L143). **Static mode.** Set `duration` to a scalar (`"30s"`) or an object with only `base`. The timeout is a fixed constant. `min` and `max` are ignored in static mode — `Resolve` returns `base` unchanged. Source: [`common/adaptive_duration.go:L88-89`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration.go#L88-L89). **Quantile-adaptive mode.** When `quantile > 0`, the effective timeout is computed per request as: ``` effective = clamp(base + P(q) of method latency, min, max) ``` Latency is tracked per *(network × method)* using a streaming `QuantileTracker`. `eth_call` and `eth_getLogs` accumulate independent histograms. The `base` offset adds a constant buffer on top of the quantile value. **Cold-start behavior.** When no latency data exists yet, `AdaptiveDuration.Resolve` returns `Min` as the adaptive component (NOT `base`). If that also resolves to zero, `coldStartFallback` in `NewTimeoutFunc` tries `base` first, then `max`, and returns nil (fail-open) if both are zero. **Cold-start priority order: `base > max > nil (fail-open)`.** Always set `base` or `max` when using quantile mode. Source: [`common/timeout_func.go:L85-93`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L85-L93). **Feedback-loop prevention floor.** When `quantile > 0` and `min` is not configured, `NewTimeoutFunc` auto-populates `min = base/2` (or `500ms` when `base == 0`) on a local copy of the spec. Without this floor, fast quantiles could shrink the timeout so aggressively that all requests start timing out, keeping the quantile low in a self-reinforcing loop. The auto-floor is applied to a local copy — the original config is never mutated. Source: [`common/timeout_func.go:L29-37`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L29-L37). **Retry interaction.** The network timeout is the outer boundary; retries run inside it. A common footgun is setting the network timeout shorter than `upstream.timeout × maxAttempts`. When retry exhausts and the last attempt timed out, `ErrFailsafeRetryExceeded` wins as the top-level error; `erpc_network_timeout_fired_total` does NOT increment in this case. **Hedge interaction.** All legs of a hedge race share the same deadline. If the network timeout fires before any leg wins, all legs are cancelled. **Error sentinel distinction.** `ErrHandlerTimeout` (server) and `ErrDynamicTimeoutExceeded` (failsafe) are different sentinels. The network executor checks for `ErrDynamicTimeoutExceeded` specifically — it will never misclassify an HTTP-server timeout as `ErrFailsafeTimeoutExceeded`. Source: [`erpc/networks.go:L1427-1432`](https://github.com/erpc/erpc/blob/main/erpc/networks.go#L1427-L1432). **Timeout attribution and counter guards.** `MetricNetworkTimeoutFiredTotal` is incremented by the scope that owns the policy. Four guards prevent double-counting at `erpc/networks.go:L1435-L1438`: 1. `failsafeExecutor.HasTimeout()` — this scope has a configured timeout policy. 2. `errors.Is(execErr, common.ErrDynamicTimeoutExceeded)` — the eRPC-specific sentinel is present; generic `context.DeadlineExceeded` from a parent context does not satisfy this. 3. `!HasErrorCode(execErr, ErrCodeFailsafeRetryExceeded)` — retry-exhausted suppression: when retry exhausts and the last attempt timed out, `ErrFailsafeRetryExceeded` wins and the timeout counter does not fire. 4. `!HasErrorCode(execErr, ErrCodeFailsafeTimeoutExceeded)` — cross-scope suppression: when an upstream-scope timeout already classified the error, the network-scope counter does NOT double-count it. Source: [`erpc/networks.go:L1424-1451`](https://github.com/erpc/erpc/blob/main/erpc/networks.go#L1424-L1451), [`erpc/networks_timeout_test.go:L495-538`](https://github.com/erpc/erpc/blob/main/erpc/networks_timeout_test.go#L495-L538). ### Config schema #### `server.maxTimeout` | Field | Type | Default | Behavior / footguns | |---|---|---|---| | `server.maxTimeout` | `*Duration` | `150s` ([`erpc/http_server.go:L65-67`](https://github.com/erpc/erpc/blob/main/erpc/http_server.go#L65-L67)) | Absolute HTTP-handler ceiling. POST → HTTP 200 + JSON-RPC `-32603`; non-POST → HTTP 504. No failsafe policy can exceed this. Always set failsafe timeouts shorter. | #### `TimeoutPolicyConfig` — network or upstream scope All fields live under `networks[].failsafe[].timeout` or `upstreams[].failsafe[].timeout`. Struct: [`common/config.go:L1406-1408`](https://github.com/erpc/erpc/blob/main/common/config.go#L1406-L1408). | Field | Type | Default | Behavior / footguns | |---|---|---|---| | `timeout.duration` | `Duration \| AdaptiveDuration` | Network: `120s` ([`common/defaults.go:L134-136`](https://github.com/erpc/erpc/blob/main/common/defaults.go#L134-L136)); Upstream: `60s` ([`common/defaults.go:L155-157`](https://github.com/erpc/erpc/blob/main/common/defaults.go#L155-L157)) | The timeout budget. Scalar sets `base` only (static mode). Object `{base, quantile, min, max}` enables adaptive mode. Zero/nil disables the timeout for this scope (fail-open). | | `timeout.duration.base` | `Duration` | `0` | Static fallback and cold-start addend. When `quantile == 0`, this IS the effective timeout. When `quantile > 0`, added to the quantile value before clamping. `coldStartFallback` prefers `base` over `max`. | | `timeout.duration.quantile` | `float64` | `0` (static mode) | Percentile of per-method response times, range `[0, 1]`. When `> 0`, enables adaptive mode. Requires `base` or `max` to avoid cold-start fail-open. Source: [`common/adaptive_duration.go:L83-109`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration.go#L83-L109). | | `timeout.duration.min` | `Duration` | **Auto-floor**: when `quantile > 0` and `min == 0`, auto-set to `base/2` (or `500ms` when `base == 0`) at build time. Source: [`common/timeout_func.go:L31-36`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L31-L36). | Floor after quantile resolution. Prevents feedback-loop collapse. Cold-start: when quantile returns 0, `min` is used as the adaptive component inside `Resolve`. | | `timeout.duration.max` | `Duration` | `0` (uncapped) | Ceiling after quantile resolution. Cold-start fallback of last resort when `base == 0`: `coldStartFallback` returns `max`. If both `base` and `max` are zero, cold start is fail-open. Source: [`common/timeout_func.go:L89-91`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L89-L91). | #### Legacy flat siblings (backward compatible) | Field | Type | Folds into | Precedence | |---|---|---|---| | `timeout.quantile` | `float64` | `duration.quantile` | Object form wins; sibling fills only if `duration.quantile == 0`. Source: [`common/config.go:L1471-1473`](https://github.com/erpc/erpc/blob/main/common/config.go#L1471-L1473). | | `timeout.minDuration` | `Duration` (string or int-as-ms) | `duration.min` | Same precedence rule. Source: [`common/config.go:L1474-1476`](https://github.com/erpc/erpc/blob/main/common/config.go#L1474-L1476). | | `timeout.maxDuration` | `Duration` (string or int-as-ms) | `duration.max` | Same precedence rule. Source: [`common/config.go:L1477-1479`](https://github.com/erpc/erpc/blob/main/common/config.go#L1477-L1479). | Legacy form still accepted: ```yaml # Legacy flat form — still valid, folded into duration.* at parse time timeout: duration: 5s quantile: 0.99 minDuration: 200ms maxDuration: 30s ``` ### Worked examples All patterns below are distilled from real production fleets; comments explain the non-obvious choices. **1. Finality-split network timeouts (recommended general shape).** Production uses separate failsafe entries per finality state so realtime/unfinalized data — where block-availability races make latency spiky — gets a wider budget than finalized reads. P99 quantile mode means `eth_blockNumber` self-tunes to a tight budget while `eth_getLogs` breathes. The `min` floor prevents the adaptive budget from collapsing after a run of fast responses: **Config path:** `projects[].networks[].failsafe[]` **YAML — `erpc.yaml`:** ```yaml failsafe: # eth_call and eth_getLogs: realtime/unfinalized have block-availability races - matchMethod: "eth_call|eth_getLogs" matchFinality: [realtime, unfinalized] timeout: duration: quantile: 0.99 # base doubles as cold-start fallback — always set it base: 30s # min floor prevents feedback-loop collapse after a run of fast responses min: 20s max: 30s # finalized/unknown: same methods but no block-availability race → same ceiling is fine - matchMethod: "eth_call|eth_getLogs" matchFinality: [finalized, unknown] timeout: duration: quantile: 0.99 base: 30s min: 20s max: 30s # generic catch-all for realtime reads; wider because methods like trace_* are slow - matchMethod: "*" matchFinality: [realtime, unfinalized] timeout: duration: quantile: 0.99 base: 60s min: 40s max: 60s - matchMethod: "*" matchFinality: [finalized, unknown] timeout: duration: quantile: 0.99 base: 60s min: 40s max: 60s ``` **TypeScript — `erpc.ts`:** ```typescript failsafe: [ { matchMethod: "eth_call|eth_getLogs", matchFinality: ["realtime", "unfinalized"], timeout: { duration: { quantile: 0.99, base: "30s", min: "20s", max: "30s" } }, }, { matchMethod: "eth_call|eth_getLogs", matchFinality: ["finalized", "unknown"], timeout: { duration: { quantile: 0.99, base: "30s", min: "20s", max: "30s" } }, }, { matchMethod: "*", matchFinality: ["realtime", "unfinalized"], timeout: { duration: { quantile: 0.99, base: "60s", min: "40s", max: "60s" } }, }, { matchMethod: "*", matchFinality: ["finalized", "unknown"], timeout: { duration: { quantile: 0.99, base: "60s", min: "40s", max: "60s" } }, }, ] ``` **2. Per-method upstream timeouts with automatic failover.** Upstream-scope timeouts are the first line of defense: when one provider stalls, a tight per-method ceiling triggers failover to another provider without burning the full network budget. Critically, `hedge` and `retry` are both `null` at the upstream scope — the network layer owns those policies. Methods with variable latency (like `eth_getLogs`, `trace_*`) get wider ceilings so legitimate big queries don't time out prematurely: **Config path:** `projects[].upstreams[].failsafe[]` **YAML — `erpc.yaml`:** ```yaml upstreams: - id: primary endpoint: https://rpc.example.com failsafe: # fee/nonce reads: extremely tight — these must be fast or skip the upstream - matchMethod: "eth_getTransactionCount|eth_gasPrice|eth_maxPriorityFeePerGas" timeout: duration: quantile: 0.8 base: 500ms min: 100ms max: 500ms hedge: null # upstream scope owns no hedge/retry — network layer does retry: null # heavy data: legitimate eth_getLogs on wide ranges routinely take seconds - matchMethod: "eth_getLogs|eth_getBlockReceipts" timeout: duration: quantile: 0.9 # raised from default — subgraph backfills legitimately take 2-15s base: 15s min: 2s max: 15s hedge: null retry: null # light getters: small responses, tight cap encourages fast hedging on stuck upstreams - matchMethod: "eth_get*" timeout: duration: quantile: 0.9 base: 5s min: 200ms max: 5s hedge: null retry: null # catch-all - matchMethod: "*" timeout: duration: quantile: 0.8 base: 60s min: 500ms max: 60s hedge: null retry: null ``` **TypeScript — `erpc.ts`:** ```typescript upstreams: [{ id: "primary", endpoint: "https://rpc.example.com", failsafe: [ { matchMethod: "eth_getTransactionCount|eth_gasPrice|eth_maxPriorityFeePerGas", timeout: { duration: { quantile: 0.8, base: "500ms", min: "100ms", max: "500ms" } }, hedge: null, retry: null, }, { matchMethod: "eth_getLogs|eth_getBlockReceipts", timeout: { duration: { quantile: 0.9, base: "15s", min: "2s", max: "15s" } }, hedge: null, retry: null, }, { matchMethod: "eth_get*", timeout: { duration: { quantile: 0.9, base: "5s", min: "200ms", max: "5s" } }, hedge: null, retry: null, }, { matchMethod: "*", timeout: { duration: { quantile: 0.8, base: "60s", min: "500ms", max: "60s" } }, hedge: null, retry: null, }, ], }] ``` **3. Transaction hash lookups: tight network timeout, no hedge.** A null `eth_getTransactionReceipt` means the tx isn't indexed yet — hedging or heavy retries only multiply load. The network timeout ceiling matches the `upstream.timeout × maxAttempts` budget so the last retry isn't silently cut short: **Config path:** `projects[].networks[].failsafe[]` **YAML — `erpc.yaml`:** ```yaml failsafe: - matchMethod: "eth_getTransactionByHash|eth_getTransactionReceipt" timeout: duration: quantile: 0.99 # 10s ceiling covers upstream timeout (8s) × 2 attempts without overflow base: 10s min: 6s max: 10s retry: maxAttempts: 2 # ~one block of patience: tx is either propagated or not; more retries waste budget delay: 500ms # no hedge key — parallel fan-out on a missing tx multiplies load for no gain ``` **TypeScript — `erpc.ts`:** ```typescript failsafe: [{ matchMethod: "eth_getTransactionByHash|eth_getTransactionReceipt", timeout: { duration: { quantile: 0.99, base: "10s", min: "6s", max: "10s" } }, retry: { maxAttempts: 2, delay: "500ms" }, // no hedge — parallel fan-out on a missing tx multiplies load for no gain }] ``` **4. Cache connector timeout (static, not adaptive).** Cache connectors use `failsafeForGets` / `failsafeForSets`; quantile mode is rejected at cache scope. A static timeout ensures a slow remote cache read is abandoned quickly so eRPC falls through to upstream — a cache read that takes longer than an upstream call is never worth it: **Config path:** `database.evmJsonRpcCache.connectors[].failsafeForGets[]` **YAML — `erpc.yaml`:** ```yaml connectors: - id: remote-cache driver: grpc failsafeForGets: - matchMethod: "*" # static: quantile is rejected at cache scope; connector latency is stable enough timeout: # 400ms hard ceiling: a slow cache read is worse than a direct upstream call duration: 400ms retry: { maxAttempts: 2, delay: "0" } hedge: { delay: 100ms, maxCount: 1 } ``` **TypeScript — `erpc.ts`:** ```typescript connectors: [{ id: "remote-cache", driver: "grpc", failsafeForGets: [{ matchMethod: "*", // static: quantile is rejected at cache scope; connector latency is stable enough timeout: { // 400ms hard ceiling: a slow cache read is worse than a direct upstream call duration: "400ms", }, retry: { maxAttempts: 2, delay: "0" }, hedge: { delay: "100ms", maxCount: 1 }, }], }] ``` **5. Quantile-only cold-start safe pattern.** A pure `{quantile: 0.95}` with no `base`/`max` is fail-open on cold start. Adding `max` provides the cold-start ceiling without hard-coding a `base` offset; useful when you want the budget to be purely data-driven once the tracker warms up: **Config path:** `projects[].networks[].failsafe[]` **YAML — `erpc.yaml`:** ```yaml failsafe: - matchMethod: "*" timeout: duration: quantile: 0.95 # max acts as cold-start ceiling; base would be added on top of the quantile value max: 20s ``` **TypeScript — `erpc.ts`:** ```typescript failsafe: [{ matchMethod: "*", timeout: { duration: { quantile: 0.95, max: "20s" }, }, }] ``` ### Request/response behavior - When `server.maxTimeout` fires on a POST request, `TimeoutHandler` writes HTTP 200 with body `{"jsonrpc":"2.0","id":null,"error":{"code":-32603,"message":"http request handling timeout"}}`. Non-POST requests receive HTTP 504. Source: [`erpc/http_timeout.go:L107-116`](https://github.com/erpc/erpc/blob/main/erpc/http_timeout.go#L107-L116). - When a failsafe timeout fires, the error propagates as `ErrFailsafeTimeoutExceeded` with a `scope` field (`"network"` or `"upstream"`), which maps to a JSON-RPC `-32603` body at the HTTP layer. - Cancelled in-flight upstream requests receive context cancellation; they do not contribute to response headers or metrics beyond what the timeout counter records. - The `ErrHandlerTimeout` (server) and `ErrDynamicTimeoutExceeded` (failsafe) sentinels are distinct; the network executor never misclassifies them. Source: [`erpc/networks.go:L1427-1432`](https://github.com/erpc/erpc/blob/main/erpc/networks.go#L1427-L1432). ### Best practices - Set the **network timeout larger than `upstream.timeout × maxAttempts`**. If upstream timeout is 10s and retry maxAttempts is 3, a network timeout below 30s silently drops the last retry. - Use **quantile mode with `max`** rather than a static ceiling for production APIs — static values go stale as provider latency shifts; quantile self-adjusts per method. - **Always set `base` or `max`** when using quantile mode. A spec with only `quantile` and no `base`/`max` is fail-open on cold start — no timeout is applied until the tracker warms up. - **Never set `min: 0` explicitly** in quantile mode. The auto-floor (`min = base/2` or `500ms`) is only applied when `min` is unset; manually setting `min: 0` bypasses the guard and allows the budget to collapse toward zero on a run of fast responses. - Keep `server.maxTimeout` as the last-resort ceiling, but size it generously (150–300s) and size failsafe timeouts tightly — callers get a controlled error from the failsafe layer rather than a raw HTTP 504. - Combine upstream-scope timeouts with network-scope retry so a single stalled provider triggers failover rather than consuming the entire network budget. - Use `erpc_network_timeout_duration_seconds` to observe what budget the adaptive policy is computing; it emits on every quantile request, not only on fires. ### Edge cases & gotchas 1. **`server.maxTimeout` is a hard ceiling.** A failsafe `duration: "200s"` will never run for 200s — the HTTP handler times out at `server.maxTimeout` (default 150s) first. Always set failsafe timeouts shorter than `server.maxTimeout`. Source: [`erpc/http_server.go:L65-67`](https://github.com/erpc/erpc/blob/main/erpc/http_server.go#L65-L67). 2. **Quantile cold-start fail-open when `base` and `max` are both zero.** `{quantile: 0.95}` with no `base` or `max` → `coldStartFallback` returns nil → no timeout on cold start. Always set `base` or `max` with quantile mode. Source: [`common/timeout_func.go:L85-93`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L85-L93). 3. **Network timeout shorter than `upstream.timeout × maxAttempts`.** Upstream `timeout: 10s` + `retry.maxAttempts: 3` = 30s worst-case budget. A network timeout of 15s silently drops the third attempt. Set network timeout ≥ upstream timeout × retry attempts, or accept the tradeoff explicitly. 4. **Static `base` ignores `min`/`max`.** `duration: { base: "30s", min: "1s", max: "60s" }` with no `quantile` applies exactly `30s` — the clamp is never reached. Source: [`common/adaptive_duration.go:L88-89`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration.go#L88-L89). 5. **Quantile cold-start adaptive value is `min`, not `base`.** Inside `Resolve`, when `qt.GetQuantile(q) == 0`, `adaptive = spec.Min` — NOT `base`. Then `v = base + min`. The `base`-first fallback only happens at the `coldStartFallback` level in `NewTimeoutFunc` when `Resolve` returns 0. Source: [`common/adaptive_duration.go:L96-98`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration.go#L96-L98). 6. **Retry-exhausted error wins over timeout-on-last-attempt.** When retry exhausts and the last attempt timed out, the top-level error is `ErrFailsafeRetryExceeded`. The timeout counter does not fire; only the retry counter reflects the event. Source: [`erpc/networks_timeout_test.go:L429-493`](https://github.com/erpc/erpc/blob/main/erpc/networks_timeout_test.go#L429-L493). 7. **Upstream-scope timeout bubbles correctly.** When an upstream timeout fires, the error is wrapped as `ErrFailsafeTimeoutExceeded{scope: "upstream"}`. The network executor's `!HasErrorCode(execErr, ErrCodeFailsafeTimeoutExceeded)` guard sees this code and skips network-scope classification and counter increment. One event, one counter. Source: [`erpc/networks_timeout_test.go:L495-538`](https://github.com/erpc/erpc/blob/main/erpc/networks_timeout_test.go#L495-L538). 8. **POST timeout response is always HTTP 200.** The JSON-RPC spec requires transport-layer 200 for method errors. Non-POST gets HTTP 504. Source: [`erpc/http_timeout.go:L107-116`](https://github.com/erpc/erpc/blob/main/erpc/http_timeout.go#L107-L116). 9. **Legacy `quantile`-only flat form is fail-open on cold start.** A config with only `quantile: 0.7` and no `duration` has `base == 0` and `max == 0` → `coldStartFallback` returns nil. Add `maxDuration` to avoid this. Source: [`common/config.go:L1468-1473`](https://github.com/erpc/erpc/blob/main/common/config.go#L1468-L1473). 10. **Legacy flat siblings silently drop on field conflict.** If `duration: {quantile: 0.95}` is set and `quantile: 0.50` is also provided as a flat sibling, the result is `quantile = 0.95`. No error or warning is emitted. Source: [`common/adaptive_duration_compat_test.go:L75-88`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration_compat_test.go#L75-L88). 11. **`erpc_network_timeout_duration_seconds` emits on every quantile request, not only timeouts.** The histogram reflects what duration was computed, not whether the request actually timed out. Source: [`common/timeout_func.go:L73-80`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L73-L80). 12. **Auto-floor is applied at build time, not config-load time.** `NewTimeoutFunc` modifies a LOCAL copy of the spec. The original `TimeoutPolicyConfig` is not mutated. Manually setting `min: 0` bypasses the auto-floor because `== 0` check fails. Source: [`common/timeout_func.go:L27-36`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L27-L36). 13. **Network-only timeout does not fire the upstream-scope counter.** When only the network scope has a timeout policy and the upstream scope has none, the context cancellation propagates into the upstream's attempt. Because `failsafeExecutor.timeout == nil` at the upstream level, the upstream does NOT increment `erpc_network_timeout_fired_total` and does NOT wrap the error as `ErrFailsafeTimeoutExceeded`. The network scope handles all classification. Source: [`erpc/networks_timeout_test.go:L540-586`](https://github.com/erpc/erpc/blob/main/erpc/networks_timeout_test.go#L540-L586). 14. **`applyLegacySiblings` early-exit: all-zero flat siblings are free.** When `quantile == 0` and `minDuration == 0` and `maxDuration == 0`, `applyLegacySiblings` returns immediately without allocating a `Duration` object. Configs that use only `duration: 5s` (no flat siblings) pay no allocation cost. Source: [`common/config.go:L1465-1467`](https://github.com/erpc/erpc/blob/main/common/config.go#L1465-L1467). 15. **Legacy `quantile`-only flat form (no `duration:` key at all) allocates `Duration` from nil.** A block containing only `quantile: 0.7` with no `duration:` field is valid: `applyLegacySiblings` allocates `c.Duration = &AdaptiveDuration{}` and sets only `Quantile = 0.7`, leaving `Base`, `Min`, and `Max` at zero. At `NewTimeoutFunc` build time `quantile > 0` and `base == 0` — the auto-floor sets `min = 500ms`, but on cold start `coldStartFallback` returns `max` (zero) → nil (fail-open). Add `maxDuration` to avoid cold-start fail-open when using the legacy flat form. Source: [`common/config.go:L1468-1473`](https://github.com/erpc/erpc/blob/main/common/config.go#L1468-L1473). 16. **`context.WithTimeoutCause` cause discrimination.** `errors.Is(ctx.Err(), context.DeadlineExceeded)` is true for both the HTTP-server timeout and a failsafe timeout, but `context.Cause(ctx)` returns different sentinels: `ErrHandlerTimeout` for the server level and `ErrDynamicTimeoutExceeded` for failsafe. The network executor uses this distinction to avoid misclassifying an HTTP-server deadline as `ErrFailsafeTimeoutExceeded`. Source: [`erpc/networks.go:L1427-1432`](https://github.com/erpc/erpc/blob/main/erpc/networks.go#L1427-L1432). ### Observability | Metric | Type | Labels | When it fires | |---|---|---|---| | `erpc_network_timeout_fired_total` | counter | `project`, `network`, `category`, `finality`, `scope` | Timeout policy killed a request. `scope=network` or `scope=upstream`. Suppressed when `ErrFailsafeRetryExceeded` wins; suppressed when a lower scope already fired. Network-scope source: [`erpc/networks.go:L1440`](https://github.com/erpc/erpc/blob/main/erpc/networks.go#L1440); upstream-scope source: `upstream/upstream.go:L845`; metric declaration: `telemetry/metrics.go:L437-L441`. | | `erpc_network_timeout_duration_seconds` | histogram | `project`, `network`, `category`, `finality` | Computed per request in quantile mode only — reflects the budget computed, not whether it fired. Buckets: `[0.05, 0.1, 0.3, 0.5, 1, 3, 5, 10, 30]`. Source: [`common/timeout_func.go:L73-80`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L73-L80); metric declaration: `telemetry/metrics.go:L820-L824`. | ```promql # Rate of timeout fires by scope rate(erpc_network_timeout_fired_total[5m]) # P99 of dynamically computed timeout budget per method histogram_quantile(0.99, sum by (le, category) ( rate(erpc_network_timeout_duration_seconds_bucket[5m]) ) ) ``` ### Source code entry points - [`erpc/http_timeout.go`](https://github.com/erpc/erpc/blob/main/erpc/http_timeout.go) — `TimeoutHandler`, `timeoutWriter`; HTTP-server-level timeout and `ErrHandlerTimeout` sentinel - [`erpc/http_server.go:L65-L67`](https://github.com/erpc/erpc/blob/main/erpc/http_server.go#L65-L67) — `maxTimeout` retrieval and `TimeoutHandler` instantiation - [`common/timeout_func.go:L23-L94`](https://github.com/erpc/erpc/blob/main/common/timeout_func.go#L23-L94) — `NewTimeoutFunc`; auto-floor; `coldStartFallback`; quantile resolution; metric observation - [`common/adaptive_duration.go:L83-L109`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration.go#L83-L109) — `AdaptiveDuration.Resolve`; full resolution algorithm - [`erpc/network_executor.go:L164-L171`](https://github.com/erpc/erpc/blob/main/erpc/network_executor.go#L164-L171) — network-scope `context.WithTimeoutCause` applied to `networkExecutor.Run` - [`erpc/networks.go:L1424-L1451`](https://github.com/erpc/erpc/blob/main/erpc/networks.go#L1424-L1451) — timeout error classification; `MetricNetworkTimeoutFiredTotal` with all four guards - [`common/defaults.go:L134-L157`](https://github.com/erpc/erpc/blob/main/common/defaults.go#L134-L157) — system-level defaults (network `120s`, upstream `60s`) - [`common/config.go:L1398-L1480`](https://github.com/erpc/erpc/blob/main/common/config.go#L1398-L1480) — `TimeoutPolicyConfig` struct; YAML/JSON unmarshal; `applyLegacySiblings` - [`common/errors.go:L1588-L1603`](https://github.com/erpc/erpc/blob/main/common/errors.go#L1588-L1603) — `NewErrFailsafeTimeoutExceeded`; `ErrCodeFailsafeTimeoutExceeded` - [`common/errors.go:L1981-L1984`](https://github.com/erpc/erpc/blob/main/common/errors.go#L1981-L1984) — `ErrDynamicTimeoutExceeded` sentinel declaration - [`erpc/networks_timeout_test.go`](https://github.com/erpc/erpc/blob/main/erpc/networks_timeout_test.go) — behavior-locking tests: lifecycle scoping, cold-start, scope attribution, double-count regression - [`common/network.go:L51-L60`](https://github.com/erpc/erpc/blob/main/common/network.go#L51-L60) — `QuantileTracker` and `TrackedMetrics` interfaces - [`common/adaptive_duration_test.go`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration_test.go) — unit tests locking resolution semantics: static mode, quantile mode, clamp, nil spec, cold-start - [`common/adaptive_duration_compat_test.go`](https://github.com/erpc/erpc/blob/main/common/adaptive_duration_compat_test.go) — backward-compat tests for legacy flat-sibling folding and precedence rules - [`erpc/network_executor.go:L86-L88`](https://github.com/erpc/erpc/blob/main/erpc/network_executor.go#L86-L88) — `NewNetworkExecutor` wires `NewTimeoutFunc`; `HasTimeout` method - [`telemetry/metrics.go:L437-L441`](https://github.com/erpc/erpc/blob/main/telemetry/metrics.go#L437-L441) — `MetricNetworkTimeoutFiredTotal` declaration - [`telemetry/metrics.go:L820-L824`](https://github.com/erpc/erpc/blob/main/telemetry/metrics.go#L820-L824) — `MetricNetworkTimeoutDurationSeconds` declaration ### Related pages - [Retry](/config/failsafe/retry.llms.txt) — runs inside the network timeout; budget shared across all attempts. - [Hedge](/config/failsafe/hedge.llms.txt) — all hedge legs share the same network-scope deadline. - [Selection policies](/config/projects/selection-policies.llms.txt) — decides which upstream each failover attempt gets. - [Rate limiters](/config/rate-limiters.llms.txt) — pair with upstream timeouts to cap cost on expensive vendors. - [Survive provider outages](/use-cases/survive-provider-outages.llms.txt) — the outcome upstream-scope timeouts serve. --- ## Navigation (machine-readable surface) - Up: [Failsafe](https://docs.erpc.cloud/config/failsafe.llms.txt) - Root index of every page: [llms.txt](https://docs.erpc.cloud/llms.txt) · everything in one file: [llms-full.txt](https://docs.erpc.cloud/llms-full.txt) ### Sibling pages - [Circuit breaker](https://docs.erpc.cloud/config/failsafe/circuit-breaker.llms.txt) — When an upstream starts failing, eRPC stops sending it traffic automatically — and quietly brings it back once it recovers. - [Consensus](https://docs.erpc.cloud/config/failsafe/consensus.llms.txt) — Fan out every request to multiple providers simultaneously, agree on a single canonical answer, and automatically flag — or silence — the ones that lie. - [Hedge](https://docs.erpc.cloud/config/failsafe/hedge.llms.txt) — When a provider is having a slow moment, eRPC quietly races a backup request — your slowest responses simply disappear. - [Integrity checks](https://docs.erpc.cloud/config/failsafe/integrity.llms.txt) — eRPC silently discards stale or structurally broken upstream responses and retries on another provider — callers always get the correct answer. - [Retry](https://docs.erpc.cloud/config/failsafe/retry.llms.txt) — When a provider misbehaves, eRPC automatically rotates to the next one — and paces retries for missing data to match the chain's own block time.