# Survive provider outages > Source: https://docs.erpc.cloud/use-cases/survive-provider-outages > Keep serving traffic when an RPC provider slows down, rate-limits you, or disappears entirely. > Format: machine-readable markdown export of the docs page above. > All collapsible AI sections are inlined and fully expanded. # Survive provider outages Every RPC provider has a bad day. With eRPC in front, your users never find out: slow answers get raced against a second provider, errors fail over automatically, and an upstream that keeps misbehaving is quietly benched until it recovers. You ship one endpoint; eRPC turns a pool of imperfect providers into something that behaves like a perfect one. - **[Retry](/config/failsafe/retry.llms.txt)** — Errors fail over to the next-best upstream, automatically. - **[Hedge](/config/failsafe/hedge.llms.txt)** — Slow request? A backup race starts before the user notices. - **[Timeout](/config/failsafe/timeout.llms.txt)** — Nothing hangs — every request has a hard ceiling. - **[Circuit breaker](/config/failsafe/circuit-breaker.llms.txt)** — Repeat offenders get benched until they behave again. - **[Cordoning](/operation/cordoning.llms.txt)** — See why an upstream was benched, or bench one yourself. - **[Healthcheck](/operation/healthcheck.llms.txt)** — Tell your load balancer the truth about readiness. All of the above in one place — illustrative, not a tuned production config: **Config path:** `projects[]` **YAML — `erpc.yaml`:** ```yaml projects: - id: main # applies to every chain in this project networkDefaults: failsafe: - matchMethod: "*" # nothing hangs: hard ceiling per request timeout: duration: 30s # errors fail over to the next-best upstream retry: maxAttempts: 3 # slow answers get raced at their p70 latency hedge: delay: { quantile: 0.7, min: 100ms, max: 2s } maxCount: 1 upstreamDefaults: failsafe: - matchMethod: "*" # one in-place retry per upstream before rotating away retry: maxAttempts: 1 # bench repeat offenders, probe again after 5m circuitBreaker: failureThresholdCount: 20 failureThresholdCapacity: 80 halfOpenAfter: 5m successThresholdCount: 8 # Cordoning is automatic (no config); /healthcheck reflects readiness. ``` **TypeScript — `erpc.ts`:** ```typescript projects: [{ id: "main", // applies to every chain in this project networkDefaults: { failsafe: [{ matchMethod: "*", // nothing hangs: hard ceiling per request timeout: { duration: "30s" }, // errors fail over to the next-best upstream retry: { maxAttempts: 3 }, // slow answers get raced at their p70 latency hedge: { delay: { quantile: 0.7, min: "100ms", max: "2s" }, maxCount: 1 }, }], }, upstreamDefaults: { failsafe: [{ matchMethod: "*", // one in-place retry per upstream before rotating away retry: { maxAttempts: 1 }, // bench repeat offenders, probe again after 5m circuitBreaker: { failureThresholdCount: 20, failureThresholdCapacity: 80, halfOpenAfter: "5m", successThresholdCount: 8, }, }], }, }] ``` ## Agent reference Copy one of these prompts into your AI agent session (Claude Code, Cursor, …) — each one points the agent at this page's machine-readable reference so it can do the work correctly: **Prompt Example #1: make my RPC layer survive provider outages** ```text I want my app to keep serving traffic even when an RPC provider slows down, rate-limits me, or goes offline. Configure eRPC in my eRPC config with retry, hedge, timeout, and circuit breaker so outages are transparent to my users. Read the reference and follow the child-page links inside it: https://docs.erpc.cloud/use-cases/survive-provider-outages.llms.txt ``` **Prompt Example #2: add circuit breaking to bench a flaky provider automatically** ```text One of my RPC providers is intermittently returning errors and degrading my app. Add a circuit breaker to my eRPC config so eRPC benches that upstream after repeated failures and probes it again once it recovers, without any manual intervention. Reference: https://docs.erpc.cloud/use-cases/survive-provider-outages.llms.txt ``` **Prompt Example #3: tune failover speed to minimize user-visible errors** ```text My eRPC config in my eRPC config already has retry and hedge configured but users still see occasional errors during provider incidents. Review the timeout, retry backoff, and hedge delay settings and tighten them so failover is as fast as possible without excessive upstream cost. Reference: https://docs.erpc.cloud/use-cases/survive-provider-outages.llms.txt ``` --- ### Survive outages — agent starting points This page sells the outcome; implementation lives in the feature pages. Fetch their machine-readable companions and combine policies per method: - [Retry](/config/failsafe/retry.llms.txt) — scopes (network vs upstream), empty-result handling, backoff math. - [Hedge](/config/failsafe/hedge.llms.txt) — fixed vs quantile delays, clamps, loser cancellation. - [Timeout](/config/failsafe/timeout.llms.txt) — the three-level timeout hierarchy and dynamic quantile mode. - [Circuit breaker](/config/failsafe/circuit-breaker.llms.txt) — thresholds, half-open probing, and how it complements selection-policy exclusion (cordoning is a third, independent mechanism). - [Cordoning](/operation/cordoning.llms.txt) — every cordon reason and how to inspect/override it. - [Healthcheck](/operation/healthcheck.llms.txt) — eval strategies and load-balancer integration. Composition rules an agent must know: failsafe entries are matched per request via `matchMethod`/`matchFinality`; at network scope the executor chain nests timeout(consensus(retry(hedge(upstreams)))); retries at network scope rotate across upstreams while upstream-scope retries re-attempt the same one. Full ordering and per-policy field tables are in each feature page's agent section. --- ## Navigation (machine-readable surface) - Up: [All pages index](https://docs.erpc.cloud/llms.txt) - Root index of every page: [llms.txt](https://docs.erpc.cloud/llms.txt) · everything in one file: [llms-full.txt](https://docs.erpc.cloud/llms-full.txt) ### Sibling pages - [Cut RPC cost & latency](https://docs.erpc.cloud/use-cases/cut-costs-and-latency.llms.txt) — Serve repeated questions from cache, deduplicate identical requests, and stop paying providers for the same answer twice. - [How eRPC works](https://docs.erpc.cloud/use-cases/how-it-works.llms.txt) — Every JSON-RPC call travels a battle-tested pipeline — auth, smart caching, parallel hedging, multi-upstream consensus — and arrives with full diagnostic headers. Zero glue code required. - [Lock it down](https://docs.erpc.cloud/use-cases/lock-it-down.llms.txt) — Keys, JWTs, sign-in with Ethereum, per-user rate limits — your RPC endpoint stops being a free-for-all. - [Scale chains & providers](https://docs.erpc.cloud/use-cases/scale-chains-and-providers.llms.txt) — One config line per provider, every chain they support — and the best upstream wins each request. - [See everything](https://docs.erpc.cloud/use-cases/see-everything.llms.txt) — Per-request metrics, traces, and honest healthchecks — know about problems before your users do. - [Trust the data](https://docs.erpc.cloud/use-cases/trust-the-data.llms.txt) — Don't let one misbehaving node feed your app a wrong answer — verify, cross-check, and enforce integrity automatically.