# Production guidelines > Source: https://docs.erpc.cloud/operation/production > Memory/GC tuning, healthcheck rollout, instance identification, error visibility, and IP forwarding recommendations for running eRPC in production. > Format: machine-readable markdown export of the docs page above. > All collapsible AI sections are inlined and fully expanded. # Production guidelines Practical recommendations for running eRPC in production — from container sizing through zero-downtime rollouts and instance identification. **What this page covers:** - Memory usage and Go GC tuning (`GOGC`, `GOMEMLIMIT`) - Failsafe policies (retry, timeout, hedge) - Caching database selection - Horizontal scaling with shared state - Explicit chain ID configuration - Zero-downtime healthcheck rollout (Cilium/Envoy drain pattern) - Custom response headers for instance identification - `includeErrorDetails` in production - `trustedIPForwarders` and `trustedIPHeaders` behind a load-balancer or CDN ## Memory and GC tuning The largest memory contributor in eRPC is the size of RPC responses. Common calls like `eth_getBlockByNumber` or `eth_getTransactionReceipt` are typically under 1 MB; heavy calls like `debug_traceTransaction` can reach 50 MB. Most deployments see ~256 MB RSS at modest load. Start with a generous limit (e.g. 16 GB) while routing real traffic, then lower it once you know your p99 working set. To prevent OOM-kills on Kubernetes, add both env vars to your container spec: ```bash # Trigger GC when heap grows by 30 % (default is 100 %) GOGC=30 # Trigger GC when RSS approaches 2 GiB — tune to ~80 % of your container memory limit # WARNING: set this too low and GC will thrash; combine with GOGC for best results GOMEMLIMIT=2GiB ``` Example Docker run: ```bash docker run -e GOGC=30 -e GOMEMLIMIT=2GiB ghcr.io/erpc/erpc:latest \ erpc start -c /etc/erpc/erpc.yaml ``` Kubernetes container spec snippet: ```yaml env: - name: GOGC value: "30" - name: GOMEMLIMIT value: "2GiB" resources: limits: memory: "2.5Gi" requests: memory: "512Mi" ``` ## Failsafe policies Configure [retry](/config/failsafe/retry.llms.txt) at both network and upstream scopes: - **Network-level retry** rotates to a different upstream on a transient failure. Even with a single upstream it's worth enabling. Set `maxAttempts` ≈ number of upstreams. - **Upstream-level retry** covers per-attempt flakiness within the same upstream. Use 2–5 `maxAttempts`. Set a [timeout](/config/failsafe/timeout.llms.txt) that matches your request profile. For standard EVM calls a `3s` default is safe; for heavy trace or `getLogs` calls allow 10 s or more. Set `quantile: 0.99` on the upstream-scope timeout to auto-tune per method. Enable the [hedge policy](/config/failsafe/hedge.llms.txt) for latency-sensitive reads. With `delay: 500ms`, eRPC races a second upstream once the primary has been quiet for 500 ms and returns the first kept response — at the cost of duplicate traffic for slow requests. Hedge attempts are excluded from per-upstream scoring and from the circuit breaker. Use [consensus](/config/failsafe/consensus.llms.txt) for high-trust reads (gas price, nonce, contract calls during write paths). Set [`maxWaitOnResult`](/config/failsafe/consensus.llms.txt#tail-latency-caps-maxwaitonresult--maxwaitonempty) to bound tail latency when one participant lags. [Execution trace headers](/config/failsafe.llms.txt#http-response-headers) (`X-ERPC-Upstreams-Tried`, `X-ERPC-Upstreams-Outcomes`, `X-ERPC-Upstreams-Reasons`, `X-ERPC-Upstreams-Durations-Ms`, `X-ERPC-Upstreams-Flags`) ship by default — clients can debug retry/hedge/consensus decisions without server-side traces. Disable with `server.executionHeaders: off` if you want zero diagnostic leakage. ## Caching database Large read-heavy workloads (e.g. indexing 100 M Arbitrum blocks) require substantial cache storage. Start with Redis; switch to PostgreSQL when cached data exceeds available memory. eRPC degrades gracefully if the cache backend is unavailable — it falls back to live upstream calls with no impact on availability. See [Database](/config/database.llms.txt) for connector configuration. [eRPC Cloud](/deployment/cloud.llms.txt) offers the most cost-effective caching for multi-tenant deployments. ## Horizontal scaling Run multiple eRPC replicas with a shared Redis connector to synchronize latest/finalized block numbers across instances. Without shared state, each replica polls independently, increasing upstream requests. See [Shared State](/config/database/shared-state.llms.txt). Even when Redis is temporarily unavailable, eRPC continues serving requests using local state tracking. ## Explicitly configure chain ID Auto-detected chain IDs add one upstream call per network at startup and slow rolling restarts. Configure them explicitly: - `networks.*.evm.chainId` — under [Networks](/config/projects/networks.llms.txt) - `upstreams.*.evm.chainId` — under [Upstreams](/config/projects/upstreams.llms.txt) ## Healthcheck and zero-downtime rollout Configure a [Healthcheck](/operation/healthcheck.llms.txt) readiness probe so your orchestrator stops routing to a pod before it shuts down. ### Cilium / Envoy drain pattern When using Cilium with Envoy (Ingress or Gateway API), set both shutdown wait fields to 30 s: **Config path:** `server` **YAML — `erpc.yaml`:** ```yaml server: waitBeforeShutdown: 30s # pod marked draining; readiness probe fails waitAfterShutdown: 30s # process stays alive until Envoy drains its connections ``` **TypeScript — `erpc.ts`:** ```typescript import { createConfig } from "@erpc-cloud/config"; export default createConfig({ server: { waitBeforeShutdown: "30s", waitAfterShutdown: "30s", }, }); ``` Shorter values allow Envoy to reuse a connection after the listener closes, or route to a pod that has already exited. Adjust to match your own probe intervals. ## Custom response headers Use `server.responseHeaders` to stamp every HTTP response with instance metadata for quick debugging without opening a trace: **Config path:** `server` **YAML — `erpc.yaml`:** ```yaml server: responseHeaders: X-ERPC-Region: \${FLY_REGION} # Fly.io region X-ERPC-Machine: \${FLY_MACHINE_ID} # Fly.io machine ID # Kubernetes: # X-ERPC-Pod: \${HOSTNAME} ``` **TypeScript — `erpc.ts`:** ```typescript import { createConfig } from "@erpc-cloud/config"; export default createConfig({ server: { responseHeaders: { "X-ERPC-Region": process.env.FLY_REGION, "X-ERPC-Machine": process.env.FLY_MACHINE_ID, // Kubernetes: // "X-ERPC-Pod": process.env.HOSTNAME, }, }, }); ``` Headers with empty values (after env-var expansion) are automatically omitted. Combine with [custom trace attributes](/operation/tracing.llms.txt#custom-resource-attributes) for full observability. ## Error detail visibility By default eRPC includes upstream error details in responses. In production, set `includeErrorDetails: false` to avoid leaking internal endpoint URLs, API key fragments, or upstream error messages to end-users: **Config path:** `server` **YAML — `erpc.yaml`:** ```yaml server: includeErrorDetails: false ``` **TypeScript — `erpc.ts`:** ```typescript import { createConfig } from "@erpc-cloud/config"; export default createConfig({ server: { includeErrorDetails: false, }, }); ``` ## Trusted IP forwarding When eRPC runs behind a load-balancer or CDN, the real client IP is in a forwarded header. Configure `trustedIPForwarders` (CIDR ranges of your LB/CDN) and `trustedIPHeaders` (the header name to read): **Config path:** `server` **YAML — `erpc.yaml`:** ```yaml server: trustedIPForwarders: - "10.0.0.0/8" # cluster-internal LB CIDR - "172.16.0.0/12" trustedIPHeaders: - "X-Forwarded-For" - "CF-Connecting-IP" # Cloudflare ``` **TypeScript — `erpc.ts`:** ```typescript import { createConfig } from "@erpc-cloud/config"; export default createConfig({ server: { trustedIPForwarders: ["10.0.0.0/8", "172.16.0.0/12"], trustedIPHeaders: ["X-Forwarded-For", "CF-Connecting-IP"], }, }); ``` Without this, IP-based rate limits and `network` auth strategies see the LB address rather than the real client. ### Memory / GC tuning eRPC is a Go process. The runtime's default GC target (`GOGC=100`) is appropriate for development but often too loose for containers with hard memory limits. **Recommended production pair:** ```bash GOGC=30 # run GC after heap grows 30 % — smaller heap, more frequent collections GOMEMLIMIT=2GiB # soft ceiling — GC fires when RSS nears this value ``` Set `GOMEMLIMIT` to ~80 % of your container memory limit. For example: 2 GiB limit → `GOMEMLIMIT=1600MiB`. Setting it equal to the limit leaves no headroom and risks GC thrash or OOM from transient allocation bursts. Caution: `GOGC < 10` causes GC thrashing — the runtime spends most CPU collecting, not serving requests. Values of 20–50 are the practical floor. If you have abundant RAM and want to reduce CPU overhead, raise `GOGC` (e.g. 200). The heap will grow larger but GC runs less often. ### Healthcheck rollout pattern eRPC's shutdown sequence: 1. Receive SIGTERM. 2. Stop accepting new connections (`waitBeforeShutdown` delay — readiness probe starts failing). 3. Drain in-flight requests. 4. Wait `waitAfterShutdown` (keeps the process alive so the LB/proxy can close open connections). 5. Exit 0. For Kubernetes with Cilium/Envoy, both values should be at least 30 s: ```yaml server: waitBeforeShutdown: 30s waitAfterShutdown: 30s ``` The readiness probe should return unhealthy within 10 s of SIGTERM (before `waitBeforeShutdown` expires) so the orchestrator removes the endpoint before connections are refused. Kubernetes `terminationGracePeriodSeconds` must be greater than `waitBeforeShutdown + waitAfterShutdown + time to drain`. Set it to at least 90 s for the 30 s + 30 s pattern above. ### `responseHeaders` for instance identification `server.responseHeaders` is a map of header name → value. Values support `\${VAR}` env-var expansion. Headers with an empty value after expansion are silently omitted (safe to use with optional env vars). Useful headers: | Header | Env var | Platform | |---|---|---| | `X-ERPC-Region` | `\${FLY_REGION}` | Fly.io | | `X-ERPC-Machine` | `\${FLY_MACHINE_ID}` | Fly.io | | `X-ERPC-Pod` | `\${HOSTNAME}` | Kubernetes (pod name) | | `X-ERPC-Instance` | `\${INSTANCE_ID}` | explicit / custom | Combine with tracing resource attributes (`tracing.resourceAttributes`) so every trace span carries the same instance label as the HTTP response header. ### `includeErrorDetails` Controls whether upstream error messages and internal endpoint information appear in JSON-RPC error responses returned to callers. - **Default:** `true` (errors are verbose — helpful for development). - **Production:** set to `false` to prevent leaking upstream URLs, API key fragments, and internal error strings. Errors are still logged internally at full verbosity regardless of this setting. ### `trustedIPForwarders` + `trustedIPHeaders` When eRPC sits behind a reverse proxy, load-balancer, or CDN, the TCP source IP is always the proxy's address. To recover the real client IP: 1. `trustedIPForwarders` — list of CIDR blocks (or individual IPs) whose `X-Forwarded-For` (or the named headers) are trusted. Requests from outside these ranges have their forwarded headers ignored. 2. `trustedIPHeaders` — ordered list of headers to read. eRPC picks the first header that is present on a request from a trusted forwarder. This real IP is then used for: - IP-based rate limiting (`network` auth strategy `allowedIPs`) - Per-IP metric labels - Any upstream selection that keys on client IP Without this config, all requests appear to originate from your LB IP and IP-based policies are effectively global. ### Metrics tuning See [Monitoring](/operation/monitoring.llms.txt) for `metrics.histogramDropLabels` — dropping high-cardinality label combinations (e.g. per-upstream request-size histograms) avoids cardinality explosion in Prometheus. ### Tracing in production See [Tracing](/operation/tracing.llms.txt) for OTLP exporter setup, sampling rate config, and adding custom resource attributes. Use `tracing.resourceAttributes` to attach region/instance labels that correlate with the `responseHeaders` you set above. ### Rate-limit budgets per project / upstream See [Rate Limiters](/config/rate-limiters.llms.txt) for `rateLimiters.budgets` — define per-project or per-upstream budgets and reference them from auth strategies (per-API-key limits) or directly from upstream config (cap upstream call rate to protect a paid plan). ### Common pitfalls - **`GOMEMLIMIT` without `GOGC`** — the runtime relies solely on the soft limit, leading to large heap swings just under the ceiling. Always pair them. - **`GOGC=100` with a tight container limit** — the heap can double in size before GC fires. A container with a 512 MiB limit can OOM before GC triggers. - **`waitBeforeShutdown` too short** — load-balancers / service meshes can take several seconds to drain an endpoint after a readiness probe fails. Values below 10 s risk connection resets on rolling restarts with Envoy. - **`waitAfterShutdown` too short** — if the process exits before the proxy finishes draining, in-flight requests to that pod are reset. 30 s is a safe default. - **`terminationGracePeriodSeconds` too short** — Kubernetes SIGKILL fires when this expires. It must exceed `waitBeforeShutdown + waitAfterShutdown + expected drain time`. - **`includeErrorDetails: true` in production** — upstream error messages often contain full endpoint URLs with API keys embedded. Set to `false` before exposing eRPC to external callers. - **Missing `trustedIPForwarders`** — IP-based rate limits and auth policies all see the LB IP, effectively becoming global instead of per-client. - **Chain ID auto-detection in large deployments** — every eRPC replica calls `eth_chainId` on every upstream at startup. With many replicas and many upstreams this creates a startup burst. Configuring `evm.chainId` explicitly eliminates it. > **TIP** > Append `.llms.txt` to this URL (or use the **AI** link above) to fetch the entire expanded reference as plain markdown for an AI assistant.