# Production guidelines

> Source: https://docs.erpc.cloud/operation/production
> Memory/GC tuning, healthcheck rollout, instance identification, error visibility, and IP forwarding recommendations for running eRPC in production.
> Format: machine-readable markdown export of the docs page above.
> All collapsible AI sections are inlined and fully expanded.

# Production guidelines

Practical recommendations for running eRPC in production — from container sizing through zero-downtime rollouts and instance identification.

**What this page covers:**

- Memory usage and Go GC tuning (`GOGC`, `GOMEMLIMIT`)
- Failsafe policies (retry, timeout, hedge)
- Caching database selection
- Horizontal scaling with shared state
- Explicit chain ID configuration
- Zero-downtime healthcheck rollout (Cilium/Envoy drain pattern)
- Custom response headers for instance identification
- `includeErrorDetails` in production
- `trustedIPForwarders` and `trustedIPHeaders` behind a load-balancer or CDN

## Memory and GC tuning

The largest memory contributor in eRPC is the size of RPC responses. Common calls like `eth_getBlockByNumber` or `eth_getTransactionReceipt` are typically under 1 MB; heavy calls like `debug_traceTransaction` can reach 50 MB. Most deployments see ~256 MB RSS at modest load.

Start with a generous limit (e.g. 16 GB) while routing real traffic, then lower it once you know your p99 working set.

To prevent OOM-kills on Kubernetes, add both env vars to your container spec:

```bash
# Trigger GC when heap grows by 30 % (default is 100 %)
GOGC=30

# Trigger GC when RSS approaches 2 GiB — tune to ~80 % of your container memory limit
# WARNING: set this too low and GC will thrash; combine with GOGC for best results
GOMEMLIMIT=2GiB
```

Example Docker run:

```bash
docker run -e GOGC=30 -e GOMEMLIMIT=2GiB ghcr.io/erpc/erpc:latest \
  erpc start -c /etc/erpc/erpc.yaml
```

Kubernetes container spec snippet:

```yaml
env:
  - name: GOGC
    value: "30"
  - name: GOMEMLIMIT
    value: "2GiB"
resources:
  limits:
    memory: "2.5Gi"
  requests:
    memory: "512Mi"
```

## Failsafe policies

Configure [retry](/config/failsafe/retry.llms.txt) at both network and upstream scopes:

- **Network-level retry** rotates to a different upstream on a transient failure. Even with a single upstream it's worth enabling. Set `maxAttempts` ≈ number of upstreams.
- **Upstream-level retry** covers per-attempt flakiness within the same upstream. Use 2–5 `maxAttempts`.

Set a [timeout](/config/failsafe/timeout.llms.txt) that matches your request profile. For standard EVM calls a `3s` default is safe; for heavy trace or `getLogs` calls allow 10 s or more. Set `quantile: 0.99` on the upstream-scope timeout to auto-tune per method.

Enable the [hedge policy](/config/failsafe/hedge.llms.txt) for latency-sensitive reads. With `delay: 500ms`, eRPC races a second upstream once the primary has been quiet for 500 ms and returns the first kept response — at the cost of duplicate traffic for slow requests. Hedge attempts are excluded from per-upstream scoring and from the circuit breaker.

Use [consensus](/config/failsafe/consensus.llms.txt) for high-trust reads (gas price, nonce, contract calls during write paths). Set [`maxWaitOnResult`](/config/failsafe/consensus.llms.txt#tail-latency-caps-maxwaitonresult--maxwaitonempty) to bound tail latency when one participant lags.

[Execution trace headers](/config/failsafe.llms.txt#http-response-headers) (`X-ERPC-Upstreams-Tried`, `X-ERPC-Upstreams-Outcomes`, `X-ERPC-Upstreams-Reasons`, `X-ERPC-Upstreams-Durations-Ms`, `X-ERPC-Upstreams-Flags`) ship by default — clients can debug retry/hedge/consensus decisions without server-side traces. Disable with `server.executionHeaders: off` if you want zero diagnostic leakage.

## Caching database

Large read-heavy workloads (e.g. indexing 100 M Arbitrum blocks) require substantial cache storage. Start with Redis; switch to PostgreSQL when cached data exceeds available memory.

eRPC degrades gracefully if the cache backend is unavailable — it falls back to live upstream calls with no impact on availability.

See [Database](/config/database.llms.txt) for connector configuration. [eRPC Cloud](/deployment/cloud.llms.txt) offers the most cost-effective caching for multi-tenant deployments.

## Horizontal scaling

Run multiple eRPC replicas with a shared Redis connector to synchronize latest/finalized block numbers across instances. Without shared state, each replica polls independently, increasing upstream requests.

See [Shared State](/config/database/shared-state.llms.txt). Even when Redis is temporarily unavailable, eRPC continues serving requests using local state tracking.

## Explicitly configure chain ID

Auto-detected chain IDs add one upstream call per network at startup and slow rolling restarts. Configure them explicitly:

- `networks.*.evm.chainId` — under [Networks](/config/projects/networks.llms.txt)
- `upstreams.*.evm.chainId` — under [Upstreams](/config/projects/upstreams.llms.txt)

## Healthcheck and zero-downtime rollout

Configure a [Healthcheck](/operation/healthcheck.llms.txt) readiness probe so your orchestrator stops routing to a pod before it shuts down.

### Cilium / Envoy drain pattern

When using Cilium with Envoy (Ingress or Gateway API), set both shutdown wait fields to 30 s:

**Config path:** `server`

**YAML — `erpc.yaml`:**

```yaml
server:
  waitBeforeShutdown: 30s  # pod marked draining; readiness probe fails
  waitAfterShutdown:  30s  # process stays alive until Envoy drains its connections
```

**TypeScript — `erpc.ts`:**

```typescript
import { createConfig } from "@erpc-cloud/config";

export default createConfig({
  server: {
    waitBeforeShutdown: "30s",
    waitAfterShutdown: "30s",
  },
});
```

Shorter values allow Envoy to reuse a connection after the listener closes, or route to a pod that has already exited. Adjust to match your own probe intervals.

## Custom response headers

Use `server.responseHeaders` to stamp every HTTP response with instance metadata for quick debugging without opening a trace:

**Config path:** `server`

**YAML — `erpc.yaml`:**

```yaml
server:
  responseHeaders:
    X-ERPC-Region:   \${FLY_REGION}      # Fly.io region
    X-ERPC-Machine:  \${FLY_MACHINE_ID}  # Fly.io machine ID
    # Kubernetes:
    # X-ERPC-Pod: \${HOSTNAME}
```

**TypeScript — `erpc.ts`:**

```typescript
import { createConfig } from "@erpc-cloud/config";

export default createConfig({
  server: {
    responseHeaders: {
      "X-ERPC-Region":  process.env.FLY_REGION,
      "X-ERPC-Machine": process.env.FLY_MACHINE_ID,
      // Kubernetes:
      // "X-ERPC-Pod": process.env.HOSTNAME,
    },
  },
});
```

Headers with empty values (after env-var expansion) are automatically omitted. Combine with [custom trace attributes](/operation/tracing.llms.txt#custom-resource-attributes) for full observability.

## Error detail visibility

By default eRPC includes upstream error details in responses. In production, set `includeErrorDetails: false` to avoid leaking internal endpoint URLs, API key fragments, or upstream error messages to end-users:

**Config path:** `server`

**YAML — `erpc.yaml`:**

```yaml
server:
  includeErrorDetails: false
```

**TypeScript — `erpc.ts`:**

```typescript
import { createConfig } from "@erpc-cloud/config";

export default createConfig({
  server: {
    includeErrorDetails: false,
  },
});
```

## Trusted IP forwarding

When eRPC runs behind a load-balancer or CDN, the real client IP is in a forwarded header. Configure `trustedIPForwarders` (CIDR ranges of your LB/CDN) and `trustedIPHeaders` (the header name to read):

**Config path:** `server`

**YAML — `erpc.yaml`:**

```yaml
server:
  trustedIPForwarders:
    - "10.0.0.0/8"       # cluster-internal LB CIDR
    - "172.16.0.0/12"
  trustedIPHeaders:
    - "X-Forwarded-For"
    - "CF-Connecting-IP"  # Cloudflare
```

**TypeScript — `erpc.ts`:**

```typescript
import { createConfig } from "@erpc-cloud/config";

export default createConfig({
  server: {
    trustedIPForwarders: ["10.0.0.0/8", "172.16.0.0/12"],
    trustedIPHeaders: ["X-Forwarded-For", "CF-Connecting-IP"],
  },
});
```

Without this, IP-based rate limits and `network` auth strategies see the LB address rather than the real client.

### Memory / GC tuning

eRPC is a Go process. The runtime's default GC target (`GOGC=100`) is appropriate for development but often too loose for containers with hard memory limits.

**Recommended production pair:**

```bash
GOGC=30          # run GC after heap grows 30 % — smaller heap, more frequent collections
GOMEMLIMIT=2GiB  # soft ceiling — GC fires when RSS nears this value
```

Set `GOMEMLIMIT` to ~80 % of your container memory limit. For example: 2 GiB limit → `GOMEMLIMIT=1600MiB`. Setting it equal to the limit leaves no headroom and risks GC thrash or OOM from transient allocation bursts.

Caution: `GOGC < 10` causes GC thrashing — the runtime spends most CPU collecting, not serving requests. Values of 20–50 are the practical floor.

If you have abundant RAM and want to reduce CPU overhead, raise `GOGC` (e.g. 200). The heap will grow larger but GC runs less often.

### Healthcheck rollout pattern

eRPC's shutdown sequence:

1. Receive SIGTERM.
2. Stop accepting new connections (`waitBeforeShutdown` delay — readiness probe starts failing).
3. Drain in-flight requests.
4. Wait `waitAfterShutdown` (keeps the process alive so the LB/proxy can close open connections).
5. Exit 0.

For Kubernetes with Cilium/Envoy, both values should be at least 30 s:

```yaml
server:
  waitBeforeShutdown: 30s
  waitAfterShutdown:  30s
```

The readiness probe should return unhealthy within 10 s of SIGTERM (before `waitBeforeShutdown` expires) so the orchestrator removes the endpoint before connections are refused.

Kubernetes `terminationGracePeriodSeconds` must be greater than `waitBeforeShutdown + waitAfterShutdown + time to drain`. Set it to at least 90 s for the 30 s + 30 s pattern above.

### `responseHeaders` for instance identification

`server.responseHeaders` is a map of header name → value. Values support `\${VAR}` env-var expansion. Headers with an empty value after expansion are silently omitted (safe to use with optional env vars).

Useful headers:

| Header | Env var | Platform |
|---|---|---|
| `X-ERPC-Region` | `\${FLY_REGION}` | Fly.io |
| `X-ERPC-Machine` | `\${FLY_MACHINE_ID}` | Fly.io |
| `X-ERPC-Pod` | `\${HOSTNAME}` | Kubernetes (pod name) |
| `X-ERPC-Instance` | `\${INSTANCE_ID}` | explicit / custom |

Combine with tracing resource attributes (`tracing.resourceAttributes`) so every trace span carries the same instance label as the HTTP response header.

### `includeErrorDetails`

Controls whether upstream error messages and internal endpoint information appear in JSON-RPC error responses returned to callers.

- **Default:** `true` (errors are verbose — helpful for development).
- **Production:** set to `false` to prevent leaking upstream URLs, API key fragments, and internal error strings.

Errors are still logged internally at full verbosity regardless of this setting.

### `trustedIPForwarders` + `trustedIPHeaders`

When eRPC sits behind a reverse proxy, load-balancer, or CDN, the TCP source IP is always the proxy's address. To recover the real client IP:

1. `trustedIPForwarders` — list of CIDR blocks (or individual IPs) whose `X-Forwarded-For` (or the named headers) are trusted. Requests from outside these ranges have their forwarded headers ignored.
2. `trustedIPHeaders` — ordered list of headers to read. eRPC picks the first header that is present on a request from a trusted forwarder.

This real IP is then used for:
- IP-based rate limiting (`network` auth strategy `allowedIPs`)
- Per-IP metric labels
- Any upstream selection that keys on client IP

Without this config, all requests appear to originate from your LB IP and IP-based policies are effectively global.

### Metrics tuning

See [Monitoring](/operation/monitoring.llms.txt) for `metrics.histogramDropLabels` — dropping high-cardinality label combinations (e.g. per-upstream request-size histograms) avoids cardinality explosion in Prometheus.

### Tracing in production

See [Tracing](/operation/tracing.llms.txt) for OTLP exporter setup, sampling rate config, and adding custom resource attributes. Use `tracing.resourceAttributes` to attach region/instance labels that correlate with the `responseHeaders` you set above.

### Rate-limit budgets per project / upstream

See [Rate Limiters](/config/rate-limiters.llms.txt) for `rateLimiters.budgets` — define per-project or per-upstream budgets and reference them from auth strategies (per-API-key limits) or directly from upstream config (cap upstream call rate to protect a paid plan).

### Common pitfalls

- **`GOMEMLIMIT` without `GOGC`** — the runtime relies solely on the soft limit, leading to large heap swings just under the ceiling. Always pair them.
- **`GOGC=100` with a tight container limit** — the heap can double in size before GC fires. A container with a 512 MiB limit can OOM before GC triggers.
- **`waitBeforeShutdown` too short** — load-balancers / service meshes can take several seconds to drain an endpoint after a readiness probe fails. Values below 10 s risk connection resets on rolling restarts with Envoy.
- **`waitAfterShutdown` too short** — if the process exits before the proxy finishes draining, in-flight requests to that pod are reset. 30 s is a safe default.
- **`terminationGracePeriodSeconds` too short** — Kubernetes SIGKILL fires when this expires. It must exceed `waitBeforeShutdown + waitAfterShutdown + expected drain time`.
- **`includeErrorDetails: true` in production** — upstream error messages often contain full endpoint URLs with API keys embedded. Set to `false` before exposing eRPC to external callers.
- **Missing `trustedIPForwarders`** — IP-based rate limits and auth policies all see the LB IP, effectively becoming global instead of per-client.
- **Chain ID auto-detection in large deployments** — every eRPC replica calls `eth_chainId` on every upstream at startup. With many replicas and many upstreams this creates a startup burst. Configuring `evm.chainId` explicitly eliminates it.

</AISection>

> **TIP**
> Append `.llms.txt` to this URL (or use the **AI** link above) to fetch the entire expanded reference as plain markdown for an AI assistant.