# Kubernetes > Source: https://docs.erpc.cloud/deployment/kubernetes > Stateless by design — scale eRPC to any replica count, roll restarts without dropping requests, and wire Prometheus scraping in one manifest apply. > Format: machine-readable markdown export of the docs page above. > All collapsible AI sections are inlined and fully expanded. # Kubernetes eRPC is a stateless proxy — every replica is identical, rolling restarts are safe, and you can scale to any count without session affinity. The reference manifests in `kube/` give you a Namespace, Deployment, ConfigMap, ClusterIP Service, and PodMonitor in one apply. A companion manifest adds a PostgreSQL cache store. Zero in-flight requests are dropped when you tune the termination grace period to match your timeout settings. ```bash kubectl apply -f kube/erpc.yml kubectl apply -f kube/postgres.yml # optional cache store ``` ## Agent reference Copy one of these prompts into your AI agent session (Claude Code, Cursor, …) — each one points the agent at this page's machine-readable reference so it can do the work correctly: **Prompt Example #1: deploy eRPC to Kubernetes from scratch** ```text Apply the eRPC Kubernetes manifests to my cluster: Namespace, Deployment, ConfigMap, ClusterIP Service, and PodMonitor. Set GOMEMLIMIT to 90% of the memory limit, wire the Downward API for POD_NAME, and pin the image to a specific release tag. My Work with my existing eRPC config. Read the full reference first: https://docs.erpc.cloud/deployment/kubernetes.llms.txt ``` **Prompt Example #2: tune graceful drain to avoid dropped requests** ```text My Kubernetes pods are dropping in-flight requests during rolling deploys. Tune terminationGracePeriodSeconds, server.waitBeforeShutdown, and server.waitAfterShutdown so all requests drain cleanly before SIGKILL. Work with my existing eRPC config. Reference: https://docs.erpc.cloud/deployment/kubernetes.llms.txt ``` **Prompt Example #3: fix OOM kills and high tail latency on pods** ```text My eRPC pods are being OOM-killed under load and I also see high p99 latency spikes. Help me set GOMEMLIMIT, GOGC, and resource limits/requests correctly, and explain whether I should remove the CPU limit to avoid CFS throttling. Work with my existing eRPC config. Reference: https://docs.erpc.cloud/deployment/kubernetes.llms.txt ``` **Prompt Example #4: wire Prometheus scraping via PodMonitor** ```text My Prometheus Operator is not scraping eRPC metrics. Verify the PodMonitor spec, labels, and port names match my cluster's podMonitorSelector, and explain what metrics to alert on for production. Work with my existing eRPC config. Reference: https://docs.erpc.cloud/deployment/kubernetes.llms.txt ``` --- ### Kubernetes — full agent reference ### How it works `kube/erpc.yml` defines five Kubernetes resources: a `erpc` Namespace, a Deployment (single replica by default, safe to scale), a ConfigMap `erpc-config` with an embedded example `erpc.yaml`, a ClusterIP Service (port 80 → 4000), and a `PodMonitor` for Prometheus Operator. `kube/postgres.yml` adds a PersistentVolumeClaim (500 Gi), Deployment, Service, and Secret for a companion PostgreSQL cache store. **Config delivery.** The Deployment mounts the ConfigMap as a volume at `/erpc.yaml`. To change config, update the ConfigMap and run `kubectl rollout restart deployment/erpc` — ConfigMap updates alone do not trigger a rolling restart. **Graceful drain.** eRPC handles `SIGTERM` via `signal.NotifyContext`. After receiving SIGTERM: 1. Healthcheck starts returning 503 — readiness probe fails, pod is removed from Service endpoints. 2. `server.waitBeforeShutdown` elapses — in-flight requests drain. 3. HTTP server calls `Shutdown`. 4. `server.waitAfterShutdown` elapses — lets kube-proxy/Envoy close lingering TCP connections. 5. Process exits. Set `terminationGracePeriodSeconds >= waitBeforeShutdown + waitAfterShutdown + server.maxTimeout`. 180s is a safe default for most workloads. **Prometheus scraping.** The pod template carries annotations and a `PodMonitor` (`monitoring.coreos.com/v1`) is created in the `erpc` namespace: ```yaml # Pod annotations (kube/erpc.yml:L20-L23) prometheus.io/scrape: "true" prometheus.io/port: "4001" prometheus.io/path: "/metrics" ``` The `PodMonitor` is labeled `release: monitoring`, targets the port named `http`, scrapes every `10s` with a `5s` timeout, and matches pods labeled `app: erpc`. **Probes.** The `/healthcheck` HTTP endpoint drives all three probe types. See [Healthcheck](/operation/healthcheck.llms.txt) for the full evaluation strategy. ```yaml startupProbe: httpGet: path: /healthcheck port: 4000 initialDelaySeconds: 10 periodSeconds: 10 failureThreshold: 6 # 60s window to absorb slow upstream init readinessProbe: httpGet: path: /healthcheck port: 4000 periodSeconds: 5 failureThreshold: 2 # removed from endpoints after ~10s livenessProbe: tcpSocket: port: 4000 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 3 ``` After SIGTERM, the readiness probe fails immediately by design, signalling the orchestrator to drain traffic before the in-flight drain completes. ### Config schema No `erpc.yaml` fields are specific to the Kubernetes deployment layer. The following env vars are relevant in the pod spec: | Variable | Recommended value | Notes | |---|---|---| | `GOMEMLIMIT` | 90% of `resources.limits.memory` (e.g. `2700MiB` for `3Gi`) | Not set in reference manifest; absence risks OOM kill at container limit | | `GOGC` | `40` | Lower GC target reduces heap swings; pair with `GOMEMLIMIT` | | `POD_NAME` | Downward API `metadata.name` | Used for shared-state lock ownership; resolution order: `INSTANCE_ID` → `POD_NAME` → `HOSTNAME` → random UUID | | `LOG_LEVEL` | `info` or `warn` | `trace` is extremely verbose at high RPC traffic; can saturate log shippers | | `LOG_WRITER` | `console` | If set to `"console"`, switches to a zerolog console writer with `04:05.000ms` time format; default is JSON structured output | **CLI subcommands** available in the container binary ([`cmd/erpc/main.go`](https://github.com/erpc/erpc/blob/main/cmd/erpc/main.go)): | Subcommand | Purpose | |---|---| | `erpc start` (or `erpc [config]`) | Start the server | | `erpc validate [--format json\|md]` | Parse config, run validation report, exit non-zero on errors — useful in CI pre-deploy checks | | `erpc dump [--format yaml\|json]` | Parse config and dump the resolved effective config (including selection policy expansion) to stdout | Reference resource specs from the kube manifest (the only code-grounded recommendations): | Resource | Requests | Limits | |---|---|---| | eRPC pod | `3Gi` memory, `2` CPU | `3Gi` memory, `2` CPU | | PostgreSQL pod | `8Gi` memory, `4` CPU | `8Gi` memory, `4` CPU | | PostgreSQL PVC | — | `500Gi` | ### Worked examples **1. Minimal production Deployment patch.** Add `GOMEMLIMIT`, remove the CPU limit to avoid CFS throttling, and wire the Downward API for `POD_NAME`: ```yaml env: - name: GOMEMLIMIT value: "2700MiB" - name: GOGC value: "40" - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name resources: requests: memory: "3Gi" cpu: "2" limits: memory: "3Gi" # no cpu limit — avoids CFS throttling on Go goroutines ``` **2. Graceful drain tuning for a 30s `maxTimeout`.** Set these in `erpc.yaml` and match `terminationGracePeriodSeconds` in the pod spec: ```yaml # erpc.yaml (inside ConfigMap) server: waitBeforeShutdown: 30s waitAfterShutdown: 30s ``` ```yaml # Deployment pod spec terminationGracePeriodSeconds: 180 ``` 30s drain + 30s TCP teardown + 30s max request timeout + buffer = 180s is safe for most workloads. **3. Triggering a config reload.** After updating the ConfigMap with a new `erpc.yaml`, restart the Deployment — ConfigMap edits alone are not detected at runtime: ```bash kubectl apply -f kube/erpc.yml kubectl rollout restart deployment/erpc -n erpc kubectl rollout status deployment/erpc -n erpc ``` **4. Horizontal scaling.** eRPC has no replica-count constraint. Scale freely — no sticky sessions, no leader election needed for the proxy path: ```bash kubectl scale deployment/erpc --replicas=5 -n erpc ``` ### Best practices - **Set `GOMEMLIMIT=2700MiB`** (or 90% of your memory limit) as a container env var. The reference manifest omits it — without it, Go's GC may allow the heap to reach the hard 3 Gi limit and trigger an OOM kill instead of a managed GC cycle. - **Remove `resources.limits.cpu`**. Go's work-stealing scheduler is sensitive to Linux CFS throttling; a hard CPU limit raises tail latency without reducing memory usage. Set `resources.requests.cpu` for scheduler placement but leave limits absent. - **Pin the image tag**. The reference manifest uses `ghcr.io/erpc/erpc:latest` — pin to a specific tag or SHA for reproducible production rollouts. - **Pin PostgreSQL too**. `kube/postgres.yml` uses `postgres:latest` — pin a specific PostgreSQL version and digest for production. - **Always `kubectl rollout restart` after ConfigMap changes**. A ConfigMap update in place does not trigger pod restarts; traffic keeps hitting the old config until you explicitly restart. - **Size `terminationGracePeriodSeconds` generously**. It must be ≥ `waitBeforeShutdown + waitAfterShutdown + server.maxTimeout`; otherwise Kubernetes sends SIGKILL before the drain completes, dropping active requests. - **Allow egress to upstream RPC endpoints**. eRPC makes outbound connections on 443/TCP and 80/TCP to upstream providers, plus 5432/TCP (PostgreSQL) and 6379/TCP (Redis) for cache backends. Include these in your NetworkPolicy. ### Edge cases & gotchas 1. **No `GOMEMLIMIT` in reference manifest**: With `3Gi` limit, Go GC may trigger OOM. Set `GOMEMLIMIT=2700MiB`. 2. **CPU limits cause throttling**: Hard CPU limits engage Linux CFS throttling on Go goroutines, raising tail latency. Omit `resources.limits.cpu`. 3. **`terminationGracePeriodSeconds` too small**: If shorter than `waitBeforeShutdown + waitAfterShutdown + maxTimeout`, Kubernetes sends SIGKILL before drain completes, dropping active requests. 4. **Startup probe `initialDelaySeconds` too low**: If upstreams are slow to respond at startup, the probe fails and Kubernetes restarts the pod in a loop. Use a `startupProbe` with generous `failureThreshold` (6 × 10s = 60s window). 5. **ConfigMap update alone does not restart pods**: Run `kubectl rollout restart deployment/erpc` or checksum the ConfigMap in pod template annotations to force a rolling update. 6. **Image tag `:latest` is not reproducible**: Reference manifest pins `:latest` — operators should pin a specific tag or SHA. 7. **`kube/postgres.yml` PVC is `ReadWriteOnce`**: Multi-zone clusters may fail to reschedule the pod to a different availability zone. Ensure your StorageClass supports cross-zone access or use a managed database instead. 8. **No official Helm chart**: The repository provides raw manifests under `kube/`. Community-maintained charts may exist on Artifact Hub. 9. **Network policy egress**: eRPC makes outbound connections to upstream RPC endpoints (443/TCP, 80/TCP) and optional cache backends (5432/TCP PostgreSQL, 6379/TCP Redis). Allow ingress on 4000/TCP from app pods and 4001/TCP from the Prometheus scraper. 10. **`kube/postgres.yml` uses `postgres:latest`**: Pin a specific PostgreSQL version and digest for production. 11. **gRPC and HTTP share port 4000 by default**: `server.grpcPortV4` and `server.httpPortV4` both default to `4000`. If you override the gRPC port to a different value, a second listener is bound and you must add a corresponding `containerPort` and Service port entry. The manifest only exposes 4000 and 4001. 12. **pprof binary present but never invoked by default**: The image ships `/erpc-server-pprof` (built with `-tags pprof`) alongside the default `/erpc-server`. To enable profiling on port 6060, override the container `command` to `/erpc-server-pprof`; the default `CMD` runs the non-pprof binary. ### Observability Scraped by Prometheus via PodMonitor at `4001/metrics` every 10s. Key metrics for Kubernetes-level alerting: | Metric | Type | When it fires | |---|---|---| | `erpc_upstream_request_errors_total` | counter | Every upstream request error (used in `HighErrorRate` alert) | | `erpc_upstream_request_duration_seconds_budget` | histogram | Per-upstream request duration (used in `SlowRequests` p95 alert) | | `erpc_upstream_request_total` | counter | Every upstream request; drives `HighRequestRate` alert (> 1000 req/s for 5 min) and `LowRequestRate` alert (< 1 req/s for 15 min, warning) | | `erpc_upstream_request_self_rate_limited_total` | counter | Upstream self-rate-limiting events | | `erpc_network_request_self_rate_limited_total` | counter | Network-level self-rate-limiting events | The `PodMonitor` is labeled `release: monitoring` and must match your Prometheus Operator's `serviceMonitorSelector` / `podMonitorSelector`. [[`kube/erpc.yml:L145-160`](https://github.com/erpc/erpc/blob/main/kube/erpc.yml#L145-L160)] ### Source code entry points - [`kube/erpc.yml`](https://github.com/erpc/erpc/blob/main/kube/erpc.yml) — Namespace, Deployment, ConfigMap, Service, PodMonitor - [`kube/postgres.yml`](https://github.com/erpc/erpc/blob/main/kube/postgres.yml) — PVC (500Gi), Deployment, Service, Secret for PostgreSQL - [`kube/erpc.yml:L33-L36`](https://github.com/erpc/erpc/blob/main/kube/erpc.yml#L33-L36) — reference resource sizing: `3Gi` memory, `2` CPU - [`kube/erpc.yml:L145-L160`](https://github.com/erpc/erpc/blob/main/kube/erpc.yml#L145-L160) — PodMonitor spec (10s scrape interval, 5s timeout) - [`kube/erpc.yml:L54-L130`](https://github.com/erpc/erpc/blob/main/kube/erpc.yml#L54-L130) — embedded ConfigMap with Arbitrum example config - [`cmd/erpc/main.go:L72-L74`](https://github.com/erpc/erpc/blob/main/cmd/erpc/main.go#L72-L74) — SIGTERM graceful shutdown via `signal.NotifyContext` - [`cmd/erpc/main.go:L279-L294`](https://github.com/erpc/erpc/blob/main/cmd/erpc/main.go#L279-L294) — config file search order (12 paths) ### Related pages - [Docker](/deployment/docker.llms.txt) — single-container run and local compose stack with monitoring. - [Railway](/deployment/railway.llms.txt) — one-click managed deploy. - [Healthcheck](/operation/healthcheck.llms.txt) — probe strategy and evaluation logic. - [Monitoring & metrics](/operation/monitoring.llms.txt) — full metrics reference for dashboards and alerting. - [Authentication](/config/auth.llms.txt) — secure the ingress before exposing the Service externally. --- ## Navigation (machine-readable surface) - Up: [All pages index](https://docs.erpc.cloud/llms.txt) - Root index of every page: [llms.txt](https://docs.erpc.cloud/llms.txt) · everything in one file: [llms-full.txt](https://docs.erpc.cloud/llms-full.txt) ### Sibling pages - [Hosted cloud](https://docs.erpc.cloud/deployment/cloud.llms.txt) — Managed eRPC instances with regional cache storage — no infrastructure setup required. - [Docker](https://docs.erpc.cloud/deployment/docker.llms.txt) — A 30 MB distroless image, one mount, two ports — eRPC runs anywhere Docker runs, with a pre-wired Grafana dashboard ready the moment you start. - [Railway](https://docs.erpc.cloud/deployment/railway.llms.txt) — From zero to a running eRPC proxy with Grafana monitoring in one click — no infra to manage, public endpoints ready immediately.