Deployment
Kubernetes

Kubernetes deployment

AIOpen as plain markdown for AI

eRPC runs well on Kubernetes as a stateless proxy. A minimal setup needs a Deployment, a Service, and a ConfigMap holding your erpc.yaml. The sections below show ready-to-apply manifests you can copy and adjust.

What this page covers:

  • Quick-start manifests: ConfigMap, Secret, Deployment, Service
  • Readiness / liveness probe configuration (via /healthcheck)
  • Horizontal Pod Autoscaler
  • Graceful shutdown with waitBeforeShutdown / waitAfterShutdown
  • Optional PostgreSQL StatefulSet for cache backend
  • Helm chart status and kube/ reference manifests

Quick start

1. ConfigMap and Secret

apiVersion: v1
kind: ConfigMap
metadata:
  name: erpc-config
data:
  erpc.yaml: |
    logLevel: debug
    projects:
     - id: main
       upstreams:
       - endpoint: alchemy://\${ALCHEMY_API_KEY}
       - endpoint: blastapi://\${BLASTAPI_API_KEY}
       - endpoint: https://mynode-chain-1.svc.cluster.local
---
apiVersion: v1
kind: Secret
metadata:
  name: erpc-secrets
type: Opaque
stringData:
  ALCHEMY_API_KEY: your-alchemy-key-here
  BLASTAPI_API_KEY: your-blastapi-key-here

2. Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: erpc
  labels:
    app: erpc
spec:
  replicas: 1
  selector:
    matchLabels:
      app: erpc
  template:
    metadata:
      labels:
        app: erpc
    spec:
      containers:
      - name: erpc
        image: ghcr.io/erpc/erpc:latest
        resources:
          # CPU limits removed as they can cause throttling issues
          requests:
            memory: "256Mi"
          limits:
            memory: "2Gi"
        env:
        - name: GOGC
          value: "40"
        - name: GOMEMLIMIT
          value: "1900MiB"
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        envFrom:
        - secretRef:
            name: erpc-secrets
        ports:
        - containerPort: 4000
          name: http
        - containerPort: 4001
          name: metrics
        volumeMounts:
        - name: config
          mountPath: /erpc.yaml
          subPath: erpc.yaml
        startupProbe:
          httpGet:
            path: /healthcheck
            port: 4000
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 6
        readinessProbe:
          httpGet:
            path: /healthcheck
            port: 4000
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 5
          failureThreshold: 2
          successThreshold: 1
        livenessProbe:
          tcpSocket:
            port: 4000
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 1
          failureThreshold: 3
          successThreshold: 1
      volumes:
      - name: config
        configMap:
          name: erpc-config
      # Must be >= server.maxTimeout in erpc.yaml
      terminationGracePeriodSeconds: 180

3. Service

apiVersion: v1
kind: Service
metadata:
  name: erpc
  labels:
    app: erpc
spec:
  ports:
  - port: 4000
    name: http
    targetPort: 4000
  - port: 4001
    name: metrics
    targetPort: 4001
  selector:
    app: erpc

4. Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: erpc
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: erpc
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

5. Apply

kubectl apply -f erpc-configmap.yaml
kubectl apply -f erpc-secret.yaml
kubectl apply -f erpc-deployment.yaml
kubectl apply -f erpc-service.yaml
kubectl apply -f erpc-hpa.yaml
 
kubectl get pods
kubectl get services
kubectl get hpa

The service is available at erpc:4000 for HTTP and erpc:4001 for metrics within the cluster.

Copy for your AI assistant — full Kubernetes deployment referenceExpand for every option, default, and edge case — or copy this entire section into your AI assistant.

Deployment manifest field reference

FieldRecommended valueNotes
spec.replicas2+ in productioneRPC is stateless; any replica count works.
imageghcr.io/erpc/erpc:latestPin to a digest for reproducible rollouts.
resources.requests.memory256MiBaseline for the scheduler.
resources.limits.memory2GiSet GOMEMLIMIT ~5% below this (e.g. 1900MiB for a 2Gi limit).
CPU limitomitGo's scheduler is work-stealing; a hard CPU limit causes throttling without a proportional latency benefit.
terminationGracePeriodSeconds>= server.maxTimeoutGives the pod time to drain in-flight requests. 180s is a safe default.

POD_NAME / INSTANCE_ID env vars

eRPC uses an instance identifier for shared-state lock ownership and misbehavior file templating. On Kubernetes, inject it via the downward API:

env:
- name: POD_NAME
  valueFrom:
    fieldRef:
      fieldPath: metadata.name

Resolution order: INSTANCE_IDPOD_NAMEHOSTNAME → random UUID. See CLI & env vars for details.

Readiness and liveness probes

Use the /healthcheck HTTP endpoint for the readiness probe and a TCP socket check for liveness. See Healthcheck for available evaluation strategies.

startupProbe:
  httpGet:
    path: /healthcheck
    port: 4000
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 6      # 60s window to finish startup
 
readinessProbe:
  httpGet:
    path: /healthcheck
    port: 4000
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 5
  failureThreshold: 2      # removed from endpoints after ~10s
  successThreshold: 1
 
livenessProbe:
  tcpSocket:
    port: 4000
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 1
  failureThreshold: 3

The readiness probe fails immediately after the pod receives SIGTERM — this is intentional: it signals the orchestrator to stop routing new traffic before the in-flight drain completes.

Service types

ClusterIP (default) — suitable when eRPC is consumed by other in-cluster workloads:

spec:
  type: ClusterIP

LoadBalancer — exposes eRPC with a cloud load-balancer IP:

spec:
  type: LoadBalancer

Ingress — for TLS termination and host/path-based routing, attach an Ingress to the ClusterIP Service:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: erpc
spec:
  rules:
  - host: rpc.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: erpc
            port:
              number: 4000

ConfigMap for erpc.yaml

Mount the config as a volume rather than baking it into the image. Note that Kubernetes does NOT trigger a rolling restart when a ConfigMap changes — use kubectl rollout restart deployment/erpc after applying a config update, or add an annotation checksum to the pod template.

# In Deployment spec.template.metadata.annotations:
annotations:
  checksum/config: "{{ include (print $.Template.BasePath '/configmap.yaml') . | sha256sum }}"

Optional: PostgreSQL StatefulSet for cache backend

A reference PostgreSQL StatefulSet is available at kube/postgres.yml in the repository. Connect eRPC to it by adding a connectors entry in your erpc.yaml:

connectors:
- id: pg-cache
  driver: postgresql
  postgresql:
    connectionUri: "postgresql://erpc:password@postgres:5432/erpc"

Then reference it from a cache policy on your project. The StatefulSet in kube/postgres.yml uses a PersistentVolumeClaim; ensure your cluster has a StorageClass that supports ReadWriteOnce.

HorizontalPodAutoscaler integration

The HPA manifest above scales on CPU and memory utilization. eRPC exposes Prometheus metrics on :4001/metrics — if your cluster has the Prometheus Adapter installed, you can add custom metrics (e.g. erpc_upstream_request_duration_seconds) as additional HPA targets for more precise scaling.

metrics:
- type: Pods
  pods:
    metric:
      name: erpc_requests_per_second
    target:
      type: AverageValue
      averageValue: "500"

PodDisruptionBudget recommendation

Prevent all replicas from being evicted simultaneously during cluster maintenance:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: erpc
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: erpc

For HA setups run at least 2 replicas so this PDB allows one voluntary disruption at a time.

Graceful drain with waitBeforeShutdown / waitAfterShutdown

Configure these in erpc.yaml so pods drain cleanly under rolling updates:

server:
  # After SIGTERM: stay alive, mark NotReady, drain in-flight requests.
  # Set >= (readinessProbe.periodSeconds x readinessProbe.failureThreshold) + buffer.
  waitBeforeShutdown: 30s
  # After HTTP server stops: keep the process alive so kube-proxy / Envoy
  # can close lingering TCP connections.
  waitAfterShutdown: 30s

Also set terminationGracePeriodSeconds in the Deployment to at least waitBeforeShutdown + waitAfterShutdown + server.maxTimeout. See Healthcheck for the full configuration reference.

GOGC and GOMEMLIMIT tuning

Set both env vars in the Deployment to keep memory usage predictable. See CLI & env vars and Production tuning for guidance.

env:
- name: GOGC
  value: "40"
- name: GOMEMLIMIT
  value: "1900MiB"  # ~95% of limits.memory

Rule of thumb: GOMEMLIMIT = 95% of resources.limits.memory. Omitting GOGC while setting GOMEMLIMIT can cause large heap swings just below the limit — always pair them.

Helm chart status

There is no official Helm chart yet. The repository provides raw manifests under kube/ (kube/erpc.yml, kube/postgres.yml) as a starting point. Community-maintained charts may exist; check Artifact Hub. Contributions of an official chart are welcome.

Network policy considerations

eRPC makes outbound connections to upstream RPC endpoints and (optionally) to cache backends (PostgreSQL, Redis, DynamoDB). If your cluster enforces NetworkPolicy, allow:

  • Egress on 443/TCP and 80/TCP to upstream providers (or to the internet if using managed providers)
  • Egress on your cache backend's port (e.g. 5432/TCP for PostgreSQL, 6379/TCP for Redis)
  • Ingress on 4000/TCP from your application pods (or from the Ingress controller)
  • Ingress on 4001/TCP from your Prometheus scraper

Common pitfalls

  • Probe interval too short during cold start — if initialDelaySeconds is too low and upstreams are slow to respond, the startup probe fails and Kubernetes restarts the pod in a loop. Use the startupProbe with a generous failureThreshold to absorb slow upstream initialization.
  • Config reload is not automatic — updating a ConfigMap does not restart the Deployment. Either patch the pod template annotations with a config checksum or run kubectl rollout restart deployment/erpc.
  • Log volume at trace levellogLevel: trace is extremely verbose; at high RPC traffic it can saturate log shippers and consume significant CPU. Use info or warn in production.
  • CPU limits causing throttling — omit resources.limits.cpu. A hard CPU limit causes the Go scheduler to be throttled by the Linux CFS scheduler, increasing tail latency without reducing memory usage.
  • terminationGracePeriodSeconds too small — if this is shorter than the time needed to drain in-flight requests plus the waitBeforeShutdown / waitAfterShutdown delays, Kubernetes sends SIGKILL before the drain completes, dropping active requests.

Append .llms.txt to this page's URL (or use the AI link above) to fetch the entire expanded reference as plain text for your AI assistant.