Kubernetes deployment
AIOpen as plain markdown for AIeRPC runs well on Kubernetes as a stateless proxy. A minimal setup needs a Deployment, a Service, and a ConfigMap holding your erpc.yaml. The sections below show ready-to-apply manifests you can copy and adjust.
What this page covers:
- Quick-start manifests: ConfigMap, Secret, Deployment, Service
- Readiness / liveness probe configuration (via
/healthcheck) - Horizontal Pod Autoscaler
- Graceful shutdown with
waitBeforeShutdown/waitAfterShutdown - Optional PostgreSQL StatefulSet for cache backend
- Helm chart status and
kube/reference manifests
Quick start
1. ConfigMap and Secret
apiVersion: v1
kind: ConfigMap
metadata:
name: erpc-config
data:
erpc.yaml: |
logLevel: debug
projects:
- id: main
upstreams:
- endpoint: alchemy://\${ALCHEMY_API_KEY}
- endpoint: blastapi://\${BLASTAPI_API_KEY}
- endpoint: https://mynode-chain-1.svc.cluster.local
---
apiVersion: v1
kind: Secret
metadata:
name: erpc-secrets
type: Opaque
stringData:
ALCHEMY_API_KEY: your-alchemy-key-here
BLASTAPI_API_KEY: your-blastapi-key-here2. Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: erpc
labels:
app: erpc
spec:
replicas: 1
selector:
matchLabels:
app: erpc
template:
metadata:
labels:
app: erpc
spec:
containers:
- name: erpc
image: ghcr.io/erpc/erpc:latest
resources:
# CPU limits removed as they can cause throttling issues
requests:
memory: "256Mi"
limits:
memory: "2Gi"
env:
- name: GOGC
value: "40"
- name: GOMEMLIMIT
value: "1900MiB"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
envFrom:
- secretRef:
name: erpc-secrets
ports:
- containerPort: 4000
name: http
- containerPort: 4001
name: metrics
volumeMounts:
- name: config
mountPath: /erpc.yaml
subPath: erpc.yaml
startupProbe:
httpGet:
path: /healthcheck
port: 4000
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 6
readinessProbe:
httpGet:
path: /healthcheck
port: 4000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 5
failureThreshold: 2
successThreshold: 1
livenessProbe:
tcpSocket:
port: 4000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
successThreshold: 1
volumes:
- name: config
configMap:
name: erpc-config
# Must be >= server.maxTimeout in erpc.yaml
terminationGracePeriodSeconds: 1803. Service
apiVersion: v1
kind: Service
metadata:
name: erpc
labels:
app: erpc
spec:
ports:
- port: 4000
name: http
targetPort: 4000
- port: 4001
name: metrics
targetPort: 4001
selector:
app: erpc4. Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: erpc
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: erpc
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 805. Apply
kubectl apply -f erpc-configmap.yaml
kubectl apply -f erpc-secret.yaml
kubectl apply -f erpc-deployment.yaml
kubectl apply -f erpc-service.yaml
kubectl apply -f erpc-hpa.yaml
kubectl get pods
kubectl get services
kubectl get hpaThe service is available at erpc:4000 for HTTP and erpc:4001 for metrics within the cluster.
Copy for your AI assistant — full Kubernetes deployment referenceExpand for every option, default, and edge case — or copy this entire section into your AI assistant.
Deployment manifest field reference
| Field | Recommended value | Notes |
|---|---|---|
spec.replicas | 2+ in production | eRPC is stateless; any replica count works. |
image | ghcr.io/erpc/erpc:latest | Pin to a digest for reproducible rollouts. |
resources.requests.memory | 256Mi | Baseline for the scheduler. |
resources.limits.memory | 2Gi | Set GOMEMLIMIT ~5% below this (e.g. 1900MiB for a 2Gi limit). |
| CPU limit | omit | Go's scheduler is work-stealing; a hard CPU limit causes throttling without a proportional latency benefit. |
terminationGracePeriodSeconds | >= server.maxTimeout | Gives the pod time to drain in-flight requests. 180s is a safe default. |
POD_NAME / INSTANCE_ID env vars
eRPC uses an instance identifier for shared-state lock ownership and misbehavior file templating. On Kubernetes, inject it via the downward API:
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.nameResolution order: INSTANCE_ID → POD_NAME → HOSTNAME → random UUID. See CLI & env vars for details.
Readiness and liveness probes
Use the /healthcheck HTTP endpoint for the readiness probe and a TCP socket check for liveness. See Healthcheck for available evaluation strategies.
startupProbe:
httpGet:
path: /healthcheck
port: 4000
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 6 # 60s window to finish startup
readinessProbe:
httpGet:
path: /healthcheck
port: 4000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 5
failureThreshold: 2 # removed from endpoints after ~10s
successThreshold: 1
livenessProbe:
tcpSocket:
port: 4000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3The readiness probe fails immediately after the pod receives SIGTERM — this is intentional: it signals the orchestrator to stop routing new traffic before the in-flight drain completes.
Service types
ClusterIP (default) — suitable when eRPC is consumed by other in-cluster workloads:
spec:
type: ClusterIPLoadBalancer — exposes eRPC with a cloud load-balancer IP:
spec:
type: LoadBalancerIngress — for TLS termination and host/path-based routing, attach an Ingress to the ClusterIP Service:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: erpc
spec:
rules:
- host: rpc.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: erpc
port:
number: 4000ConfigMap for erpc.yaml
Mount the config as a volume rather than baking it into the image. Note that Kubernetes does NOT trigger a rolling restart when a ConfigMap changes — use kubectl rollout restart deployment/erpc after applying a config update, or add an annotation checksum to the pod template.
# In Deployment spec.template.metadata.annotations:
annotations:
checksum/config: "{{ include (print $.Template.BasePath '/configmap.yaml') . | sha256sum }}"Optional: PostgreSQL StatefulSet for cache backend
A reference PostgreSQL StatefulSet is available at kube/postgres.yml in the repository. Connect eRPC to it by adding a connectors entry in your erpc.yaml:
connectors:
- id: pg-cache
driver: postgresql
postgresql:
connectionUri: "postgresql://erpc:password@postgres:5432/erpc"Then reference it from a cache policy on your project. The StatefulSet in kube/postgres.yml uses a PersistentVolumeClaim; ensure your cluster has a StorageClass that supports ReadWriteOnce.
HorizontalPodAutoscaler integration
The HPA manifest above scales on CPU and memory utilization. eRPC exposes Prometheus metrics on :4001/metrics — if your cluster has the Prometheus Adapter installed, you can add custom metrics (e.g. erpc_upstream_request_duration_seconds) as additional HPA targets for more precise scaling.
metrics:
- type: Pods
pods:
metric:
name: erpc_requests_per_second
target:
type: AverageValue
averageValue: "500"PodDisruptionBudget recommendation
Prevent all replicas from being evicted simultaneously during cluster maintenance:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: erpc
spec:
minAvailable: 1
selector:
matchLabels:
app: erpcFor HA setups run at least 2 replicas so this PDB allows one voluntary disruption at a time.
Graceful drain with waitBeforeShutdown / waitAfterShutdown
Configure these in erpc.yaml so pods drain cleanly under rolling updates:
server:
# After SIGTERM: stay alive, mark NotReady, drain in-flight requests.
# Set >= (readinessProbe.periodSeconds x readinessProbe.failureThreshold) + buffer.
waitBeforeShutdown: 30s
# After HTTP server stops: keep the process alive so kube-proxy / Envoy
# can close lingering TCP connections.
waitAfterShutdown: 30sAlso set terminationGracePeriodSeconds in the Deployment to at least waitBeforeShutdown + waitAfterShutdown + server.maxTimeout. See Healthcheck for the full configuration reference.
GOGC and GOMEMLIMIT tuning
Set both env vars in the Deployment to keep memory usage predictable. See CLI & env vars and Production tuning for guidance.
env:
- name: GOGC
value: "40"
- name: GOMEMLIMIT
value: "1900MiB" # ~95% of limits.memoryRule of thumb: GOMEMLIMIT = 95% of resources.limits.memory. Omitting GOGC while setting GOMEMLIMIT can cause large heap swings just below the limit — always pair them.
Helm chart status
There is no official Helm chart yet. The repository provides raw manifests under kube/ (kube/erpc.yml, kube/postgres.yml) as a starting point. Community-maintained charts may exist; check Artifact Hub. Contributions of an official chart are welcome.
Network policy considerations
eRPC makes outbound connections to upstream RPC endpoints and (optionally) to cache backends (PostgreSQL, Redis, DynamoDB). If your cluster enforces NetworkPolicy, allow:
- Egress on 443/TCP and 80/TCP to upstream providers (or to the internet if using managed providers)
- Egress on your cache backend's port (e.g. 5432/TCP for PostgreSQL, 6379/TCP for Redis)
- Ingress on 4000/TCP from your application pods (or from the Ingress controller)
- Ingress on 4001/TCP from your Prometheus scraper
Common pitfalls
- Probe interval too short during cold start — if
initialDelaySecondsis too low and upstreams are slow to respond, the startup probe fails and Kubernetes restarts the pod in a loop. Use thestartupProbewith a generousfailureThresholdto absorb slow upstream initialization. - Config reload is not automatic — updating a ConfigMap does not restart the Deployment. Either patch the pod template annotations with a config checksum or run
kubectl rollout restart deployment/erpc. - Log volume at trace level —
logLevel: traceis extremely verbose; at high RPC traffic it can saturate log shippers and consume significant CPU. Useinfoorwarnin production. - CPU limits causing throttling — omit
resources.limits.cpu. A hard CPU limit causes the Go scheduler to be throttled by the Linux CFS scheduler, increasing tail latency without reducing memory usage. - terminationGracePeriodSeconds too small — if this is shorter than the time needed to drain in-flight requests plus the
waitBeforeShutdown/waitAfterShutdowndelays, Kubernetes sends SIGKILL before the drain completes, dropping active requests.
Append .llms.txt to this page's URL (or use the AI link above) to fetch the entire expanded reference as plain text for your AI assistant.