# See everything

> Source: https://docs.erpc.cloud/use-cases/see-everything
> Per-request metrics, traces, and honest healthchecks — know about problems before your users do.
> Format: machine-readable markdown export of the docs page above.
> All collapsible AI sections are inlined and fully expanded.

# See everything

When something degrades at 3 a.m., "the RPC feels slow" isn't a diagnosis. eRPC measures
every request at every hop — which project, which chain, which upstream, which method,
cache hit or miss, retried or hedged — and exposes it all as Prometheus metrics and
OpenTelemetry traces. Grafana dashboards ship in the repo. Your on-call sees exactly
which provider degraded and when, instead of guessing.

- **[Monitoring & metrics](/operation/monitoring.llms.txt)** — The 20 metrics that matter on day one, plus ready-made dashboards.
- **[Tracing & logging](/operation/tracing.llms.txt)** — Follow one request through every policy and upstream attempt.
- **[Healthcheck](/operation/healthcheck.llms.txt)** — Readiness that tells your load balancer the truth.
- **[Error taxonomy](/reference/errors.llms.txt)** — Every error normalized to a named code you can alert on.
- **[Metrics reference](/reference/metrics.llms.txt)** — All 122 metrics, every label, in one table.

All of the above in one place — illustrative, not a tuned production config:

**Config path:** `(root)`

**YAML — `erpc.yaml`:**

```yaml
# structured logs with secret redaction
logLevel: info
# Prometheus on :4001/metrics
metrics:
  enabled: true
  port: 4001
# one OTel trace per request, every hop; keep 10% of traces
tracing:
  enabled: true
  protocol: grpc
  endpoint: otel-collector:4317
  sampleRate: 0.1
# /healthcheck is always on; error codes are normalized automatically.
```

**TypeScript — `erpc.ts`:**

```typescript
// structured logs with secret redaction
logLevel: "info",
// Prometheus on :4001/metrics
metrics: { enabled: true, port: 4001 },
// one OTel trace per request, every hop; keep 10% of traces
tracing: {
  enabled: true,
  protocol: "grpc",
  endpoint: "otel-collector:4317",
  sampleRate: 0.1,
}
```

## Agent reference

Copy one of these prompts into your AI agent session (Claude Code, Cursor, …) — each one
points the agent at this page's machine-readable reference so it can do the work correctly:

**Prompt Example #1: wire up Prometheus metrics and a Grafana dashboard**

```text
I want to enable eRPC metrics so I can import the bundled Grafana dashboard and
start monitoring upstream health and cache hit rates. Update my eRPC config with the
metrics config and tell me which metrics matter most on day one. Read the full
reference first: https://docs.erpc.cloud/use-cases/see-everything.llms.txt
```

**Prompt Example #2: add distributed tracing and structured logging**

```text
I need to trace individual requests through eRPC all the way to the upstream to
debug slow responses. Configure OpenTelemetry tracing and structured log output in
my eRPC config with a sensible sample rate for production. Reference:
https://docs.erpc.cloud/use-cases/see-everything.llms.txt
```

**Prompt Example #3: build alerts for upstream degradation and chain-tip lag**

```text
I want Prometheus alerts that fire when an upstream starts erroring heavily or falls
behind the chain tip. Walk me through the right metric names, labels, and threshold
values to use, given my eRPC setup in my eRPC config. Reference:
https://docs.erpc.cloud/use-cases/see-everything.llms.txt
```

---

### See everything — agent starting points

This page sells the outcome; implementation lives in the feature pages. Fetch their
machine-readable companions:

- [Monitoring & metrics](/operation/monitoring.llms.txt) — metrics server config, key operational metrics, Grafana dashboards.
- [Tracing & logging](/operation/tracing.llms.txt) — OTel endpoint/sampling/protocol, span hierarchy, log levels, secret redaction.
- [Healthcheck](/operation/healthcheck.llms.txt) — eval strategies, root vs per-project checks, k8s probes.
- [Error taxonomy](/reference/errors.llms.txt) — every error code, retryability, HTTP status mapping.
- [Metrics reference](/reference/metrics.llms.txt) — the complete 122-metric catalog with labels and cardinality warnings.

Alerting hints for agents: upstream health is `erpc_upstream_request_errors_total` by
`error`/`severity`, chain-tip freshness is `erpc_network_latest_block_timestamp_distance_seconds`,
and benching events surface via cordon metrics — exact names, labels, and firing conditions
are in the metrics reference agent section.

---


## Navigation (machine-readable surface)

- Up: [All pages index](https://docs.erpc.cloud/llms.txt)
- Root index of every page: [llms.txt](https://docs.erpc.cloud/llms.txt) · everything in one file: [llms-full.txt](https://docs.erpc.cloud/llms-full.txt)

### Sibling pages

- [Cut RPC cost & latency](https://docs.erpc.cloud/use-cases/cut-costs-and-latency.llms.txt) — Serve repeated questions from cache, deduplicate identical requests, and stop paying providers for the same answer twice.
- [How eRPC works](https://docs.erpc.cloud/use-cases/how-it-works.llms.txt) — Every JSON-RPC call travels a battle-tested pipeline — auth, smart caching, parallel hedging, multi-upstream consensus — and arrives with full diagnostic headers. Zero glue code required.
- [Lock it down](https://docs.erpc.cloud/use-cases/lock-it-down.llms.txt) — Keys, JWTs, sign-in with Ethereum, per-user rate limits — your RPC endpoint stops being a free-for-all.
- [Scale chains & providers](https://docs.erpc.cloud/use-cases/scale-chains-and-providers.llms.txt) — One config line per provider, every chain they support — and the best upstream wins each request.
- [Survive provider outages](https://docs.erpc.cloud/use-cases/survive-provider-outages.llms.txt) — Keep serving traffic when an RPC provider slows down, rate-limits you, or disappears entirely.
- [Trust the data](https://docs.erpc.cloud/use-cases/trust-the-data.llms.txt) — Don't let one misbehaving node feed your app a wrong answer — verify, cross-check, and enforce integrity automatically.