Guides
Dashboard
The Layer dashboard is the operator surface that ships in-cluster alongside the gateway. It reads from the same gateway API customers do — no direct database, Aerospike, or VictoriaMetrics access — and surfaces the views that justify Layer’s role as the operating layer between an application and its vector store.
Deployments on EKS reach the dashboard at https://dashboard.hevlayer.com.
Self-hosted installs expose it via the layer-dashboard Service.
Layout
The dashboard groups everything operators care about into six tabs:
| Tab | What it answers |
|---|---|
| console | What is happening right now? At-a-glance gauges + activity log. |
| data | What is in the indexes? Namespace inventory, snapshot history, schema. |
| read | Are queries healthy? Query latency, p99 overhead, Aerospike pool. |
| write | Are writes flowing? Pipelines, embed pools, claim/heartbeat state. |
| cost | Where is spend going? AWS + Turbopuffer cost lines stacked over time. |
| observe | Catalog of every metric the gateway exports, grouped by family. |
Console
The first view a new operator opens. Two stripes:
- At a glance — single-number cards for queries/s, indexed rows/s, fetch p99, cache hit ratio, error budget burn. Each card links into the matching read / write / observe panel.
- Activity log — newest-first stream backed by
/v2/activity/snapshotsand the search-history endpoints. Filters are persisted in the URL so links survive a refresh.
Data
The inventory view. Click a namespace to drill into:
- Schema and approximate row count proxied from Turbopuffer metadata.
- Recent snapshot SHAs with field histograms and skipped-field markers — see snapshots.
- The current freshness signals (
stable_as_of,is_stable). - The Index policy fields that govern the namespace —
distanceMetricand thecache.warming.threadscap — read from theIndexresource. - A unified jobs panel covering snapshot, warm, and scan jobs (kind, id, status, progress, age) for the namespace.
Two operator actions live here:
- Trigger snapshot — materialize a snapshot for one field on demand
(
POST /v2/namespaces/{ns}/snapshots), picking the source (origin,auto,stored,cache). - Delete namespace —
DELETE /v2/namespaces/{ns}, behind a confirm dialog.
This is where operators answer “did the last cutover land?” and “what shape is this namespace?” without leaving the dashboard.
Read
Operator answer to “are queries healthy?”. Pulls from layer_query_*
histograms and the cache metrics families:
- Query latency p50/p95/p99 over the window.
- Layer-side overhead (
query_overhead_seconds) so the operator can see whether slowness is upstream or local. - Cache hit ratio per namespace, computed from
layer_cache_lookups_total. - Aerospike pool depth and node state — visible silent-failure surface.
- Aerospike stop-writes, surfaced from
layer_aerospike_op_duration_seconds{status="aerospike_stop_writes"}.
Write
The pipeline operator view. Surfaces pending / in-flight / failed counts per pipeline and per UDF, the same numbers KEDA scales from. Click into a pipeline to see:
- Per-stage counts (
pending,embedding,indexed,failed). - Active claims with
worker_id, lease expiry, heartbeat age. - Embed pool size and the autoscaling rule attached.
- Reset / pause / resume controls for UDFs (mirrors of the
/v2/udfs/{id}/{pause,resume,reset-failed}endpoints).
The infra sub-view leads with the compute pools defined in
InfraRules/default — the logical pools (name, kind, GPU type,
maxReplicasPerWorkload, selector/toleration summary) that pipelines and
UDFs select via spec.scaling.pool — above the Karpenter NodePools that
physically provision their nodes.
The write view is the first dashboard stop for PostgreSQL pressure. A
growing pending count with rising
layer_pg_query_duration_seconds{status="pg_error"} means the queue is
stalled at the indexing-state layer, not at Turbopuffer. Use the
failure-mode runbook before resizing or deleting any
queue state.
Cost
Stacked-area chart driven by /v2/cost, /v2/cost/timeseries, and
/v2/cost/rate-card. Splits cost across AWS infrastructure lines (compute,
EBS, S3, NAT, ALB) computed from CloudWatch + AWS Pricing API and
Turbopuffer lines (storage, writes, queries) computed from usage metrics
× a code-resident rate card.
The instance picker uses the rate-card endpoint to project the impact of changing instance types before applying it. Per-namespace attribution is intentionally not modeled — this view is infra-level only.
Observe
The full metrics catalog, grouped by family (Turbopuffer ops, cache,
fetch, pipeline progress, resource saturation). Each metric expands into a
sparkline that runs the corresponding PromQL through
/v2/metrics/api/v1/query_range. This is the surface operators use when
they need to confirm a hypothesis about behavior without leaving the
dashboard for Grafana.
Operational notes
- Dashboard views should treat cache cold and upstream failures as
separate operator states. A 503
cache_coldis recoverable on its own; a 502 from Turbopuffer is not. - Customer workloads never receive the dashboard URL — only the gateway base URL and credentials (see Hosted access).
- The dashboard is intentionally read-mostly. Mutating actions (UDF pause, InfraRules or scaling edits) are gated through CRD apply or explicit confirm dialogs rather than inline controls.