Start with install notes or jump straight into the API.

Guides

Dashboard

The Layer dashboard is the operator surface that ships in-cluster alongside the gateway. It reads from the same gateway API customers do — no direct database, Aerospike, or VictoriaMetrics access — and surfaces the views that justify Layer’s role as the operating layer between an application and its vector store.

Deployments on EKS reach the dashboard at https://dashboard.hevlayer.com. Self-hosted installs expose it via the layer-dashboard Service.

Layout

The dashboard groups everything operators care about into six tabs:

TabWhat it answers
consoleWhat is happening right now? At-a-glance gauges + activity log.
dataWhat is in the indexes? Namespace inventory, snapshot history, schema.
readAre queries healthy? Query latency, p99 overhead, Aerospike pool.
writeAre writes flowing? Pipelines, embed pools, claim/heartbeat state.
costWhere is spend going? AWS + Turbopuffer cost lines stacked over time.
observeCatalog of every metric the gateway exports, grouped by family.

Console

The first view a new operator opens. Two stripes:

  • At a glance — single-number cards for queries/s, indexed rows/s, fetch p99, cache hit ratio, error budget burn. Each card links into the matching read / write / observe panel.
  • Activity log — newest-first stream backed by /v2/activity/snapshots and the search-history endpoints. Filters are persisted in the URL so links survive a refresh.

Data

The inventory view. Click a namespace to drill into:

  • Schema and approximate row count proxied from Turbopuffer metadata.
  • Recent snapshot SHAs with field histograms and skipped-field markers — see snapshots.
  • The current freshness signals (stable_as_of, is_stable).
  • The Index policy fields that govern the namespace — distanceMetric and the cache.warming.threads cap — read from the Index resource.
  • A unified jobs panel covering snapshot, warm, and scan jobs (kind, id, status, progress, age) for the namespace.

Two operator actions live here:

  • Trigger snapshot — materialize a snapshot for one field on demand (POST /v2/namespaces/{ns}/snapshots), picking the source (origin, auto, stored, cache).
  • Delete namespaceDELETE /v2/namespaces/{ns}, behind a confirm dialog.

This is where operators answer “did the last cutover land?” and “what shape is this namespace?” without leaving the dashboard.

Read

Operator answer to “are queries healthy?”. Pulls from layer_query_* histograms and the cache metrics families:

  • Query latency p50/p95/p99 over the window.
  • Layer-side overhead (query_overhead_seconds) so the operator can see whether slowness is upstream or local.
  • Cache hit ratio per namespace, computed from layer_cache_lookups_total.
  • Aerospike pool depth and node state — visible silent-failure surface.
  • Aerospike stop-writes, surfaced from layer_aerospike_op_duration_seconds{status="aerospike_stop_writes"}.

Write

The pipeline operator view. Surfaces pending / in-flight / failed counts per pipeline and per UDF, the same numbers KEDA scales from. Click into a pipeline to see:

  • Per-stage counts (pending, embedding, indexed, failed).
  • Active claims with worker_id, lease expiry, heartbeat age.
  • Embed pool size and the autoscaling rule attached.
  • Reset / pause / resume controls for UDFs (mirrors of the /v2/udfs/{id}/{pause,resume,reset-failed} endpoints).

The infra sub-view leads with the compute pools defined in InfraRules/default — the logical pools (name, kind, GPU type, maxReplicasPerWorkload, selector/toleration summary) that pipelines and UDFs select via spec.scaling.pool — above the Karpenter NodePools that physically provision their nodes.

The write view is the first dashboard stop for PostgreSQL pressure. A growing pending count with rising layer_pg_query_duration_seconds{status="pg_error"} means the queue is stalled at the indexing-state layer, not at Turbopuffer. Use the failure-mode runbook before resizing or deleting any queue state.

Cost

Stacked-area chart driven by /v2/cost, /v2/cost/timeseries, and /v2/cost/rate-card. Splits cost across AWS infrastructure lines (compute, EBS, S3, NAT, ALB) computed from CloudWatch + AWS Pricing API and Turbopuffer lines (storage, writes, queries) computed from usage metrics × a code-resident rate card.

The instance picker uses the rate-card endpoint to project the impact of changing instance types before applying it. Per-namespace attribution is intentionally not modeled — this view is infra-level only.

Observe

The full metrics catalog, grouped by family (Turbopuffer ops, cache, fetch, pipeline progress, resource saturation). Each metric expands into a sparkline that runs the corresponding PromQL through /v2/metrics/api/v1/query_range. This is the surface operators use when they need to confirm a hypothesis about behavior without leaving the dashboard for Grafana.

Operational notes

  • Dashboard views should treat cache cold and upstream failures as separate operator states. A 503 cache_cold is recoverable on its own; a 502 from Turbopuffer is not.
  • Customer workloads never receive the dashboard URL — only the gateway base URL and credentials (see Hosted access).
  • The dashboard is intentionally read-mostly. Mutating actions (UDF pause, InfraRules or scaling edits) are gated through CRD apply or explicit confirm dialogs rather than inline controls.