Start with install notes or jump straight into the API.

Overview

Concepts

Control loops

Layer uses a control loop as a core primitive for managing your indexes. It reconciles index state against metrics emitted by the search system, which is how Layer applies row-level transformations (UDFs) and keeps an index’s stable view current.

Related: UDFs, snapshots, stable watermark.

Kubernetes autoscaling

Because Layer is stateless, you can autoscale every tier independently. Karpenter handles node-level scaling, and KEDA scales pods against signals from an embedded PostgreSQL queue. The data in that queue is used for scaling decisions only — it carries no non-recoverable system state.

Gateway enhancements

Where helpful, the gateway extends your search system with common query patterns and filtering primitives. Layer’s enhancements use reserved _hevlayer_* attributes; changing the schema on those attributes breaks Layer’s guarantees but should degrade gracefully. All functionality is exposed through a single client, so applications can route every call through the gateway — Layer works best when traffic flows through it consistently, even for requests that need no extra behavior.

Scatter/gather

Layer can partition a single namespace into hash buckets — shards — by assigning each row a reserved _hevlayer_shard attribute (xxh64 of its id, modulo the shard count). The gateway then scatters a query to every bucket in parallel, one _hevlayer_shard-filtered query per shard, and gathers the results: it merges and re-ranks the combined rows down to your requested top_k before returning them. Sharding stays invisible to the client — you issue one query and get one ranked result set. The same scatter/gather path backs result count, scans, and UDF discovery scans.

Pull-through cache

Document reads are served by a pull-through cache: the gateway checks the NVMe-backed cache (Aerospike) first, and on a miss reads through to Turbopuffer — or S3 for snapshots — returns the row, and backfills the cache best-effort. The cache is a read accelerator, not a hard dependency: if it is unavailable, reads fall through to origin and still succeed. One logical cache serves every read path, with different uses (document fetch, snapshot field-values) separated by Aerospike set.

Observability as code

Layer’s observability contract is defined in the service itself. The gateway emits a self-describing catalog of every metric it exports — names, labels, and example PromQL — so the metric surface is code, not hand-maintained dashboard config. The bundled dashboard and any external automation read from that catalog, and an embedded, Prometheus-compatible VictoriaMetrics instance lets you run PromQL against the series directly or bring your own monitoring stack.

Glossary

ConceptCurrent meaning
NamespaceA Turbopuffer namespace addressed through /v2/namespaces/{namespace}.
DocumentA row id plus attributes, and optionally a vector when writing/searching.
CacheNVMe-backed records keyed by namespace and document id, plus cache sets for pipeline chunks and snapshots.
Stable watermarkEpoch-ms cut tracked by the consistency watcher when Turbopuffer index status is up-to-date.
PipelineA PostgreSQL-backed state machine for CPU extraction and GPU embedding work.
SnapshotA content-addressed S3 facet histogram written after a namespace is observed stable.
Facet listingThe distinct values for a configured snapshot field, surfaced as fields[].values[].v.
Facet countThe document count for a configured snapshot field value, surfaced as fields[].values[].n.
Result countA synchronous ranked-query count over FTS or vector query input.
ScanA filter scan that returns matching IDs asynchronously or a matching row count synchronously.
UDFA stateless container the gateway calls once per row of an index to compute a derived attribute.
GatewayThe Rust proxy fronting Turbopuffer that serves the compatible API plus cache, scans, snapshots, pipelines, and the UDF runtime.
OperatorThe Kubernetes controller that reconciles Layer’s CRDs — functions, pipelines, scaling, and cluster config.
ShardA hash bucket within a single namespace. Each row carries a reserved _hevlayer_shard value (xxh64 of its id, modulo the shard count) so the gateway can scatter/gather a query across buckets.
CRDCustom Resource Definition: the Kubernetes-native resources the operator reconciles — functions, pipelines, scaling, and indexes.
PromQLThe Prometheus query language. The gateway proxies it to the embedded VictoriaMetrics so you can query metrics without a separate scraper.