Start with install notes or jump straight into the API.

Overview

Introduction

Layer provides a set of drop-in enhancements to your favorite retrieval systems. Layer lets you scale your own compute over multi-stage pipelines, reason about the state of your index, observe clickstream, track cost, and more.

╔════════════╗      ╔════════════╗          ╔═══ vector store ════════════════════════╗
║   layer    ║░     ║   layer    ║░         ║                                         ║░
║   client   ║◀────▶║  gateway   ║◀──API───▶║  ┏━━━━━━━━━━━━━━┓     ┏━━━━━━━━━━━━━━┓  ║░
║            ║░     ║            ║░         ║  ┃    BM 25     ┃     ┃  KNN / ANN   ┃  ║░
╚════════════╝░     ╚═════╤══════╝░         ║  ┃              ┃     ┃              ┃  ║░
 ░░░░░░░░░░░░░░      ░░░░░│░░░░░░░░         ║  ┗━━━━━━━━━━━━━━┛     ┗━━━━━━━━━━━━━━┛  ║░
                          │                 ║                                         ║░
╔════════════╗      ╔═════▼══════╗          ║                                         ║░
║   layer    ║░     ║   layer    ║░         ║                                         ║░
║ dashboard  ║◀────▶║  operator  ║◀──API───▶║                                         ║░
║            ║░     ║            ║░         ║                                         ║░
╚════════════╝░     ╚═════╤══════╝░         ╚═════════════════════════════════════════╝░
 ░░░░░░░░░░░░░░      ░░░░░│░░░░░░░░          ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
                          ▼
                   ┏━━━━━━━━━━━━━━┓
                   ┃ Object Store ┃
                   ┃ Bucket (S3)  ┃
                   ┗━━━━━━━━━━━━━━┛

You run two server components in your own cluster: a Rust gateway and a Kubernetes operator. The gateway is a transparent proxy in front of Turbopuffer. It extends native clients with fetch, scans, snapshots, result count, and operator-facing semantics around the cache, write path, and pipelines — you swap in Layer’s drop-in client and change nothing else. It also drives the function runtime: discovering UDF work, leasing it to worker pools, retrying, and writing results back, with KEDA scaling each pool to zero between bursts.

In addition to a set of wire-compatible clients, Layer also ships an optional GUI dashboard. The dashboard manages cluster configuration through CRDs; all other state is persisted in object storage (S3). No durable state lives in a Layer process, so the compute tier is stateless and fully elastic.

Because indexing is bursty — especially GPU-bound work — our Terraform installs Karpenter as a cluster autoscaler to provision and scale the nodes Layer’s compute runs on. The remaining backing services are the document cache, the indexing-state store, and the metrics store. Every component Layer runs alongside is open source:

  • Karpenter — cluster autoscaler that provisions and scales nodes for Layer’s bursty, GPU-bound compute (Apache-2.0).
  • Aerospike — NVMe-backed ephemeral document cache (AGPL-3.0).
  • PostgreSQL — indexing-state store for the pipeline and embed queue (PostgreSQL License).
  • VictoriaMetrics — metrics store (Apache-2.0).

To get started, see the install guide. For more technical detail, see Concepts, Guarantees, and Tradeoffs.