Start with install notes or jump straight into the API.

API

Introduction

Layer matches the Turbopuffer wire contract so existing clients keep working when you point them at the gateway. Where a route has an upstream equivalent, the site documents what Layer adds — not the upstream behavior itself. Follow the Upstream docs link on each page for the underlying request/response shape.

Install

The Python SDK is generated from apps/layer-gateway/openapi.yaml and ships the typed async client (AsyncHevlayer) and the layer CLI in one package.

pip install hevlayer

Requires Python 3.11+. Both the SDK and the CLI read connection info from environment variables:

VariablePurpose
LAYER_GATEWAY_URLBase URL of the gateway.
LAYER_GATEWAY_API_KEYAPI key sent on every request.
TURBOPUFFER_API_KEYOptional direct fallback key for Turbopuffer-compatible SDK calls when the gateway is unreachable.
TURBOPUFFER_API_URLOptional direct fallback base URL; defaults to https://aws-us-east-1.turbopuffer.com.

The CLI surface is documented separately under Layer CLI. Languages beyond Python are generated on demand through the SDK harness; reach out if you need one that isn’t shipped yet.

Client fall-through

The Python SDK can fall through to Turbopuffer direct when the gateway is unreachable. The fallback is limited to calls that can be satisfied without Layer state: simple vector queries and raw Turbopuffer-compatible methods such as write_namespace, query_turbopuffer_namespace, and namespace schema/listing calls. It emits a client log warning and sets LayerPerf.fallback to turbopuffer_direct when with_perf=True.

Fetches, warm jobs, pipelines, UDFs, nearest_to_id queries, and other Layer-only workflows still fail fast because they depend on gateway-owned cache, queue, history, or consistency state. Set fallback_to_turbopuffer=False on AsyncHevlayer to disable direct fallback.

Enhancements to upstream routes

Each of the routes below is wire-compatible with Turbopuffer. The body of each section describes only what Layer overlays on top.

Write — POST /v2/namespaces/{ns} and PATCH /v2/namespaces/{ns}

  • Best-effort NVMe cache mirror before the upstream write.
  • Server-stamped _hevlayer_upserted_at on every upsert and patch, which powers the consistency watermark on the query path.
  • _hevlayer_* attributes are reserved — writes to them are rejected.

Page: Write.

Query — POST /v2/namespaces/{ns}/query

  • Strong-consistent reads via an injected _hevlayer_upserted_at <= watermark predicate while the upstream index is updating.
  • One-shot 429 retry with the watermark filter forced on, for queries that race a write storm.
  • stable_as_of echoed on every response so callers can correlate freshness across reads.

Page: Query.

Metadata — GET /v2/namespaces/{ns}/metadata

  • Proxied upstream verbatim, then enriched with a layer block containing stable_as_of and is_stable.

Page: Namespace metadata.

Cache warm hint — GET /v1/namespaces/{ns}/hint_cache_warm

  • Forwards the hint upstream, then runs Layer-side warm steps: a warm job to backfill the NVMe cache from origin, plus a mirror of the latest S3 snapshot body into NVMe.
  • Each step is independently toggleable per request.

Page: Warm cache.

Cross-cutting conventions

These apply to every endpoint Layer proxies, whether the route is upstream-compatible or Layer-only.

  • Server-stamped _hevlayer_upserted_at. Every upsert and patch is stamped with a server-side epoch-ms watermark. Caller-supplied values are silently overwritten.
  • _hevlayer_* reserved. Document attributes prefixed with _hevlayer_ are reserved for the proxy layer. Writing to them is a validation error; reading them is fine when explicitly requested.
  • Hard vs soft failures. Turbopuffer write/query failures are hard failures and surface as 5xx. NVMe cache failures are soft and never block the response.
  • x-layer-cache header. Fetch responses include hit, miss, or miss-on-error so callers can distinguish a cold cache from an outage.
  • Consistency hints. Reads that go through the watermark path include stable_as_of; queries omit it only on a cold-start gateway that has not yet observed a stable poll.

Compatibility posture

Layer aims to be a drop-in for existing Turbopuffer clients. Routes that the upstream does not implement are namespaced under /v2/ and do not shadow upstream behavior. If a Turbopuffer client sends a request to a route Layer doesn’t proxy, the gateway returns 404 — it does not silently re-route to an upstream that might handle it differently.