API
Introduction
Layer matches the Turbopuffer wire contract so existing clients keep working when you point them at the gateway. Where a route has an upstream equivalent, the site documents what Layer adds — not the upstream behavior itself. Follow the Upstream docs link on each page for the underlying request/response shape.
Install
The Python SDK is generated from apps/layer-gateway/openapi.yaml and
ships the typed async client (AsyncHevlayer) and the layer CLI in one
package.
pip install hevlayer
Requires Python 3.11+. Both the SDK and the CLI read connection info from environment variables:
| Variable | Purpose |
|---|---|
LAYER_GATEWAY_URL | Base URL of the gateway. |
LAYER_GATEWAY_API_KEY | API key sent on every request. |
TURBOPUFFER_API_KEY | Optional direct fallback key for Turbopuffer-compatible SDK calls when the gateway is unreachable. |
TURBOPUFFER_API_URL | Optional direct fallback base URL; defaults to https://aws-us-east-1.turbopuffer.com. |
The CLI surface is documented separately under Layer CLI. Languages beyond Python are generated on demand through the SDK harness; reach out if you need one that isn’t shipped yet.
Client fall-through
The Python SDK can fall through to Turbopuffer direct when the gateway is
unreachable. The fallback is limited to calls that can be satisfied without
Layer state: simple vector queries and raw Turbopuffer-compatible methods
such as write_namespace, query_turbopuffer_namespace, and namespace
schema/listing calls. It emits a client log warning and sets
LayerPerf.fallback to turbopuffer_direct when with_perf=True.
Fetches, warm jobs, pipelines, UDFs, nearest_to_id queries, and other
Layer-only workflows still fail fast because they depend on gateway-owned
cache, queue, history, or consistency state. Set
fallback_to_turbopuffer=False on AsyncHevlayer to disable direct
fallback.
Enhancements to upstream routes
Each of the routes below is wire-compatible with Turbopuffer. The body of each section describes only what Layer overlays on top.
Write — POST /v2/namespaces/{ns} and PATCH /v2/namespaces/{ns}
- Best-effort NVMe cache mirror before the upstream write.
- Server-stamped
_hevlayer_upserted_aton every upsert and patch, which powers the consistency watermark on the query path. _hevlayer_*attributes are reserved — writes to them are rejected.
Page: Write.
Query — POST /v2/namespaces/{ns}/query
- Strong-consistent reads via an injected
_hevlayer_upserted_at <= watermarkpredicate while the upstream index isupdating. - One-shot 429 retry with the watermark filter forced on, for queries that race a write storm.
stable_as_ofechoed on every response so callers can correlate freshness across reads.
Page: Query.
Metadata — GET /v2/namespaces/{ns}/metadata
- Proxied upstream verbatim, then enriched with a
layerblock containingstable_as_ofandis_stable.
Page: Namespace metadata.
Cache warm hint — GET /v1/namespaces/{ns}/hint_cache_warm
- Forwards the hint upstream, then runs Layer-side warm steps: a warm job to backfill the NVMe cache from origin, plus a mirror of the latest S3 snapshot body into NVMe.
- Each step is independently toggleable per request.
Page: Warm cache.
Cross-cutting conventions
These apply to every endpoint Layer proxies, whether the route is upstream-compatible or Layer-only.
- Server-stamped
_hevlayer_upserted_at. Every upsert and patch is stamped with a server-side epoch-ms watermark. Caller-supplied values are silently overwritten. _hevlayer_*reserved. Document attributes prefixed with_hevlayer_are reserved for the proxy layer. Writing to them is a validation error; reading them is fine when explicitly requested.- Hard vs soft failures. Turbopuffer write/query failures are hard failures and surface as 5xx. NVMe cache failures are soft and never block the response.
x-layer-cacheheader. Fetch responses includehit,miss, ormiss-on-errorso callers can distinguish a cold cache from an outage.- Consistency hints. Reads that go through the watermark path
include
stable_as_of; queries omit it only on a cold-start gateway that has not yet observed a stable poll.
Compatibility posture
Layer aims to be a drop-in for existing Turbopuffer clients. Routes that
the upstream does not implement are namespaced under /v2/ and do not
shadow upstream behavior. If a Turbopuffer client sends a request to a
route Layer doesn’t proxy, the gateway returns 404 — it does not
silently re-route to an upstream that might handle it differently.