Start with install notes or jump straight into the API.

Guides

Namespaces

The gateway exposes a thin layer over Turbopuffer namespaces, adding:

  • Pull-through caching — reads check the NVMe cache first; write paths and cache misses backfill it.
  • Strong-consistent vector queries — when the upstream index is updating, queries are bounded by a per-namespace _hevlayer_upserted_at watermark so they never read partially-indexed data without blocking on indexing. When up-to-date the filter is skipped, with a one-shot 429-retry fallback for races.
  • Enhanced metadata/v2/namespaces/{ns}/metadata proxies turbopuffer’s metadata response and adds a layer block with freshness signals (stable_as_of, is_stable).
  • Snapshot history — every time the upstream is observed stable, the gateway records a content-addressed snapshot of configured facet histograms to S3. See Snapshots.

Namespace surface

  • POST /v2/namespaces/{namespace}: upsert and delete.
  • PATCH /v2/namespaces/{namespace}: column-level attribute merge.
  • DELETE /v2/namespaces/{namespace}: hard-delete a namespace and gateway-owned state.
  • POST /v2/namespaces/{namespace}/query: vector query with consistency watermark handling.
  • GET/POST document fetch endpoints: pull-through cache.
  • GET /v2/namespaces/{namespace}/metadata: Turbopuffer metadata plus layer block.
  • GET /v2/namespaces/{namespace}/history and /snapshots/{sha}: snapshot history.

Prerequisites

  • TURBOPUFFER_API_KEY set. Without it, the gateway panics on the first namespace request.
  • Aerospike configured at AEROSPIKE_HOSTS (default localhost:3000). The gateway starts when it is cold and reconnects in the background. Cache failures are non-fatal on writes; fetches fall through to Turbopuffer; cache-only surfaces return 503 cache_cold.

Upsert and delete

curl -X POST http://gateway:8080/v2/namespaces/products \
  -H 'content-type: application/json' \
  -d '{
    "upserts": [
      {
        "id": "asin-B08N5WRWNW",
        "vector": [0.0012, -0.043],
        "attributes": {"title": "Wireless headphones", "category": "Electronics"}
      }
    ],
    "deletes": ["asin-old-001"]
  }'

Write path:

  1. Aerospike (best-effort) — so subsequent reads can find the doc immediately, even before Turbopuffer’s index catches up.
  2. Turbopuffer (sync) — source of truth. A failure here returns 502; the cache write from step 1 is not rolled back, so the cache can briefly contain a doc that didn’t reach the index. Re-sending the upsert resolves it.
  3. 200 OK.

Every upsert is server-stamped with a hidden _hevlayer_upserted_at attribute (epoch milliseconds). Any caller-supplied value is silently overwritten — this stamp powers the consistency watermark below.

upserts and deletes can be combined in one request. Sending both empty is a 400.

Hard delete

curl -X DELETE http://gateway:8080/v2/namespaces/products

DELETE /v2/namespaces/{namespace} removes the Turbopuffer namespace first. After the upstream delete succeeds, the gateway purges its namespace-local state:

  • Aerospike document-cache rows, latest snapshot mirror, and search-history cache.
  • S3 snapshot history, search history, clickstream events, and shard metadata.
  • In-memory warm/scan/snapshot job state, cache-state markers, consistency watermarks, and namespace-list cache.
  • The operator-discovered Index CR for the namespace when Index garbage collection is enabled.

Repeated deletes are idempotent. A Turbopuffer 404 still triggers local cleanup and returns 200 OK. Any other Turbopuffer failure returns 502 and does not run local cleanup. If local cleanup fails after the upstream namespace is deleted, the route returns 502 with the failed cleanup step in the error message so operators can retry the same delete.

Fetch by ID

Single document:

curl 'http://gateway:8080/v2/namespaces/products/documents/asin-B08N5WRWNW?include_attributes=title,category'

Batch:

curl -X POST http://gateway:8080/v2/namespaces/products/documents \
  -H 'content-type: application/json' \
  -d '{"ids": ["asin-1", "asin-2"], "include_attributes": ["title"]}'
{
  "documents": [
    {"id": "asin-1", "attributes": {"title": "..."}}
  ],
  "missing": ["asin-2"]
}

Both forms are pull-through: Aerospike first, Turbopuffer for cache misses or cache errors, then a best-effort cache backfill when Aerospike is available. documents preserves request order; IDs the gateway couldn’t find anywhere land in missing.

  • x-layer-cache: miss-on-error if Aerospike was unavailable and the response came from Turbopuffer.
  • 404 if the doc is missing from both layers (single-fetch only — batch reports missing inline).

Vector query

{
  "vector": [0.0012, -0.043],
  "top_k": 10,
  "filters": ["category", "Eq", "Electronics"],
  "include_attributes": ["title", "category"]
}

Filter shape follows Turbopuffer array syntax. The gateway may add its own _hevlayer_upserted_at predicate internally; callers should not send that field.

Strong-consistent reads

Turbopuffer indexes upserts asynchronously, so a naive query right after an upsert can return partial results — or 429 entirely under streaming-write pressure. The gateway sidesteps both:

  1. Queries run at consistency=eventual upstream, so they never block on indexing.
  2. A background loop polls each registered namespace’s index.status and records both the latest status (Stable / Updating / Unknown) and, when stable, an as-of watermark equal to poll_start - safety_margin.
  3. Per-query decision:
    • Updating → inject a hidden _hevlayer_upserted_at <= watermark predicate so the read never sees partially-indexed rows.
    • Stable or Unknown → run without the predicate; the upstream index is caught up (or no contrary evidence exists).
  4. If turbopuffer returns 429 on an unfiltered query, the gateway retries once with the watermark filter forced on.

Responses always report stable_as_of (epoch ms) — the most recent watermark the watcher has recorded. Omitted on a cold-start gateway that has not yet observed a stable poll for the namespace.

Tunables: CONSISTENCY_POLL_INTERVAL_MS (default 1000) and CONSISTENCY_SAFETY_MARGIN_MS (default 500).

Filter shape

["category", "Eq", "Electronics"]                # leaf
["And", [["category", "Eq", "Electronics"],
         ["price", "Lte", 200]]]                 # conjunction
["Or",  [...]]                                   # disjunction

The gateway combines the caller’s filter with the watermark predicate using a 2-element And automatically — callers never see _hevlayer_upserted_at in their request or response.

Namespace metadata

curl http://gateway:8080/v2/namespaces/products/metadata
{
  // Proxied from turbopuffer verbatim
  "schema": { },
  "approx_row_count": 12500,
  "index": { "status": "up-to-date" },

  // Gateway enhancement
  "layer": {
    "stable_as_of": 1715600400000,  // most recent watermark, epoch ms; null on cold start
    "is_stable": true               // most recent poll observed index.status == up-to-date
  }
}

is_stable is the current signal (drives the query path’s filter-skip decision); stable_as_of is the historical watermark. Both can be null/false on a cold start before the watcher has completed its first successful poll.