API
Query
Request
POST /v2/namespaces/products/query
Content-Type: application/json
{
"vector": [0.0012, -0.043],
"top_k": 10,
"filters": ["category", "Eq", "Electronics"],
"include_attributes": ["title", "category"]
}
{
"results": [
{"id": "asin-B08N5WRWNW", "dist": 0.42, "attributes": {"title": "..."}}
],
"stable_as_of": 1715600400000
}
Strong-consistent reads
Turbopuffer indexes upserts asynchronously, so a naive query right after an upsert can return partial results or 429 entirely under streaming-write pressure. Layer sidesteps both:
- Queries run at
consistency=eventualupstream, so they never block on indexing. - A background loop polls each registered namespace’s
index.statusand records the latest status plus, when stable, a watermark equal topoll_start - safety_margin. - Per-query decision:
Updating→ inject a hidden_hevlayer_upserted_at <= watermarkpredicate so the read never sees partially-indexed rows.StableorUnknown→ run without the predicate. The upstream index is caught up (or no contrary evidence exists).
- On a 429 to an unfiltered query, Layer retries once with the watermark filter forced on.
Responses always report stable_as_of (epoch ms) — the most recent
watermark the watcher has recorded. Omitted on a cold-start gateway that
has not yet observed a stable poll.
Filter shape
["category", "Eq", "Electronics"] # leaf
["And", [["category", "Eq", "Electronics"],
["price", "Lte", 200]]] # conjunction
["Or", [...]] # disjunction
Filter shape follows Turbopuffer array syntax. Layer combines the
caller’s filter with the watermark predicate using a 2-element And
automatically — callers never see _hevlayer_upserted_at in their request or
response.
Tunables
| Variable | Default | Purpose |
|---|---|---|
CONSISTENCY_POLL_INTERVAL_MS | 1000 | How often the watcher polls each namespace. |
CONSISTENCY_SAFETY_MARGIN_MS | 500 | Cushion between poll time and watermark to cover in-flight upserts. |