Start with install notes or jump straight into the API.

Guides

Scans

Scans answer ad hoc filter questions about a namespace. ID mode creates an asynchronous job that returns matching document IDs. Count mode returns one number synchronously and uses the latest snapshot when the filter is covered.

Use scans for bulk exports, manual inspection, UDF discovery debugging, cache/origin consistency checks, or exact row counts for a filter.

ID scans

curl -X POST http://gateway:8080/v2/namespaces/products/scans \
  -H 'content-type: application/json' \
  -d '{"mode": "ids", "source": "auto", "filters": ["category", "Eq", "Electronics"]}'

The create call returns 202 Accepted with a job:

{
  "id": "scan-uuid",
  "namespace": "products",
  "source": "auto",
  "status": "running",
  "progress": 0,
  "documents_scanned": 0,
  "created_at": "2026-05-26T10:00:00Z"
}

Poll the job, then read results:

curl http://gateway:8080/v2/namespaces/products/scans/scan-uuid
curl 'http://gateway:8080/v2/namespaces/products/scans/scan-uuid/results?limit=1000'

Count scans

curl -X POST http://gateway:8080/v2/namespaces/products/scans \
  -H 'content-type: application/json' \
  -d '{"mode": "count", "source": "auto", "filters": ["category", "Eq", "Electronics"]}'
{
  "count": 4210,
  "served_by": "snapshot",
  "snapshot_sha": "3f9e8b21",
  "watermark_ms": 1747300000123,
  "elapsed_ms": 3
}

source: auto checks the latest snapshot first for single-field Eq and In filters. If the field is fully present in the snapshot, the response is served by snapshot. Otherwise auto falls through to cache or origin. Use source: snapshot to require the snapshot path; unsupported filters return 412 precondition_failed.

Sources

SourceID modeCount mode
autoCache when fresh enough, otherwise originSnapshot first, then cache/origin.
snapshotNot supportedLatest snapshot only; requires eligible Eq or In.
cacheAerospike document cache onlyAerospike document cache only.
originTurbopuffer paginated scanTurbopuffer paginated scan.

When auto resolves to cache, the gateway applies _hevlayer_upserted_at <= cache_warmed_through before the user filter. This makes the scan a stable warmed view instead of a mixed view of old and new rows.

Filters

Scans accept the same Turbopuffer filter array as query. On origin scans, the filter is pushed to Turbopuffer. On cache scans, the gateway evaluates it against cached document attributes.

Supported cache operators are Eq, NotEq, Gt, Gte, Lt, Lte, In, NotIn, And, Or, and Not. If auto sees a filter the cache cannot evaluate, it uses origin. Explicit source: cache with an unsupported filter fails rather than returning partial results.

Operational notes

  • ID scan state is in-memory and ephemeral; it resets on gateway restart.
  • Count scans have a deadline, default 30s and maximum 300s.
  • Snapshot-served count scans are exact at the snapshot watermark_ms.
  • Live count scans include bounded, timed_out, and shard fields.