API
Cost
The gateway publishes a rolled-up cost view that combines AWS infrastructure lines (compute, EBS, S3, NAT, ALB) computed from Prometheus + CloudWatch + the AWS Pricing API with Turbopuffer lines (storage, writes, queries) computed from gateway-emitted usage metrics × a code-resident rate card.
Cost endpoints are infra-level only — per-namespace attribution is intentionally not modeled. Use the metrics catalog when you need per-namespace breakdowns.
Routes
| Route | Behavior |
|---|---|
GET /v2/cost | Current snapshot rolled up over a window. |
GET /v2/cost/timeseries | Per-line USD/hour samples bucketed by step. |
GET /v2/cost/rate-card | Active AWS Pricing entries and Turbopuffer rate card. |
Cost snapshot
GET /v2/cost?window=24h
window accepts 1h, 6h, 24h (default), 7d, 30d. The same line
set is returned at every window — the window controls totals only.
{
"window": "24h",
"totals": {
"total_usd": 142.33,
"aws_usd": 86.10,
"turbopuffer_usd": 56.23,
"cost_per_document_usd": 0.000018,
"cost_per_tib_indexed_usd": 0.97
},
"lines": [
{
"provider": "aws",
"service": "compute",
"service_detail": null,
"region": "us-east-1",
"site": "primary",
"rate_card_version": null,
"line_usd": 42.18,
"monthly_projection_usd": 1264.93
}
],
"rate_card_status": {
"aws_pricing_stale": false,
"turbopuffer_version": "2026-04"
},
"caveats": []
}
When upstream pricing inputs are unavailable, the snapshot still serves:
missing lines are omitted, rate_card_status.aws_pricing_stale flips
true, and the missing input is recorded in caveats. A stale rate card
is preferable to a missing chart in the dashboard.
Cost timeseries
GET /v2/cost/timeseries?window=7d&step=1h
Returns per-line usd_per_hour samples for the requested window,
bucketed by step. The total series carries the bucketed sum across
every other series at the same timestamp — clients can render the
stacked chart without re-summing.
window | Default step |
|---|---|
1h, 6h | 5m |
24h | 30m |
7d | 1h |
30d | 6h |
The samples are sourced from
hevlayer_cost_usd_per_hour{provider, service, region, site, rate_card_version},
which the gateway’s 60-second cost sampler writes into VictoriaMetrics.
PromQL access to the raw series is also fine via
/v2/metrics/query_range.
Rate card
GET /v2/cost/rate-card
Returns the rate cards in use by the cost engine:
aws.items— cached AWS Pricing API entries for the gateway’s deploy region, used to drive the dashboard’s instance picker.aws.stale—trueif the AWS Pricing API has not refreshed yet or has failed every refresh attempt. The cached set is still returned.turbopuffer— code-resident rate card with a version stamp.
The dashboard’s cost tab uses this surface to project the impact of an instance-type swap before the operator commits to it.
Caveats and shapes
serviceis a coarse bucket:compute,storage_ebs,storage_s3,network,tpuf_storage,tpuf_writes,tpuf_queries. Network lines carry aservice_detail(alb,nat_gateway) for further disambiguation; everything else leavesservice_detailnull.monthly_projection_usdis a pre-prorate projection for the bucket at its current sizing — it’s the operator’s “if we held this steady” number, not a forecast.- Turbopuffer lines always carry a
rate_card_versionso the dashboard can footnote when the upstream pricing changed.