API

Cost

The gateway publishes a rolled-up cost view that combines AWS infrastructure lines (compute, EBS, S3, NAT, ALB) computed from Prometheus + CloudWatch + the AWS Pricing API with Turbopuffer lines (storage, writes, queries) computed from gateway-emitted usage metrics × a code-resident rate card.

Cost endpoints are infra-level only — per-namespace attribution is intentionally not modeled. Use the metrics catalog when you need per-namespace breakdowns.

Routes

Route	Behavior
`GET /v2/cost`	Current snapshot rolled up over a window.
`GET /v2/cost/timeseries`	Per-line USD/hour samples bucketed by step.
`GET /v2/cost/rate-card`	Active AWS Pricing entries and Turbopuffer rate card.

Cost snapshot

GET /v2/cost?window=24h

window accepts 1h, 6h, 24h (default), 7d, 30d. The same line set is returned at every window — the window controls totals only.

{
  "window": "24h",
  "totals": {
    "total_usd": 142.33,
    "aws_usd": 86.10,
    "turbopuffer_usd": 56.23,
    "cost_per_document_usd": 0.000018,
    "cost_per_tib_indexed_usd": 0.97
  },
  "lines": [
    {
      "provider": "aws",
      "service": "compute",
      "service_detail": null,
      "region": "us-east-1",
      "site": "primary",
      "rate_card_version": null,
      "line_usd": 42.18,
      "monthly_projection_usd": 1264.93
    }
  ],
  "rate_card_status": {
    "aws_pricing_stale": false,
    "turbopuffer_version": "2026-04"
  },
  "caveats": []
}

When upstream pricing inputs are unavailable, the snapshot still serves: missing lines are omitted, rate_card_status.aws_pricing_stale flips true, and the missing input is recorded in caveats. A stale rate card is preferable to a missing chart in the dashboard.

Cost timeseries

GET /v2/cost/timeseries?window=7d&step=1h

Returns per-line usd_per_hour samples for the requested window, bucketed by step. The total series carries the bucketed sum across every other series at the same timestamp — clients can render the stacked chart without re-summing.

`window`	Default `step`
`1h`, `6h`	`5m`
`24h`	`30m`
`7d`	`1h`
`30d`	`6h`

The samples are sourced from hevlayer_cost_usd_per_hour{provider, service, region, site, rate_card_version}, which the gateway’s 60-second cost sampler writes into VictoriaMetrics. PromQL access to the raw series is also fine via /v2/metrics/query_range.

Rate card

GET /v2/cost/rate-card

Returns the rate cards in use by the cost engine:

aws.items — cached AWS Pricing API entries for the gateway’s deploy region, used to drive the dashboard’s instance picker.
aws.stale — true if the AWS Pricing API has not refreshed yet or has failed every refresh attempt. The cached set is still returned.
turbopuffer — code-resident rate card with a version stamp.

The dashboard’s cost tab uses this surface to project the impact of an instance-type swap before the operator commits to it.

Caveats and shapes

service is a coarse bucket: compute, storage_ebs, storage_s3, network, tpuf_storage, tpuf_writes, tpuf_queries. Network lines carry a service_detail (alb, nat_gateway) for further disambiguation; everything else leaves service_detail null.
monthly_projection_usd is a pre-prorate projection for the bucket at its current sizing — it’s the operator’s “if we held this steady” number, not a forecast.
Turbopuffer lines always carry a rate_card_version so the dashboard can footnote when the upstream pricing changed.