Start with install notes or jump straight into the API.

Guides

UDFs

A UDF is a stateless worker that preserves row count: one input row produces one derived attribute on the same row. Embeddings, classifications, tags, and attribute backfills all use the same primitive.

Use a pipeline when external data becomes rows, or when one row fans out into many rows. Use a UDF when rows already in Layer acquire derived attributes.


Gateway                              Worker Deployment
| create ID scan                     |
| POST /v2/namespaces/{ns}/scans     |
| filters: spec.filter               |
|                                     |
| enqueue (namespace, id) rows        |
| into the UDF queue                  |
|                                     |
| <----- POST /v2/udfs/{id}/claim ---|
| -----> rows + input columns ------>|
|                                     |  fn(*, id, title) -> list[str]
| <- POST /v2/udfs/{id}/items/complete
|                                     |
| writeback: Turbopuffer patch_columns

Author a worker

The Python SDK turns a normal function into the claim/process/complete loop.

import asyncio
from hevlayer.udf import PermanentError, TransientError, run_udf_worker, udf


@udf(inputs=["id", "title", "description"], output="tags", kind="tags")
def tag_product(*, id: str, title: str | None, description: str | None) -> list[str]:
    if not title:
        raise PermanentError(f"{id}: missing title")
    try:
        text = f"{title} {description or ''}".lower()
    except TypeError as exc:
        raise TransientError(str(exc)) from exc

    tags: list[str] = []
    if "wireless" in text:
        tags.append("wireless")
    if "waterproof" in text:
        tags.append("waterproof")
    return tags or ["uncategorized"]


if __name__ == "__main__":
    asyncio.run(run_udf_worker(tag_product, udf_id="product-tags"))

Function parameters are keyword-only and named to match inputs. Raise TransientError for retryable work and PermanentError for unrecoverable input.

Declare the function

Apply a Function CRD. The operator emits a worker Deployment, optional Service for push dispatch, and a KEDA ScaledObject from spec.scaling. The gateway uses the Function spec to register the UDF queue and discovery policy.

apiVersion: hevlayer.com/v1alpha1
kind: Function
metadata:
  name: product-tags
  namespace: hev-shop
spec:
  paused: false
  targetNamespaces:
    - amazon-products
  inputs:
    - id
    - title
    - description
  output:
    attribute: tags
    kind: tags
    version: v1
  filter:
    - "Or"
    - - ["tags_v", "NotEq", "v1"]
      - ["tags_v", "Eq", null]
  triggers:
    - discovery
  worker:
    image: ghcr.io/hev/hev-shop-udf-product-tags:latest
    dispatch: pull
    batchSize: 16
    timeoutSeconds: 30
  schedule:
    discoveryIntervalSeconds: 300
    leaseSeconds: 120
    maxInFlightBatches: 4
    maxConcurrentScans: 1
  retry:
    maxAttempts: 6
    initialBackoffSeconds: 5
    maxBackoffSeconds: 300
  scaling:
    pool: cpu
    mode: autoscale
    replicas:
      min: 0
      max: 4

spec.filter is the same JSON tuple syntax used in Turbopuffer queries. The 0.1 CRD preserves array-form filters, so compound expressions like the example above can be applied directly.

The worker pod receives HEVLAYER_UDF_ID, HEVLAYER_BASE_URL, HEVLAYER_UDF_BATCH_SIZE, HEVLAYER_UDF_TIMEOUT_SECONDS, HEVLAYER_UDF_LEASE_SECONDS, and LAYER_GATEWAY_API_KEY.

The CRD is the source of truth for the worker shape. Use POST /v2/udfs/{id}/discover, claim, and complete only for runtime coordination and manual recovery; do not create a separate Deployment for the same function unless you also take over scaling and placement.

Scaling and placement

spec.scaling is the scaling contract for the Function worker.

FieldPurpose
poolName of a compute pool in InfraRules/default.
modeautoscale, fixed, or disabled.
replicas.minMinimum worker replicas. Use 1 for warm workers.
replicas.maxMaximum worker replicas; must not exceed the pool cap.

InfraRules owns shared placement: node selectors, tolerations, resource requests, and per-workload replica ceilings. Workload specs choose a pool; they do not repeat placement rules.

For extra pod-level config, set spec.worker.podSpec. It is deep-merged into the operator pod spec. Container array overrides are not merged.

Gateway API

In Kubernetes installs the Function CRD is the source of truth and the runtime API below is registered from it. The routes are the same surface the Python SDK drives, and the path you reach for to register a UDF without the operator or to coordinate and recover workers by hand.

Spec routes

RouteBehavior
POST /v2/udfsCreate a UDF definition and queue.
GET /v2/udfsList UDFs.
GET /v2/udfs/{id}Read a UDF.
DELETE /v2/udfs/{id}Delete a UDF and its queue (does not delete written output).
GET /v2/udfs/{id}/statusQueue depth, in-flight, failed counts.

The create body carries the same shape the CRD spec expresses:

POST /v2/udfs
Content-Type: application/json

{
  "id": "product-tags",
  "spec": {
    "target_namespaces": ["amazon-products"],
    "inputs": ["id", "title", "description"],
    "output": {"attribute": "tags", "kind": "tags", "version": "v1"},
    "filter": ["Or", ["tags_v", "NotEq", "v1"], ["tags_v", "Eq", null]],
    "triggers": ["discovery"],
    "worker": {
      "image": "ghcr.io/hev/hev-shop-udf-product-tags:latest",
      "port": 8080,
      "batch_size": 16,
      "timeout_seconds": 30
    },
    "schedule": {
      "discovery_interval_seconds": 300,
      "lease_seconds": 120,
      "max_in_flight_batches": 4,
      "max_concurrent_scans": 1
    },
    "retry": {"max_attempts": 6, "initial_backoff_seconds": 5, "max_backoff_seconds": 300}
  }
}

Lifecycle routes

RouteBehavior
POST /v2/udfs/{id}/pauseStop both discovery and dispatch. Workers drain in-flight then idle.
POST /v2/udfs/{id}/resumeResume discovery and dispatch.
POST /v2/udfs/{id}/reset-failedMove every row in failed back to pending.
POST /v2/udfs/{id}/discoverTrigger a discovery sweep immediately.

reset-failed is the recovery path after a transient upstream incident — for permanent issues, fix the input shape or bump spec.output.version and re-apply.

Worker coordination routes

RouteBehavior
POST /v2/udfs/{id}/claimClaim a batch of rows for processing.
POST /v2/udfs/{id}/items/heartbeatExtend the lease on in-flight items.
POST /v2/udfs/{id}/items/completeReport success and persist output.
POST /v2/udfs/{id}/items/failReport failure (transient or permanent).

The Python SDK’s run_udf_worker implements the full loop — most workloads should never call these routes directly.

POST /v2/udfs/product-tags/items/complete
Content-Type: application/json

{
  "worker_id": "udf-worker-0",
  "items": [
    {"namespace": "amazon-products", "id": "asin-B08N5WRWNW", "output": ["wireless", "waterproof"]}
  ]
}

claim returns the batch as (namespace, id) pairs alongside the input columns the spec declared. Rows the gateway can’t bind from the index (missing required inputs) surface as bind errors, not silent skips, so the worker can fail them explicitly rather than retry forever. On fail, kind: transient honors spec.retry while kind: permanent dead-letters immediately — the SDK derives kind from TransientError / PermanentError.

Writeback and discovery

UDF outputs are patched onto the target row as the named attribute. output.kind is an SDK type hint; writeback semantics are the same for tags, classifications, scalars, and vectors. When spec.output.version is set, the gateway atomically writes the output and the matching {attribute}_v marker in a single patch.

Discovery sweeps create an ID scan with spec.filter against each target_namespace. Returned IDs are enqueued and deduplicated. The first sweep after create/apply is implicit; subsequent sweeps run on schedule.discovery_interval_seconds.

Lifecycle

kubectl get function product-tags
kubectl describe function product-tags

curl -H "authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  $LAYER_GATEWAY_URL/v2/udfs/product-tags/status

kubectl patch function product-tags --type=merge -p '{"spec":{"paused":true}}'
kubectl patch function product-tags --type=merge -p '{"spec":{"paused":false}}'

curl -X POST -H "authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  $LAYER_GATEWAY_URL/v2/udfs/product-tags/reset-failed

kubectl delete function product-tags

Deletion garbage-collects the operator-managed Deployment, Service, and ScaledObject. Written outputs are not deleted.

Version markers

spec.output.version is the re-run safety rail. When set, the gateway stamps {attribute}_v alongside every output write. Bump the version and keep the canonical stale filter when a model, taxonomy, or prompt changes.

Tuning knobs

KnobWhat it bounds
worker.batchSizeRows per worker batch.
worker.timeoutSecondsWorker call timeout.
schedule.leaseSecondsHow long a claim is held before reissue.
schedule.discoveryIntervalSecondsTime between discovery scan jobs.
schedule.maxInFlightBatchesConcurrent worker batches per UDF.
schedule.maxConcurrentScansConcurrent namespace discovery jobs.
retry.maxAttemptsTries before a row lands in failed.

Not in 0.1

  • Cross-namespace aggregate UDFs.
  • Chunkers or fan-out transforms; those remain pipelines.
  • Multi-output UDFs.
  • Managed image builds.