Run stardelt locally on kind
This page walks through bringing the MVP stardelt lakehouse slice up on a
single-node kind cluster. The same Helm charts and manifests are used as in
production, just with values tuned for a laptop.
Before you start, make sure you have the tools from Prerequisites
on $PATH.
The stack you'll get
┌──────────────────────┐
│ Airflow │
│ scheduler + api + │
│ dag-processor + │
│ triggerer │
│ │
│ KubernetesExecutor │
│ spawns task pods │
└──────────┬───────────┘
│ pyiceberg
▼
trino (coord + 1 worker)
│
REST + S3
│
┌─────────────┼────────────┐
▼ ▼
lakekeeper seaweedfs (master,
(Iceberg REST volume, filer, s3)
catalog) │
│ │
CNPG lakekeeper-pg │
(Postgres, 1 instance) bucket: lakehouse
(PVC-backed)
All resources live in namespace stardelt. Kind cluster name: stardelt.
Bring the stack up
From the stardelt-demos/ repo:
make up # kind + cnpg + seaweedfs + lakekeeper + trino + airflow (~12 min cold)
make smoke # acceptance: CREATE/INSERT/SELECT through Trino
make airflow-trigger # trigger the nyc_taxi_load DAG (1 year, ~3 min)
make pf # port-forwards: 8081 Trino, 8181 Lakekeeper
make airflow-ui # port-forward Airflow UI to localhost:8088
make down # tear down the cluster
make up is idempotent — re-running it skips steps that already succeeded.
What make up does
- kind cluster (
deploy/kind-config.yaml) — single node, host ports 8080/8081/8181 mapped to the host. - CloudNative-PG operator (
cnpg/cloudnative-pg) in namespacecnpg-system. Used by Lakekeeper for its metadata Postgres. - SeaweedFS in
stardelt— master + volume + filer + S3 gateway, trimmed for kind (1 replica each, replication000). Thelakehousebucket is auto-created on install. - S3 credentials Secret (
deploy/manifests/stardelt-s3-creds.yaml) —access-key,secret-key,endpoint,bucket,regionconsumed by Lakekeeper bootstrap and Trino's catalog config. - Lakekeeper Postgres (
postgresql.cnpg.io/Clusterlakekeeper-pg) — single instance, 2 GiB. - Lakekeeper — bundled Postgres + OpenFGA disabled,
authz.backend: allowall, points at the CNPG Postgres via thelakekeeper-pg-appSecret. - Lakekeeper warehouse bootstrap (
deploy/manifests/lakekeeper-bootstrap.yaml) — a Job that POSTs/management/v1/bootstrapand/management/v1/warehouse(creatingwarehouseons3://lakehouse/warehouse). Idempotent. - Trino — coordinator + 1 worker, 2 GiB heap each. The catalog
warehouseis configured withiceberg.catalog.type=rest, REST URI = Lakekeeper, S3 endpoint = SeaweedFS, path-style access, credentials fromstardelt-s3-creds. - Apache Airflow — slim image plus the
postgres,fab, andcncf-kubernetesproviders; KubernetesExecutor; ships thenyc_taxi_loadDAG that loads NYC TLC yellow-taxi Parquet intowarehouse.nyc_taxi.yellow_tripsvia pyiceberg.
When this finishes, verify with the smoke test.
Inspecting the stack
kubectl -n stardelt get pods # all should be Running/Ready
kubectl -n stardelt logs deploy/trino-coordinator -f
kubectl -n stardelt logs deploy/lakekeeper -f
kubectl -n stardelt exec deploy/trino-coordinator -- trino # interactive SQL