Skip to main content
Pre-alpha · MVP / vibecoding phase — moving fast
stardelt

A self-hostable data platform for Kubernetes.

An opinionated collection of upstream OSS services — lakehouse SQL, distributed compute, streaming, notebooks and BI — composed by one declarative platform operator. Apache 2.0. Runs in your cluster.

No managed control plane. No phone-home at runtime. No license server.

Why stardelt

Built for teams that need the data to stay inside their cluster.

Managed data warehouses keep the control plane at the vendor and charge per credit. stardelt runs the whole stack inside your Kubernetes cluster — same engines, same SQL, same ML — operated by you.

Runs on any Kubernetes

Hyperscaler, sovereign cloud, on-prem, edge, air-gapped. The control plane is operators in your cluster — there is no SaaS side.

No outbound calls at runtime

No telemetry, no license check, no usage report. Each release ships with a documented network-egress matrix listing every call its components make.

OSI-permissive licenses

Apache 2.0 / MIT / BSD across the stack. Services on BSL, SSPL, ELv2 or AGPL are excluded by policy.

One CRD per workload

A small set of top-level CRDs (Lakehouse, Pipeline, StreamApp) reconcile into the per-service operators, instead of asking platform teams to assemble them by hand.

stardelt Nova — single UI

SSO, catalog browser, lineage view, cost attribution, audit search, and deep-links into the underlying tools. One place to start for engineers, one place to look for auditors.

Open formats at rest

Tables are Iceberg + Parquet on object storage. Uninstalling stardelt leaves your data readable by any Iceberg-compatible engine — no proprietary metadata to peel off.

Architecture

An opinionated collection of services on a shared foundation.

Modern 2026 service picks on a shared Iceberg + Lakekeeper foundation. Composed by the stardelt Operator and surfaced through stardelt Nova.

Lakehouse SQL

Interactive SQL over Iceberg tables on object storage.

TrinoIcebergLakekeeperSeaweedFS

Compute & orchestration

Spark Connect for distributed jobs, Airflow for scheduling.

Spark ConnectAirflow

Streaming

Kafka as the event backbone; Flink (opt-in) for stream processing.

Kafka (KRaft)Flink

Notebooks & BI

JupyterHub for analysis, Superset for dashboards — both wired into Trino and Spark.

JupyterHubSuperset
Sovereignty

For environments where data has to stay in your perimeter.

Documented egress

Every release publishes a network-egress matrix. Every outbound call is listed; nothing undocumented at runtime.

Air-gap install path

Helm/OLM bundles ship every image. Harbor (or any OCI registry) as the in-cluster mirror; no DNS to external hosts required at install or upgrade.

Sovereign-cloud CI

CI runs installs on European sovereign-cloud Kubernetes (STACKIT, OVHcloud, IONOS, Hetzner, Open Telekom Cloud, Scaleway) alongside the hyperscalers.

No non-EU vendor in the default path

Default install pulls only from the Apache Software Foundation, CNCF, Linux Foundation and vendor-neutral projects. Useful when GDPR Article 28 or the US CLOUD Act are in scope.