Homelab
A 6-node Kubernetes cluster across Proxmox VMs and bare-metal Intel NUCs, deployed with Kubespray and managed end-to-end by Flux.
A self-hosted platform running in my basement. The goal was to run things the same way I would at work — GitOps, OIDC everywhere, secrets that never touch git — but on hardware I own and look after alone.
Topology
Three control-plane nodes virtualised on a Proxmox host, three bare-metal Intel NUC workers, all on a dedicated lab VLAN behind an OPNsense router.
| Component | Version / config |
|---|---|
| Kubernetes | v1.34.3, deployed via Kubespray v2.30.0 (kubeadm under the hood) |
| Container runtime | containerd 2.2.1 |
| CNI | Calico (VXLAN, MTU 1450) |
| kube-proxy | IPVS mode (strict_arp: true) |
| Load balancing | MetalLB in L2 mode |
| Ingress | Envoy Gateway via the Gateway API |
| Certificates | cert-manager + Cloudflare DNS-01 |
| OS | Ubuntu 24.04 across all nodes |
GitOps with Flux
Every cluster-side change is driven from a Git repo. Flux reconciles four
layered Kustomizations with explicit dependsOn ordering — the apply sequence
is fixed, so I don't have to think about it:
Repo layout:
clusters/homelab/ # Flux entry point — Kustomization CRs
infrastructure/
sources/ # HelmRepository definitions
controllers/ # HelmReleases for the cluster's runtime pieces
configs/ # post-install config (Gateway, IPAddressPool, RBAC, Kyverno policies)
apps/ # application workloads
.sops.yaml # encryption rules for secrets
Flux also runs image update automation — it watches ImagePolicy resources
across the cluster and commits updated image tags back to the repo automatically.
Secrets — Vault + VSO, SOPS for the rest
Workload secrets are sourced from HashiCorp Vault running on the cluster.
The current pattern uses the Vault Secrets Operator
— each namespace that needs secrets gets a VaultAuth and one or more
VaultStaticSecret CRs, which VSO reconciles into native Kubernetes Secret
objects on a continuous sync loop. I wrote about migrating to this from a
CronJob-based pattern if you want the
full before/after.
Vault itself runs on the cluster, which is the obvious trade-off: if Vault is down, workloads that depend on secret sync can't start. At this scale it's an acceptable risk — the cluster is stable and Vault's HA mode would add complexity I don't need yet.
For secrets that VSO can't reach — Flux itself, HelmRelease values, anything
without a namespace to target — secrets are encrypted in git with
SOPS and age.
The cluster holds the age private key as a sops-age Secret in flux-system,
and Flux's kustomize-controller decrypts on the fly during reconciliation.
Authentication — OIDC via Authentik
The API server accepts client certificates (the default break-glass kubeconfig)
and OIDC tokens via Authentik. kubectl uses
kubelogin as an exec credential plugin
to drive the authorization-code flow in a browser; group claims map to RBAC
ClusterRoles via two simple bindings:
| Authentik role | Kubernetes group | RBAC |
|---|---|---|
app-kubernetes-admin | oidc:admin | cluster-admin |
app-kubernetes-user | oidc:user | view |
The OIDC flags are configured via Kubespray group vars, so the configuration moves with the cluster.
Provisioning chain
The cluster doesn't bootstrap itself. A separate provisioning repo handles
everything below the apiserver:
| Tool | Purpose |
|---|---|
| Packer | Build the Ubuntu 24.04 Proxmox VM template |
| Terraform | Create Proxmox VMs and manage Vault, Authentik, Harbor, GitHub, Okta resources |
| Kubespray | Bootstrap and upgrade the Kubernetes control plane and workers |
| Flux | Take over once the cluster is up — everything else is GitOps |
There's also Kyverno running as an admission controller
with a handful of cluster-wide policies: no latest tags, no privilege
escalation, required labels, required resource requests. Not exhaustive, but
enough to catch the lazy mistakes.
Why not cloud?
Partly cost — I didn't want a recurring cloud bill for something I run at home. But mostly I wanted something physical I could actually work on, upgrade over time, and break without consequences. There's a difference between reading about NIC offloading issues and having to fix one because your node just fell off the network.
Notable gotchas
Wildcard DNS + ndots: 5. A *.sperring.io record pointing at the
reverse proxy combined with Kubernetes' default ndots: 5 and the node's
search domain caused pods to resolve external hostnames through the
wildcard, returning the wrong IP for anything with fewer than five dots.
Fixed by pointing kubelet at a clean /etc/kubernetes/resolv.conf with no
search domains.
Proxmox e1000e NIC hangs. The onboard Intel I219 NIC on the Proxmox host
hangs under load and takes the whole node off the network with it. Mitigated
by disabling TX/RX offloading (ethtool -K nic0 tso off gso off gro off),
made persistent via the pve-base Ansible role.
Stack
Kubernetes, Flux, Calico, MetalLB, cert-manager, Vault, VSO, Authentik, Kyverno, Prometheus, Grafana, Loki.