Setting up worker nodes

1. Update System

sudo apt update && sudo apt upgrade -y
sudo apt install apt-transport-https ca-certificates curl gnupg lsb-release -y

Kubernetes now prefers containerd over Docker for runtime.

# Install containerd
sudo apt install containerd -y

# Create default config
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml

# Set SystemdCgroup to true (important for k8s)
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

# Restart containerd
sudo systemctl restart containerd
sudo systemctl enable containerd

3. Install Kubernetes Tools

(kubeadm, kubelet, kubectl)

Install curl and add apt registries

Run the following:

sudo apt install -y apt-transport-https ca-certificates curl
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key |
sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /" |
sudo tee /etc/apt/sources.list.d/kubernetes.list > /dev/null

(optional 1) Removing registries in case of error

To remove repository:

sudo vi /etc/apt/sources.list.d/kubernetes.list

Remove or comment out the problematic entry:

# deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /

Save the file and exit


(optional 2) Alternative to removing registries in case of error

If the error still persists, it's worth checking other places where Kubernetes might be listed.

Check Other Files in /etc/apt/sources.list.d/:

ls /etc/apt/sources.list.d/

Then edit any other relevant files using nano or vi.

Output:

macc@craftlab:~$ ls /etc/apt/sources.list.d/
archive_uri-http_apt_kubernetes_io_-noble.list  kubernetes.list  ubuntu.sources  ubuntu.sources.curtin.orig

Notice the deprecated file: archive_uri-http_apt_kubernetes_io_-noble.list. Remove it or comment out

sudo vi /etc/apt/sources.list.d/archive_uri-http_apt_kubernetes_io_-noble.list

Remove or comment out the problematic entry:

# deb http://apt.kubernetes.io/ kubernetes-xenial main
# deb-src http://apt.kubernetes.io/ kubernetes-xenial main
Install kubelet, kubeadm, kubectl

Run the following:

sudo apt update
Err:5 https://packages.cloud.google.com/apt kubernetes-xenial Release
  404  Not Found [IP: 142.250.138.100 443]
...
E: The repository 'http://apt.kubernetes.io kubernetes-xenial Release' does not have a Release file.
sudo apt install -y kubelet kubeadm kubectl

Prevent automatic updates/removals of Kubernetes packages:

sudo apt-mark hold kubelet kubeadm kubectl

4. Disable Swap

sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab

5. Join the Cluster

On master node, run:

kubeadm token create --print-join-command

It will give you something like:

kubeadm join 192.168.1.10:6443 --token abc123.def456ghi789 \
    --discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxx

Run that command on the worker node as root:

sudo kubeadm join 192.168.1.10:6443 --token abc123.def456ghi789 \
    --discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxx
In case of error, run the following on the worker node
# 1) Load required kernel modules now and on boot
sudo tee /etc/modules-load.d/k8s.conf >/dev/null <<'EOF'
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter

# 2) Set required sysctl params now and persist
sudo tee /etc/sysctl.d/99-kubernetes-cri.conf >/dev/null <<'EOF'
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system

# (Optional sanity checks)
lsmod | grep br_netfilter || echo "br_netfilter NOT loaded"
cat /proc/sys/net/bridge/bridge-nf-call-iptables
cat /proc/sys/net/ipv4/ip_forward

You should see:

Also double‑check swap is disabled (K8s requirement):

sudo swapoff -a
grep -E "swap" /etc/fstab   # should return nothing uncommented

Now re‑run your join command (exactly as printed by the control plane):

sudo kubeadm join 192.168.1.10:6443 --token abc123.def456ghi789 \
    --discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxx
If error is consistent, run the following on the control node
# 1) Make sure the cluster is healthy
kubectl get nodes

# 2) (Re)create a fresh bootstrap token and ensure the cluster-info JWS is present
sudo kubeadm init phase bootstrap-token

# 3) Print a fresh join command (valid for 24h; adjust TTL if you want)
sudo kubeadm token create --ttl 24h --print-join-command

On the worker node, reset any partial state, then join again:

# 1) Reset any partial kubeadm state
sudo kubeadm reset -f

# 2) (Optional) clean CNI leftovers if they exist
sudo rm -rf /etc/cni/net.d/*

# 3) Re-run the *new* join command you copied from the master
sudo kubeadm join <master-ip>:6443 --token <new-token> --discovery-token-ca-cert-hash sha256:<hash>

You should see

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

To verify this, you can run the kubectl get nodes on the control plane

$ kubectl get nodes
NAME         STATUS     ROLES           AGE   VERSION
craftlab     Ready      control-plane   95d   v1.29.15
craftnode1   NotReady   <none>          7s    v1.29.15
What are Taints? (optional)

Command:

kubectl describe node <node>

shows the taints, if any.

6. Installing Helm

What is Helm?
Installation steps

Run these commands on the master node (where you run kubectl):

# Download Helm install script
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

That will:

Verify Installation

$ helm version
version.BuildInfo{Version:"v3.x.x", GitCommit:"...", ...}
Add Common Helm Repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
Quick Test

Deploy something simple to test Helm:

kubectl create ns test
helm install my-nginx bitnami/nginx -n test

Output:

NAME: my-nginx
LAST DEPLOYED: Sun Aug 17 12:44:05 2025
NAMESPACE: test
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: nginx
CHART VERSION: 21.1.22
APP VERSION: 1.29.1

⚠ WARNING: Since August 28th, 2025, only a limited subset of images/charts are available for free.
    Subscribe to Bitnami Secure Images to receive continued support and security updates.
    More info at https://bitnami.com and https://github.com/bitnami/containers/issues/83267

** Please be patient while the chart is being deployed **
NGINX can be accessed through the following DNS name from within your cluster:

    my-nginx.test.svc.cluster.local (port 80)

To access NGINX from outside the cluster, follow the steps below:

1. Get the NGINX URL by running these commands:

  NOTE: It may take a few minutes for the LoadBalancer IP to be available.
        Watch the status with: 'kubectl get svc --namespace test -w my-nginx'

    export SERVICE_PORT=$(kubectl get --namespace test -o jsonpath="{.spec.ports[0].port}" services my-nginx)
    export SERVICE_IP=$(kubectl get svc --namespace test my-nginx -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
    echo "http://${SERVICE_IP}:${SERVICE_PORT}"

WARNING: There are "resources" sections in the chart not set. Using "resourcesPreset" is not recommended for production. For production installations, please set the following values according to your workload needs:
  - cloneStaticSiteFromGit.gitSync.resources
  - resources
+info https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

Verify

$ kubectl get pods -n test
NAME                        READY   STATUS    RESTARTS   AGE
my-nginx-6b895f4c9d-kqx76   1/1     Running   0          30s

When done:

helm uninstall my-nginx -n test
kubectl delete ns test

7. Set Up StorageClass / PVC

What is a StorageClass / PVC (and why you need it)

For a small two-node cluster, the simplest dynamic provisioner is local-path-provisioner (by Rancher). It creates a directory on the node’s disk and binds the pod to that node. It’s fast and dead simple.

Limitations of local-path:

Installing a default StorageClass (local-path)

Install local-path-provisioner

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml

Make it the default StorageClass

kubectl annotate storageclass local-path storageclass.kubernetes.io/is-default-class="true" --overwrite

Verify it’s running

kubectl get pods -n local-path-storage
kubectl get storageclass

Output:

NAME                                      READY   STATUS    RESTARTS   AGE
local-path-provisioner-7599fd788b-68t7w   1/1     Running   0          52s

NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  66s

You should see the controller pod Running and local-path (default) in the SC list.

8. Installing Metrics Server

What is Metrics Server?

With Metrics Server:

Installing Metrics Server

The Metrics Server project provides ready-to-use manifests. On your control node:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Verify

Check that the pod is running:

$ kubectl -n kube-system get pods | grep metrics-server
metrics-server-856f767b-f6mgb      0/1     Running   0              58s

Test it:

kubectl top nodes
kubectl top pods -A
Error on reading certificates

Metrics Server is trying to scrape each kubelet by IP (e.g., https://192.168.101.187:10250/metrics/resource), but the kubelet’s serving certificate doesn’t include IP SubjectAltNames. That’s common in homelabs. The two practical fixes are:

  1. Easiest (what most homelabs do): tell Metrics Server to skip kubelet TLS verification and prefer DNS/hostname when possible.
  2. Hard-mode / “proper” fix: re-issue kubelet serving certs with IP SANs (involves kubelet serving cert rotation & cluster CSR approval). Not worth it here.

Run these on your control node:

# Add/replace args for the container
kubectl -n kube-system set args deploy/metrics-server --containers=metrics-server -- \
  --kubelet-insecure-tls \
  --kubelet-preferred-address-types=InternalDNS,ExternalDNS,Hostname,InternalIP,ExternalIP \
  --metric-resolution=15s

Run these (safe even if some aren’t present):

# Stop anything running
kubectl -n kube-system scale deploy metrics-server --replicas=0 --timeout=30s || true

# Remove old objects that might conflict
kubectl -n kube-system delete deploy metrics-server --ignore-not-found
kubectl -n kube-system delete svc metrics-server --ignore-not-found
kubectl delete apiservice v1beta1.metrics.k8s.io --ignore-not-found

# (Optional) If you applied components.yaml earlier and want to fully reset:
# kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml --ignore-not-found=true

Install via Helm (recommended)

# Add the official repo
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm repo update

# Install with homelab-friendly flags
helm upgrade --install metrics-server metrics-server/metrics-server \
  --namespace kube-system \
  --set args="{--kubelet-insecure-tls,--kubelet-preferred-address-types=InternalDNS,Hostname,InternalIP,ExternalDNS,ExternalIP}" \
  --set replicas=1

Verify

kubectl -n kube-system rollout status deploy/metrics-server
kubectl -n kube-system get pods -l k8s-app=metrics-server -o wide
kubectl get apiservice | grep metrics
kubectl describe apiservice v1beta1.metrics.k8s.io | sed -n '1,120p'   # look for Available=True

kubectl top nodes
kubectl top pods -A

Output:

NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
craftlab     236m         3%     5431Mi          22%
craftnode1   50m          1%     333Mi           4%

NAMESPACE            NAME                                      CPU(cores)   MEMORY(bytes)
crafty               crafty-664d488db-6cz4z                    78m          3999Mi
crafty               playit-agent-8679c7c5d-cj9nq              5m           43Mi
kube-flannel         kube-flannel-ds-2qfwh                     20m          48Mi
kube-flannel         kube-flannel-ds-8vz5x                     9m           11Mi
kube-system          coredns-76f75df574-hrzxl                  3m           59Mi
kube-system          coredns-76f75df574-rf95t                  4m           15Mi
kube-system          etcd-craftlab                             29m          92Mi
kube-system          kube-apiserver-craftlab                   77m          299Mi
kube-system          kube-controller-manager-craftlab          23m          123Mi
kube-system          kube-proxy-jpkmc                          1m           11Mi
kube-system          kube-proxy-jzz7f                          1m           62Mi
kube-system          kube-scheduler-craftlab                   5m           66Mi
kube-system          metrics-server-8648894d55-nmvtt           7m           19Mi
local-path-storage   local-path-provisioner-7599fd788b-68t7w   1m           7Mi
Installing kube-prometheus-stack

Run the following commands

kubectl create ns monitoring
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

helm upgrade --install kube-stack prometheus-community/kube-prometheus-stack \
  -n monitoring --create-namespace \
  --set grafana.persistence.enabled=true \
  --set grafana.persistence.storageClassName=local-path \
  --set grafana.persistence.size=5Gi \
  --set grafana.deploymentStrategy.type=Recreate \
  --set grafana.initChownData.enabled=false \
  --set grafana.podSecurityContext.fsGroup=472
  
kubectl -n monitoring get pods

Expose Grafana quickly (NodePort for now):

kubectl -n monitoring patch svc kube-stack-grafana -p '{"spec":{"type":"NodePort"}}'
kubectl -n monitoring get svc kube-stack-grafana
# Default creds: admin / prom-operator (change them!)
How to use Grafana
  1. Find a node IP:
kubectl get nodes -o wide
  1. Open Grafana in your browser:
http://<any-node-internal-ip>:32186
  1. Get default creds
kubectl --namespace monitoring get secrets kube-stack-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echo
prom-operator
  1. Immediately change the admin password (Gear icon → Users → admin → Change password)

Quick checks in Grafana

9. Installing Ingress (Paused)

What is Ingress?
Install ingress-nginx (Helm)

Run these once:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

kubectl create ns ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx -n ingress-nginx

Verify:

kubectl -n ingress-nginx get pods
kubectl -n ingress-nginx get svc

Here’s the breakdown

So:

10. Certificates (Paused)

What “Certificates” means

Without TLS, anyone on the same network could sniff traffic (not a huge deal if it’s just LAN testing, but important if you’ll open access via Tailscale or internet).

How this works in Kubernetes
Do you need it now?
What it implies (when you decide to add it)
  1. Install cert-manager into the cluster.
  2. Choose an issuer:
    • Let’s Encrypt (if you own a domain and point it to your cluster).
    • Tailscale Operator (easiest if you plan to run everything via Tailscale).
    • Self-signed (quick and dirty for LAN).
  3. Annotate your Ingress to request certs, e.g.:
annotations:
  cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - grafana.homelab
    secretName: grafana-tls
  1. The Ingress controller terminates TLS using that secret automatically.

Bottom line for you now: You don’t need certificates until you get DNS and Ingress going. You can skip this step and still build your apps (Grafana, Dashy, n8n, Elastic, etc.).

Later, when you want pretty https:// URLs with a padlock, you’ll install cert-manager or the Tailscale Operator and tie it into Ingress.


11. Installing Uptime Kuma

What is Uptime Kuma?
Installing Uptime Kuma
kubectl create ns homelab

helm repo add uptime-kuma https://dirsigler.github.io/uptime-kuma-helm
helm repo update
helm install uptime-kuma louislam/uptime-kuma -n homelab

# Install Uptime Kuma with persistence + NodePort
helm upgrade --install uptime-kuma uptime-kuma/uptime-kuma -n homelab \
  --set service.type=NodePort \
  --set service.nodePort=31081 \
  --set persistence.enabled=true \
  --set persistence.storageClass=local-path \
  --set persistence.size=2Gi
Adding the first Monitor

This way you’ll know if your cluster control plane is alive.

So you know if your monitoring stack is working.

12. Installing Dashy

Apply Dashy (PVC + Deployment + Service)
# Dashy with persistent config and NodePort 31080
cat <<'EOF' | kubectl apply -n homelab -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dashy
spec:
  replicas: 1
  selector: { matchLabels: { app: dashy } }
  template:
    metadata: { labels: { app: dashy } }
    spec:
      securityContext: { fsGroup: 1000 }
      initContainers:
      - name: fix-perms
        image: busybox:1.36
        command: ["sh","-c","chown -R 1000:1000 /app/public || true"]
        volumeMounts: [ { name: data, mountPath: /app/public } ]
      containers:
      - name: dashy
        image: lissy93/dashy:2.1.1
        env: [ { name: PORT, value: "8080" } ]
        ports: [ { containerPort: 8080 } ]
        volumeMounts: [ { name: data, mountPath: /app/public } ]
        resources:
          requests: { cpu: "100m", memory: "256Mi" }
          limits:   { cpu: "500m", memory: "1Gi" }
        # add probes back but generous
        readinessProbe:
          httpGet: { path: /, port: 8080 }
          initialDelaySeconds: 60
          periodSeconds: 10
        livenessProbe:
          httpGet: { path: /, port: 8080 }
          initialDelaySeconds: 90
          periodSeconds: 20
      volumes:
      - name: data
        persistentVolumeClaim: { claimName: dashy-data }
---
apiVersion: v1
kind: Service
metadata:
  name: dashy
spec:
  type: NodePort
  selector: { app: dashy }
  ports:
  - name: http
    port: 80
    targetPort: 8080
    nodePort: 31080
EOF

Deploy:

kubectl -n homelab rollout status deploy/dashy
kubectl -n homelab get pods -l app=dashy -o wide
kubectl -n homelab get endpoints dashy

Now on your browser:

http://192.168.101.186:31080

If Dashy just displays a blank screen make sure config exists

kubectl -n homelab exec deploy/dashy -- sh -lc '
ls -al /app/public;
[ -s /app/public/conf.yml ] && { echo "--- conf.yml (head) ---"; head -n 40 /app/public/conf.yml; } || echo "conf.yml missing or empty";
'

If conf.yml is missing/empty → create a starter one

kubectl -n homelab exec -i deploy/dashy -- sh -lc 'cat > /app/public/conf.yml << "EOF"
appConfig:
  title: "Homelab"
  theme: nord
  layout: auto
sections:
  - name: Monitoring
    icon: fas fa-chart-line
    items:
      - title: Grafana
        url: http://<NODE-IP>:32186
      - title: Uptime Kuma
        url: http://<NODE-IP>:31081
  - name: Apps
    icon: fas fa-rocket
    items:
      - title: Minecraft
        url: http://<NODE-IP>:30065
EOF'
kubectl -n homelab rollout restart deploy/dashy

Reload the page.