Last Updated on June 22, 2026 by Thiago Crepaldi
Today, we are building a homelab-grade, production-ready Kubernetes cluster using K3s (Lightweight Kubernetes), Rancher for cluster management, Helm as package manager, NGINX Ingress Controller for reverse proxy and MetaILB as load balancer, Longhorn for fast Distributed Block Storage, NFS mountpoint from a Synology NAS as high capacity Storage, Cert-Manager as Let’s Encrypt certificate manager on a GPU Accelerated server thanks to NVIDIA Device Plugin.
To do this right, we will implement a Split-Plane Architecture that cleanly separates our management engine (“The Brain”) onto its own dedicated virtual machine (aka rancher-mgmt), while isolating our primary workload node (aka k3s-node-01). To make things more interesting, both the management and workload servers will be Proxmox Virtual Machines, having the latter a GPU passed through directly from the host to the VM, bypassing the Proxmox hypervisor, for better performance.
Before diving in, make sure you have completed our two essential setup guides that lay the groundwork for this environment:
- Installing latest NVIDIA GPU Driver on Proxmox 9.2 (Debian Trixie + Linux Kernel 7.x) – This is a recipe for our Kubernetes Workload Node and goes through how to install latest NVIDIA GPU drivers on Proxmox 9.2+ (Debian Trixie / Kernel 7.x). This is how you can make your Proxmox VE GPU accelerated.
- Setup Nvidia GPU Passthrough for Ubuntu VMs on Proxmox 9.2 – To build our high-performance
k3s-node-01virtual machine and grant it exclusive, bare-metal access to the GPU cores.
1. The Architectural Blueprint
To understand our networking, control routing, and workload isolation, I have created a diagram that explicitly shows every single component we are deploying—tracing exactly where each component reside and how they interact across the physical and virtual layers:

You might be asking why splitting management from workload in two separate VMs. If we were to install Rancher directly inside our compute cluster, the runtime footprint of the management UI would constantly fight your active workloads for host memory and CPU cycles. By hosting Rancher on its own lightweight VM (rancher-mgmt), and running the K3s workload engine on a distinct, hardware-accelerated VM (k3s-node-01), we isolate our control loop. Even if a massive machine learning training job pegs k3s-node-01‘s CPUs to 100%, the Rancher UI and control plane remain completely operational, allowing you to troubleshoot, inspect logs, and safely orchestrate cluster states.
You might also be wondering why using Kubernetes inside a VM instead of bare-metal. Installing Rancher and K3s inside Proxmox VMs instead of bare metal provides essential enterprise-grade resilience by enforcing hard resource isolation, ensuring heavy workloads cannot starve your management plane. This virtualized approach also unlocks instant snapshots for risk-free disaster recovery, dynamic scaling to resize storage or compute without downtime, and hardware abstraction for seamless node migration if physical components fail. Ultimately, it significantly reduces your security “blast radius,” containing any potential container breaches to a single VM rather than compromising your entire physical server and its underlying storage pools.
Enough said, let’s dive into it!
2. Optimizing ZFS’s Memory Allocation on Proxmox VE
Proxmox VE relies heavily on ZFS (Zettabyte File System) for software-defined RAID arrays. By default, ZFS implements an Adaptive Replacement Cache (ARC) designed to consume up to 50% of the host’s physical RAM as a read cache. While this speeds up local storage, it creates severe issues in virtualization environments.
If Proxmox is caching heavy disk writes, the ARC will greedily consume system memory. When you boot your large K3s virtual machines (which require static memory allocations), the Linux kernel’s Out-Of-Memory (OOM) Killer will step in to protect the hypervisor host. It will immediately terminate your most resource-intensive processes: your Kubernetes VMs.
To prevent this, we must configure Proxmox to clamp its ARC usage to a maximum of 2 GB, leaving the remaining memory pool free for our virtual machine layers.
SSH into your Proxmox host (pve2) and set a permanent driver restriction configuration:
echo "options zfs zfs_arc_max=2147483648" | sudo tee /etc/modprobe.d/zfs.confForce the Proxmox kernel loop to update its initramfs variables, then reboot the host to commit the layout:
sudo update-initramfs -u -k all3. Provisioning the Management Plane Hardware
We will create our dedicated management VM (aka rancher-mgmt) directly through the Proxmox Web UI, mirroring our exact storage-optimized setup from the previous guide.
3.1 Download the Cloud Image on the Host Terminal
If you haven’t already, SSH into your Proxmox host shell (aka proxmox.domain.com) and make sure the official minimal Ubuntu 24.04 LTS Cloud Image is ready:
wget https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img3.2 Create the rancher-mgmt VM via Web UI
Click Create VM in the top-right corner of the Proxmox UI and configure the tabs precisely as follows:
- General: Assign VM ID
101and name itrancher-mgmt. - OS: Select Do not use any media.
- System:
- Graphic card: Select Standard VGA (std).
- Machine: q35 (keeps our VM layouts uniform).
- BIOS: OVMF (UEFI). Assign to your local storage pool.
- SCSI Controller: VirtIO SCSI Single.
- Disks: Delete the default disk that Proxmox populates automatically so the tab is entirely blank.
- CPU: Allocate 4 Cores. Set the Type to host.
- Memory: Allocate 16GB (16384 MB) of RAM. Uncheck “Ballooning Device” (Rancher’s underlying database requires stable, non-fluctuating memory strings).
- Network: Set Model to VirtIO (Intel MTU optimized).
Click Finish to save the hardware framework.
3.3 Import the Disk and Inject Multi-Console Mapping
To complete the layout, jump back onto your Proxmox host terminal to flash the Ubuntu OS partition map into this new shell and attach a serial port for flexible debugging:
# 1. Import the Cloud Image directly into your new management VM storage
qm importdisk 101 noble-server-cloudimg-amd64.img local-lvm
# 2. Attach the imported image explicitly to the VM as scsi0
qm set 101 --scsi0 local-lvm:vm-101-disk-0,discard=on,iothread=1
# 3. Grow the operating system partition pool size to 64GB
qm resize 101 scsi0 +62G
# 4. Attach the dedicated Cloud-Init hardware drive
qm set 101 --ide0 local-lvm:cloudinit
# 5. Add the virtual serial port adapter hardware
qm set 101 --serial0 socket
# 6. Prioritize scsi0 during boot execution
qm set 101 --boot order=scsi0Return to the Proxmox Web UI, click on VM rancher-mgmt, navigate to the Cloud-Init panel, populate your administrative username, password, SSH public key, and your Static IP configuration (e.g., 192.168.1.11/24 with your network gateway). Click Regenerate Image.
3.4 Expand VM’s Storage
Modern Ubuntu Cloud Images utilize flat partition maps that drop into a tiny default container layout (~2.5GB). If you try installing K3s right now (or almost anything for that matter), it will instantly run out of space and fail the installation.
Fire up the management node from the Proxmox terminal or Web UI. Once online, connect straight to it from your desktop terminal via SSH or through Proxmox’s Console. Next force the VM’s kernel to recognize the full 64GB and resize the filesystem directly:
# 1. Force a hardware rescan of the virtual SCSI drive
echo 1 | sudo tee /sys/class/block/sda/device/rescan
# 2. Grow the flat root partition boundaries (Partition 1)
sudo growpart /dev/sda 1
# 3. Force the kernel to refresh the active geometry mappings
sudo partprobe /dev/sda
# 4. Live-expand the filesystem to claim the newly added space
sudo resize2fs /dev/sda1Verify the layout using df -h /. You will see your available root storage scale up cleanly to the full 64GB boundary.
4. Installing Rancher (The Management Plane)
Now we deploy the Rancher management engine inside our newly provisioned rancher-mgmt virtual machine. Because this VM is decoupled from any specific compute node, it acts as our centralized administrative hub.
4.1 Install K3s in Master Mode
Let’s install K3s and to make access easier, let’s bootstrap the K3s control engine to your user’s ~/.bash_aliases so you never forget how to control the cluster:
curl -sfL https://get.k3s.io | sh -
# Grant local read permission and apply a permanent session environment alias
sudo chmod 644 /etc/rancher/k3s/k3s.yaml
echo "export KUBECONFIG=/etc/rancher/k3s/k3s.yaml" >> ~/.bash_aliases
source ~/.bashrcVerify the local control engine is active:
kubectl get nodes4.2 Installing Helm
Helm is the official package manager for Kubernetes, acting as the equivalent of apt, yum, or homebrew for your containerized applications. It allows you to define, install, upgrade, and manage complex Kubernetes applications using a single tool.
# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bashVerify helm is installed and kicking:
# Get helm version
helm version4.3 Deploy Cert-Manager
Cert-Manager is an automated certificate management controller that streamlines the process of issuing, renewing, and managing SSL/TLS certificates within your cluster. By integrating directly with public authorities like Let’s Encrypt, it eliminates the manual overhead of updating certificates by automatically monitoring their expiration and performing the necessary domain validation challenges.
We will deploy cert-manager via Helm handle internal certificates internally:
# 1. Add stable repositories
helm repo add jetstack https://charts.jetstack.io
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm repo update
# 2. Install Cert-Manager for local Rancher validation tracking
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set crds.enabled=trueWait a couple minutes and verify cert-manager was installed
kubectl get pods -n cert-managerSince we updated the deployment to use external DNS providers, let’s confirm that the arguments have been applied:
kubectl get deployment cert-manager -n cert-manager -o jsonpath='{.spec.template.spec.containers[0].args}'Ensure the output contains:
--dns01-recursive-nameservers=8.8.8.8:53,1.1.1.1:53--dns01-recursive-nameservers-only=true
4.4 Deploy Rancher
Rancher is an open-source, enterprise-grade container management platform that simplifies the operation of Kubernetes clusters across any environment. It provides a centralized dashboard for deploying, monitoring, and securing Kubernetes clusters, abstracting away the complexity of managing different cloud providers or on-premises infrastructure. By offering unified authentication, centralized security policies, and an integrated catalog of applications, Rancher enables teams to manage their entire containerized fleet from a single, intuitive interface.
Rancher requires secure TLS termination to manage downstream nodes. We will install Helm and deploy cert-manager to handle internal certificates automatically:
# 1. Add stable repositories
helm repo add jetstack https://charts.jetstack.io
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm repo update
# 2. Install Cert-Manager for local Rancher validation tracking
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set crds.enabled=true
# 3. Install Rancher Management Dashboard
helm install rancher rancher-stable/rancher --namespace cattle-system --create-namespace \
--set hostname=rancher.domain.com \
--set bootstrapPassword='YourStrongPassword' \
--set replicas=1Give the pods a few minutes to initialize, then navigate to your configured address (e.g., https://rancher.domain.com) to access your new multi-cluster control center. Complete the registration wizard to define your master admin password and set the server URL to your intended local address: https://rancher.domain.com.
5. Creating the VM for the Accelerated Workload Plane
The k3s-node-01 VM acts as our primary compute powerhouse. This VM must be provisioned according to our prerequisite post: Setup Nvidia GPU Passthrough for Ubuntu VMs on Proxmox 9.2. This ensures that the VM has a virtual PCIe bus mapping your physical graphics card, with the Nvidia drivers compiled directly against the guest kernel.
Our K3s deployment here must be highly optimized. To construct an enterprise-grade stack, we must disable K3s’ default local storage provisioner, and Traefik. By omitting these components, we prevent them from conflicting with the custom, production-ready routing (MetalLB, NGINX) and storage (Longhorn) components we deploy in the following steps.
ssh user@1k3s-node-015.1 Bootstrapping the K3s Workload Worker Node
We will install K3s on this node with custom flags. We are explicitly instructing K3s to disable the default Traefik ingress controller and the local-path storage provisioner:
curl -sfL https://get.k3s.io | sh -s - \
--disable traefik \
--disable local-storage \
--write-kubeconfig-mode 644Verify the local control engine is active:
kubectl get nodes5.2 Connecting the Workload Node to Rancher
- Open your Rancher Web UI dashboard.
- Navigate to Cluster Management -> Import Existing Cluster -> Generic.
- Name the cluster
Production-Clusterand click Create. - Copy the registration string (
kubectl apply -f ...) presented on your screen. - Paste that exact command directly into your
k3s-node-01SSH terminal.
Within moments, you will see the node register inside the Rancher dashboard, moving smoothly from a status of Pending to a vibrant green Active flag.
6. Constructing the Network Pipeline & Core Add-ons
Right now, your virtual machine can see the NVIDIA GPU via nvidia-smi, but Kubernetes is completely blind to it. We must configure our container runtime tools and deploy the resource plugin so K3s can schedule workloads onto the GPU’s CUDA cores. Then, we will bring up our enterprise networking stack.
6.1 Bootstrap Helm on the Workload Node
Kubernetes configuration files (YAMLs) can get incredibly complex. Helm is the official package manager for Kubernetes. It allows us to install, upgrade, and configure complex applications (like our ingress controllers and storage drivers) using pre-packaged templates called Charts, rather than writing thousands of lines of YAML manually.
Because the workload plane operates inside an independent cluster environment separate from our management VM, we also need to install the Helm binary locally on k3s-node-01 too. Then, we grant your user profile permission to read the K3s keys and bind a permanent session environment alias profile:
# Download and install the binary package
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Grant local read permission and apply a permanent session environment alias
sudo chmod 644 /etc/rancher/k3s/k3s.yaml
echo "export KUBECONFIG=/etc/rancher/k3s/k3s.yaml" >> ~/.bash_aliases
source ~/.bashrcVerify helm is installed and kicking:
# Get helm version
helm version6.2 Install the Nvidia Container Toolkit
Before Kubernetes can coordinate container workloads on GPU cores, we must install the underlying container toolkit integration and bind it directly to containerd 2.0+ schemas. Run these commands inside your k3s-node-01 terminal to install the underlying Nvidia’s Container integration toolkit, ensuring the APT strings are written to the proper repository tracking layout and evaluating $(ARCH) to amd64:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sed "s#\$(ARCH)#amd64#g" | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update && sudo apt install -y nvidia-container-toolkitNext, override K3s’ containerd configuration map to instruct the runtime engine to use the nvidia wrapper:
sudo mkdir -p /etc/rancher/k3s
sudo tee /etc/rancher/k3s/config.yaml << 'EOF'
disable:
- traefik
- local-storage
write-kubeconfig-mode: "644"
default-runtime: "nvidia"
EOF
# Restart the supervisor loop to commit the changes
sudo systemctl restart k3sVerify k3s is using nvidia runtime by default:
sudo crictl info | grep "defaultRuntime"You should see something like "defaultRuntimeName": "nvidia", in the output
6.3 Installing NVIDIA Device Plugin
Out of the box, Kubernetes has no idea your physical GPU exists. To allow containerized workloads to execute code on your GPU cores, we need two components: The Nvidia Container Toolkit (which patches our containerd runtime to understand Nvidia binaries) and the Nvidia Device Plugin (a Kubernetes daemon that scans the host hardware and advertises the available GPU resources to the Kubernetes scheduler).
If you try to deploy the vanilla NVIDIA Device Plugin right now, you will notice that the cluster reports DESIRED: 0 pods. This is because k3s-node-01 acts as its own cluster master, meaning K3s automatically brands it with a control-plane scheduling taint (node-role.kubernetes.io/control-plane:NoSchedule).
To bypass this safety layout cleanly, we must pass an explicit Toleration block through Helm. This allows both the core plugin and the GPU Feature Discovery (GFD) engine to run on our node. GFD will automatically scan the active kernel layers and apply the correct scheduling tags for us without any manual intervention:
# Add the Nvidia Helm repository track
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update
# Deploy with global tolerations and automated feature discovery enabled
helm upgrade -i nvdp nvdp/nvidia-device-plugin \
--namespace kube-system \
--set-json 'tolerations=[{"operator":"Exists","effect":"NoSchedule"}]' \
--set gfd.enabled=trueVerify the nvdp pods are up and running:
kubectl get pods -n kube-system -l app.kubernetes.io/name=nvidia-device-pluginVerify that your cluster successfully tracks your hardware acceleration capacity:
kubectl describe node | grep -E "Allocatable|nvidia.com/gpu"If successful, the console will print nvidia.com/gpu: 1. The cluster now holds full native scheduling rights over your NVIDIA GPU.
6.4 Setting Up Bare-Metal Load Balancing (MetalLB)
When you request a service of type: LoadBalancer in a cloud provider like AWS, it automatically provisions a physical load balancer for you. On a bare-metal homelab, that request will just sit in a ‘Pending’ state forever. MetalLB bridges this gap. It monitors the cluster for load balancer requests and dynamically assigns them a real, routable IP address from a designated pool on your local network using Layer 2 ARP announcements.
Let’s install MetalLB to act as our local automated network load balancer:
# 1. Add the official MetalLB Helm repository channel
helm repo add metallb https://metallb.github.io/metallb
helm repo update
# 2. Deploy the core orchestration routers
helm install metallb metallb/metallb --namespace metallb-system --create-namespaceWait until all MetalLB controller and speaker pods are running:
kubectl get pods -n metallb-system Next, we assign a dedicated slice of unused IP addresses from our home network pool. Create a file named metallb-config.yaml (make sure this range is completely outside your local router’s DHCP pool to prevent collisions):
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: local-ip-pool
namespace: metallb-system
spec:
addresses:
- 192.168.1.201-192.168.1.210 # Assign a pristine local pool slice
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: local-l2-advertisement
namespace: metallb-system
spec:
ipAddressPools:
- local-ip-poolApply the networking policy configuration directly:
kubectl apply -f metallb-config.yamlVerify the configured IPs in MetaILB pools:
kubectl get ipaddresspools -n metallb-systemYou should see something like
NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES
local-ip-pool true false ["192.168.1.201-192.168.1.250"]6.5 Deploying NGINX Ingress Controller (Cloud-Native Mode)
This will create the necessary LoadBalancer service that receives traffic from your local network.
# Add the official repository
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
# Install the controller and request a LoadBalancer IP
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.service.type=LoadBalancer
6.5.1 Verify the LoadBalancer IP
After the installation, run the following command until you see an IP address (e.g., 192.168.1.201) appear under EXTERNAL-IP:
kubectl get svc -n ingress-nginx ingress-nginx-controller -w6.6 Deploying Certificate Manager
Cert-Manager is an automated certificate authority controller. It talks to providers like Let’s Encrypt to request SSL certificates for your applications and automatically renews them before they expire.
In a private homelab, your local router’s DNS system often cannot resolve your public wildcard domains locally. During ACME DNS-01 validation challenges, this loopback limitation causes Cert-Manager to fail its self-check validations. To bypass this, we force Cert-Manager to query authoritative root servers (like Cloudflare at 1.1.1.1 and Google at 8.8.8.8) directly, bypassing local DNS loops entirely.
# Add the Jetstack repository for Cert-Manager
helm repo add jetstack https://charts.jetstack.io
helm repo update
# Install Cert-Manager with strict recursive DNS parameters enabled
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set installCRDs=true \
--set 'extraArgs={--dns01-recursive-nameservers=8.8.8.8:53\,1.1.1.1:53}' \
--set 'extraArgs={--dns01-recursive-nameservers-only=true}'Verify that the Cert-Manager pods are healthy:
kubectl get pods -n cert-managerFor this tutorial, we will use Cloudflare to manage our DNS-01 challenge. To allow Cert-Manager to automatically create temporary DNS records for validation, you need a Cloudflare API token.
Generate your Cloudflare API Token:
- Log into your Cloudflare Dashboard.
- Go to My Profile -> API Tokens and click Create Token.
- Select Create Custom Token.
- Under Permissions, select Zone -> DNS -> Edit.
- Under Zone Resources, select Include -> Specific Zone -> Select your domain (e.g.,
yourdomain.com). - Click Continue to summary and Create Token. Save this token securely.
Create a secure Kubernetes Secret to store your restricted Cloudflare API Token:
kubectl create secret generic cloudflare-api-token-secret \
--namespace cert-manager \
--from-literal=api-token="YOUR_CLOUDFLARE_API_TOKEN_HERE"Create your cluster issuer registration manifest. Create letsencrypt-dns-production.yaml:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@yourdomain.com
privateKeySecretRef:
name: letsencrypt-prod-account-key
solvers:
- dns01:
cloudflare:
email: admin@domain.com
apiTokenSecretRef:
name: cloudflare-api-token-secret
key: api-tokenApply the file to register the automated ACME handshake issuer:
kubectl apply -f letsencrypt-dns-production.yamlVerify if your ClusterIssuer is ready:
kubectl get clusterissuer letsencrypt-prod -o wideLook at the output under the READY column. If it says True, your Cert-Manager is successfully communicating with Let’s Encrypt and is ready to issue certificates.
7. Storage Provisioning (The Hybrid Storage Plane)
To give our containerized apps flexible data persistence, we will implement a hybrid storage plane. We will use Longhorn backed by a newly initialized Proxmox ZFS pool for high-performance block storage, and a Synology NAS via NFS for shared, multi-pod asset storage.
7.1 High-Performance Block Storage (Proxmox ZFS to Longhorn)
For database logs and container states that require low latency (ReadWriteOnce), we want to back our storage using a dedicated ZFS block volume on the host. If your Proxmox server does not have an active ZFS storage pool initialized yet, we will locate an empty disk, clean it, and build our pool fresh.
Step A: Provision the ZFS Pool on the Proxmox Host (pve2)
Open a separate terminal window on your physical Proxmox host (pve2) and execute the storage inventory sweep:
# 1. Inspect physical storage topography to locate your unassigned 1TB disk
lsblkLocate your empty target disk name from the output (for this guide, we are targeting /dev/sdb, which offers ~953.9G of raw block space). Clear out any legacy filesystem headers and bind the device to a fresh pool named local-zfs
# 2. Wipe hidden signature tables to prevent mounting blocks
sudo wipefs -a /dev/sdb
# 3. Initialize the new high-performance ZFS storage pool
sudo zpool create -f local-zfs /dev/sdb
# 4. Verify the pool is active, online, and healthy
sudo zpool status local-zfsStep B: Carve out the Virtual Block Volume
With our pool cleanly initialized on the host, we will carve out an 850GB volume dataset. Leaving roughly 10-15% of the raw disk space unallocated provides ZFS with the structural buffer it needs to maintain optimal write optimization performance:
# 1. Generate an 850GB ZFS Block Volume dataset
zfs create -V 850G local-zfs/k3s-persistent-storage
# 2. Map the block lane directly to our Workload VM (ID 100) on SCSI slot 1
qm set 100 -scsi1 /dev/zvol/local-zfs/k3s-persistent-storageStep C: Format and Mount inside the Workload VM (k3s-node-01)
Open a separate terminal window and SSH directly into your k3s-node-01 VM window. The Linux kernel inside the guest will instantly surface the new hardware block allocation path on /dev/sdb.
We will apply an enterprise-standard XFS filesystem layer directly over the drive. Compared to ext4, XFS utilizes independent allocation zones that handle massive parallel read/write tasks across our 16 CPU cores seamlessly without encountering monolithic allocation locks or running into ext4 inode exhaustion limits:
# 1. Install storage formatting libraries
sudo apt install -y xfsprogs
# 2. Flash an optimized XFS structure over the disk channel
sudo mkfs.xfs /dev/sdb
# 3. Create the permanent mounting target directory
sudo mkdir -p /var/lib/longhorn
# 4. Mount the drive to the initialization target path
sudo mount /dev/sdb /var/lib/longhorn
# 5. Commit the block device map permanently to fstab for safe boots
echo "/dev/sdb /var/lib/longhorn xfs defaults 0 0" | sudo tee -a /etc/fstabVerify that your drive is mounted correctly:
df -h | grep longhorn
# Expect to see: /dev/sdb mounted to /var/lib/longhornStep D: Deploy Longhorn Storage Engine
Now that the file mount path is online with raw high-performance storage blocks, deploy Longhorn via Helm to manage the localized allocation matrix:
helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace7.2 Shared Network Storage (Synology NFS Provisioner)
For workloads where multiple pods need to read and write to the exact same directories simultaneously (ReadWriteMany), we will connect K3s to a Synology NAS NFS export.
First, install the native NFS common libraries on your k3s-node-01 VM so the guest kernel can translate network file locks:
sudo apt update && sudo apt install -y nfs-commonNow, implement the storage strategy that fits your workload targets:
Strategy A: Dynamic Subfolder Provisioning
Best used for automated, encapsulated application data paths where Kubernetes manages isolated sub-folders:
# Add the NFS external provisioner repository
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
helm repo update
# Deploy the provisioner, mapping it to your NAS IP and Shared Folder
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--namespace storage-system --create-namespace \
--set nfs.server=192.168.1.50 \
--set nfs.path=/volume1/k3s-nfs-share \
--set storageClass.name=network-nfs \
--set storageClass.defaultClass=falseTo use the nfs-subdir-external-provisioner, you only need a PersistentVolumeClaim (PVC) that references the correct storageClassName. Unlike static PVs, you do not need to define the PersistentVolume (PV) manually; the provisioner will create it for you automatically. This is an example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-dynamic-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
# This class name must match the one defined in your
# nfs-subdir-external-provisioner configuration
storageClassName: network-nfs
resources:
requests:
storage: 10GiStrategy B: Static Root Volume Mapping
Best used for mapping top-level media pools or raw download endpoints directly, bypassing subfolders so your NAS apps (like Plex) can see them instantly. Create static-nfs-root.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: network-multimedia-root-pv
spec:
capacity:
storage: 5Ti
volumeMode: Filesystem
accessModes: [ReadWriteMany]
persistentVolumeReclaimPolicy: Retain
nfs:
server: 192.168.1.50
path: /volume1/k3s-nfs-share
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: network-multimedia-root-pvc
spec:
accessModes: [ReadWriteMany]
storageClassName: ""
volumeName: network-multimedia-root-pv
resources:
requests:
storage: 5TiApply it to the cluster:
kubectl apply -f static-nfs-root.yaml(Swap out 192.168.1.50 and /volume1/k3s-nfs-share with your actual Synology management IP and volume path).
To implement a static NFS mount, you need two objects: a PersistentVolume (which defines the actual NFS server and path details) and a PersistentVolumeClaim (which binds to that volume).
apiVersion: v1
kind: PersistentVolume
metadata:
name: synology-nfs-static-pv
spec:
capacity:
storage: 5Ti
volumeMode: Filesystem
accessModes: [ReadWriteMany]
persistentVolumeReclaimPolicy: Retain
# storageClassName must be empty ("") to force static binding
storageClassName: ""
nfs:
server: synology.domain.com
path: /volume1/k3s-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-static-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
# volumeName tells Kubernetes to bind specifically to the PV defined above
volumeName: synology-nfs-static-pv
resources:
requests:
storage: 5Ti8. Local DNS Routing
Because this infrastructure operates on your private subnet, public domain names like app.domain.com will not automatically resolve to your internal nodes. Unless we map these domains internally, your web browser will look up the public DNS tree, fail to find your private IP address, and report a connection timeout.
We must map our wildcard domain zone directly to our local NGINX Ingress controller IP (e.g., 192.168.1.201) assigned by MetalLB.
Execution Options
Option A: Central DNS Server Override (Pi-hole / pfSense / Unbound)
If you manage a centralized local network DNS server, define a wildcard rewrite rule:
*.domain.com -> 192.168.1.201Option B: Desktop System Hosts File Mapping
If you do not have a local DNS server, you can configure your testing system to point directly to the ingress IP:
- Linux / macOS (
/etc/hosts):192.168.1.201 app.domain.com - Windows (
C:\Windows\System32\drivers\etc\hosts):192.168.1.201 app.domain.com
9. Putting all together and validating the cluster
The goal of our validation suite is to test the reliability of our hardware acceleration, network routing, and dynamic storage configurations under load.
Rather than testing these in isolation, we will deploy a multi-container pod architecture:
- A high-performance CUDA container is assigned access to your passed-through GPU. It executes a constant monitoring loop and exports the current hardware status to a shared storage directory.
- A lightweight NGINX container mounts the exact same shared volume. It reads the exported status file and hosts it as a live web page.
- We back these containers with a dynamic Longhorn Storage Class volume to confirm our block-storage layer is fully operational.
- Simultaneously, we declare a secondary dynamic PVC backed by network NFS (the NFS subdir external provisioner) to append a live filesystem heartbeat, proving both hot and cold hybrid storage planes work in perfect harmony.
- Lastly, we also mapped the static root mapping for the NFS just for fun – we didn’t really wrote anything to it, but left as an example on how to use it on your deployments
Create a multi-resource testing schema file named cluster-validation.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-longhorn-block-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-network-nfs-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
storageClassName: network-nfs
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: synology-nfs-root-pv
spec:
capacity:
storage: 15Ti
volumeMode: Filesystem
accessModes: [ReadWriteMany]
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
nfs:
server: synology.domain.com
path: /volume1/k3s-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: validation-nfs-static-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
volumeName: synology-nfs-root-pv
resources:
requests:
storage: 5Ti
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: validation-web-app
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: validation-web
template:
metadata:
labels:
app: validation-web
spec:
containers:
# Container 1: High-Performance Web Server Front-End
- name: web-server
image: "nginx:alpine"
ports:
- containerPort: 80
volumeMounts:
- name: html-mount
mountPath: /usr/share/nginx/html
# Container 2: Dedicated CUDA Silicon Pass-Through Tester
- name: cuda-tester
image: "nvidia/cuda:12.2.0-runtime-ubuntu22.04"
command: ["/bin/bash", "-c"]
args:
- |
echo "== starting hardware cluster diagnostics =="
rm -f /usr/share/nginx/html/index.html
while true; do
echo "<html><head><meta http-equiv='refresh' content='10'></head><body style='font-family:monospace;background:#111;color:#eee;padding:20px;'>" > /usr/share/nginx/html/index.html
echo "<h2>Cluster Validation Status: ACTIVE</h2>" >> /usr/share/nginx/html/index.html
echo "<h3>[$(date)] Live GPU Metrics:</h3><pre>" >> /usr/share/nginx/html/index.html
nvidia-smi >> /usr/share/nginx/html/index.html 2>&1
echo "</pre></body></html>" >> /usr/share/nginx/html/index.html
# Append write heartbeat test straight to the NFS mount path
echo "[$(date)] NAS Write Operation Successful" >> /output/network-nfs-test/heartbeat.log
sleep 10
done
resources:
limits:
nvidia.com/gpu: 1
volumeMounts:
- name: html-mount
mountPath: /usr/share/nginx/html
- name: network-mount
mountPath: /output/network-nfs-test
- name: nfs-static-mount
mountPath: /output/network-nfs-static
volumes:
- name: html-mount
persistentVolumeClaim: {claimName: test-longhorn-block-pvc}
- name: network-mount
persistentVolumeClaim: {claimName: test-network-nfs-pvc}
- name: nfs-static-mount
persistentVolumeClaim: {claimName: validation-nfs-static-pvc}
---
apiVersion: v1
kind: Service
metadata:
name: validation-web-service
namespace: default
spec:
ports:
- port: 80
targetPort: 80
selector:
app: validation-web
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: validation-web-ingress
namespace: default
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- "test.domain.com"
secretName: validation-web-tls-certs
rules:
- host: test.domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: validation-web-service
port: {number: 80}Apply the execution block directly using kubectl:
kubectl apply -f cluster-validation.yaml10. Running the Quality Assurance Diagnostics
Before opening the site in your browser, verify that your Kubernetes components are healthy:
1. Verify Storage Allocation
kubectl get pvc test-longhorn-block-pvc
# Status should be: Bound
kubectl get pvc test-network-nfs-pvc
# Status should be: Bound2. Inspect Ingress and Certificate Status
kubectl get ingress validation-web-ingress
# Check that the address is successfully mapped to: 192.168.1.201
kubectl get certificate validation-web-tls-certs
# Expected output: READY -> True3. Check GPU Scheduler Logs
kubectl logs -l app=validation-web -c cuda-tester --tail=50
# Confirm that nvidia-smi runs successfully and does not return: "Nvidia driver not found"4. Check Network NFS Heartbeat Logging
Check the log file directly inside your mounting path:
kubectl exec -it $(kubectl get pods -l app=validation-web -o jsonpath='{.items[0].metadata.name}') -c cuda-tester -- tail -n 10 /output/network-nfs-test/heartbeat.logNow, open your browser and navigate to:
https://app.domain.com- Verify SSL: Check the browser address bar. You should see a clean padlock icon confirming a valid, trusted wildcard certificate issued by Let’s Encrypt.
- Verify Storage & Hardware Passthrough: The web page should display a live, refreshing table generated by the physical
nvidia-smicommand. Because the webpage layout you are viewing is being generated by your heavycuda-testercontainer but served out to the internet by your independentweb-servercontainer, the visual rendering itself serves as proof that your underlying Longhorn Distributed Block Storage Layer is flawlessly managing concurrent filesystem reads and disk writes.
Your split-plane control cluster is now fully configured, secure, accelerated, and operational.
The Lab Medic: Troubleshooting Common Gotchas
The “Disk Pressure” Pod Eviction Taint
- Symptom: Nodes move to an
Unavailablestatus flag, and your application pods are forcefully evicted. - Cause: Standard Ubuntu Cloud Images default to a very small base root partition footprint. When K3s reads that the operating system drive is filling up with container logs and images, it triggers an emergency lock down.
- The Fix: Expand the guest volume file allocation layer directly inside your VM console shell to take full advantage of your disk allocation boundaries:
echo 1 | sudo tee /sys/class/block/sda/device/rescan
sudo growpart /dev/sda 1
sudo partprobe /dev/sda
sudo resize2fs /dev/sda1The Let’s Encrypt DNS Challenge Hang
- Symptom:
kubectl get certificateshowsREADY -> Falseand hangs in an endless validation loop. - Cause: By default, your local router is intercepting outbound DNS queries and attempting to validate the ACME challenge against its own internal DNS cache (which has no knowledge of the Cloudflare API modification yet).
- The Fix: Ensure your cert-manager instance was installed using the
--set 'extraArgs={--dns01-recursive-nameservers-only=true}'and recursive server settings detailed in Step 6.5.
Summary: Your Proxmox Compute Power is Unlocked
Your technical foundation is now complete. You have a split-plane virtualized Kubernetes environment running on Proxmox VE capable of routing intensive containerized applications directly into dedicated graphics hardware, all while monitored and managed by a clean interface layer. Happy hacking!

