Bare-Metal Kubernetes: A Production-Ready Architecture Guide
A comprehensive approach to deploying resilient, scalable Kubernetes clusters on bare metal infrastructure
Executive Summary
Building production-grade Kubernetes on bare metal requires careful consideration of every layer—from the operating system to application deployment. This guide presents a battle-tested architecture using modern tooling that prioritises security, operability, and developer experience while maintaining the cost advantages of bare metal infrastructure.
Key Technologies: Talos Linux, Sidero Metal, LINSTOR, WireGuard, Argo CD, Cilium
🏗️ Foundation: Cluster Deployment & Scaling
The Talos Linux Advantage
When deploying bare-metal Kubernetes, the choice of operating system fundamentally impacts your operational burden. Talos Linux emerges as the clear winner for production environments, offering:
Immutable, API-first design with no SSH access or package managers
Atomic updates with automatic rollback capabilities
Minimal attack surface compared to general-purpose distributions
Built-in Kubernetes lifecycle management without external tooling
This eliminates the complexity of managing Ansible playbooks, SSH keys, and OS drift that plague traditional kubeadm or Rancher deployments.
Infrastructure as Cattle with Sidero Metal
Sidero Metal provides the missing piece for bare-metal provisioning, offering:
Cluster API (CAPI) integration for declarative node management
PXE/iPXE boot orchestration with BMC/IPMI control
True "infrastructure as cattle" mentality
Cluster Topology Best Practices:
3 or 5 control-plane nodes (odd quorum) across different power domains
Worker pools segmented by hardware class (small/medium/large)
Strategic labelling and taints for workload placement
Scaling Strategy
Horizontal & Vertical Pod Scaling:
Kubernetes-native HPA and VPA for application-level scaling
KEDA integration for event-driven workloads tied to business SLOs
Node-Level Scaling Reality Check: Bare-metal autoscaling remains challenging—hardware can't be created on demand. The solution combines:
Intelligent capacity planning with N+1 redundancy
Low-priority "pause" pods for resource reservation
Pre-staged container images to reduce scheduling latency
Proactive alerting for hardware capacity thresholds
💾 Storage: High-Performance Persistent Volumes
Why LINSTOR Over Ceph
For most bare-metal deployments, LINSTOR provides superior performance and operational simplicity compared to Ceph:
LINSTOR Advantages:
Localised I/O path with direct NVMe/SSD access
DRBD replication handles data durability in the background
Lower operational complexity suitable for smaller platform teams
Native Talos integration via system extensions
Architecture Overview:
Dedicated storage disks separate from OS drives
LVM Volume Groups managed per worker node
DRBD provides multi-replica durability across failure domains
Piraeus Operator handles CSI and HA controller deployment
Storage Classes by Use Case:
lvm-thick-r2
: Standard stateful applications (2 replicas)lvm-thick-r3
: High-durability workloads (3 replicas)lvm-thin
: Snapshot-capable volumes (slight latency trade-off)
Quorum & Split-Brain Protection: Minimum 3-node deployment (2 storage + 1 diskless tie-breaker) ensures DRBD quorum. The LINSTOR HA controller provides rapid failover for volume issues.
🔒 Network Security & Connectivity
East-West: WireGuard by Default
Talos Linux includes built-in WireGuard encryption for all cluster communication:
Control-plane nodes act as WireGuard servers
All inter-node traffic encrypted (kubelet, etcd, metrics, CNI)
KubeSpan extends this to multi-site full-mesh networks
Zero additional configuration overhead
North-South: Flexible Ingress Options
Option A: Edge Load Balancer Architecture
Internet → Caddy (3 VMs) → ingress-nginx (DaemonSet) → Applications
Caddy handles public TLS termination and health checks
Cloudflare integration for automated certificate management
ingress-nginx operates in hostNetwork/NodePort mode
Re-encryption between Caddy and cluster for end-to-end security
Option B: Cloudflare Tunnel For environments without public IP allocation, Cloudflare Tunnel provides outbound-only connectivity with edge TLS termination.
L4 LoadBalancer Services: Cilium BGP When BGP peering is available, Cilium's integrated BGP control plane offers superior L4 load balancing compared to MetalLB:
Native CNI integration with fewer moving parts
Built-in LB-IPAM for automated VIP management
Direct pod CIDR advertisement capability
🚀 Application Deployment: GitOps Excellence
Argo CD: The GitOps Standard
Why Argo CD over Flux:
Superior UI/UX with live diffs and health monitoring
Strong multi-tenancy via Application and AppProject CRDs
ApplicationSet for multi-cluster deployments
First-class progressive delivery integration with Argo Rollouts
Repository Organization
Platform Repository:
Cluster-wide services (ingress, monitoring, storage)
Infrastructure-level configurations
Application Repositories:
Kustomize overlays for environment promotion
Helm charts with environment-specific values
PR-driven promotion workflow (dev → stage → prod)
Zero-Downtime Deployment Strategy
Deployment Configuration:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
Critical Components:
Readiness probes with appropriate
minReadySeconds
Graceful shutdown via preStop hooks and
terminationGracePeriod
Versioned ConfigMaps/Secrets to trigger clean rollouts
PodDisruptionBudgets for all SLO-critical workloads
Progressive Delivery with Argo Rollouts
Advanced Deployment Patterns:
Canary releases with Prometheus-based automated promotion
Blue/Green deployments for immediate switchover capability
A/B testing via header/cookie-based routing
Traffic splitting through ingress-nginx integration
Database Migration Safety:
Deploy backwards-compatible schema changes
Roll out the application code
Perform cleanup migrations
Use sync waves for proper sequencing
🔧 Operational Excellence
Security by Design
No SSH access across the entire stack
RBAC-controlled API updates for all components
Minimal SBOM surface with CIS-compliant defaults
Network policies and admission controllers for runtime protection
Observability Stack
Prometheus + Grafana for metrics and visualization
Alertmanager for intelligent alert routing
KEDA metrics aligned with business SLOs (queue lag, p95 latency)
Multi-dimensional monitoring beyond CPU-only metrics
Day-2 Operations
Structured atomic upgrades for OS and Kubernetes versions
Respect for PodDisruptionBudgets during maintenance windows
Volume expansion support through LINSTOR StorageClasses
Capacity monitoring with proactive alerting
📊 Architecture Benefits
This bare-metal Kubernetes architecture delivers:
✅ Operational Simplicity: API-driven management eliminates SSH sprawl and configuration drift
✅ Security Posture: Immutable nodes with encrypted inter-node communication
✅ Performance: Local storage with DRBD replication provides low-latency persistence
✅ Reliability: GitOps workflow with progressive delivery minimizes deployment risk
✅ Cost Efficiency: Bare-metal economics with cloud-native operational experience
✅ Team Productivity: Small platform teams can manage production-grade infrastructure
🎯 Key Takeaways
Choose purpose-built tools: Talos Linux and LINSTOR are specifically designed for Kubernetes workloads
Security first: WireGuard encryption and immutable nodes provide defense in depth
Embrace GitOps: Argo CD enables auditable, repeatable deployments with excellent developer experience
Plan for capacity: Bare-metal scaling requires proactive hardware planning and intelligent resource reservation
Progressive delivery: Argo Rollouts provides enterprise-grade deployment safety without vendor lock-in
This architecture has proven effective for organisations seeking the cost benefits of bare metal while maintaining the operational excellence expected in modern platform engineering.
Ready to implement bare-metal Kubernetes? The tooling ecosystem has matured to the point where small teams can deploy production-grade infrastructure without sacrificing reliability or developer experience.