Issue #107

Read about infrastructure and programming topics and news every week

Aug 26, 2025

Bare-Metal Kubernetes: A Production-Ready Architecture Guide

A comprehensive approach to deploying resilient, scalable Kubernetes clusters on bare metal infrastructure

Executive Summary

Building production-grade Kubernetes on bare metal requires careful consideration of every layer—from the operating system to application deployment. This guide presents a battle-tested architecture using modern tooling that prioritises security, operability, and developer experience while maintaining the cost advantages of bare metal infrastructure.

Key Technologies: Talos Linux, Sidero Metal, LINSTOR, WireGuard, Argo CD, Cilium

🏗️ Foundation: Cluster Deployment & Scaling

The Talos Linux Advantage

When deploying bare-metal Kubernetes, the choice of operating system fundamentally impacts your operational burden. Talos Linux emerges as the clear winner for production environments, offering:

Immutable, API-first design with no SSH access or package managers
Atomic updates with automatic rollback capabilities
Minimal attack surface compared to general-purpose distributions
Built-in Kubernetes lifecycle management without external tooling

This eliminates the complexity of managing Ansible playbooks, SSH keys, and OS drift that plague traditional kubeadm or Rancher deployments.

Infrastructure as Cattle with Sidero Metal

Sidero Metal provides the missing piece for bare-metal provisioning, offering:

Cluster API (CAPI) integration for declarative node management
PXE/iPXE boot orchestration with BMC/IPMI control
True "infrastructure as cattle" mentality

Cluster Topology Best Practices:

3 or 5 control-plane nodes (odd quorum) across different power domains
Worker pools segmented by hardware class (small/medium/large)
Strategic labelling and taints for workload placement

Scaling Strategy

Horizontal & Vertical Pod Scaling:

Kubernetes-native HPA and VPA for application-level scaling
KEDA integration for event-driven workloads tied to business SLOs

Node-Level Scaling Reality Check: Bare-metal autoscaling remains challenging—hardware can't be created on demand. The solution combines:

Intelligent capacity planning with N+1 redundancy
Low-priority "pause" pods for resource reservation
Pre-staged container images to reduce scheduling latency
Proactive alerting for hardware capacity thresholds

💾 Storage: High-Performance Persistent Volumes

Why LINSTOR Over Ceph

For most bare-metal deployments, LINSTOR provides superior performance and operational simplicity compared to Ceph:

LINSTOR Advantages:

Localised I/O path with direct NVMe/SSD access
DRBD replication handles data durability in the background
Lower operational complexity suitable for smaller platform teams
Native Talos integration via system extensions

Architecture Overview:

Dedicated storage disks separate from OS drives
LVM Volume Groups managed per worker node
DRBD provides multi-replica durability across failure domains
Piraeus Operator handles CSI and HA controller deployment

Storage Classes by Use Case:

lvm-thick-r2: Standard stateful applications (2 replicas)
lvm-thick-r3: High-durability workloads (3 replicas)
lvm-thin: Snapshot-capable volumes (slight latency trade-off)

Quorum & Split-Brain Protection: Minimum 3-node deployment (2 storage + 1 diskless tie-breaker) ensures DRBD quorum. The LINSTOR HA controller provides rapid failover for volume issues.

🔒 Network Security & Connectivity

East-West: WireGuard by Default

Talos Linux includes built-in WireGuard encryption for all cluster communication:

Control-plane nodes act as WireGuard servers
All inter-node traffic encrypted (kubelet, etcd, metrics, CNI)
KubeSpan extends this to multi-site full-mesh networks
Zero additional configuration overhead

North-South: Flexible Ingress Options

Option A: Edge Load Balancer Architecture

Internet → Caddy (3 VMs) → ingress-nginx (DaemonSet) → Applications

Caddy handles public TLS termination and health checks
Cloudflare integration for automated certificate management
ingress-nginx operates in hostNetwork/NodePort mode
Re-encryption between Caddy and cluster for end-to-end security

Option B: Cloudflare Tunnel For environments without public IP allocation, Cloudflare Tunnel provides outbound-only connectivity with edge TLS termination.

L4 LoadBalancer Services: Cilium BGP When BGP peering is available, Cilium's integrated BGP control plane offers superior L4 load balancing compared to MetalLB:

Native CNI integration with fewer moving parts
Built-in LB-IPAM for automated VIP management
Direct pod CIDR advertisement capability

🚀 Application Deployment: GitOps Excellence

Argo CD: The GitOps Standard

Why Argo CD over Flux:

Superior UI/UX with live diffs and health monitoring
Strong multi-tenancy via Application and AppProject CRDs
ApplicationSet for multi-cluster deployments
First-class progressive delivery integration with Argo Rollouts

Repository Organization

Platform Repository:

Cluster-wide services (ingress, monitoring, storage)
Infrastructure-level configurations

Application Repositories:

Kustomize overlays for environment promotion
Helm charts with environment-specific values
PR-driven promotion workflow (dev → stage → prod)

Zero-Downtime Deployment Strategy

Deployment Configuration:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 0
    maxSurge: 1

Critical Components:

Readiness probes with appropriate minReadySeconds
Graceful shutdown via preStop hooks and terminationGracePeriod
Versioned ConfigMaps/Secrets to trigger clean rollouts
PodDisruptionBudgets for all SLO-critical workloads

Progressive Delivery with Argo Rollouts

Advanced Deployment Patterns:

Canary releases with Prometheus-based automated promotion
Blue/Green deployments for immediate switchover capability
A/B testing via header/cookie-based routing
Traffic splitting through ingress-nginx integration

Database Migration Safety:

Deploy backwards-compatible schema changes
Roll out the application code
Perform cleanup migrations
Use sync waves for proper sequencing

🔧 Operational Excellence

Security by Design

No SSH access across the entire stack
RBAC-controlled API updates for all components
Minimal SBOM surface with CIS-compliant defaults
Network policies and admission controllers for runtime protection

Observability Stack

Prometheus + Grafana for metrics and visualization
Alertmanager for intelligent alert routing
KEDA metrics aligned with business SLOs (queue lag, p95 latency)
Multi-dimensional monitoring beyond CPU-only metrics

Day-2 Operations

Structured atomic upgrades for OS and Kubernetes versions
Respect for PodDisruptionBudgets during maintenance windows
Volume expansion support through LINSTOR StorageClasses
Capacity monitoring with proactive alerting

📊 Architecture Benefits

This bare-metal Kubernetes architecture delivers:

✅ Operational Simplicity: API-driven management eliminates SSH sprawl and configuration drift

✅ Security Posture: Immutable nodes with encrypted inter-node communication

✅ Performance: Local storage with DRBD replication provides low-latency persistence

✅ Reliability: GitOps workflow with progressive delivery minimizes deployment risk

✅ Cost Efficiency: Bare-metal economics with cloud-native operational experience

✅ Team Productivity: Small platform teams can manage production-grade infrastructure

🎯 Key Takeaways

Choose purpose-built tools: Talos Linux and LINSTOR are specifically designed for Kubernetes workloads
Security first: WireGuard encryption and immutable nodes provide defense in depth
Embrace GitOps: Argo CD enables auditable, repeatable deployments with excellent developer experience
Plan for capacity: Bare-metal scaling requires proactive hardware planning and intelligent resource reservation
Progressive delivery: Argo Rollouts provides enterprise-grade deployment safety without vendor lock-in

This architecture has proven effective for organisations seeking the cost benefits of bare metal while maintaining the operational excellence expected in modern platform engineering.

Ready to implement bare-metal Kubernetes? The tooling ecosystem has matured to the point where small teams can deploy production-grade infrastructure without sacrificing reliability or developer experience.

Infra Weekly Newsletter

Discussion about this post

Ready for more?