High Availability Overview

High availability (HA) ensures that Kubernetes clusters continue operating even when individual components fail. Understanding HA is crucial for production deployments because failures are inevitable—nodes fail, networks partition, disks corrupt. HA design ensures your cluster survives these failures and continues serving workloads.

Think of high availability like redundancy in critical systems. Just as airplanes have multiple engines (so one failure doesn’t crash the plane), HA Kubernetes clusters have multiple control plane nodes, multiple etcd nodes, and workloads distributed across multiple worker nodes. If one component fails, others take over.

What is High Availability?

High availability means the system continues operating despite component failures. For Kubernetes, this means:

  • Control plane redundancy - Multiple API servers, schedulers, controller managers
  • etcd clustering - Multiple etcd nodes with replication
  • Node distribution - Workloads spread across multiple nodes
  • Load balancing - Traffic distributed across multiple instances
  • Automatic recovery - Failed components are replaced automatically

Why HA Matters

Production clusters need HA because:

  • Component failures - Hardware fails, software crashes
  • Network issues - Network partitions, connectivity problems
  • Maintenance - Need to update/reboot nodes without downtime
  • Disaster recovery - Survive data center failures
  • Service level agreements - Meet uptime requirements

Without HA, a single failure can take down your entire cluster.

Control Plane HA

The control plane is the “brain” of Kubernetes. Making it highly available is critical:

graph TB LB[Load Balancer] --> API1[API Server 1] LB --> API2[API Server 2] LB --> API3[API Server 3] API1 --> etcd1[etcd 1] API2 --> etcd2[etcd 2] API3 --> etcd3[etcd 3] API1 --> Scheduler1[Scheduler 1] API2 --> Scheduler2[Scheduler 2] API3 --> Scheduler3[Scheduler 3] API1 --> CM1[Controller Manager 1] API2 --> CM2[Controller Manager 2] API3 --> CM3[Controller Manager 3] etcd1 <--> etcd2 etcd2 <--> etcd3 etcd3 <--> etcd1 style LB fill:#e1f5ff style API1 fill:#fff4e1 style API2 fill:#fff4e1 style API3 fill:#fff4e1 style etcd1 fill:#e8f5e9 style etcd2 fill:#e8f5e9 style etcd3 fill:#e8f5e9

API Server HA

Multiple API servers behind a load balancer:

  • Load balancer - Distributes requests across API servers
  • Stateless - API servers are stateless, any can handle any request
  • Health checks - Load balancer routes away from unhealthy servers
  • Automatic failover - If one fails, others continue serving

Scheduler HA

Multiple schedulers with leader election:

  • Leader election - Only one scheduler is active at a time
  • Automatic failover - If leader fails, another becomes leader
  • No duplicate scheduling - Prevents conflicts

Controller Manager HA

Multiple controller managers with leader election:

  • Leader election - Only one controller manager is active
  • Automatic failover - If leader fails, another takes over
  • Consistent state - Prevents duplicate actions

etcd Clustering

etcd stores all cluster state. Clustering etcd is essential for HA:

etcd Cluster

Typically 3 or 5 etcd nodes:

  • 3 nodes - Can survive 1 node failure
  • 5 nodes - Can survive 2 node failures
  • Odd numbers - Required for quorum (majority voting)

Raft Consensus

etcd uses Raft consensus:

  • Leader - One node handles writes
  • Followers - Replicate from leader
  • Quorum - Majority must agree for writes
  • Automatic leader election - If leader fails

etcd Placement

For best HA:

  • Separate nodes - Run etcd on dedicated nodes
  • Separate zones - Distribute across availability zones
  • Network isolation - Protect etcd network
  • Regular backups - Backup etcd regularly

Node Distribution

Distribute workloads across multiple nodes:

Multiple Worker Nodes

  • Node redundancy - Multiple nodes run workloads
  • Automatic rescheduling - Pods rescheduled if node fails
  • Load distribution - Workloads spread across nodes

Availability Zones

Distribute nodes across zones:

  • Zone redundancy - Nodes in multiple zones
  • Zone-aware scheduling - Spread pods across zones
  • Survive zone failures - Cluster survives zone outages

Pod Disruption Budgets

Control pod evictions during maintenance:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Load Balancing

Distribute traffic across multiple instances:

Service Load Balancing

Kubernetes Services provide load balancing:

  • ClusterIP - Internal load balancing
  • NodePort - Expose on node ports
  • LoadBalancer - Cloud load balancer integration

Ingress Load Balancing

Ingress controllers provide HTTP load balancing:

  • Multiple replicas - Ingress controller replicas
  • Traffic distribution - Distribute across pods
  • Health checks - Route away from unhealthy pods

Failure Scenarios

HA design handles various failures:

Single Node Failure

  • Worker node - Pods rescheduled to other nodes
  • Control plane node - Other nodes continue operating
  • etcd node - Cluster continues if quorum maintained

Network Partition

  • Split-brain prevention - etcd quorum prevents split-brain
  • Partition handling - Majority partition continues
  • Automatic recovery - Rejoins when network restored

Data Center Failure

  • Multi-zone deployment - Survive zone failures
  • Multi-region - Survive region failures (advanced)
  • Disaster recovery - Backup and restore procedures

HA Architecture Patterns

Single Zone HA

graph TB subgraph "Zone 1" CP1[Control Plane 1] CP2[Control Plane 2] CP3[Control Plane 3] Node1[Node 1] Node2[Node 2] Node3[Node 3] end style CP1 fill:#e1f5ff style CP2 fill:#e1f5ff style CP3 fill:#e1f5ff style Node1 fill:#fff4e1 style Node2 fill:#fff4e1 style Node3 fill:#fff4e1
  • Multiple control plane nodes
  • Multiple worker nodes
  • etcd cluster
  • Survives node failures

Multi-Zone HA

graph TB subgraph "Zone 1" CP1[CP 1] Node1[Node 1] end subgraph "Zone 2" CP2[CP 2] Node2[Node 2] end subgraph "Zone 3" CP3[CP 3] Node3[Node 3] end style CP1 fill:#e1f5ff style CP2 fill:#e1f5ff style CP3 fill:#e1f5ff style Node1 fill:#fff4e1 style Node2 fill:#fff4e1 style Node3 fill:#fff4e1
  • Control plane across zones
  • Worker nodes across zones
  • etcd across zones
  • Survives zone failures

HA Best Practices

Control Plane

  • Multiple API servers - At least 3
  • Load balancer - Distribute API server traffic
  • Leader election - For scheduler and controller manager
  • Health monitoring - Monitor all components

etcd

  • Cluster size - 3 or 5 nodes
  • Separate nodes - Dedicated etcd nodes
  • Regular backups - Backup etcd regularly
  • Monitor health - Monitor etcd cluster health

Workloads

  • Multiple replicas - Run multiple pod replicas
  • Pod Disruption Budgets - Control evictions
  • Anti-affinity - Spread pods across nodes
  • Health checks - Liveness and readiness probes

Networking

  • Load balancing - Use Services and Ingress
  • Health checks - Route away from unhealthy pods
  • Multiple endpoints - Distribute across endpoints

Monitoring HA

Monitor HA health:

  • Control plane status - All components healthy
  • etcd health - Cluster quorum maintained
  • Node status - Sufficient healthy nodes
  • Pod distribution - Workloads properly distributed
  • Service availability - Services responding

Key Takeaways

  • High availability ensures clusters survive component failures
  • Control plane HA requires multiple API servers, schedulers, controller managers
  • etcd clustering (3 or 5 nodes) provides state storage HA
  • Node distribution spreads workloads across multiple nodes
  • Load balancing distributes traffic across instances
  • HA handles node failures, network partitions, and zone failures
  • Follow HA best practices for production clusters
  • Monitor HA health continuously

See Also