High Availability Overview

High availability (HA) ensures that Kubernetes clusters continue operating even when individual components fail. Understanding HA is crucial for production deployments because failures are inevitable—nodes fail, networks partition, disks corrupt. HA design ensures your cluster survives these failures and continues serving workloads.

Think of high availability like redundancy in critical systems. Just as airplanes have multiple engines (so one failure doesn’t crash the plane), HA Kubernetes clusters have multiple control plane nodes, multiple etcd nodes, and workloads distributed across multiple worker nodes. If one component fails, others take over.

What is High Availability?

High availability means the system continues operating despite component failures. For Kubernetes, this means:

Control plane redundancy - Multiple API servers, schedulers, controller managers
etcd clustering - Multiple etcd nodes with replication
Node distribution - Workloads spread across multiple nodes
Load balancing - Traffic distributed across multiple instances
Automatic recovery - Failed components are replaced automatically

Why HA Matters

Production clusters need HA because:

Component failures - Hardware fails, software crashes
Network issues - Network partitions, connectivity problems
Maintenance - Need to update/reboot nodes without downtime
Disaster recovery - Survive data center failures
Service level agreements - Meet uptime requirements

Without HA, a single failure can take down your entire cluster.

Control Plane HA

The control plane is the “brain” of Kubernetes. Making it highly available is critical:

graph TB LB[Load Balancer] --> API1[API Server 1] LB --> API2[API Server 2] LB --> API3[API Server 3] API1 --> etcd1[etcd 1] API2 --> etcd2[etcd 2] API3 --> etcd3[etcd 3] API1 --> Scheduler1[Scheduler 1] API2 --> Scheduler2[Scheduler 2] API3 --> Scheduler3[Scheduler 3] API1 --> CM1[Controller Manager 1] API2 --> CM2[Controller Manager 2] API3 --> CM3[Controller Manager 3] etcd1 <--> etcd2 etcd2 <--> etcd3 etcd3 <--> etcd1 style LB fill:#e1f5ff style API1 fill:#fff4e1 style API2 fill:#fff4e1 style API3 fill:#fff4e1 style etcd1 fill:#e8f5e9 style etcd2 fill:#e8f5e9 style etcd3 fill:#e8f5e9

API Server HA

Multiple API servers behind a load balancer:

Load balancer - Distributes requests across API servers
Stateless - API servers are stateless, any can handle any request
Health checks - Load balancer routes away from unhealthy servers
Automatic failover - If one fails, others continue serving

Scheduler HA

Multiple schedulers with leader election:

Leader election - Only one scheduler is active at a time
Automatic failover - If leader fails, another becomes leader
No duplicate scheduling - Prevents conflicts

Controller Manager HA

Multiple controller managers with leader election:

Leader election - Only one controller manager is active
Automatic failover - If leader fails, another takes over
Consistent state - Prevents duplicate actions

etcd Clustering

etcd stores all cluster state. Clustering etcd is essential for HA:

etcd Cluster

Typically 3 or 5 etcd nodes:

3 nodes - Can survive 1 node failure
5 nodes - Can survive 2 node failures
Odd numbers - Required for quorum (majority voting)

Raft Consensus

etcd uses Raft consensus:

Leader - One node handles writes
Followers - Replicate from leader
Quorum - Majority must agree for writes
Automatic leader election - If leader fails

etcd Placement

For best HA:

Separate nodes - Run etcd on dedicated nodes
Separate zones - Distribute across availability zones
Network isolation - Protect etcd network
Regular backups - Backup etcd regularly

Node Distribution

Distribute workloads across multiple nodes:

Multiple Worker Nodes

Node redundancy - Multiple nodes run workloads
Automatic rescheduling - Pods rescheduled if node fails
Load distribution - Workloads spread across nodes

Availability Zones

Distribute nodes across zones:

Zone redundancy - Nodes in multiple zones
Zone-aware scheduling - Spread pods across zones
Survive zone failures - Cluster survives zone outages

Pod Disruption Budgets

Control pod evictions during maintenance:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Load Balancing

Distribute traffic across multiple instances:

Service Load Balancing

Kubernetes Services provide load balancing:

ClusterIP - Internal load balancing
NodePort - Expose on node ports
LoadBalancer - Cloud load balancer integration

Ingress Load Balancing

Ingress controllers provide HTTP load balancing:

Multiple replicas - Ingress controller replicas
Traffic distribution - Distribute across pods
Health checks - Route away from unhealthy pods

Failure Scenarios

HA design handles various failures:

Single Node Failure

Worker node - Pods rescheduled to other nodes
Control plane node - Other nodes continue operating
etcd node - Cluster continues if quorum maintained

Network Partition

Split-brain prevention - etcd quorum prevents split-brain
Partition handling - Majority partition continues
Automatic recovery - Rejoins when network restored

Data Center Failure

Multi-zone deployment - Survive zone failures
Multi-region - Survive region failures (advanced)
Disaster recovery - Backup and restore procedures

HA Architecture Patterns

Single Zone HA

graph TB subgraph "Zone 1" CP1[Control Plane 1] CP2[Control Plane 2] CP3[Control Plane 3] Node1[Node 1] Node2[Node 2] Node3[Node 3] end style CP1 fill:#e1f5ff style CP2 fill:#e1f5ff style CP3 fill:#e1f5ff style Node1 fill:#fff4e1 style Node2 fill:#fff4e1 style Node3 fill:#fff4e1

Multiple control plane nodes
Multiple worker nodes
etcd cluster
Survives node failures

Multi-Zone HA

graph TB subgraph "Zone 1" CP1[CP 1] Node1[Node 1] end subgraph "Zone 2" CP2[CP 2] Node2[Node 2] end subgraph "Zone 3" CP3[CP 3] Node3[Node 3] end style CP1 fill:#e1f5ff style CP2 fill:#e1f5ff style CP3 fill:#e1f5ff style Node1 fill:#fff4e1 style Node2 fill:#fff4e1 style Node3 fill:#fff4e1

Control plane across zones
Worker nodes across zones
etcd across zones
Survives zone failures

HA Best Practices

Control Plane

Multiple API servers - At least 3
Load balancer - Distribute API server traffic
Leader election - For scheduler and controller manager
Health monitoring - Monitor all components

etcd

Cluster size - 3 or 5 nodes
Separate nodes - Dedicated etcd nodes
Regular backups - Backup etcd regularly
Monitor health - Monitor etcd cluster health

Workloads

Multiple replicas - Run multiple pod replicas
Pod Disruption Budgets - Control evictions
Anti-affinity - Spread pods across nodes
Health checks - Liveness and readiness probes

Networking

Load balancing - Use Services and Ingress
Health checks - Route away from unhealthy pods
Multiple endpoints - Distribute across endpoints

Monitoring HA

Monitor HA health:

Control plane status - All components healthy
etcd health - Cluster quorum maintained
Node status - Sufficient healthy nodes
Pod distribution - Workloads properly distributed
Service availability - Services responding

Key Takeaways

High availability ensures clusters survive component failures
Control plane HA requires multiple API servers, schedulers, controller managers
etcd clustering (3 or 5 nodes) provides state storage HA
Node distribution spreads workloads across multiple nodes
Load balancing distributes traffic across instances
HA handles node failures, network partitions, and zone failures
Follow HA best practices for production clusters
Monitor HA health continuously