High Availability

High availability (HA) ensures your Kubernetes cluster continues operating even when individual components fail. In a single-node control plane, if that node fails, the entire cluster becomes unusable. High availability distributes control plane components across multiple nodes, so the failure of one node doesn’t bring down the cluster.

Think of high availability like having multiple engines on an airplane. If one engine fails, the others keep the plane flying. Similarly, with multiple control plane nodes, if one fails, the others continue serving the cluster.

What Is High Availability?

High availability in Kubernetes means:

Multiple Control Plane Nodes - Run API server, controller manager, and scheduler on multiple nodes
Clustered etcd - Run etcd as a distributed cluster (typically 3 or 5 nodes)
Load Balanced API Traffic - Distribute API server requests across control plane nodes
Node Redundancy - Run worker nodes across multiple availability zones
Automatic Failover - Components automatically use healthy nodes when others fail

graph TB subgraph "High Availability Cluster" subgraph "Control Plane" A1[API Server 1] A2[API Server 2] A3[API Server 3] E1[etcd 1] E2[etcd 2] E3[etcd 3] C1[Controller Manager] C2[Controller Manager] C3[Controller Manager] S1[Scheduler] S2[Scheduler] S3[Scheduler] end LB[Load Balancer] subgraph "Worker Nodes" W1[Node 1 Zone A] W2[Node 2 Zone B] W3[Node 3 Zone C] end end LB --> A1 LB --> A2 LB --> A3 A1 --> E1 A2 --> E2 A3 --> E3 E1 <--> E2 E2 <--> E3 E3 <--> E1 W1 --> LB W2 --> LB W3 --> LB style LB fill:#e1f5ff style A1 fill:#fff4e1 style A2 fill:#fff4e1 style A3 fill:#fff4e1 style E1 fill:#e8f5e9 style E2 fill:#e8f5e9 style E3 fill:#e8f5e9

Control Plane High Availability

The control plane consists of several components, each with different HA requirements:

API Server

The API server is stateless and can run multiple instances. A load balancer distributes requests across all API server instances. If one API server fails, requests automatically go to the others.

etcd

etcd is stateful and requires clustering for HA. etcd uses a consensus algorithm (Raft) that requires a quorum (majority) of nodes to operate:

3-node etcd - Can tolerate 1 node failure (needs 2 of 3 for quorum)
5-node etcd - Can tolerate 2 node failures (needs 3 of 5 for quorum)
7-node etcd - Can tolerate 3 node failures (needs 4 of 7 for quorum)

More nodes provide better fault tolerance but increase complexity and latency. Most clusters use 3-node etcd.

Controller Manager and Scheduler

These components use leader election—only one instance is active at a time, but multiple instances run for redundancy. If the active instance fails, another instance takes over automatically.

etcd Topologies

How etcd is deployed affects availability:

Stacked etcd Topology

etcd runs on the same nodes as control plane components. This is simpler but couples etcd availability with control plane availability.

Advantages:

Simpler setup (fewer nodes)
Lower resource requirements
Easier to manage

Disadvantages:

etcd and API server share fate (if node fails, both are affected)
More complex recovery (need to restore both)

External etcd Topology

etcd runs on separate nodes from control plane components. This provides better isolation and is recommended for production.

Advantages:

Better isolation (etcd and control plane failures are independent)
Can scale etcd separately
More resilient to failures

Disadvantages:

More nodes to manage
Higher resource requirements
More complex setup

Load Balancing Control Plane Traffic

All components (kubelet, kube-proxy, controllers, users) need to connect to the API server. In an HA setup, a load balancer distributes this traffic:

graph TB subgraph "Clients" K1[kubelet 1] K2[kubelet 2] U1[Users] C1[Controllers] end LB[Load Balancer VIP: 10.0.0.100] subgraph "Control Plane" A1[API Server 1 10.0.0.11] A2[API Server 2 10.0.0.12] A3[API Server 3 10.0.0.13] end K1 --> LB K2 --> LB U1 --> LB C1 --> LB LB --> A1 LB --> A2 LB --> A3 style LB fill:#e1f5ff style A1 fill:#fff4e1 style A2 fill:#fff4e1 style A3 fill:#fff4e1

The load balancer must:

Health check API servers
Distribute traffic evenly
Handle API server failures gracefully
Provide a stable endpoint (VIP) that doesn’t change

Failure Scenarios

High availability protects against various failure scenarios:

Single Control Plane Node Failure

API server: Load balancer routes to other API servers (no impact)
Controller manager/scheduler: Another instance takes over via leader election (brief pause)
etcd (stacked): Cluster continues with remaining etcd nodes (if quorum maintained)

etcd Node Failure

3-node etcd: Cluster continues with 2 nodes (quorum maintained)
5-node etcd: Cluster continues with 4 nodes (quorum maintained)
If quorum lost: etcd becomes read-only (cluster effectively down)

Load Balancer Failure

Single point of failure
Mitigate with redundant load balancers or DNS-based failover

Availability Zone Failure

Distribute control plane nodes across zones
Distribute worker nodes across zones
Use Pod Disruption Budgets to maintain application availability

HA Setup with kubeadm

kubeadm supports HA cluster setup:

Initialize first control plane node - Creates certificates and initial configuration
Copy certificates - Share certificates to other control plane nodes
Join additional control plane nodes - Use kubeadm join with control-plane flag
Configure load balancer - Set up load balancer pointing to all API servers
Update kubeconfig - Point to load balancer VIP instead of single node

Best Practices

Use 3 or 5 etcd nodes - Odd numbers prevent split-brain scenarios
Distribute across zones - Place nodes in different availability zones
Monitor etcd health - Watch etcd cluster health and quorum status
Test failure scenarios - Regularly test node failures to verify HA works
Document procedures - Document how to add/remove control plane nodes
Use external etcd for production - Better isolation than stacked topology
Configure proper load balancing - Use health checks and proper algorithms
Plan for upgrades - Upgrade HA clusters one node at a time
Monitor leader election - Ensure controller manager and scheduler have leaders
Backup etcd regularly - Even with HA, backups are essential

Topics

etcd Topologies - Detailed guide to etcd deployment topologies
Control Plane LB - Load balancing strategies for control plane