Scheduling

Scheduling determines which nodes pods run on in your Kubernetes cluster. The Kubernetes scheduler assigns pods to nodes based on resource requirements, constraints, affinity rules, and other criteria. Understanding scheduling helps you control pod placement, optimize resource utilization, and ensure workloads run on appropriate nodes.

What Is Scheduling?

Scheduling is the process of assigning pods to nodes. When you create a pod, the scheduler evaluates cluster nodes and selects the best node based on various factors like resource availability, node constraints, pod requirements, and affinity/anti-affinity rules.

graph TB A[Pod Created] --> B[Scheduler Evaluates] B --> C[Check Node Resources] B --> D[Check Node Constraints] B --> E[Check Affinity Rules] B --> F[Check Taints/Tolerations] C --> G[Filter Nodes] D --> G E --> G F --> G G --> H[Score Remaining Nodes] H --> I[Select Best Node] I --> J[Bind Pod to Node] J --> K[Kubelet Starts Pod] style A fill:#e1f5ff style B fill:#fff4e1 style I fill:#e8f5e9 style K fill:#f3e5f5

The Scheduler

The Kubernetes scheduler is a control plane component that runs as a daemon. It watches for newly created pods that don’t have a node assigned and selects a node for them to run on.

graph LR A[Scheduler] --> B[Watch API Server] B --> C[Pods Pending] C --> D[Filter Nodes] D --> E[Score Nodes] E --> F[Bind Pod to Node] style A fill:#e1f5ff style C fill:#fff4e1 style F fill:#e8f5e9

Scheduling Process

The scheduling process involves two phases:

1. Filtering (Predicates)

Filter out nodes that don’t meet pod requirements:

Resource availability - Node has enough CPU and memory
Node constraints - Node is not tainted or matches pod tolerations
Affinity rules - Node satisfies pod affinity requirements
Port availability - Required ports are available
Volume constraints - Required volumes can be attached

graph TD A[All Nodes] --> B[Filter: Resources] B --> C[Filter: Taints] C --> D[Filter: Affinity] D --> E[Filter: Ports] E --> F[Filter: Volumes] F --> G[Feasible Nodes] style A fill:#e1f5ff style G fill:#e8f5e9

2. Scoring (Priorities)

Rank the remaining nodes to select the best one:

Resource balance - Prefer nodes with balanced resource usage
Affinity preferences - Prefer nodes that match preferred affinity
Inter-pod affinity - Prefer nodes with related pods
Least requested - Prefer nodes with fewer requested resources
Node affinity - Prefer nodes that match node affinity preferences

graph TD A[Feasible Nodes] --> B[Score: Resource Balance] B --> C[Score: Affinity] C --> D[Score: Inter-Pod Affinity] D --> E[Score: Least Requested] E --> F[Select Highest Score] F --> G[Selected Node] style A fill:#e1f5ff style F fill:#fff4e1 style G fill:#e8f5e9

Scheduling Constraints

Various mechanisms control pod placement:

Resource Requests and Limits

Define how much CPU and memory pods need:

resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"

The scheduler uses requests to ensure nodes have enough resources before placing pods.

Node Selectors

Simple node selection based on labels:

nodeSelector:
  disktype: ssd
  zone: us-west-1

Affinity and Anti-Affinity

Advanced rules for pod placement:

Node Affinity - Place pods on nodes with specific characteristics
Pod Affinity - Place pods near other pods
Pod Anti-Affinity - Keep pods away from other pods

graph TB A[Affinity Rules] --> B[Node Affinity] A --> C[Pod Affinity] A --> D[Pod Anti-Affinity] B --> E[Required or Preferred] C --> F[Required or Preferred] D --> G[Required or Preferred] style A fill:#e1f5ff style B fill:#fff4e1 style C fill:#fff4e1 style D fill:#fff4e1

Taints and Tolerations

Node-level restrictions:

Taints - Mark nodes to repel pods (unless they have matching tolerations)
Tolerations - Allow pods to run on tainted nodes

Useful for:

Dedicated nodes (e.g., GPU nodes, database nodes)
Preventing regular workloads from scheduling on master nodes
Isolating workloads

Topology Spread Constraints

Control how pods are distributed across zones, nodes, or other topology domains:

Distribute pods evenly across zones
Prevent too many pods on a single node
Ensure availability zone distribution

Scheduling Scenarios

Scenario 1: Resource-Based Scheduling

graph LR A[Pod: 2 CPU, 4Gi RAM] --> B[Scheduler] B --> C[Node 1: 1 CPU, 2Gi RAM] B --> D[Node 2: 4 CPU, 8Gi RAM] B --> E[Node 3: 2 CPU, 6Gi RAM] C -.->|Insufficient| F[Filtered Out] D --> G[Can Schedule] E --> G style A fill:#e1f5ff style G fill:#e8f5e9 style F fill:#ffe1e1

Scenario 2: Affinity-Based Scheduling

graph TB A[Pod with Node Affinity] --> B[Requires: zone=us-west-1] B --> C[Node 1: zone=us-east-1] B --> D[Node 2: zone=us-west-1] B --> E[Node 3: zone=us-west-1] C -.->|Filtered| F[Not Selected] D --> G[Can Schedule] E --> G style A fill:#e1f5ff style G fill:#e8f5e9 style F fill:#ffe1e1

Scenario 3: Taint-Based Scheduling

graph TB A[Pod] --> B[No Tolerations] B --> C[Node 1: No Taints] B --> D[Node 2: Tainted] B --> E[Node 3: No Taints] C --> F[Can Schedule] D -.->|Blocked| G[Filtered Out] E --> F style A fill:#e1f5ff style F fill:#e8f5e9 style G fill:#ffe1e1

Scheduling Best Practices

Set resource requests - Always specify CPU and memory requests for predictable scheduling
Use resource limits - Prevent pods from consuming excessive resources
Leverage node selectors - For simple placement requirements
Use affinity for complex rules - When you need more sophisticated placement logic
Implement anti-affinity - For high availability and workload isolation
Use taints for dedicated nodes - Reserve nodes for specific workloads
Apply topology spread - Distribute pods across zones and nodes
Monitor scheduling - Watch for unschedulable pods
Test placement - Verify pods schedule on intended nodes
Document constraints - Clearly document why specific scheduling rules are needed

Common Scheduling Issues

Pods Stuck in Pending

Pods can’t be scheduled when:

No nodes have sufficient resources
No nodes match node selectors or affinity rules
All nodes are tainted and pods lack tolerations
Resource quotas are exceeded
PersistentVolumeClaims can’t be satisfied

Pods Scheduled on Wrong Nodes

Check node labels and selectors
Verify affinity rules are correct
Review taint/toleration configuration
Check node capacity and resource availability

Uneven Pod Distribution

Use topology spread constraints
Implement pod anti-affinity
Review node capacity and resource requests
Consider node affinity preferences

Topics

Requests & Limits - CPU and memory resource requirements
Affinity & Anti-Affinity - Advanced pod placement rules
Taints & Tolerations - Node-level pod restrictions
Topology Spread Constraints - Distributing pods across topology domains
Priority & Preemption - Pod priority and preemption
Pod Admission - Pod admission control and validation