Scheduling Overview

Scheduling is the process by which Kubernetes decides which node should run each pod. Understanding scheduling is important because it affects where your workloads run, how resources are utilized, and how applications perform. The scheduler is a core component that ensures pods are placed on appropriate nodes based on resource requirements, constraints, and policies.

Think of the scheduler like a smart assignment system. When you have a task (pod) that needs to be done, the scheduler looks at all available workers (nodes), considers their capabilities, current workload, and any special requirements, then assigns the task to the best worker. It’s not random—it’s an intelligent decision-making process.

What is Scheduling?

Scheduling is the process of assigning pods to nodes. When you create a pod, the scheduler:

Finds the pod - Detects pods with no node assignment
Filters nodes - Removes nodes that can’t run the pod
Scores nodes - Ranks remaining nodes by suitability
Selects node - Chooses the best node
Binds pod - Assigns pod to selected node

Once scheduled, the kubelet on that node creates and runs the pod’s containers.

The Scheduler

The scheduler is a control plane component that runs as kube-scheduler. It’s responsible for:

Watching for unscheduled pods - Continuously monitors for new pods
Evaluating nodes - Checks resource availability, constraints, policies
Making placement decisions - Selects the best node for each pod
Binding pods - Updates pods with node assignments

The scheduler doesn’t actually create pods—it just decides where they should run. The kubelet on the selected node creates the containers.

Scheduling Process

graph TB Pod[New Pod Created] --> Scheduler[Scheduler Watches] Scheduler --> Filter[Filter Nodes] Filter --> Score[Score Nodes] Score --> Select[Select Best Node] Select --> Bind[Bind Pod to Node] Bind --> Kubelet[Kubelet Creates Containers] style Pod fill:#e1f5ff style Scheduler fill:#fff4e1 style Filter fill:#e8f5e9 style Score fill:#f3e5f5 style Bind fill:#fff4e1 style Kubelet fill:#e1f5ff

Filtering Phase

The scheduler filters out nodes that can’t run the pod:

Insufficient resources - Node doesn’t have enough CPU/memory
Node selector mismatch - Pod’s nodeSelector doesn’t match
Taints without tolerations - Node has taints pod can’t tolerate
Pod affinity/anti-affinity - Affinity rules not satisfied
Node conditions - Node is not ready or has pressure

Scoring Phase

The scheduler scores remaining nodes and selects the best one:

Resource availability - Prefers nodes with more available resources
Affinity preferences - Prefers nodes matching affinity rules
Load balancing - Spreads pods across nodes
Custom policies - Applies custom scoring plugins

Resource Requests and Limits

Pods specify resource requirements that affect scheduling:

Requests

Requests are the minimum resources guaranteed to a pod:

spec:
  containers:
  - name: app
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"

Scheduler only places pod on nodes with enough resources
Resources are reserved for the pod
Used for scheduling decisions

Limits

Limits are the maximum resources a pod can use:

spec:
  containers:
  - name: app
    resources:
      limits:
        memory: "128Mi"
        cpu: "500m"

Pod can’t exceed these limits
Used for resource enforcement (not scheduling)
Helps prevent resource starvation

How They Affect Scheduling

graph LR Pod[Pod with Requests] --> Scheduler[Scheduler Checks] Scheduler --> Node1[Node 1 CPU: 2 cores Available: 1 core] Scheduler --> Node2[Node 2 CPU: 4 cores Available: 3 cores] Pod -.->|"Request: 500m"| Scheduler Scheduler -->|"Can fit"| Node1 Scheduler -->|"Can fit"| Node2 Scheduler -->|"Selects Node 2 (more resources)"| Node2 style Pod fill:#e1f5ff style Scheduler fill:#fff4e1 style Node1 fill:#e8f5e9 style Node2 fill:#e8f5e9

The scheduler:

Only considers nodes with enough available resources
Prefers nodes with more available resources (better packing)
Ensures node capacity isn’t exceeded

Node Selectors

Node selectors allow pods to specify which nodes they can run on:

spec:
  nodeSelector:
    disktype: ssd
    zone: us-west-1

Pods only scheduled on nodes matching all selectors
Simple key-value matching
Nodes must have matching labels

Affinity and Anti-Affinity

Affinity rules provide more flexible scheduling constraints:

Pod Affinity

Prefer to schedule pods together:

spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - web
        topologyKey: kubernetes.io/hostname

Pod Anti-Affinity

Prefer to schedule pods apart:

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app: my-app
          topologyKey: kubernetes.io/hostname

Node Affinity

Prefer specific node characteristics:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: zone
            operator: In
            values:
            - us-west-1

Taints and Tolerations

Taints and tolerations control which pods can run on which nodes:

Taints

Nodes can have taints that repel pods:

kubectl taint nodes node1 key=value:NoSchedule

Tolerations

Pods can have tolerations that allow them to run on tainted nodes:

spec:
  tolerations:
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"

Effects:

NoSchedule - Don’t schedule new pods (existing pods unaffected)
PreferNoSchedule - Try not to schedule (soft taint)
NoExecute - Evict existing pods that don’t tolerate

Topology Spread Constraints

Topology spread constraints control how pods are distributed across zones, nodes, or other topologies:

spec:
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: my-app

This ensures pods are evenly distributed across zones.

Priority and Preemption

Pods can have priorities that affect scheduling:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000
---
spec:
  priorityClassName: high-priority

Higher priority pods can preempt (evict) lower priority pods if needed.

Scheduling Failures

Pods can fail to schedule if:

No nodes available - All nodes filtered out
Insufficient resources - No nodes have enough resources
Constraints not met - Affinity, taints, selectors not satisfied
Node conditions - All nodes have issues (pressure, not ready)

Check scheduling status:

kubectl describe pod my-pod

Look for events showing why scheduling failed.

Scheduler Extensibility

The scheduler can be extended with:

Scheduler plugins - Custom filtering and scoring logic
Scheduler profiles - Different scheduling strategies
Custom schedulers - Alternative schedulers for specific use cases

Best Practices

Resource Management

Always specify resource requests
Set appropriate limits
Monitor resource usage
Right-size requests based on actual usage

Scheduling Constraints

Use node selectors for simple requirements
Use affinity for complex placement rules
Use taints/tolerations for dedicated nodes
Use topology spread for distribution

Performance

Avoid over-constraining pods (may not schedule)
Use preferred affinity when possible (softer constraint)
Monitor scheduling failures
Balance constraints with flexibility

Key Takeaways

Scheduling assigns pods to nodes based on resources, constraints, and policies
The scheduler filters nodes, then scores and selects the best one
Resource requests affect scheduling (limits don’t)
Node selectors, affinity, taints/tolerations control placement
Topology spread constraints ensure even distribution
Priority and preemption allow important pods to be scheduled first
Scheduling can fail if constraints can’t be met