Kubelet & CRI
Kubelet is the primary node agent that runs on each node, managing pod lifecycles and communicating with the container runtime. The Container Runtime Interface (CRI) provides the abstraction layer between kubelet and container runtimes. This guide covers troubleshooting kubelet and CRI issues.
Kubelet Overview
Kubelet is responsible for:
- Managing pod lifecycle
- Pulling container images
- Starting and stopping containers
- Reporting node and pod status
- Executing liveness and readiness probes
- Managing volumes
Common Kubelet Issues
Node Not Registered
Symptoms:
- Node doesn’t appear in
kubectl get nodes - Node status unknown
- API server cannot communicate with kubelet
Diagnosis:
# Check if node is registered
kubectl get nodes
# Check kubelet status (on node)
systemctl status kubelet
# Check kubelet logs
journalctl -u kubelet -n 100
# Check kubelet configuration
cat /var/lib/kubelet/config.yaml
Common Causes:
- Kubelet not running
- Certificate/authentication issues
- API server connectivity problems
- Configuration errors
Solutions:
- Start kubelet:
systemctl start kubelet - Check certificates: Verify kubelet certificates
- Verify API server connectivity
- Check kubelet configuration
Image Pull Failures
Symptoms:
- Pods stuck in
ImagePullBackOff - Pod events show pull errors
- Containers fail to start
Diagnosis:
# Check pod events
kubectl describe pod <pod-name>
# Check kubelet logs for pull errors
journalctl -u kubelet | grep -i "pull\|image"
# Check image pull secrets
kubectl get pod <pod-name> -o jsonpath='{.spec.imagePullSecrets}'
# Test image pull manually
crictl pull <image-name>
Common Causes:
- Image doesn’t exist
- Wrong image name or tag
- Private registry without credentials
- Network connectivity issues
- Registry authentication failures
Solutions:
- Verify image name and tag
- Check image pull secrets
- Verify network connectivity to registry
- Test image pull manually
- Check registry authentication
Pod Startup Failures
Symptoms:
- Pods stuck in
PendingorContainerCreating - Containers fail to start
- Pod events show startup errors
Diagnosis:
# Check pod status
kubectl get pod <pod-name>
# Check pod events
kubectl describe pod <pod-name>
# Check kubelet logs
journalctl -u kubelet | grep <pod-name>
# Check container runtime logs
journalctl -u containerd | grep <pod-name>
Common Causes:
- Image pull failures
- Volume mount issues
- Resource constraints
- Configuration errors
- Container runtime issues
Solutions:
- Check image pull status
- Verify volume mounts
- Check resource limits
- Review pod configuration
- Check container runtime status
CRI Issues
Symptoms:
- Containers not starting
- CRI errors in kubelet logs
- Container runtime unresponsive
Diagnosis:
# Check container runtime status
systemctl status containerd
# or
systemctl status docker
# Check CRI socket
ls -la /var/run/containerd/containerd.sock
# or
ls -la /var/run/docker.sock
# Test CRI connectivity
crictl version
# Check kubelet logs for CRI errors
journalctl -u kubelet | grep -i "cri\|runtime"
Common Causes:
- Container runtime not running
- CRI socket not accessible
- Version incompatibility
- Resource exhaustion
Solutions:
- Start container runtime
- Verify CRI socket permissions
- Check version compatibility
- Restart container runtime
Kubelet Logs
Accessing Kubelet Logs
On the Node
# View kubelet logs (systemd)
journalctl -u kubelet -n 100
# Follow kubelet logs
journalctl -u kubelet -f
# View logs since specific time
journalctl -u kubelet --since "1 hour ago"
# View logs with timestamps
journalctl -u kubelet --since "1 hour ago" --no-pager
From Kubernetes API
If kubelet exposes logs via API (depends on configuration):
# Get kubelet logs (if available)
kubectl get --raw /api/v1/nodes/<node-name>/proxy/logs/kubelet
Log File Location
# Kubelet log file (if configured)
tail -f /var/log/kubelet.log
# Systemd journal
journalctl -u kubelet
Kubelet Configuration
Configuration File
# Check kubelet configuration
cat /var/lib/kubelet/config.yaml
# Check kubelet command-line arguments
ps aux | grep kubelet
# Check kubelet config file
cat /etc/kubernetes/kubelet.conf
Important Configuration Options
--api-servers- API server endpoints--pod-manifest-path- Static pod manifest path--config- Kubelet config file path--container-runtime-endpoint- CRI endpoint--image-pull-progress-deadline- Image pull timeout--node-ip- Node IP address--pod-infra-container-image- Pod infrastructure image
Container Runtime Interface (CRI)
CRI Overview
CRI provides abstraction between kubelet and container runtimes:
Supported Runtimes
- containerd - Default in many distributions
- CRI-O - Lightweight CRI implementation
- Docker - Via dockershim (deprecated)
CRI Socket Location
# containerd socket
/var/run/containerd/containerd.sock
# CRI-O socket
/var/run/crio/crio.sock
# Docker socket (deprecated)
/var/run/docker.sock
Troubleshooting CRI Issues
Checking CRI Status
# Check container runtime status
systemctl status containerd
# or
systemctl status crio
# Test CRI connectivity
crictl version
# Check CRI socket
ls -la /var/run/containerd/containerd.sock
Common CRI Problems
Container Runtime Not Running
# Check status
systemctl status containerd
# Start container runtime
systemctl start containerd
# Enable on boot
systemctl enable containerd
CRI Socket Not Accessible
# Check socket permissions
ls -la /var/run/containerd/containerd.sock
# Check kubelet can access socket
sudo -u kubelet test -r /var/run/containerd/containerd.sock
# Fix permissions if needed
chmod 666 /var/run/containerd/containerd.sock
Version Incompatibility
# Check versions
crictl version
kubectl version --short
# Check kubelet logs for version errors
journalctl -u kubelet | grep -i version
Debugging Kubelet Connectivity
API Server Connectivity
# Check API server connectivity from node
curl -k https://<api-server-ip>:6443/healthz
# Check kubelet can reach API server
journalctl -u kubelet | grep -i "connection\|unreachable"
# Verify kubelet certificate
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout
Node Registration
# Check node registration
kubectl get nodes
# Check node conditions
kubectl describe node <node-name>
# Check kubelet logs for registration
journalctl -u kubelet | grep -i "register\|certificate"
Resource Management Problems
Resource Limits
Kubelet enforces resource limits:
# Check node capacity
kubectl describe node <node-name> | grep -A 5 "Capacity\|Allocatable"
# Check resource usage
kubectl top node <node-name>
# Check for resource pressure
kubectl describe node <node-name> | grep -i "pressure"
Eviction Policies
Kubelet evicts pods under resource pressure:
# Check eviction policy
cat /var/lib/kubelet/config.yaml | grep -i eviction
# Check for evicted pods
kubectl get pods --all-namespaces | grep Evicted
# Check eviction events
kubectl get events --all-namespaces | grep Evicted
Pod Startup Failures
Diagnosing Startup Issues
# Step 1: Check pod status
kubectl get pod <pod-name>
# Step 2: Check pod events
kubectl describe pod <pod-name>
# Step 3: Check kubelet logs
journalctl -u kubelet | grep <pod-name>
# Step 4: Check container runtime logs
journalctl -u containerd | grep <pod-name>
Common Startup Problems
Image Pull Issues
# Check image pull status
kubectl describe pod <pod-name> | grep -A 5 "Events"
# Check kubelet logs
journalctl -u kubelet | grep -i "pull\|image"
# Test image pull
crictl pull <image-name>
Volume Mount Issues
# Check volume mounts
kubectl describe pod <pod-name> | grep -A 5 "Volumes\|Mounts"
# Check volume events
kubectl get events --field-selector involvedObject.name=<pod-name> | grep -i volume
# Verify volume exists
kubectl get pv,pvc
Resource Constraints
# Check resource requests
kubectl describe pod <pod-name> | grep -A 5 "Requests\|Limits"
# Check node resources
kubectl describe node <node-name> | grep -A 5 "Allocatable"
# Check for resource pressure
kubectl describe node <node-name> | grep -i "pressure"
Best Practices
Monitor kubelet health - Set up monitoring for kubelet metrics
Regular log rotation - Configure log rotation for kubelet logs
Resource limits - Set appropriate resource limits for kubelet
Certificate management - Regularly rotate and verify certificates
Version compatibility - Keep kubelet and container runtime versions compatible
Configuration management - Use kubelet config files for better management
Health checks - Implement kubelet health checks
Troubleshooting Checklist
- Check kubelet service status
- Review kubelet logs for errors
- Verify API server connectivity
- Check node registration
- Verify container runtime status
- Check CRI socket accessibility
- Verify image pull functionality
- Check resource constraints
- Review pod startup logs
- Verify certificates and authentication
See Also
- Control Plane - Control plane troubleshooting
- Clusters & Nodes - General node troubleshooting
- Container & Pod logs - Viewing container logs