Networking
Network issues in Kubernetes can be complex due to the layered networking architecture. This guide covers common networking problems, diagnostic tools, and solutions for pod-to-pod communication, service connectivity, DNS, and network policies.
Network Troubleshooting Methodology
Systematic approach to network troubleshooting:
Common Network Issues
Pod-to-Pod Communication
Symptoms:
- Pods cannot ping each other
- Services cannot reach backend pods
- Network timeouts
Common Causes:
- CNI plugin issues
- NetworkPolicy blocking traffic
- IP address conflicts
- Routing problems
Service Connectivity
Symptoms:
- Cannot access services
- Service endpoints empty
- Service IP not responding
Common Causes:
- Service selector mismatch
- No pod endpoints
- NetworkPolicy blocking
- kube-proxy issues
DNS Failures
Symptoms:
- DNS resolution failures
- Services not resolvable
- nslookup failures
Common Causes:
- CoreDNS not running
- DNS configuration issues
- NetworkPolicy blocking DNS
- DNS service problems
Ingress Issues
Symptoms:
- Ingress not routing traffic
- 502/503 errors
- Certificates not working
Common Causes:
- Ingress controller not running
- Backend service issues
- Certificate problems
- Configuration errors
Diagnostic Tools
Basic Connectivity Tests
# Test pod-to-pod connectivity
kubectl run test-pod --image=busybox --rm -it -- ping <target-pod-ip>
# Test service connectivity
kubectl run test-pod --image=busybox --rm -it -- wget -O- <service-name>
# Test DNS resolution
kubectl run test-pod --image=busybox --rm -it -- nslookup <service-name>
Network Debug Pod
Create a persistent network debugging pod:
apiVersion: v1
kind: Pod
metadata:
name: netshoot
namespace: default
spec:
containers:
- name: netshoot
image: nicolaka/netshoot
command: ["sleep", "3600"]
Use for debugging:
# Execute into debug pod
kubectl exec -it netshoot -- bash
# Now you have network tools:
# - ping, traceroute
# - curl, wget
# - dig, nslookup
# - tcpdump
# - netstat, ss
kubectl exec Tools
# Execute into pod
kubectl exec -it <pod-name> -- /bin/sh
# Test connectivity
kubectl exec <pod-name> -- ping <target-ip>
kubectl exec <pod-name> -- curl http://<service-name>
# Check DNS
kubectl exec <pod-name> -- nslookup <service-name>
kubectl exec <pod-name> -- dig <service-name>
# Check network interfaces
kubectl exec <pod-name> -- ip addr
kubectl exec <pod-name> -- netstat -an
Pod-to-Pod Communication
Checking Pod IPs
# Get pod IP addresses
kubectl get pods -o wide
# Get specific pod IP
kubectl get pod <pod-name> -o jsonpath='{.status.podIP}'
# List all pod IPs
kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.podIP}{"\n"}{end}'
Testing Connectivity
# Create test pod
kubectl run test-pod --image=busybox --rm -it -- sh
# Inside pod, test connectivity:
ping <target-pod-ip>
wget -O- http://<target-pod-ip>:8080
telnet <target-pod-ip> 8080
CNI Plugin Issues
# Check CNI plugin pods
kubectl get pods -n kube-system | grep -i cni
# Check CNI plugin logs
kubectl logs -n kube-system -l app=<cni-plugin>
# Check CNI configuration
cat /etc/cni/net.d/*.conf
# Check CNI binary
ls -la /opt/cni/bin/
Service Troubleshooting
Service Status
# Check service
kubectl get service <service-name>
# Describe service
kubectl describe service <service-name>
# Get service details
kubectl get service <service-name> -o yaml
Service Endpoints
# Check service endpoints
kubectl get endpoints <service-name>
# Describe endpoints
kubectl describe endpoints <service-name>
# Check if endpoints are empty
kubectl get endpoints <service-name> -o jsonpath='{.subsets}'
Empty endpoints indicate:
- No pods match service selector
- Pods not ready (readiness probe failing)
- Port mismatch
Service Selector
# Check service selector
kubectl get service <service-name> -o jsonpath='{.spec.selector}'
# Check pods matching selector
kubectl get pods -l <selector>
# Verify labels match
kubectl get pods --show-labels
Testing Service Connectivity
# Test service from pod
kubectl run test-pod --image=busybox --rm -it -- \
wget -O- http://<service-name>.<namespace>.svc.cluster.local
# Test service IP directly
kubectl get service <service-name> -o jsonpath='{.spec.clusterIP}'
kubectl run test-pod --image=busybox --rm -it -- \
wget -O- http://<cluster-ip>
# Test with port
kubectl run test-pod --image=busybox --rm -it -- \
wget -O- http://<service-name>:<port>
DNS Troubleshooting
CoreDNS Status
# Check CoreDNS pods
kubectl get pods -n kube-system | grep coredns
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns
# Check CoreDNS configuration
kubectl get configmap coredns -n kube-system -o yaml
DNS Configuration
# Check pod DNS configuration
kubectl run test-pod --image=busybox --rm -it -- cat /etc/resolv.conf
# Expected output:
# nameserver <coredns-service-ip>
# search <namespace>.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5
DNS Resolution Tests
# Test DNS resolution
kubectl run test-pod --image=busybox --rm -it -- nslookup <service-name>
# Test with dig
kubectl run test-pod --image=busybox --rm -it -- \
dig <service-name>.<namespace>.svc.cluster.local
# Test FQDN
kubectl run test-pod --image=busybox --rm -it -- \
nslookup <service-name>.<namespace>.svc.cluster.local
Common DNS Issues
DNS Not Resolving
# Check CoreDNS is running
kubectl get pods -n kube-system -l k8s-app=kube-dns
# Check DNS service
kubectl get service kube-dns -n kube-system
# Test CoreDNS directly
kubectl run test-pod --image=busybox --rm -it -- \
nslookup kubernetes.default.svc.cluster.local <coredns-service-ip>
Slow DNS Resolution
# Check CoreDNS performance
kubectl logs -n kube-system -l k8s-app=kube-dns | grep -i "slow\|timeout"
# Check DNS configuration
kubectl get configmap coredns -n kube-system -o yaml
# Consider optimizing DNS
# - Reduce ndots
# - Add DNS caching
# - Scale CoreDNS
NetworkPolicy Debugging
Checking Network Policies
# List network policies
kubectl get networkpolicies -A
# Get network policy details
kubectl get networkpolicy <policy-name> -o yaml
# Describe network policy
kubectl describe networkpolicy <policy-name>
Testing Network Policies
# Test from allowed pod
kubectl exec -it <allowed-pod> -- ping <target-pod-ip>
# Test from blocked pod
kubectl exec -it <blocked-pod> -- ping <target-pod-ip>
# Check NetworkPolicy logs (if CNI supports it)
kubectl logs -n kube-system -l app=<cni-plugin> | grep -i networkpolicy
Common NetworkPolicy Issues
- Too restrictive - Blocking legitimate traffic
- Missing rules - Not blocking intended traffic
- Label mismatches - Selectors not matching pods
- Namespace issues - Policies not applying to correct namespace
Ingress Troubleshooting
Ingress Status
# Check ingress
kubectl get ingress <ingress-name>
# Describe ingress
kubectl describe ingress <ingress-name>
# Get ingress details
kubectl get ingress <ingress-name> -o yaml
Ingress Controller
# Check ingress controller pods
kubectl get pods -n <ingress-namespace> | grep ingress
# Check ingress controller logs
kubectl logs -n <ingress-namespace> -l app=<ingress-controller>
# Check ingress controller service
kubectl get service -n <ingress-namespace> | grep ingress
Testing Ingress
# Test ingress from outside
curl -H "Host: <hostname>" http://<ingress-ip>
# Test ingress from pod
kubectl run test-pod --image=busybox --rm -it -- \
wget -O- -H "Host: <hostname>" http://<ingress-ip>
# Check ingress backend
kubectl describe ingress <ingress-name> | grep -A 5 "Backends\|Rules"
kube-proxy Troubleshooting
kube-proxy Status
# Check kube-proxy pods
kubectl get pods -n kube-system | grep kube-proxy
# Check kube-proxy logs
kubectl logs -n kube-system -l k8s-app=kube-proxy
# Check kube-proxy mode
kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode
Common kube-proxy Issues
- iptables rules - Check iptables rules for services
- IPVS mode - Verify IPVS configuration
- Connection tracking - Check conntrack table
Network Debugging Examples
Complete Connectivity Test
# Step 1: Create debug pod
kubectl run netshoot --image=nicolaka/netshoot --rm -it -- sleep 3600
# Step 2: Test DNS
nslookup kubernetes.default.svc.cluster.local
dig <service-name>.<namespace>.svc.cluster.local
# Step 3: Test service connectivity
curl http://<service-name>.<namespace>.svc.cluster.local
wget -O- http://<service-name>:<port>
# Step 4: Test pod-to-pod
ping <target-pod-ip>
telnet <target-pod-ip> <port>
# Step 5: Capture traffic
tcpdump -i any -n
Service Discovery Test
# Test service discovery
kubectl run test-pod --image=busybox --rm -it -- sh
# Inside pod:
# Short name (same namespace)
wget -O- http://<service-name>
# Short name (different namespace)
wget -O- http://<service-name>.<namespace>
# FQDN
wget -O- http://<service-name>.<namespace>.svc.cluster.local
Best Practices
Use network debug pods - Keep debugging tools ready
Test incrementally - Start from basic connectivity, then add layers
Check logs - Review CNI, kube-proxy, and CoreDNS logs
Verify configurations - Check service selectors, NetworkPolicies, DNS config
Document network topology - Understand your CNI and network setup
Monitor network metrics - Track network performance and errors
Test regularly - Proactively test network connectivity
Troubleshooting Checklist
- Verify pod IPs are assigned
- Test pod-to-pod connectivity
- Check service endpoints
- Verify service selector matches pods
- Test DNS resolution
- Check CoreDNS is running
- Review NetworkPolicies
- Verify CNI plugin is working
- Check kube-proxy status
- Test Ingress configuration
See Also
- Services - Service configuration
- Network Policies - NetworkPolicy configuration
- DNS & CoreDNS - DNS setup
- Troubleshooting - General troubleshooting guide