KubePodNotReady

Description

This alert fires when a Kubernetes Pod is running but not in a Ready state for longer than expected.
A Pod marked as NotReady cannot receive traffic via Services, which may lead to partial or complete service disruption depending on replica count and workload type.

Possible Causes:

Failing readiness probes
Containers still starting or initializing
Application startup errors
Crashing containers (CrashLoopBackOff)
Insufficient CPU or memory resources
Volume mount failures
Network or CNI issues
Node pressure (MemoryPressure, DiskPressure)
Misconfigured health checks

Severity estimation

Medium to High severity, depending on impact.

Low if other replicas are Ready and traffic is unaffected
Medium if reduced capacity or increased latency occurs
High if few replicas exist or multiple pods are NotReady
Critical if all pods of a service are NotReady

Severity increases with:

Number of affected pods
Duration of the NotReady state
Criticality of the workload

Troubleshooting steps

Check pod readiness status
- Command / Action:
  - Inspect pod readiness conditions
  - kubectl get pod <pod-name> -n <namespace>
- Expected result:
  - Pod shows READY 1/1 (or expected container count)
- additional info:
  - READY 0/1 indicates readiness probe or startup issues

Describe the pod
- Command / Action:
  - Review events, conditions, and probe results
  - kubectl describe pod <pod-name> -n <namespace>
- Expected result:
  - Events show successful container startup and readiness
- additional info:
  - Look for probe failures, mount errors, or scheduling issues

Check readiness probe configuration
- Command / Action:
  - Inspect readiness probe definition
  - kubectl get pod <pod-name> -n <namespace> -o yaml
- Expected result:
  - Probe matches actual application health endpoint
- additional info:
  - Overly strict probes can keep pods NotReady

Inspect container logs
- Command / Action:
  - Review application logs
  - kubectl logs <pod-name> -n <namespace>
- Expected result:
  - Application starts successfully
- additional info:
  - Use --previous if the container has restarted

Check init containers
- Command / Action:
  - Verify init containers completed successfully
  - kubectl get pod <pod-name> -n <namespace> -o jsonpath=’{.status.initContainerStatuses}'
- Expected result:
  - All init containers show terminated with exit code 0
- additional info:
  - Stuck init containers block readiness

Check node health
- Command / Action:
  - Verify node status and pressure conditions
  - kubectl get node <node-name>
  - kubectl describe node <node-name>
- Expected result:
  - Node is Ready with no pressure conditions
- additional info:
  - Node issues can delay readiness

Verify resource requests and limits
- Command / Action:
  - Inspect pod resource configuration
  - kubectl describe pod <pod-name> -n <namespace>
- Expected result:
  - Resources are sufficient for application startup
- additional info:
  - CPU throttling or OOM kills can prevent readiness

Restart pod if appropriate
- Command / Action:
  - Restart pod after fixing root cause
  - kubectl delete pod <pod-name> -n <namespace>
- Expected result:
  - New pod becomes Ready
- additional info:
  - Avoid restarts without understanding the cause

Additional resources

Kubernetes Pod lifecycle and troubleshooting
Kubernetes probes documentation
Related alert: KubePodCrashLooping
Related alert: KubeDeploymentReplicasMismatch