KubeStatefulSetReplicaMismatch
KubeStatefulSetReplicaMismatch
Description
This alert fires when a Kubernetes StatefulSet does not have the expected number of replicas running and Ready.
It indicates that the actual number of Ready pods differs from the desired replica count, potentially impacting stateful workloads such as databases, queues, or clustered applications that rely on stable identities and storage.
Possible Causes:
- Pods failing to start or repeatedly crashing (
CrashLoopBackOff) - Pending pods due to insufficient CPU, memory, or storage
- PersistentVolumeClaim (PVC) provisioning or binding failures
- Volume mount or permission issues
- Node failures or nodes in
NotReadystate - Strict pod management policy blocking pod creation
- Failing readiness or liveness probes
- PodDisruptionBudget constraints
- Manual pod deletion without replacement
Severity estimation
Medium to High severity, depending on the workload and replica count.
- Low if a non-critical replica is missing and redundancy remains
- Medium if reduced capacity or quorum risk exists
- High if critical replicas are missing (e.g. primary database pod)
- Critical if quorum is lost or all replicas are unavailable
Severity increases with:
- Number of missing replicas
- Role of the missing pod(s) in the application
- Duration of the mismatch
Troubleshooting steps
-
Check StatefulSet status
- Command / Action:
- Inspect desired vs ready replicas
-
kubectl get statefulset <statefulset-name> -n <namespace>
- Expected result:
READYequals desired replica count
- additional info:
- A mismatch confirms the alert condition
- Command / Action:
-
Describe the StatefulSet
- Command / Action:
- Review events and pod management behavior
-
kubectl describe statefulset <statefulset-name> -n <namespace>
- Expected result:
- Events show normal pod creation and updates
- additional info:
- StatefulSets create pods sequentially by default
- Command / Action:
-
Inspect pods
- Command / Action:
- List StatefulSet pods and their status
-
kubectl get pods -n <namespace> -l <statefulset-label> -o wide
- Expected result:
- Pods are
RunningandReady
- Pods are
- additional info:
- Identify which ordinal pod is missing or unhealthy
- Command / Action:
-
Describe problematic pods
- Command / Action:
- Inspect events and container status
-
kubectl describe pod <pod-name> -n <namespace>
- Expected result:
- Pods start successfully without repeated failures
- additional info:
- Look for PVC, scheduling, or probe-related errors
- Command / Action:
-
Check PersistentVolumeClaims
- Command / Action:
- Verify PVCs are bound and healthy
-
kubectl get pvc -n <namespace>
- Expected result:
- All PVCs are in
Boundstate
- All PVCs are in
- additional info:
- Unbound PVCs block pod startup
- Command / Action:
-
Check pod logs
- Command / Action:
- Review logs for application-level failures
-
kubectl logs <pod-name> -n <namespace>
- Expected result:
- Application starts and runs normally
- additional info:
- Use
--previousfor restarted containers
- Use
- Command / Action:
-
Verify PodDisruptionBudgets
- Command / Action:
- Inspect PDBs that may block recovery
-
kubectl get pdb -n <namespace>
- Expected result:
- PDBs allow at least one pod to be unavailable
- additional info:
- Overly strict PDBs can stall StatefulSet recovery
- Command / Action:
-
Recover missing replicas
- Command / Action:
- Fix root cause and allow pod recreation
-
kubectl delete pod <pod-name> -n <namespace>
- Expected result:
- StatefulSet recreates pod with same identity
- additional info:
- Avoid deleting PVCs unless data loss is acceptable
- Command / Action:
Additional resources
- Kubernetes StatefulSet documentation
- Kubernetes Persistent Volumes
- Kubernetes Pod lifecycle and troubleshooting
- Related alert: KubePodNotReady
- Related alert: KubePodCrashLooping