Alert Runbooks

KubeDeploymentReplicasMismatch

KubeDeploymentReplicasMismatch

Description

This alert fires when the number of replicas defined in a Kubernetes Deployment does not match the number of available or ready replicas. It indicates that the Deployment is unable to reach or maintain the desired replica count, which can lead to reduced capacity or service unavailability.


Possible Causes:


Severity estimation


Troubleshooting steps

  1. Check Deployment replica status

    • Command / Action:
      • Inspect desired vs available replicas
      • kubectl get deployment <deployment-name> -n <namespace>

    • Expected result:
      • Desired, current, ready, and available replica counts align
      • DESIRED=3, READY=3, AVAILABLE=3

    • additional info:
      • A mismatch indicates pods are not becoming ready
  2. Describe the Deployment

    • Command / Action:
      • Review events and rollout status
      • kubectl describe deployment <deployment-name> -n <namespace>

    • Expected result:
      • No errors related to scheduling, images, or probes
    • additional info:
      • Events often explain why replicas are missing
  3. Inspect Pods

    • Command / Action:
      • List pods belonging to the Deployment
      • kubectl get pods -n <namespace>

    • Expected result:
      • Pods are Running and Ready
    • additional info:
      • Investigate Pending, CrashLoopBackOff, or ImagePullBackOff states
  4. Describe problematic Pods

    • Command / Action:
      • Inspect pod details and events
      • kubectl describe pod <pod-name> -n <namespace>

    • Expected result:
      • Pod events show normal startup and readiness
    • additional info:
      • Common issues include resource limits, probes, or permissions
  5. Check container logs

    • Command / Action:
      • Review application logs
      • kubectl logs <pod-name> -n <namespace>

    • Expected result:
      • Application starts without fatal or repeated errors
    • additional info:
      • For restarted pods, check previous logs using --previous
  6. Verify node health

    • Command / Action:
      • Ensure nodes are healthy and schedulable
      • kubectl get nodes

    • Expected result:
      • Nodes are in Ready state with no pressure conditions
    • additional info:
      • Node issues can prevent pods from running
  7. Check for HPA influence

    • Command / Action:
      • Verify if an HPA is controlling replicas
      • kubectl get hpa -n <namespace>

    • Expected result:
      • HPA behavior matches expectations
    • additional info:
      • HPA may temporarily override Deployment replica settings
  8. Scale or rollback if required

    • Command / Action:
      • Scale manually or roll back to a stable version
      • kubectl scale deployment <deployment-name> –replicas=<n> -n <namespace>

      • kubectl rollout undo deployment <deployment-name> -n <namespace>

    • Expected result:
      • Available replicas return to the desired count
    • additional info:
      • Roll back if the current version is unstable

Additional resources