Alert Runbooks

KubeQuotaExceeded

KubeQuotaExceeded

Description

This alert fires when a Kubernetes ResourceQuota in a namespace has been fully exhausted, meaning one or more tracked resources (CPU, memory, pods, services, PVCs, etc.) have reached or exceeded their configured hard limits.

When a quota is exceeded, Kubernetes actively rejects new object creation in the namespace — pod scheduling, deployments, and scaling operations will fail with a exceeded quota error until consumption is reduced or the quota is raised.


Possible Causes:


Severity estimation

High severity — new workloads are actively being rejected and the namespace is operationally impaired:

Severity increases with:


Troubleshooting steps

  1. Identify the affected namespace, quota, and resource type

    • Command / Action:
      • Check alert labels for namespace and resource, then list all quotas in the namespace
      • kubectl get resourcequota -n <namespace>

    • Expected result:
      • The specific quota and exceeded resource type are clearly identified
    • additional info:
      • A namespace can have multiple ResourceQuota objects; check all of them

  1. Describe the quota to see the full usage breakdown

    • Command / Action:
      • Get exact used vs. hard values for each tracked resource
      • kubectl describe resourcequota <quota-name> -n <namespace>

    • Expected result:
      • The used value equals or exceeds the hard limit for at least one resource
    • additional info:
      • Common resources to check: requests.cpu, requests.memory, limits.cpu, limits.memory, pods, services, persistentvolumeclaims

  1. Confirm active failures caused by the exceeded quota

    • Command / Action:
      • Check for pods or deployments failing due to quota errors
      • kubectl get events -n <namespace> –sort-by=’.lastTimestamp’ | grep -i quota

      • kubectl describe replicaset <rs-name> -n <namespace>

    • Expected result:
      • Events show exceeded quota errors; identify which resource type is the bottleneck
    • additional info:
      • This confirms the quota is the direct cause of any ongoing failures

  1. Clean up stale or completed objects to free quota immediately

    • Command / Action:
      • Delete completed pods, succeeded/failed jobs, and unused resources
      • kubectl get pods -n <namespace> –field-selector=status.phase=Succeeded

      • kubectl delete pod –field-selector=status.phase=Succeeded -n <namespace>

      • kubectl delete pod –field-selector=status.phase=Failed -n <namespace>

      • kubectl get jobs -n <namespace> -o json | jq ‘.items[] | select(.status.completionTime != null) | .metadata.name’

    • Expected result:
      • Quota usage drops below the hard limit after cleanup
    • additional info:
      • Also check for orphaned PVCs, unused ConfigMaps, and old Secrets that count toward object quotas

  1. Identify the largest resource consumers in the namespace

    • Command / Action:
      • Find which pods are consuming the most of the exhausted resource
      • kubectl top pod -n <namespace> –sort-by=cpu

      • kubectl top pod -n <namespace> –sort-by=memory

    • Expected result:
      • Top consumers identified; candidates for limit reduction or removal confirmed
    • additional info:
      • Compare actual usage to configured requests/limits to identify oversized workloads

  1. Reduce resource requests or limits on oversized workloads

    • Command / Action:
      • Right-size requests and limits to free up quota without removing workloads
      • kubectl set resources deployment <deployment-name> –requests=cpu=<value>,memory=<value> -n <namespace>

      • kubectl set resources deployment <deployment-name> –limits=cpu=<value>,memory=<value> -n <namespace>

    • Expected result:
      • Total requested resources in the namespace drop below the hard quota limit
    • additional info:
      • Use actual usage from kubectl top pod as a baseline; do not set limits below actual usage

  1. Temporarily scale down non-critical workloads if immediate relief is needed

    • Command / Action:
      • Scale down lower-priority deployments to free up quota for critical services
      • kubectl scale deployment <deployment-name> –replicas=<n> -n <namespace>

    • Expected result:
      • Quota usage falls below the hard limit; critical workloads can resume scheduling
    • additional info:
      • This is a short-term mitigation; follow up with a quota increase or workload optimization

  1. Increase the ResourceQuota limit if cluster capacity allows

    • Command / Action:
      • Edit the quota to raise the hard limit for the exhausted resource
      • kubectl edit resourcequota <quota-name> -n <namespace>

    • Expected result:
      • The hard limit is raised; pending pod creations and deployments proceed normally
    • additional info:
      • Verify available cluster capacity before raising limits: kubectl describe nodes | grep -A5 "Allocated resources"
      • Coordinate with the team responsible for the namespace and cluster capacity planning

  1. Check if a runaway autoscaler is the root cause

    • Command / Action:
      • Inspect HPA or KEDA objects that may have scaled workloads beyond the quota
      • kubectl get hpa -n <namespace>

      • kubectl describe hpa <hpa-name> -n <namespace>

    • Expected result:
      • Max replicas setting is consistent with available quota capacity
    • additional info:
      • If the HPA max replicas exceed what the quota can accommodate, lower maxReplicas or increase the quota

  1. Monitor quota usage after remediation

    • Command / Action:
      • Confirm the quota is no longer exceeded and track usage trend
      • kube_resourcequota{namespace="<namespace>", type=“used”}

      • kubectl get resourcequota -n <namespace>

    • Expected result:
      • Used values are below hard limits; alert clears within the next evaluation cycle
    • additional info:
      • If usage climbs back quickly, consider increasing the quota or reviewing workload sizing more broadly

Additional resources