KubeQuotaExceeded

Description

This alert fires when a Kubernetes ResourceQuota in a namespace has been fully exhausted, meaning one or more tracked resources (CPU, memory, pods, services, PVCs, etc.) have reached or exceeded their configured hard limits.

When a quota is exceeded, Kubernetes actively rejects new object creation in the namespace — pod scheduling, deployments, and scaling operations will fail with a exceeded quota error until consumption is reduced or the quota is raised.

Possible Causes:

Sudden burst in workload replicas due to autoscaling or manual scaling
Deployment of new services without updating the namespace quota first
Accumulated stale objects (completed jobs, old pods, unused PVCs) consuming quota
Resource requests or limits set too high across workloads
Quota limit set too low for the actual workload requirements
Runaway autoscaler creating more replicas than the quota allows
Misconfigured CI/CD pipeline deploying excessive resources

Severity estimation

High severity — new workloads are actively being rejected and the namespace is operationally impaired:

Medium: Non-critical namespace affected; new deployments failing but existing workloads stable
High: Production namespace affected; scaling or rollouts blocked; existing workloads at risk
Critical: Core services cannot restart or scale; cluster stability or data integrity at risk

Severity increases with:

Criticality of the affected namespace and its workloads
Number of resource types exceeding the quota
Duration of the exceeded state
Active failures in deployments or pod restarts

Troubleshooting steps

Identify the affected namespace, quota, and resource type
- Command / Action:
  - Check alert labels for namespace and resource, then list all quotas in the namespace
  - kubectl get resourcequota -n <namespace>
- Expected result:
  - The specific quota and exceeded resource type are clearly identified
- additional info:
  - A namespace can have multiple ResourceQuota objects; check all of them

Describe the quota to see the full usage breakdown
- Command / Action:
  - Get exact used vs. hard values for each tracked resource
  - kubectl describe resourcequota <quota-name> -n <namespace>
- Expected result:
  - The used value equals or exceeds the hard limit for at least one resource
- additional info:
  - Common resources to check: requests.cpu, requests.memory, limits.cpu, limits.memory, pods, services, persistentvolumeclaims

Confirm active failures caused by the exceeded quota
- Command / Action:
  - Check for pods or deployments failing due to quota errors
  - kubectl get events -n <namespace> –sort-by=’.lastTimestamp’ | grep -i quota
  - kubectl describe replicaset <rs-name> -n <namespace>
- Expected result:
  - Events show exceeded quota errors; identify which resource type is the bottleneck
- additional info:
  - This confirms the quota is the direct cause of any ongoing failures

Clean up stale or completed objects to free quota immediately
- Command / Action:
  - Delete completed pods, succeeded/failed jobs, and unused resources
  - kubectl get pods -n <namespace> –field-selector=status.phase=Succeeded
  - kubectl delete pod –field-selector=status.phase=Succeeded -n <namespace>
  - kubectl delete pod –field-selector=status.phase=Failed -n <namespace>
  - kubectl get jobs -n <namespace> -o json | jq ‘.items[] | select(.status.completionTime != null) | .metadata.name’
- Expected result:
  - Quota usage drops below the hard limit after cleanup
- additional info:
  - Also check for orphaned PVCs, unused ConfigMaps, and old Secrets that count toward object quotas

Identify the largest resource consumers in the namespace
- Command / Action:
  - Find which pods are consuming the most of the exhausted resource
  - kubectl top pod -n <namespace> –sort-by=cpu
  - kubectl top pod -n <namespace> –sort-by=memory
- Expected result:
  - Top consumers identified; candidates for limit reduction or removal confirmed
- additional info:
  - Compare actual usage to configured requests/limits to identify oversized workloads

Reduce resource requests or limits on oversized workloads
- Command / Action:
  - Right-size requests and limits to free up quota without removing workloads
  - kubectl set resources deployment <deployment-name> –requests=cpu=<value>,memory=<value> -n <namespace>
  - kubectl set resources deployment <deployment-name> –limits=cpu=<value>,memory=<value> -n <namespace>
- Expected result:
  - Total requested resources in the namespace drop below the hard quota limit
- additional info:
  - Use actual usage from kubectl top pod as a baseline; do not set limits below actual usage

Temporarily scale down non-critical workloads if immediate relief is needed
- Command / Action:
  - Scale down lower-priority deployments to free up quota for critical services
  - kubectl scale deployment <deployment-name> –replicas=<n> -n <namespace>
- Expected result:
  - Quota usage falls below the hard limit; critical workloads can resume scheduling
- additional info:
  - This is a short-term mitigation; follow up with a quota increase or workload optimization

Increase the ResourceQuota limit if cluster capacity allows
- Command / Action:
  - Edit the quota to raise the hard limit for the exhausted resource
  - kubectl edit resourcequota <quota-name> -n <namespace>
- Expected result:
  - The hard limit is raised; pending pod creations and deployments proceed normally
- additional info:
  - Verify available cluster capacity before raising limits: kubectl describe nodes | grep -A5 "Allocated resources"
  - Coordinate with the team responsible for the namespace and cluster capacity planning

Check if a runaway autoscaler is the root cause
- Command / Action:
  - Inspect HPA or KEDA objects that may have scaled workloads beyond the quota
  - kubectl get hpa -n <namespace>
  - kubectl describe hpa <hpa-name> -n <namespace>
- Expected result:
  - Max replicas setting is consistent with available quota capacity
- additional info:
  - If the HPA max replicas exceed what the quota can accommodate, lower maxReplicas or increase the quota

Monitor quota usage after remediation
- Command / Action:
  - Confirm the quota is no longer exceeded and track usage trend
  - kube_resourcequota{namespace="<namespace>", type=“used”}
  - kubectl get resourcequota -n <namespace>
- Expected result:
  - Used values are below hard limits; alert clears within the next evaluation cycle
- additional info:
  - If usage climbs back quickly, consider increasing the quota or reviewing workload sizing more broadly

Additional resources

Kubernetes Resource Quotas
Managing Resources for Containers
Kubernetes LimitRange
Related alert: KubeQuotaAlmostFull
Related alert: KubeCPUQuotaOvercommit
Related alert: KubeMemoryQuotaOvercommit