KubeQuotaExceeded
KubeQuotaExceeded
Description
This alert fires when a Kubernetes ResourceQuota in a namespace has been fully exhausted, meaning one or more tracked resources (CPU, memory, pods, services, PVCs, etc.) have reached or exceeded their configured hard limits.
When a quota is exceeded, Kubernetes actively rejects new object creation in the namespace — pod scheduling, deployments, and scaling operations will fail with a exceeded quota error until consumption is reduced or the quota is raised.
Possible Causes:
- Sudden burst in workload replicas due to autoscaling or manual scaling
- Deployment of new services without updating the namespace quota first
- Accumulated stale objects (completed jobs, old pods, unused PVCs) consuming quota
- Resource requests or limits set too high across workloads
- Quota limit set too low for the actual workload requirements
- Runaway autoscaler creating more replicas than the quota allows
- Misconfigured CI/CD pipeline deploying excessive resources
Severity estimation
High severity — new workloads are actively being rejected and the namespace is operationally impaired:
- Medium: Non-critical namespace affected; new deployments failing but existing workloads stable
- High: Production namespace affected; scaling or rollouts blocked; existing workloads at risk
- Critical: Core services cannot restart or scale; cluster stability or data integrity at risk
Severity increases with:
- Criticality of the affected namespace and its workloads
- Number of resource types exceeding the quota
- Duration of the exceeded state
- Active failures in deployments or pod restarts
Troubleshooting steps
-
Identify the affected namespace, quota, and resource type
- Command / Action:
- Check alert labels for namespace and resource, then list all quotas in the namespace
-
kubectl get resourcequota -n <namespace>
- Expected result:
- The specific quota and exceeded resource type are clearly identified
- additional info:
- A namespace can have multiple ResourceQuota objects; check all of them
- Command / Action:
-
Describe the quota to see the full usage breakdown
- Command / Action:
- Get exact used vs. hard values for each tracked resource
-
kubectl describe resourcequota <quota-name> -n <namespace>
- Expected result:
- The
usedvalue equals or exceeds thehardlimit for at least one resource
- The
- additional info:
- Common resources to check:
requests.cpu,requests.memory,limits.cpu,limits.memory,pods,services,persistentvolumeclaims
- Common resources to check:
- Command / Action:
-
Confirm active failures caused by the exceeded quota
- Command / Action:
- Check for pods or deployments failing due to quota errors
-
kubectl get events -n <namespace> –sort-by=’.lastTimestamp’ | grep -i quota
-
kubectl describe replicaset <rs-name> -n <namespace>
- Expected result:
- Events show
exceeded quotaerrors; identify which resource type is the bottleneck
- Events show
- additional info:
- This confirms the quota is the direct cause of any ongoing failures
- Command / Action:
-
Clean up stale or completed objects to free quota immediately
- Command / Action:
- Delete completed pods, succeeded/failed jobs, and unused resources
-
kubectl get pods -n <namespace> –field-selector=status.phase=Succeeded
-
kubectl delete pod –field-selector=status.phase=Succeeded -n <namespace>
-
kubectl delete pod –field-selector=status.phase=Failed -n <namespace>
-
kubectl get jobs -n <namespace> -o json | jq ‘.items[] | select(.status.completionTime != null) | .metadata.name’
- Expected result:
- Quota usage drops below the hard limit after cleanup
- additional info:
- Also check for orphaned PVCs, unused ConfigMaps, and old Secrets that count toward object quotas
- Command / Action:
-
Identify the largest resource consumers in the namespace
- Command / Action:
- Find which pods are consuming the most of the exhausted resource
-
kubectl top pod -n <namespace> –sort-by=cpu
-
kubectl top pod -n <namespace> –sort-by=memory
- Expected result:
- Top consumers identified; candidates for limit reduction or removal confirmed
- additional info:
- Compare actual usage to configured requests/limits to identify oversized workloads
- Command / Action:
-
Reduce resource requests or limits on oversized workloads
- Command / Action:
- Right-size requests and limits to free up quota without removing workloads
-
kubectl set resources deployment <deployment-name> –requests=cpu=<value>,memory=<value> -n <namespace>
-
kubectl set resources deployment <deployment-name> –limits=cpu=<value>,memory=<value> -n <namespace>
- Expected result:
- Total requested resources in the namespace drop below the hard quota limit
- additional info:
- Use actual usage from
kubectl top podas a baseline; do not set limits below actual usage
- Use actual usage from
- Command / Action:
-
Temporarily scale down non-critical workloads if immediate relief is needed
- Command / Action:
- Scale down lower-priority deployments to free up quota for critical services
-
kubectl scale deployment <deployment-name> –replicas=<n> -n <namespace>
- Expected result:
- Quota usage falls below the hard limit; critical workloads can resume scheduling
- additional info:
- This is a short-term mitigation; follow up with a quota increase or workload optimization
- Command / Action:
-
Increase the ResourceQuota limit if cluster capacity allows
- Command / Action:
- Edit the quota to raise the hard limit for the exhausted resource
-
kubectl edit resourcequota <quota-name> -n <namespace>
- Expected result:
- The hard limit is raised; pending pod creations and deployments proceed normally
- additional info:
- Verify available cluster capacity before raising limits:
kubectl describe nodes | grep -A5 "Allocated resources" - Coordinate with the team responsible for the namespace and cluster capacity planning
- Verify available cluster capacity before raising limits:
- Command / Action:
-
Check if a runaway autoscaler is the root cause
- Command / Action:
- Inspect HPA or KEDA objects that may have scaled workloads beyond the quota
-
kubectl get hpa -n <namespace>
-
kubectl describe hpa <hpa-name> -n <namespace>
- Expected result:
- Max replicas setting is consistent with available quota capacity
- additional info:
- If the HPA max replicas exceed what the quota can accommodate, lower
maxReplicasor increase the quota
- If the HPA max replicas exceed what the quota can accommodate, lower
- Command / Action:
-
Monitor quota usage after remediation
- Command / Action:
- Confirm the quota is no longer exceeded and track usage trend
-
kube_resourcequota{namespace="<namespace>", type=“used”}
-
kubectl get resourcequota -n <namespace>
- Expected result:
- Used values are below hard limits; alert clears within the next evaluation cycle
- additional info:
- If usage climbs back quickly, consider increasing the quota or reviewing workload sizing more broadly
- Command / Action:
Additional resources
- Kubernetes Resource Quotas
- Managing Resources for Containers
- Kubernetes LimitRange
- Related alert: KubeQuotaAlmostFull
- Related alert: KubeCPUQuotaOvercommit
- Related alert: KubeMemoryQuotaOvercommit