KubeAPITerminatedRequests
KubeAPITerminatedRequests
Description
This alert fires when a high percentage of requests to the Kubernetes API server are being terminated before completion. Terminated requests are rejected due to API Priority and Fairness (APF) constraints, server overload, or resource exhaustion. Unlike transient errors, this indicates the API server is under significant stress and actively dropping requests.
Commonly affected operations:
- Deployments and scaling operations
- Controller reconciliation loops
- Cluster autoscaler decisions
- kubectl commands and API clients
- Operators and custom controllers
Possible Causes:
- API server overload from excessive client requests
- Misconfigured or misbehaving controllers making too many API calls
- Insufficient API server resources (CPU, memory)
- Too many concurrent long-running requests (watches, lists)
- API Priority and Fairness (APF) rate limiting active
- Cluster scaling events triggering high API activity
- Webhook timeouts causing request queuing
- Network issues between API server and etcd
- Burst of requests from CI/CD pipelines or automation
Severity estimation
High severity, as terminated requests directly impact cluster operations.
- Medium if termination rate is low and brief
- High if sustained terminations affect deployments or scaling
- Critical if cluster operations are blocked or cascading failures occur
Severity increases with:
- Percentage and duration of terminated requests
- Number of affected clients and controllers
- Whether core cluster operations are impacted
Troubleshooting steps
-
Check API server termination metrics
- Command / Action:
- Query metrics to identify termination rates by resource and verb
-
kubectl get –raw /metrics | grep apiserver_request_terminations_total
- Expected result:
- Low or zero termination counts
- additional info:
- High counts for specific resources indicate targeted overload
- Command / Action:
-
Identify top API consumers
- Command / Action:
- Find which clients are making the most requests
-
kubectl get –raw /metrics | grep apiserver_request_total | head -50
- Expected result:
- Balanced request distribution across clients
- additional info:
- Look for user agents with disproportionately high request counts
- Command / Action:
-
Check API Priority and Fairness status
- Command / Action:
- Review priority levels and flow schemas
-
kubectl get prioritylevelconfigurations
-
kubectl get flowschemas
- Expected result:
- No flow schemas in rejected or throttled state
- additional info:
- APF may be limiting specific clients to protect the API server
- Command / Action:
-
Review APF metrics for throttling
- Command / Action:
- Check flow control metrics for rejected requests
-
kubectl get –raw /metrics | grep apiserver_flowcontrol
- Expected result:
- Low or zero rejected/queued requests
- additional info:
- High
apiserver_flowcontrol_rejected_requests_totalconfirms APF throttling
- High
- Command / Action:
-
Check API server logs
- Command / Action:
- Look for termination and throttling messages
-
kubectl logs -n kube-system -l component=kube-apiserver –tail=200
- Expected result:
- No repeated termination or resource exhaustion errors
- additional info:
- Messages about “request terminated” or “too many requests” indicate overload
- Command / Action:
-
Monitor API server resource usage
- Command / Action:
- Check CPU and memory consumption
-
kubectl top pods -n kube-system -l component=kube-apiserver
- Expected result:
- Resource usage within allocated limits
- additional info:
- High CPU or memory may require scaling or resource increases
- Command / Action:
-
Identify misbehaving controllers or operators
- Command / Action:
- Review recent deployments and controller activity
-
kubectl get pods -A –sort-by=’.metadata.creationTimestamp’ | tail -20
- Expected result:
- No recently deployed controllers with excessive API calls
- additional info:
- New operators or CRDs can trigger reconciliation storms
- Command / Action:
-
Scale down problematic clients
- Command / Action:
- Temporarily reduce replicas of offending controllers
-
kubectl scale deployment -n –replicas=0
- Expected result:
- Termination rate decreases
- additional info:
- This is a temporary measure while investigating root cause
- Command / Action:
-
Adjust APF configuration if needed
- Command / Action:
- Increase limits for critical workloads
-
kubectl edit prioritylevelconfiguration
- Expected result:
- Critical operations no longer throttled
- additional info:
- Be cautious not to remove all protection from the API server
- Command / Action:
Additional resources
- Kubernetes API Priority and Fairness
- API Server Metrics
- Debugging Kubernetes API Server
- Related alert: KubeAggregatedAPIErrors
- Related alert: KubeAggregatedAPIDown