KubeAPITerminatedRequests

Description

This alert fires when a high percentage of requests to the Kubernetes API server are being terminated before completion. Terminated requests are rejected due to API Priority and Fairness (APF) constraints, server overload, or resource exhaustion. Unlike transient errors, this indicates the API server is under significant stress and actively dropping requests.

Commonly affected operations:

Deployments and scaling operations
Controller reconciliation loops
Cluster autoscaler decisions
kubectl commands and API clients
Operators and custom controllers

Possible Causes:

API server overload from excessive client requests
Misconfigured or misbehaving controllers making too many API calls
Insufficient API server resources (CPU, memory)
Too many concurrent long-running requests (watches, lists)
API Priority and Fairness (APF) rate limiting active
Cluster scaling events triggering high API activity
Webhook timeouts causing request queuing
Network issues between API server and etcd
Burst of requests from CI/CD pipelines or automation

Severity estimation

High severity, as terminated requests directly impact cluster operations.

Medium if termination rate is low and brief
High if sustained terminations affect deployments or scaling
Critical if cluster operations are blocked or cascading failures occur

Severity increases with:

Percentage and duration of terminated requests
Number of affected clients and controllers
Whether core cluster operations are impacted

Troubleshooting steps

Check API server termination metrics
- Command / Action:
  - Query metrics to identify termination rates by resource and verb
  - kubectl get –raw /metrics | grep apiserver_request_terminations_total
- Expected result:
  - Low or zero termination counts
- additional info:
  - High counts for specific resources indicate targeted overload

Identify top API consumers
- Command / Action:
  - Find which clients are making the most requests
  - kubectl get –raw /metrics | grep apiserver_request_total | head -50
- Expected result:
  - Balanced request distribution across clients
- additional info:
  - Look for user agents with disproportionately high request counts

Check API Priority and Fairness status
- Command / Action:
  - Review priority levels and flow schemas
  - kubectl get prioritylevelconfigurations
  - kubectl get flowschemas
- Expected result:
  - No flow schemas in rejected or throttled state
- additional info:
  - APF may be limiting specific clients to protect the API server

Review APF metrics for throttling
- Command / Action:
  - Check flow control metrics for rejected requests
  - kubectl get –raw /metrics | grep apiserver_flowcontrol
- Expected result:
  - Low or zero rejected/queued requests
- additional info:
  - High apiserver_flowcontrol_rejected_requests_total confirms APF throttling

Check API server logs
- Command / Action:
  - Look for termination and throttling messages
  - kubectl logs -n kube-system -l component=kube-apiserver –tail=200
- Expected result:
  - No repeated termination or resource exhaustion errors
- additional info:
  - Messages about “request terminated” or “too many requests” indicate overload

Monitor API server resource usage
- Command / Action:
  - Check CPU and memory consumption
  - kubectl top pods -n kube-system -l component=kube-apiserver
- Expected result:
  - Resource usage within allocated limits
- additional info:
  - High CPU or memory may require scaling or resource increases

Identify misbehaving controllers or operators
- Command / Action:
  - Review recent deployments and controller activity
  - kubectl get pods -A –sort-by=’.metadata.creationTimestamp’ | tail -20
- Expected result:
  - No recently deployed controllers with excessive API calls
- additional info:
  - New operators or CRDs can trigger reconciliation storms

Scale down problematic clients
- Command / Action:
  - Temporarily reduce replicas of offending controllers
  - kubectl scale deployment -n –replicas=0
- Expected result:
  - Termination rate decreases
- additional info:
  - This is a temporary measure while investigating root cause

Adjust APF configuration if needed
- Command / Action:
  - Increase limits for critical workloads
  - kubectl edit prioritylevelconfiguration
- Expected result:
  - Critical operations no longer throttled
- additional info:
  - Be cautious not to remove all protection from the API server

Additional resources

Kubernetes API Priority and Fairness
API Server Metrics
Debugging Kubernetes API Server
Related alert: KubeAggregatedAPIErrors
Related alert: KubeAggregatedAPIDown