CPUThrottlingHigh
CPUThrottlingHigh
Description
This alert fires when a container or pod experiences high CPU throttling, meaning it is frequently prevented from using CPU time due to hitting its configured CPU limits.
High CPU throttling can cause increased latency, slow processing, timeouts, and degraded application performance, even if CPU usage appears low at first glance.
Possible Causes:
- CPU limits set too low for the workload
- CPU bursts exceeding the configured limit
- Increased load or traffic
- Inefficient or CPU-intensive application code
- Too many CPU-constrained pods on the same node
- Node CPU overcommitment
- Background processes competing for CPU
- Misconfigured resource requests and limits
Severity estimation
Medium to High severity, depending on impact.
- Low if throttling is brief and user impact is negligible
- Medium if application latency or throughput is degraded
- High if throttling causes timeouts, failed requests, or backlogs
- Critical if throttling affects critical or user-facing services
Severity increases with:
- Throttling percentage and duration
- Number of affected pods
- Workload criticality
Troubleshooting steps
-
Confirm CPU throttling
- Command / Action:
- Check throttling metrics (Prometheus)
-
container_cpu_cfs_throttled_seconds_total
- Expected result:
- Throttling rate is low or near zero
- additional info:
- Sustained throttling confirms CPU limit pressure
- Command / Action:
-
Check CPU usage vs limits
- Command / Action:
- Compare actual usage to limits
-
kubectl top pod <pod-name> -n <namespace>
- Expected result:
- CPU usage comfortably below the limit
- additional info:
- Throttling can occur even if average usage seems low
- Command / Action:
-
Inspect resource configuration
- Command / Action:
- Review requests and limits
-
kubectl describe pod <pod-name> -n <namespace>
- Expected result:
- CPU limits align with workload needs
- additional info:
- Tight limits increase throttling risk
- Command / Action:
-
Check node CPU pressure
- Command / Action:
- Inspect node resource usage
-
kubectl describe node <node-name>
- Expected result:
- Node has available CPU capacity
- additional info:
- CPU overcommitment amplifies throttling
- Command / Action:
-
Review application behavior
- Command / Action:
- Identify CPU-intensive operations
-
Review application metrics and profiling
- Expected result:
- CPU usage patterns match expectations
- additional info:
- Busy loops or inefficient algorithms cause bursts
- Command / Action:
-
Increase CPU limits (if appropriate)
- Command / Action:
- Adjust CPU limits to reduce throttling
-
kubectl set resources deployment <deployment-name> –limits=cpu=<value> -n <namespace>
- Expected result:
- Throttling rate decreases
- additional info:
- Ensure node capacity can support higher limits
- Command / Action:
-
Adjust CPU requests
- Command / Action:
- Increase CPU requests to improve scheduling
-
kubectl set resources deployment <deployment-name> –requests=cpu=<value> -n <namespace>
- Expected result:
- Pod is scheduled on nodes with sufficient CPU
- additional info:
- Requests affect placement; limits affect throttling
- Command / Action:
-
Scale the workload
- Command / Action:
- Distribute load across more replicas
-
kubectl scale deployment <deployment-name> –replicas=<n> -n <namespace>
- Expected result:
- CPU load and throttling per pod decrease
- additional info:
- Horizontal scaling often reduces throttling
- Command / Action: