CPUThrottlingHigh

Description

This alert fires when a container or pod experiences high CPU throttling, meaning it is frequently prevented from using CPU time due to hitting its configured CPU limits.
High CPU throttling can cause increased latency, slow processing, timeouts, and degraded application performance, even if CPU usage appears low at first glance.

Possible Causes:

CPU limits set too low for the workload
CPU bursts exceeding the configured limit
Increased load or traffic
Inefficient or CPU-intensive application code
Too many CPU-constrained pods on the same node
Node CPU overcommitment
Background processes competing for CPU
Misconfigured resource requests and limits

Severity estimation

Medium to High severity, depending on impact.

Low if throttling is brief and user impact is negligible
Medium if application latency or throughput is degraded
High if throttling causes timeouts, failed requests, or backlogs
Critical if throttling affects critical or user-facing services

Severity increases with:

Throttling percentage and duration
Number of affected pods
Workload criticality

Troubleshooting steps

Confirm CPU throttling
- Command / Action:
  - Check throttling metrics (Prometheus)
  - container_cpu_cfs_throttled_seconds_total
- Expected result:
  - Throttling rate is low or near zero
- additional info:
  - Sustained throttling confirms CPU limit pressure

Check CPU usage vs limits
- Command / Action:
  - Compare actual usage to limits
  - kubectl top pod <pod-name> -n <namespace>
- Expected result:
  - CPU usage comfortably below the limit
- additional info:
  - Throttling can occur even if average usage seems low

Inspect resource configuration
- Command / Action:
  - Review requests and limits
  - kubectl describe pod <pod-name> -n <namespace>
- Expected result:
  - CPU limits align with workload needs
- additional info:
  - Tight limits increase throttling risk

Check node CPU pressure
- Command / Action:
  - Inspect node resource usage
  - kubectl describe node <node-name>
- Expected result:
  - Node has available CPU capacity
- additional info:
  - CPU overcommitment amplifies throttling

Review application behavior
- Command / Action:
  - Identify CPU-intensive operations
  - Review application metrics and profiling
- Expected result:
  - CPU usage patterns match expectations
- additional info:
  - Busy loops or inefficient algorithms cause bursts

Increase CPU limits (if appropriate)
- Command / Action:
  - Adjust CPU limits to reduce throttling
  - kubectl set resources deployment <deployment-name> –limits=cpu=<value> -n <namespace>
- Expected result:
  - Throttling rate decreases
- additional info:
  - Ensure node capacity can support higher limits

Adjust CPU requests
- Command / Action:
  - Increase CPU requests to improve scheduling
  - kubectl set resources deployment <deployment-name> –requests=cpu=<value> -n <namespace>
- Expected result:
  - Pod is scheduled on nodes with sufficient CPU
- additional info:
  - Requests affect placement; limits affect throttling

Scale the workload
- Command / Action:
  - Distribute load across more replicas
  - kubectl scale deployment <deployment-name> –replicas=<n> -n <namespace>
- Expected result:
  - CPU load and throttling per pod decrease
- additional info:
  - Horizontal scaling often reduces throttling

CPUThrottlingHigh

CPUThrottlingHigh

Description

Possible Causes:

Severity estimation

Troubleshooting steps

Additional resources