KubeAggregatedAPIDown
KubeAggregatedAPIDown
Description
This alert fires when a Kubernetes aggregated API is unavailable and reports Available=False in its APIService status.
Aggregated APIs are served by extension API servers (such as metrics.k8s.io or custom APIs). When they are down, dependent Kubernetes features stop working.
Commonly impacted components:
- Metrics Server (
metrics.k8s.io) - Horizontal Pod Autoscaler (HPA)
- Custom Resource APIs
- Controllers relying on extension APIs
Possible Causes:
- Aggregated API server pods are not running or crashing
- Service backing the APIService has no ready endpoints
- TLS or CA bundle misconfiguration
- Expired or invalid certificates
- Network connectivity issues between kube-apiserver and the API
- Misconfigured
APIServiceobject - RBAC permission issues
- Node failure hosting the aggregated API pods
Severity estimation
Medium to High severity, depending on API importance.
- Low if the API is optional or rarely used
- Medium if it impacts monitoring or autoscaling
- High if it affects workload scaling or controllers
- Critical if multiple aggregated APIs are unavailable
Severity increases with:
- Duration of the outage
- Number of dependent components
- Whether autoscaling or core operations are blocked
Troubleshooting steps
-
Identify unavailable aggregated APIs
- Command / Action:
- List all APIService objects
-
kubectl get apiservice
- Expected result:
AVAILABLEisTrue
- additional info:
- Any
Falsevalue indicates an unavailable aggregated API
- Any
- Command / Action:
-
Describe the failing APIService
- Command / Action:
- Inspect conditions and errors
-
kubectl describe apiservice
- Expected result:
- Condition
Available=True
- Condition
- additional info:
- TLS, service, or permission errors are usually reported here
- Command / Action:
-
Check backing Service and Endpoints
- Command / Action:
- Verify Service and Endpoints exist
-
kubectl get svc -n
-
kubectl get endpoints -n
- Expected result:
- Endpoints list one or more ready pods
- additional info:
- No endpoints means kube-apiserver cannot reach the API
- Command / Action:
-
Inspect aggregated API pods
- Command / Action:
- Check pod status
-
kubectl get pods -n
- Expected result:
- Pods are
RunningandReady
- Pods are
- additional info:
CrashLoopBackOfforPendingblocks API availability
- Command / Action:
-
Check pod logs
- Command / Action:
- Review API server logs
-
kubectl logs -n
- Expected result:
- API server starts without fatal errors
- additional info:
- Certificate and RBAC errors are common causes
- Command / Action:
-
Verify TLS configuration
- Command / Action:
- Inspect CA bundle configuration
-
kubectl get apiservice -o yaml
- Expected result:
caBundleis present and valid
- additional info:
- Invalid or expired certs cause API unavailability
- Command / Action:
-
Check node and network health
- Command / Action:
- Verify node readiness
-
kubectl get nodes
- Expected result:
- Nodes are
Ready
- Nodes are
- additional info:
- Network or node issues affect aggregated APIs
- Command / Action:
-
Restart or redeploy aggregated API
- Command / Action:
- Restart API server deployment
-
kubectl rollout restart deployment -n
- Expected result:
- APIService becomes
Available=True
- APIService becomes
- additional info:
- Fix root cause before restarting to avoid loops
- Command / Action:
Additional resources
- Kubernetes Aggregated API documentation
- Kubernetes APIService reference
- Metrics Server troubleshooting
- Related alert: TargetDown