KubeletCertificateExpiration
KubeletCertificateExpiration
Description
A kubelet certificate used for node authentication to the Kubernetes API server is expiring soon or has expired.
Kubelet certificates authenticate each node to the Kubernetes control plane and secure communication between the kubelet and API server. When these certificates expire, the affected node will lose the ability to communicate with the API server, preventing pod management, status updates, and other critical node operations.
Possible Causes:
- Kubelet certificate renewal process not configured or not working
- Automatic certificate rotation disabled or misconfigured
- Certificate Signing Requests (CSRs) not being approved automatically
- Kubelet unable to request new certificates from the API server
- Clock skew on the node causing premature expiration detection
- Manual certificates that were not renewed in time
- Certificate rotation controller not running or failing
- Insufficient permissions for kubelet to create CSRs
Severity estimation
High to Critical severity, depending on timing and number of affected nodes.
- High if the certificate is expiring within 30 days on one or few nodes
- Critical if the certificate is expiring within 7 days
- Critical if the certificate has already expired
- Critical if multiple nodes are affected simultaneously
Impact assessment:
- Expired certificates cause immediate node communication failures
- Node cannot report status or receive pod scheduling instructions
- Existing pods continue running but cannot be managed
- New pods cannot be scheduled to affected nodes
- Node may appear as NotReady in the cluster
- Metrics and logs from the node may become unavailable
Troubleshooting steps
-
Identify which nodes have expiring certificates
- Command / Action:
- Check alert labels to identify affected nodes
- Review certificate expiration across all nodes
-
kubectl get nodes
- Check node conditions for certificate issues
- Expected result:
- Identification of nodes with certificates expiring soon or expired
- Nodes should be in Ready state
- additional info:
- The alert should provide details about which node is affected
- Focus on certificates with less than 30 days remaining
- Command / Action:
-
Check kubelet certificate details on the affected node
- Command / Action:
- Examine kubelet certificate files on the node
- SSH to the affected node
-
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -dates
-
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -text | grep -A2 Validity
- Expected result:
- Certificate expiration date and time displayed
- Confirmation of which certificates need renewal
- additional info:
- Kubelet certificates are typically in /var/lib/kubelet/pki/
- kubelet-client-current.pem is the active client certificate
- Also check kubelet-server-current.pem for server certificate
- Command / Action:
-
Verify automatic certificate rotation is enabled
- Command / Action:
- Check kubelet configuration for certificate rotation
-
ps aux | grep kubelet | grep rotate-certificates
- Check kubelet config file: >cat /var/lib/kubelet/config.yaml | grep -i rotate
-
systemctl cat kubelet | grep rotate
- Expected result:
- –rotate-certificates flag should be set to true
- rotateCertificates: true in config file
- additional info:
- Certificate rotation must be enabled for automatic renewal
- Also check –rotate-server-certificates for server cert rotation
- Default rotation happens at 80% of certificate lifetime
- Command / Action:
-
Check kubelet logs for certificate rotation issues
- Command / Action:
- Review kubelet logs on the affected node
-
journalctl -u kubelet -n 200 | grep -i certificate
-
journalctl -u kubelet | grep -i “rotate|renew|CSR”
- Expected result:
- No errors related to certificate rotation
- Logs should show successful certificate requests and renewals
- additional info:
- Look for “certificate rotation is not enabled” messages
- Check for API server connectivity errors
- Look for “obtained new certificate” success messages
- Command / Action:
-
Review pending Certificate Signing Requests
- Command / Action:
- Check for pending CSRs from the affected node
-
kubectl get csr
-
kubectl get csr | grep Pending
-
kubectl get csr -o json | jq ‘.items[] | select(.status.conditions == null) | {name: .metadata.name, user: .spec.username}’
- Expected result:
- No pending CSRs from the affected node
- If pending, they should be legitimate and approved
- additional info:
- Pending CSRs indicate kubelet is requesting renewal but needs approval
- CSRs from system:node:nodename should be reviewed and approved
- Verify CSR signer name and requested usages
- Command / Action:
-
Approve pending CSRs if legitimate
- Command / Action:
- Review and approve pending kubelet CSRs
-
kubectl describe csr <csr-name>
- Verify the CSR is from the correct node and user
-
kubectl certificate approve <csr-name>
- Expected result:
- CSR approved successfully
- Kubelet receives new certificate
- additional info:
- Only approve CSRs from legitimate nodes
- CSR should be from user: system:node:<nodename>
- Check CSR creation time and requesting identity
- For automated approval, ensure certificate signing controller is running
- Command / Action:
-
Check certificate signing controller
- Command / Action:
- Verify the controller-manager has CSR signing enabled
-
kubectl logs -n kube-system kube-controller-manager-<node> | grep certificate
- Check for CSR approval controller logs
- Expected result:
- CSR signing and approval controllers are running
- No errors in certificate controller logs
- additional info:
- Controller-manager handles automatic CSR approval
- Check –cluster-signing-cert-file and –cluster-signing-key-file flags
- Verify –controllers includes csrsigning and csrapproving
- Command / Action:
-
Verify kubelet can communicate with API server
- Command / Action:
- Test API server connectivity from the node
- SSH to the affected node
-
curl -k https://<api-server>:6443/healthz
- Check kubelet can reach API server
- Expected result:
- API server is reachable from the node
- Kubelet has network connectivity to control plane
- additional info:
- Network issues prevent CSR submission
- Check firewall rules and network policies
- Verify DNS resolution for API server
- Command / Action:
-
Check kubelet client certificate permissions
- Command / Action:
- Verify kubelet has proper RBAC permissions
-
kubectl get clusterrolebinding | grep system:node
-
kubectl describe clusterrole system:node
- Expected result:
- Kubelet should have permissions to create CSRs
- system:node role should be bound correctly
- additional info:
- Kubelet needs certificatesigningrequests create permission
- Check for any RBAC changes that might affect kubelet
- Command / Action:
-
Manually restart kubelet to trigger certificate rotation
- Command / Action:
- Restart kubelet service on the affected node
- SSH to the node
-
systemctl restart kubelet
- Monitor kubelet logs for certificate rotation
- Expected result:
- Kubelet restarts successfully
- New CSR is created and approved
- Certificate is renewed
- additional info:
- Restarting can trigger immediate rotation attempt
- Monitor with: journalctl -u kubelet -f
- Verify node returns to Ready state
- Command / Action:
-
Manually renew kubelet certificates (if automatic rotation fails)
- Command / Action:
- Use kubeadm to renew certificates (kubeadm clusters)
- SSH to the affected node
-
kubeadm certs renew all
-
systemctl restart kubelet
- Expected result:
- Certificates renewed successfully
- Kubelet authenticates with new certificates
- additional info:
- This is for kubeadm-managed clusters
- Backup existing certificates before renewal
- Non-kubeadm clusters may need manual certificate generation
- Command / Action:
-
Verify certificate renewal and node status
- Command / Action:
- Confirm new certificates are in place
-
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -dates
-
kubectl get nodes <node-name>
- Check node is Ready
- Expected result:
- Certificate has new expiration date (typically 1 year out)
- Node shows Ready status
- No certificate-related errors in kubelet logs
- additional info:
- Verify certificate CommonName matches node name
- Check that node can schedule and manage pods
- Monitor for any authentication failures
- Command / Action:
-
Enable automatic CSR approval (for clusters without it)
- Command / Action:
- Configure automatic CSR approval for kubelet certificates
- Deploy or verify kubelet-rubber-stamp or similar CSR approver
- Ensure controller-manager has automatic approval enabled
- Expected result:
- Future CSRs are automatically approved
- Certificates renew without manual intervention
- additional info:
- Use caution with automatic approval
- Verify CSR approval criteria are secure
- Consider using tools like kubelet-csr-approver
- Document the approval process and policies
- Command / Action:
Additional resources
- Certificate rotation for the kubelet
- Kubelet TLS bootstrapping
- Kubernetes PKI certificates and requirements
- Managing TLS in a cluster
- Related alert: KubeClientCertificateExpiration
- Related alert: KubeNodeNotReady