Alert Runbooks

KubeProxyAbsent

KubeProxyAbsent

Description

Prometheus target discovery has not found Kube-proxy in the past 15 minutes.

Kube-proxy is a network proxy that runs on each node in the cluster. It maintains network rules for Service networking, enabling communication to pods from inside or outside the cluster. When absent from monitoring, metrics cannot be collected from kube-proxy instances.


Possible Causes:


Severity estimation

High severity - Kube-proxy is essential for Service networking.

While this alert only indicates missing metrics (not necessarily that kube-proxy is down), if kube-proxy is actually not running:

Impact depends on whether kube-proxy is truly down or just metrics are unavailable.

If using an alternative CNI that replaces kube-proxy (like Cilium in kube-proxy replacement mode), this alert may be expected and can be tuned or disabled.


Troubleshooting steps

  1. Verify Prometheus target status

    • Command / Action:
      • Check if kube-proxy appears in Prometheus targets
      • Access Prometheus UI at /targets
      • Look for kube-proxy targets
    • Expected result:
      • Targets should be present and status should be “UP”
      • Should see one target per node
    • additional info:
      • If targets are missing from discovery, this is a service discovery issue
      • If targets show as “DOWN”, the endpoints are unreachable

  1. Check kube-proxy DaemonSet and pod status

    • Command / Action:
      • Verify kube-proxy is running on all nodes
      • kubectl get daemonset -n kube-system kube-proxy

      • kubectl get pods -n kube-system -l k8s-app=kube-proxy

    • Expected result:
      • DaemonSet should have desired number equal to current number
      • All kube-proxy pods should be in Running state
      • kube-proxy-xxxxx 1/1 Running

    • additional info:
      • There should be one kube-proxy pod per node
      • Check for CrashLoopBackOff or ImagePullBackOff states

  1. Describe kube-proxy pods for errors

    • Command / Action:
      • Inspect pod events for issues
      • kubectl describe pods -n kube-system -l k8s-app=kube-proxy

    • Expected result:
      • No errors in events
      • Pods should be healthy
    • additional info:
      • Look for scheduling failures, resource constraints, or startup errors
      • Check if pods are stuck in Pending state

  1. Check kube-proxy logs

    • Command / Action:
      • Review logs for errors or startup issues
      • kubectl logs -n kube-system -l k8s-app=kube-proxy –tail=50

      • kubectl logs -n kube-system kube-proxy-<xxxxx>

    • Expected result:
      • No critical errors in logs
      • Kube-proxy should be syncing iptables/ipvs rules
    • additional info:
      • Look for API server connectivity issues
      • Check for permission errors or config problems
      • Verify network mode (iptables, ipvs, or userspace)

  1. Verify metrics endpoint accessibility

    • Command / Action:
      • Test if metrics endpoint is reachable on a node
      • kubectl get pods -n kube-system -l k8s-app=kube-proxy -o wide

      • curl http://<node-ip>:10249/metrics

      • Or exec into pod: >kubectl exec -n kube-system kube-proxy-<xxxxx> – curl localhost:10249/metrics
    • Expected result:
      • Metrics endpoint should return Prometheus-formatted metrics
      • HTTP 200 response with metric data
    • additional info:
      • Default metrics port is 10249
      • Check if –metrics-bind-address is set correctly
      • Port should be accessible from Prometheus

  1. Check kube-proxy configuration

    • Command / Action:
      • Review kube-proxy ConfigMap and DaemonSet config
      • kubectl get configmap -n kube-system kube-proxy -o yaml

      • kubectl get daemonset -n kube-system kube-proxy -o yaml

    • Expected result:
      • Metrics bind address should not be 127.0.0.1 only
      • metricsBindAddress should be “0.0.0.0:10249” or similar
    • additional info:
      • If bound to localhost only, metrics won’t be scrapable
      • Check proxy mode (iptables, ipvs, kernelspace)

  1. Verify ServiceMonitor configuration (for Prometheus Operator)

    • Command / Action:
      • Check ServiceMonitor for kube-proxy
      • kubectl get servicemonitor -A | grep proxy

      • kubectl get servicemonitor -n kube-system kube-proxy -o yaml

    • Expected result:
      • ServiceMonitor exists and targets correct endpoints
      • Selector matches kube-proxy service/pods
    • additional info:
      • ServiceMonitor tells Prometheus how to discover kube-proxy
      • Check endpoint configuration and port settings

  1. Check service and endpoints

    • Command / Action:
      • Verify kube-proxy service exists with endpoints
      • kubectl get svc -n kube-system kube-proxy

      • kubectl get endpoints -n kube-system kube-proxy

    • Expected result:
      • Service should exist and have endpoints for each node
      • Endpoints should list IPs from all kube-proxy pods
    • additional info:
      • Missing endpoints mean service can’t find kube-proxy pods
      • Check service selector matches pod labels

  1. Verify network policies and firewall rules

    • Command / Action:
      • Check for policies blocking metric scraping
      • kubectl get networkpolicies -n kube-system

      • Check node firewall rules for port 10249
    • Expected result:
      • No policies blocking Prometheus access to port 10249
      • Firewall allows traffic on metrics port
    • additional info:
      • Network policies may block Prometheus from reaching kube-proxy
      • Verify Prometheus namespace has access to kube-system

  1. Check for alternative CNI solutions

    • Command / Action:
      • Determine if kube-proxy has been replaced by CNI
      • kubectl get pods -n kube-system | grep cilium

      • Check for Cilium, Calico eBPF mode, or other kube-proxy replacements
    • Expected result:
      • If using kube-proxy replacement, this alert is expected
    • additional info:
      • Some CNI plugins (Cilium, Calico eBPF) can replace kube-proxy
      • If intentionally not running kube-proxy, adjust or silence this alert
      • Verify Service networking still works correctly

  1. Restart kube-proxy pods if necessary

    • Command / Action:
      • Delete kube-proxy pods to force restart
      • kubectl delete pods -n kube-system -l k8s-app=kube-proxy

      • DaemonSet will recreate them automatically
    • Expected result:
      • New pods start successfully
      • Prometheus begins scraping metrics again
      • Service networking continues to function
    • additional info:
      • Pods will be recreated one by one across nodes
      • Monitor for successful metric collection in Prometheus
      • Verify no service disruption during restart

Additional resources