Alert Runbooks

AuditorQueueAge

Runbook: AuditorQueueAge

Alert Details

Description

This alert triggers when the maximum age of jobs in the audit queue exceeds a certain threshold, indicating potential delays in processing audit jobs.

Possible Causes

Troubleshooting Steps

  1. Check Audit Queue Length

    • Command: curl -s http://audit-service:port/metrics | grep pull_audit_new_jobs_queue
    • Expected Output: Current metrics for the audit queue length.
    • Example:
      1
      2
      
      $ curl -s http://audit-service:port/metrics | grep pull_audit_new_jobs_queue
      pull_audit_new_jobs_queue{environment="production"} 150
  2. Check Audit Service Status

    • Command: systemctl status audit-service
    • Expected Output: The status of the audit service. Look for “active (running)”.
    • Example:
      1
      2
      3
      4
      
      $ systemctl status audit-service
      ● audit-service.service - Audit Service
         Loaded: loaded (/etc/systemd/system/audit-service.service; enabled; vendor preset: enabled)
         Active: active (running) since Wed 2024-11-13 14:00:00 UTC; 19min ago
  3. Restart Audit Service

    • Command: sudo systemctl restart audit-service
    • Expected Output: The service restarts without errors.
    • Example:
      1
      
      $ sudo systemctl restart audit-service
  4. Check Network Connectivity

    • Command: ping -c 4 audit-service-hostname
    • Expected Output: Successful ping responses.
    • Example:
      1
      2
      3
      4
      5
      6
      
      $ ping -c 4 audit-service-hostname
      PING audit-service-hostname (192.168.1.2) 56(84) bytes of data.
      64 bytes from audit-service-hostname: icmp_seq=1 ttl=64 time=0.123 ms
      64 bytes from audit-service-hostname: icmp_seq=2 ttl=64 time=0.124 ms
      64 bytes from audit-service-hostname: icmp_seq=3 ttl=64 time=0.125 ms
      64 bytes from audit-service-hostname: icmp_seq=4 ttl=64 time=0.126 ms
  5. Verify Audit Queue Configuration

    • Command: cat /etc/audit-service/config.yml
    • Expected Output: Configuration file contents. Ensure all settings are correct.
    • Example:
      1
      2
      3
      
      $ cat /etc/audit-service/config.yml
      queue_name: 'audit-queue'
      max_queue_age: 300

Additional Steps