Runbook: ExporterScrapeTimeLimit Alert
Alert Details
- Alert Name: ExporterScrapeTimeLimit
- Expression:
scrape_duration_seconds{job="integrations/postgres"} > 2
Description
This alert triggers when the time to scrape the database metrics exceeds 2 sec.
Possible Causes
- long running queries
- high system load on PostgreSQL server
Troubleshooting Steps
1. Check CPU and IOPS usage of the PostgreSQL server
An overloaded server may have difficulty collecting metrics.
2. Look at PostgreSQL Server logs to identify long running queries.
You may need to enable log_min_duration_statement to identify which queries are long to be executed.
3. identify and kill heavy queries
SELECT pg_terminate_backend(pid), usename, datname, application_name, client_addr, client_port, state, wait_event_type, wait_event, state_change, query FROM pg_stat_activity WHERE pid in ('<replace_with_pids>');