Runbook: HighTransactionFailRate Alert
Alert Details
- Alert Name: HighTransactionFailRate
- Expression:
rate(pg_stat_activity_count{state="idle in transaction (aborted)", datname!~"template.*"}[5m]) >= 5
Description
This alert triggers when when the rate of aborted transactions in a 5 min interval is equal or greater than 5. These are transactions that were aborted but remain idle, potentially locking resources and causing performance degradation.
Possible Causes
- Not properly handling transaction errors
- Connection leaks in application code
- ORM misconfiguration
- Applications crashing mid-transaction
- Long-running clients that don’t properly close connections
- Missing or too high idle_in_transaction_session_timeout
- Insufficient connection timeouts
Troubleshooting Steps
1. Terminate Problem Sessions:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction (aborted)';
2. Investigate Source:
-
Check application logs for errors
-
Identify which client hosts are involved (client_addr in pg_stat_activity)