]> git.openstreetmap.org Git - chef.git/history - cookbooks/prometheus/templates/default/alert_rules.yml.erb
cloudwatch: add eu-north-1 for osm-main to collect replication metrics
[chef.git] / cookbooks / prometheus / templates / default / alert_rules.yml.erb
2023-10-02 Tom HughesAdd laser power alerts for switches
2023-09-07 Tom HughesAdd a connection limit alert for apache
2023-08-21 Tom HughesRestrict taginfo alerts to active instances
2023-08-04 Tom HughesAdd alerts for juniper alarms
2023-08-04 Tom HughesCorrect scaling for junos load average alerts
2023-08-04 Tom HughesChange switch fan alerts to use junos exporter metrics
2023-08-04 Tom HughesChange switch alerts to use junos exporter metrics
2023-07-10 Tom HughesImprove fastly alerts to make them more useful
2023-07-03 Tom HughesFix incorrect change to replication lag alerts
2023-07-03 Tom HughesUpdate postgresql exporter configuration
2023-06-20 Tom HughesMerge remote-tracking branch 'github/pull/590'
2023-06-20 Tom HughesAdd alerts for uplink status
2023-06-12 Tom HughesFix threshold for juniper load average alarm
2023-06-12 Tom HughesBase juniper CPU alarm on 5 minute load average
2023-06-11 Tom HughesFix exim mail queue alerts
2023-05-25 Tom HughesDrop the apache low request rate alert as it's not...
2023-05-18 Tom HughesUpdate site power limit for Amsterdam to 3.5kVA
2023-05-17 Tom HughesReduce sensitivity of job processing rate alert
2023-05-17 Tom HughesIncrease alert threshold for interface transmit/receive...
2023-05-17 Tom HughesReduce sensitivity of postgres idle transaction alert
2023-05-06 Tom HughesAvoid alerting on transient taginfo size changes during...
2023-05-03 Tom HughesAdd an alert for postgresql transactions which have...
2023-04-23 Tom HughesAdd some alerts for taginfo
2023-04-10 Tom HughesBase site power alerts on a one hour rolling average
2023-04-10 Tom HughesRemove site current alerts and update pdu current alerts
2023-04-07 Grant SlaterIncrease alert window for site power usage alert
2023-04-06 Tom HughesAlert for RAID batteries that have been recharging...
2023-04-06 Tom HughesAdd alerts for site power usage in Amsterdam and Dublin
2023-03-21 Tom HughesReduce sensitivity of CPU pressure alerts
2023-03-09 Tom HughesRelax thresholds for packet loss reporting
2023-03-08 Tom HughesScale some percentage values correctly in alerts
2023-03-08 Tom HughesAdd a packet loss alert
2023-03-06 Tom HughesAdd alert for failing discourse jobs
2023-03-06 Tom HughesFix statuscake alerts
2023-02-28 Tom HughesFix alerting for failed chef runs
2023-02-27 Tom HughesFix alerting for failed chef runs
2023-02-26 Tom HughesMerge remote-tracking branch 'github/pull/584'
2023-02-24 Tom HughesAdd an alert for RAID controller battery failures
2023-02-16 Tom HughesReduce sensitivity of postgres replication alarms
2023-02-16 Tom HughesMerge remote-tracking branch 'github/pull/572'
2023-02-12 Tom HughesMerge remote-tracking branch 'github/pull/571'
2023-02-12 Tom HughesReduce sensitivity of render rate alarm
2023-01-27 Tom HughesAdd an alert for low render rates on tile servers
2022-12-31 Tom HughesFix typo
2022-12-31 Tom HughesAlert if the number of SNMP PDUs returned decreases
2022-12-31 Tom HughesAdd some mysql alerts
2022-12-17 Tom HughesAdjust environment temperature alarm thresholds
2022-12-11 Tom HughesMerge remote-tracking branch 'github/pull/550'
2022-12-10 Tom HughesOnly alert for readonly filesystems which were recently...
2022-12-10 Tom HughesMerge remote-tracking branch 'github/pull/528'
2022-12-09 Tom HughesAdjust environment alarm thresholds
2022-12-03 Tom HughesAdd an alert for exim being down
2022-12-03 Tom HughesAdd some passenger alerts
2022-11-30 Tom HughesAdd high CPU alarm for Juniper switches
2022-11-30 Tom HughesRevert workarounds for intermittent SNMP monitoring
2022-11-21 Tom HughesMake fastly healthcheck alerts operate on a per-service...
2022-11-15 Tom HughesFix wildcard match for Juniper fan state
2022-11-15 Tom HughesTreat runningAtFullSpeed as a good state for Juniper...
2022-11-09 Tom HughesFix typo
2022-11-03 Sarah Hoffmannincrease alert time for overpass database age
2022-10-30 Tom HughesIncrease alert threshold for OSM overpass database
2022-10-26 Tom HughesExtend lookback for juniper alerts
2022-10-26 Tom HughesAdd temperature/humidity/power alerts for Dublin
2022-10-26 Tom HughesAttempt to make juniper alarms more robust
2022-10-26 Tom HughesResolve duplicate alert names
2022-10-23 Tom HughesFix errors in alert rules
2022-10-23 Tom HughesAdd alert rule for nominatim replication delay
2022-10-23 Tom HughesAdd alert rules for overpass database age
2022-08-29 Tom HughesFix error in rasdaemon alerts
2022-08-28 Tom HughesAdd alerts for rasdaemon events
2022-08-13 Tom HughesExclude nominatim databases from deadlock alerts
2022-08-03 Tom HughesMerge remote-tracking branch 'github/pull/514'
2022-08-02 Grant SlaterMerge remote-tracking branch 'tigerfell/pr257'
2022-07-28 Tom HughesFix typo
2022-07-28 Tom HughesDecrease threshold for SSD wearout alerting
2022-07-28 Tom HughesReduce sensitivity of fastly healthcheck alerts
2022-07-25 Tom HughesAdd an alert for prometheus configuration errors
2022-07-25 Tom HughesFix syntax error in alert rules
2022-07-25 Tom HughesAdd alerts for degraded raid arrays and failed raid...
2022-07-24 Tom HughesAdd an alert for failing healthchecks on the CDN
2022-07-21 Tom HughesReduce sensitivity of some alerts
2022-07-21 Tom HughesReduce sensitivity of alert for wireguard interface...
2022-07-21 Tom HughesAdd sensor values to cisco alerts
2022-07-21 Tom HughesAdd alert rules for juniper switches
2022-07-21 Tom HughesAdd alert rules for cisco switches
2022-07-20 Tom HughesFix syntax error in alert rule
2022-07-18 Tom HughesAdd an alert for unusually low apache request rates
2022-07-18 Tom HughesAdd a prometheus alert for hosts shown as down by Statu...
2022-06-20 Tom HughesAdd site monitoring alerts for amsterdam
2022-05-29 Tom HughesReduce sensitivity of renderd replication delay alert
2022-05-29 Tom HughesOnly alert if the job processing rate is low for an...
2022-05-21 Tom HughesIncrease alerting threshold for CPU pressure
2022-05-19 Tom HughesAdd an alert for chef not running for an extended time
2022-05-19 Tom HughesOnly alert for failed chef-client services if they...
2022-02-24 Tom HughesAdd alert for job processing rate
2021-12-22 Tom HughesAdd alert for high error rates on fastly
2021-12-01 Tom HughesAdd alert for mailman queue length
2021-11-25 Tom HughesAdd an alert for the mail queue
2021-11-21 Tom HughesCorrect alert name
2021-11-21 Tom HughesUse correct metric for CPU pressure
next