In <1110846746.16767.277.camel at localhost.localdomain> Daniel J McDonald <dan.mcdonald at austinenergy.com> writes:
I'm finally trying to get alerts working with hobbit (RC5, no post patches, Linux 2.6.8.1-24mdksmp i686). I'm getting paged on yellow.
First check if those are alert-messages or recovery messages. In ~hobbit/data/acks/notifications.log you should see log entries for the messages you receive, like these:
Tue Mar 15 09:50:34 2005 backup-mx.post.tele.dk.smtp (195.41.53.68) henrik at hswn.dk 1110876634 725 Tue Mar 15 10:05:37 2005 backup-mx.post.tele.dk.smtp (195.41.53.68) henrik at hswn.dk 1110877537 725 Tue Mar 15 10:10:41 2005 backup-mx.post.tele.dk.smtp (195.41.53.68) henrik at hswn.dk 1110877840 725 4220
The first two are alerts, the last one is a recovery message (you can see that by the extra number "4220" which is how long the service was down).
RC5 has a known bug in the alert module, where it will send recovery messages even if you never received an alert message.
Here are the hobbitlaunch.cfg parameters:
Looks OK
and the paging rules: HOST=ae-urps.aenetad.net MAIL=dan.mcdonald at austinenergy.com,barry.allen at austinenergy.com REPEAT=24h RECOVERED
HOST=%.*ups.*.austin-energy.net MAIL=dan.mcdonald at austinenergy.com REPEAT=2h DURATION>10m SERVICE=freq COLOR="red" RECOVERED MAIL=dan.mcdonald at austinenergy.com REPEAT=2h SERVICE=upsmin,upssec COLOR="red" RECOVERED
HOST=%.*probe.*.austin-energy.net MAIL=dan.mcdonald at austinenergy.com REPEAT=24h DURATION>20m COLOR="red" RECOVERED
HOST=%. MAIL=dan.mcdonald at austinenergy.com REPEAT=140h DURATION>30m RECOVERED COLOR="red" UNMATCED
I suspect your yellow alerts are recovery messages, and hence this is the known bug in RC5.
BTW, the "UNMATCED" in your last rule must be a typo ...
Regards, Henrik