Henrik Stoerner wrote:
On Mon, May 22, 2006 at 11:16:00AM +0200, Dominique Frise wrote:
We use the RECOVERED keyword for all recipients defined in hobbit-alerts.cfg.
We noticed a problem for hosts where alerting for a given service is excluded during a certain time. When a problem occurs on the service -out of the exclusion time-, the yellow/red alarms get sent. When the problem is resolved though, there is no recovered confirmation message/SMS. This issue is not related to the amount of time the service was down.
Example configuration and logs:
----hobbit-alerts.cfg---- ... ...
Do not send anything for given service(s) during period of time
HOST=test3 SERVICE=http TIME=*:0305:0315 ... ...
Rules by administrator
HOST=test3 MAIL test at example.com REPEAT=24h RECOVERED SCRIPT /usr/local/sendsms 0123456789 COLOR=red FORMAT=SMS REPEAT=24h RECOVERED
If I understand your configuration snippet correctly, then this is a configuration error. You shouldn't have rules with no recipients, like the first one you have shown here.
Is this a bug or a is something wrong with the exclusion specification?
Your exclusion is wrong. It should be (notice the TIME setting):
HOST=test3 TIME=*:0315:0305 MAIL test at example.com REPEAT=24h RECOVERED SCRIPT /usr/local/sendsms 0123456789 COLOR=red FORMAT=SMS REPEAT=24h RECOVERED
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Thank you fo these explanations.
That means it is not possible to write simple rules for excluding alerts for a given service for all hosts (HOST=*) during a period of time? Do we really have to write the same exclude/include rules for each host?
Dominique UNIL - University of Lausanne