In <AANLkTi=OUGSQU_YvUn3temW2Pii09zPXyvUkmfjNTFbx at mail.gmail.com> Colin Coe <colin.coe at gmail.com> writes:
The alerting is starting to take shape but I've a question regarding how the alerting works. If I have a stanza similar to the following, how is it evaluated? Once for all hosts, or for one host at a time?
I understand your curiosity, but does it really matter how ? But it is evaluated whenever a potential alert may be generated, based on the host/service combination, time-of-day and all the other criteria. Think of it as a set of rules, and each time there something red or yellow, hobbitd_alert looks at this set of rules and finds those actions that match (if any).
HOST=%.* # Proliant tests MAIL sms at somecompany.com SERVICE=proliant FORMAT=SMS REPEAT=1440m MAIL sms at somecompany.com SERVICE=proliant FORMAT=SMS RECOVERED
Also, I've noticed that when a fault occurs I get two emails (or sms') and another when the fault is rectified. I'm thinking this is because of the 'RECOVERED' line but i thought this would only trigger when the fault goes. Have I misunderstood?
I think you have. Your configuration sets up two alerting actions, but both of them send mail to the same recipient. That's why you get two messages. What you want to do is simpler:
HOST=%.* # Proliant tests MAIL sms at somecompany.com SERVICE=proliant FORMAT=SMS REPEAT=1440m RECOVERED
This will give you one message when the service goes red or yellow, and one when it recovers. "RECOVERED" is an "add-on" to the normal alert, since you probably would like to know not only when something is fixed, but also when it broke.
Regards, Henrik