On 05-07-2011 11:58, Heather Keen wrote:
Anyway, I think this is a BUG.
Xymon Version 4.3.3. Configuration as follows:
analysis.cfg: HOST=myhost.mydomain.com GROUP=heather PROC TESTtestTEST 1
alerts.cfg: HOST=* MAIL heather1 at mydomain.com RECOVERED
GROUP=heather MAIL heather2 at mydomain.com RECOVERED
When the alert is generated, both e-mail addresses get the notification. But when the alert is cleared, only heather1 at mydomain.com <mailto:heather1 at mydomain.com> gets the recovery message.
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(
It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.
The problem is that when the PROC triggers a red status, Xymon knows that the rule was one that included a "GROUP=heather" setting. But when the recovery happens, it is because none of the rules in analysis.cfg triggered. So Xymon does not know that the green status is a recovery from a rule that contained the GROUP setting.
There is some state lost here.
To solve this, the xymond_alert module will have to keep track of the active alerts, and which GROUP settings triggered them. When the recovery happens, it will then use that list of groups that received the alert as the basis for sending out the recovered-notices.
It can be solved, of course. Just don't be disappointed when you see 4.3.4 being released later today without a fix for this problem.
Regards, Henrik