Henrik, Tom, My 15 minute DURATION fired. I don't think it is a coincidence that it fired at 1 day and 5 hours. I think the earlier possible bug where when you specify 15m you get a particularly large number is probably where the problem is. Tom Georgoulias wrote:
David Gore wrote:
So it's like nothing happens afterwards? Hopefully, I got all the relevant parts of the log file. I didn't want the posting to long. Any ideas?
Have you made any progress on this? I can't get the DURATION variable to work either, and this time around I'm sure a typo is not the reason for not getting an alert email.
Here's what I've done and what I see:
I added the --debug switch to hobbitd_alert in hobbitlaunch.cfg:
CMD hobbitd_channel --channel=page --log=$BBSERVERLOGS/page.log hobbitd_alert --debug
My rule from hobbit-alerts.cfg.
HOST=$FOUND_SYS MAIL broken at nandomedia.com SERVICE=procs COLOR=red DURATION>5 REPEAT=5
After I add this rule, I restart hobbit. I read on the list that restarting isn't necessary, but it has been my experience that changes made to hobbit-alerts.cfg do not always get put into effect unless hobbit is restarted.
Excerpts from page.log:
(note: I replaced a valid IP address with 0s in the 3rd field of the @@page line of this excerpt)
2005-02-02 08:11:12 hobbitd_alert: Got message 4 @@page#4|1107349872.146928|0.0.0.0|foundry01.nandomedia.com|procs|0.0.0.0|1107351672|red|red|1107227163|web6|315344
2005-02-02 08:11:12 Got page message from foundry01.nandomedia.com:procs 2005-02-02 08:11:12 Alert status changed from 0 to 1 2005-02-02 08:11:12 criteriamatch foundry01.nandomedia.com:procs %(foundry.*).nandomedia.com:(NULL):(NULL) 2005-02-02 08:11:12 pcre_exec returned 2 2005-02-02 08:11:12 Checking default color setting 70 against 5 gives 1 2005-02-02 08:11:12 Found a first matching rule 2005-02-02 08:11:12 criteriamatch foundry01.nandomedia.com:procs (NULL):(NULL):procs 2005-02-02 08:11:12 failed minduration 0<300
So it looks like the duration variable was checked, which is good. The next time I see this server in the page.log, the min duration isn't checked.
2005-02-02 08:16:12 hobbitd_alert: Got message 16 @@page#16|1107350172.517352|0.0.0.0|foundry01.nandomedia.com|procs|0.0.0.0|1107351972|red|red|1107227163|web6|315344
2005-02-02 08:16:12 Got page message from foundry01.nandomedia.com:procs 2005-02-02 08:16:12 0 alerts to go 2005-02-02 08:17:12 0 alerts to go
This message will repeat from now on, varying only in the message count #, but alerts are not sent out:
-bash-2.05b$ grep foundry data/acks/notifications.log -bash-2.05b$
I dunno what else to investigate at this point.
Tom
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk