problem with alerting
Hi there,
I'm running XYmon 4.3.0-beta2 on the main server and XYmon 4.3.0-0.20080929 on the standby server. Both servers are running Debian 5.0.3 and XYmon was compiled from source. Now I'm experiencing a strange problem. I added some alerts to hobbit-alerts.cfg that should do the following:
if a service (cluster, conn, disk, ports, procs or eDIR) is switching to red and stays red for more than 15 minutes a mail will be sent to a defined address. No repeat.
I created a macro with the following parameters:
$NOVELL=MAIL user at domain.tld REPEAT=0 COLOR=red RECOVERED EXHOST=%-rib.* SERVICE=cluster,conn,disk,eDIR,ports,procs DURATION>15
and below that I created the alert entry:
HOST=%^fs.* $NOVELL
When I run bin/hobbitd_alert --config=etc/hobbit-alerts.cfg --dump-config it expands to:
HOST=%^fs.* MAIL user at domain.tld FORMAT=TEXT REPEAT=0 EXHOST=%-rib.* SERVICE=cluster,conn,disk,eDIR,ports,procs COLOR=red DURATION>15 RECOVERED
which seems to be ok.
Also when I look at the info page of one of the servers it looks good:
ServiceRecipient1st DelayStop afterRepeatTime of DayColors conn user at domain.tld(R) 15m - - - red disk user at domain.tld (R) 15m - - - red eDIR user at domain.tld (R) 15m - - - red ports user at domain.tld (R) 15m - - - red procs user at domain.tld (R) 15m - - - red
But last night the service ports went down for over 4 hours on 1 client and XYmon sent out emails every minute which should not be done if I understood the configuration right. It did send the mails from both servers although [bbpage] is disabled on the standby server.
What I am doing wrong? I had a similar problem on another XYmon server (4.2.0) and there I set the REPEAT to 1 day and it disappeared. But that can't be the solution.
Any help would be appreciated.
Regards Torsten
In <320287131.245444.1257759574454.JavaMail.open-xchange at oxltgw01.schlund.de> "bb4 at richter-it.net" <bb4 at richter-it.net> writes:
if a service (cluster, conn, disk, ports, procs or eDIR) is switching= to red and stays red for more than 15 minutes a mail will be sent to a defined add= ress. No repeat.
I created a macro with the following parameters:
$NOVELL=MAIL user at domain.tld REPEAT=0 COLOR=red RECOVERED EXHOST=%-rib.*
[snip]
But last night the service ports went down for over 4 hours on 1 client and XYmon sent out emails every minute
The "REPEAT=0" is the culprit here. Use some large value instead of 0.
Regards, Henrik
participants (2)
-
bb4@richter-it.net
-
henrik@hswn.dk