Confusion on Alerts
Hi Folks -
I'm a little confused to why something is happening in my alerts config, so I thought I would ask for some help to clear up the confusion.
I have a specific set of systems that I only want to alert on if the procs alarm is on for 30 mins or more, while my default is 6 minutes (1 more than the poller). So I have 2 things (well, for this, I have a whole lot more doing other things):
only alarm after 30 min machines
HOST=%echo.*.domain.com SERVICE=procs MAIL $EMAIL_OPS DURATION>30m REPEAT=60m MAIL $OPS_PAGER_1 DURATION>30m REPEAT=10m COLOR=red STOP
HOST=* MAIL $EMAIL_OPS DURATION>6m REPEAT=60m MAIL $OPS_PAGER_1 DURATION>6m REPEAT=10m COLOR=red
I _THOUGHT_ that if the STOP keyword was in there, it would stop processing after that, but when I look in the web interface, I see this:
procs bb-notice at domain.com 30m 1s - 1h - purple,yellow,red pager at domain.com,pagerbox at domain.com (S) 30m 1s - 10m - red bb-notice at domain.com 6m 1s - 1h - purple,yellow,red pager at domain.com,pagerbox at domain.com 6m 1s - 10m - red
The lines after the first set are red (which makes me think it's pointing out bad configuration?). What am I missing here?
Thanks! Skadz
Ryan,
Ran into the same problem. Here is a note I left in my alerts.cfg file. There may be another way to handle it but this works for me. The problem is it falls down to the next rule since the DURATION has not been met. So it never sees the STOP at the end of the line.
#######################################################################################################################
Here is the proper way to handle a DURATION setting when you don't want any notification to happen
until it is in an alert condition for at least 2 minutes.
If you don't use the SCRIPT for xymon-ignore.sh it will fail to match if the duration is not met.
It will then NOT stop processing and will fall down to the default alert notifications which will be immediate.
The xymon-ignore.sh script just returns an exit code of 0. Does nothing else.
PAGE=%url/.* EXSERVICE=sslcert
MAIL admin at some_company.com DURATION>2 REPEAT=60 RECOVERED
SCRIPT /data/paging/bin/xymon-ignore.sh none FORMAT=SCRIPT STOP
#######################################################################################################################
cat /data/paging/bin/xymon-ignore.sh #!/bin/sh exit 0
Larry
-----Original Message----- From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Ryan Skadberg Sent: Friday, August 01, 2014 1:41 PM To: xymon at xymon.com Subject: [Xymon] Confusion on Alerts
Hi Folks -
I'm a little confused to why something is happening in my alerts config, so I thought I would ask for some help to clear up the confusion.
I have a specific set of systems that I only want to alert on if the procs alarm is on for 30 mins or more, while my default is 6 minutes (1 more than the poller). So I have 2 things (well, for this, I have a whole lot more doing other things):
only alarm after 30 min machines
HOST=%echo.*.domain.com SERVICE=procs MAIL $EMAIL_OPS DURATION>30m REPEAT=60m MAIL $OPS_PAGER_1 DURATION>30m REPEAT=10m COLOR=red STOP
HOST=* MAIL $EMAIL_OPS DURATION>6m REPEAT=60m MAIL $OPS_PAGER_1 DURATION>6m REPEAT=10m COLOR=red
I _THOUGHT_ that if the STOP keyword was in there, it would stop processing after that, but when I look in the web interface, I see this:
procs bb-notice at domain.com 30m 1s - 1h - purple,yellow,red pager at domain.com,pagerbox at domain.com (S) 30m 1s - 10m - red bb-notice at domain.com 6m 1s - 1h - purple,yellow,red pager at domain.com,pagerbox at domain.com 6m 1s - 10m - red
The lines after the first set are red (which makes me think it's pointing out bad configuration?). What am I missing here?
Thanks! Skadz
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
CONFIDENTIALITY NOTICE: This electronic mail message is intended exclusively for recipient to which it is addressed. The contents of this message and any attachments may contain confidential and privileged information. Any unauthorized review, use, print, storage, copy, disclosure or distribution is strictly prohibited. If you have received this message in error, please advise the sender immediately by replying to the message's sender and delete all copies of this message and its attachments without disclosing the contents to anyone, or using the contents for any purpose.
Thanks for the info, Larry, I see the problem now.
Is it really just Larry and I doing something like this? Would be nice to not have to hack this.
Skadz
On Fri, Aug 1, 2014 at 2:55 PM, Larry Bonham <larry at fni-stl.com> wrote:
Ryan,
Ran into the same problem. Here is a note I left in my alerts.cfg file. There may be another way to handle it but this works for me. The problem is it falls down to the next rule since the DURATION has not been met. So it never sees the STOP at the end of the line.
#######################################################################################################################
Here is the proper way to handle a DURATION setting when you don't want any notification to happen
until it is in an alert condition for at least 2 minutes.
If you don't use the SCRIPT for xymon-ignore.sh it will fail to match if the duration is not met.
It will then NOT stop processing and will fall down to the default alert notifications which will be immediate.
The xymon-ignore.sh script just returns an exit code of 0. Does nothing else.
PAGE=%url/.* EXSERVICE=sslcert
MAIL admin at some_company.com DURATION>2 REPEAT=60 RECOVERED
SCRIPT /data/paging/bin/xymon-ignore.sh none FORMAT=SCRIPT STOP
#######################################################################################################################
cat /data/paging/bin/xymon-ignore.sh #!/bin/sh exit 0
Larry
-----Original Message----- From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Ryan Skadberg Sent: Friday, August 01, 2014 1:41 PM To: xymon at xymon.com Subject: [Xymon] Confusion on Alerts
Hi Folks -
I'm a little confused to why something is happening in my alerts config, so I thought I would ask for some help to clear up the confusion.
I have a specific set of systems that I only want to alert on if the procs alarm is on for 30 mins or more, while my default is 6 minutes (1 more than the poller). So I have 2 things (well, for this, I have a whole lot more doing other things):
only alarm after 30 min machines
HOST=%echo.*.domain.com SERVICE=procs MAIL $EMAIL_OPS DURATION>30m REPEAT=60m MAIL $OPS_PAGER_1 DURATION>30m REPEAT=10m COLOR=red STOP
HOST=* MAIL $EMAIL_OPS DURATION>6m REPEAT=60m MAIL $OPS_PAGER_1 DURATION>6m REPEAT=10m COLOR=red
I _THOUGHT_ that if the STOP keyword was in there, it would stop processing after that, but when I look in the web interface, I see this:
procs bb-notice at domain.com 30m 1s - 1h - purple,yellow,red pager at domain.com,pagerbox at domain.com (S) 30m 1s - 10m - red bb-notice at domain.com 6m 1s - 1h - purple,yellow,red pager at domain.com,pagerbox at domain.com 6m 1s - 10m - red
The lines after the first set are red (which makes me think it's pointing out bad configuration?). What am I missing here?
Thanks! Skadz
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
CONFIDENTIALITY NOTICE: This electronic mail message is intended exclusively for recipient to which it is addressed. The contents of this message and any attachments may contain confidential and privileged information. Any unauthorized review, use, print, storage, copy, disclosure or distribution is strictly prohibited. If you have received this message in error, please advise the sender immediately by replying to the message's sender and delete all copies of this message and its attachments without disclosing the contents to anyone, or using the contents for any purpose.
participants (2)
-
larry@fni-stl.com
-
skadz@skadz.com