alert script timing issue?
I have set up a SCRIPT in alerts.cfg that runs when connectivity to any monitored client goes red.
HOST=* SERVICE=conn COLOR=red SCRIPT /usr/local/xymonutil/alertscripts/noconnectivity.sh DURATION>15 REPEAT=15 RECOVERED
Inside the script, it does a few things, one of which is to get the NAME associated with the client reporting the issue.
It does this by:
STATION=$XYMON $XYMONSERVERHOSTNAME "hostinfo host=$BBHOSTNAME" | $SED -e 's/|/\n/g' | $GREP NAME | $CUT -d: -f2
I have seen instances where connectivity is lost to several clients at the same time, and it appears that this STATION value is getting the same value for different lost connections.
For example, if there are 2 clients, with NAME value of STATION1 and STATION2, and connectivity is lost to both clients at the same time, the SCRIPT is run two times, but each run of the script will get STATION2 as the result of the above command.
As there anything wrong with my logic? Is there a better/different way to approach this?
Any thoughts or suggestions are appreciated. Thanks.
Kevin
On Fri, April 4, 2014 8:05 am, Kevin VerMeer wrote:
Inside the script, it does a few things, one of which is to get the NAME associated with the client reporting the issue. It does this by: STATION=
$XYMON $XYMONSERVERHOSTNAME "hostinfo host=$BBHOSTNAME" | $SED -e 's/|/\n/g' | $GREP NAME | $CUT -d: -f2I have seen instances where connectivity is lost to several clients at the same time, and it appears that this STATION value is getting the same value for different lost connections.
For example, if there are 2 clients, with NAME value of STATION1 and STATION2, and connectivity is lost to both clients at the same time, the SCRIPT is run two times, but each run of the script will get STATION2 as the result of the above command.
As there anything wrong with my logic? Is there a better/different way to approach this?
Is one of the hostnames a substring of the other one? IIRC, the parameter sent via "host=" is done as a PCRE match. If you have a whole hostname, adding \b before or after might help.
You can use "hostinfo clone=$BBHOSTNAME" to do an exact match, which will also break out data into different lines finally.
Finally, I think $XMH_RAW is available in the environment of scripts from xymond_alert itself. If so (it might have been a patch -- I forget :/ ) that would save you a callback into xymond for the data.
HTH,
-jc
participants (2)
-
cleaver@terabithia.org
-
KVerMeer@peoplenetonline.com