Henrik> I assume the both monitor3 and monitor5 are running the network tests Henrik> (the [xymonnet] task), and they are the same version?
Yes to both questions:
[xymon at monitor5 ~]$ server/bin/xymoncmd xymonnet --version 2011-08-09 10:28:49 Using default environment file /home/xymon/server/etc/xymonserver.cfg xymonnet version 4.3.2 SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008 LDAP library: OpenLDAP 20343
[xymon at monitor5 ~]$
[xymon at monitor3 ~]$ server/bin/xymoncmd xymonnet --version 2011-08-09 10:27:51 Using default environment file /home/xymon/server/etc/xymonserver.cfg xymonnet version 4.3.2 SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008 LDAP library: OpenLDAP 20343
[xymon at monitor3 ~]$
Henrik> Could you provide the output from running Henrik> Henrik> xymoncmd xymonnet --no-update --debug camilla Henrik> Henrik> on the two Xymon servers when it fails ?
Yes, I set up a test host (tesla) to simulate the problem (delay + bad HTTP status) that I had with the production host (camilla) and ran:
[xymon at monitor3 ~]$ server/bin/xymoncmd xymonnet --no-update --debug tesla
...on both monitoring servers and attached the results and the '/home/xymon/data/hist/tesla' contents.
The hosts.cfg entries are:
192.168.0.2 tesla # ssh http://tesla httpstatus=dbB;http://tesla/cgi/dbB.pl;200; httpstatus=dbC;http://tesla/cgi/dbC.pl;200;
The only difference that I know of between the monitoring servers is that monitor5 has an 'analysis.cfg' with:
HOST=tesla DS http %.*http.*:sec >0.660 COLOR=yellow TEXT="Time exceeds &U at &V seconds."
With the above monitor5:analysis.cfg directive; the status is reported as yellow; without it is reported as red.
It looks like the analysis.cfg entry causes Xymon to mask the red with the yellow.
Any help understanding this would be appreciated.
- Troy
[debug output provided]
The hosts.cfg entries are: 192.168.0.2 tesla # ssh http://tesla httpstatus=dbB;http://tesla/cgi/dbB.pl;200; httpstatus=dbC;http://tesla/cgi/dbC.pl;200;
The only difference that I know of between the monitoring servers is that monitor5 has an 'analysis.cfg' with: HOST=tesla DS http %.*http.*:sec >0.660 COLOR=yellow TEXT="Time exceeds &U at &V seconds."
With the above monitor5:analysis.cfg directive; the status is reported as yellow; without it is reported as red.
It looks like the analysis.cfg entry causes Xymon to mask the red with the yellow.
OK, *now* I understand what You were talking about with "analysis.cfg". I had not noticed that You were using a "DS" test for response-times.
There's some bad news and some good news. Both of them are that Xymon works as designed :-/ The DS checks can increase the severity of a status - e.g. from green to yellow - or it may decrease the severity, which is what happens in this case.
I can see what it is You want to achieve, and Xymon obviously does not behave the way one would expect in this case. But I am not sure what the correct solution is.
An easy fix would be to say that the "DS" rules can only increase the severity of a status, not decrease it. But I can quickly come up with a couple of scenarios where I want the severity to be reduced.
The ideal solution would be a more advanced method of giving priorities to the different information Xymon has: Whether the network test actually worked, what the response time is, what data was returned from the server ... that can get quite complicated.
I'll have to think about how to solve this.
Regards, Henrik
Henrik,
Henrik> Could you provide the output from running Henrik> Henrik> xymoncmd xymonnet --no-update --debug camilla Henrik> Henrik> on the two Xymon servers when it fails ? Could you add to the xymonnet man page that it is possible to debug the test for a single host. This is a very useful capability I've looked for previously.
Thanks, David.
-- David Baldwin - Assistant Director, Infrastructure (acting) Information and Communication Technology Services Australian Sports Commission http://ausport.gov.au Tel 02 62147830 Fax 02 62141830 PO Box 176 Belconnen ACT 2616 david.baldwin at ausport.gov.au Leverrier Street Bruce ACT 2617
Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au
This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.
On 10-08-2011 01:23, David Baldwin wrote:
Henrik> xymoncmd xymonnet --no-update --debug camilla
Could you add to the xymonnet man page that it is possible to debug the test for a single host. This is a very useful capability I've looked for previously.
I've added this:
By default, all servers are tested - if XYMONNETWORK is set via .I xymonserver.cfg(5) then only the hosts marked as belonging to this network are tested. If the command-line includes one or more hostnames, then only those servers are tested.
and changed the "Synopsis" of the man-page to better show the most commonly used command-line options and the support for providing hostnames.
Regards, Henrik
participants (3)
-
david.baldwin@ausport.gov.au
-
henrik@hswn.dk
-
troy@athabascau.ca