Problem disabling large groups of hosts
I'm having a problem trying to disable over 1000+ hosts and tests at one time for maintenance windows. I've even tried disabling in groups of 100-200 but it will still miss a group of servers here and there. It seems that when I click "apply" and once the screen refreshes could possibly stop running the enadis.sh/cgi script in the background (??)
Is there a way to debug this or modify the refresh time/interval?
The reason we want to disable all these hosts/tests for maintenance windows is to keep SLA during the reboots and service outages for patching etc.. Does anyone have other recommendations or suggest using REPORTTIME globally instead?
FreeBSD 9.0 Apache 2.2.22 Xymon v4.3.10
Thanks in advance,
Clint
This e-mail (and any attachment) is strictly confidential and for use only by intended recipient(s). The opinions therein expressed are those of the author. Its contents, therefore, do not represent any commitment between the company and the recipient(s) and no liability or responsibility is accepted by the company for the above mentioned content. If you are not an intended recipient(s), please notify the author promptly and delete this message.
I'm having a problem trying to disable over 1000+ hosts and tests at one time for maintenance windows. I've even tried disabling in groups of 100-200 but it will still miss a group of servers here and there. It seems that when I click "apply" and once the screen refreshes could possibly stop running the enadis.sh/cgi script in the background (??)
Is there a way to debug this or modify the refresh time/interval?
The reason we want to disable all these hosts/tests for maintenance windows is to keep SLA during the reboots and service outages for patching etc.. Does anyone have other recommendations or suggest using REPORTTIME globally instead?
Possibly adding --debug to cgioptions under ~xymon/server/etc/ might give you some additional info... Sadly, it could be httpd closing the connection :/
If you're doing that many, I might suggest simply generating a series of disable/enable commands and just piping them in via the xymon CLI.
There's also the 'schedule' option too.
Regards,
-jc
There's also the 'schedule' option too. <-- Using this method worked!
Thanks, Clint
-----Original Message----- From: cleaver at terabithia.org [mailto:cleaver at terabithia.org] Sent: Tuesday, January 29, 2013 3:43 PM To: Simmons Clint Cc: xymon at xymon.com Subject: Re: [Xymon] Problem disabling large groups of hosts
I'm having a problem trying to disable over 1000+ hosts and tests at one time for maintenance windows. I've even tried disabling in groups of 100-200 but it will still miss a group of servers here and there. It seems that when I click "apply" and once the screen refreshes could possibly stop running the enadis.sh/cgi script in the background (??)
Is there a way to debug this or modify the refresh time/interval?
The reason we want to disable all these hosts/tests for maintenance windows is to keep SLA during the reboots and service outages for patching etc.. Does anyone have other recommendations or suggest using REPORTTIME globally instead?
Possibly adding --debug to cgioptions under ~xymon/server/etc/ might give you some additional info... Sadly, it could be httpd closing the connection :/
If you're doing that many, I might suggest simply generating a series of disable/enable commands and just piping them in via the xymon CLI.
There's also the 'schedule' option too.
Regards,
-jc
This e-mail (and any attachment) is strictly confidential and for use only by intended recipient(s). The opinions therein expressed are those of the author. Its contents, therefore, do not represent any commitment between the company and the recipient(s) and no liability or responsibility is accepted by the company for the above mentioned content. If you are not an intended recipient(s), please notify the author promptly and delete this message.
On 30 January 2013 04:36, Simmons Clint <C.Simmons at criflending.com> wrote:
I’m having a problem trying to disable over 1000+ hosts and tests at one time for maintenance windows. I’ve even tried disabling in groups of 100-200 but it will still miss a group of servers here and there. It seems that when I click “apply” and once the screen refreshes could possibly stop running the enadis.sh/cgi script in the background (??)
I'm thinking that there's some limit being reached, perhaps maximum rate of commands to xymond (although I'm not aware of any such thing), maximum CGI request size, or maximum CGI lifetime.
First, check your xymond logs for warnings. Also, check the Apache logs. If nothing, then run the following command:
xymond_channel --channel=enadis sh -c 'cat >/tmp/enadis-msgs.dump'
Then do your disable from the web interface. When it's all finished (perhaps when the last of the servers show as disabled), stop the above process and review the dump file, looking for the hosts that didn't get disabled. If they show up in the dump, the problem might be in the xymond process handling the disable commands. This would be tricky to diagnose, and might require a review of the code, or running xymond with debugging on.
If the hosts didn't show up in the dump, the problem might be in the CGI process (enadis.cgi). You could replace enadis.sh with a modified version that first copies STDIN to a dump file before sending it on the normal path to the CGI. Such as adding the "cat" line below immediately before the "exec" line like so:
#!/bin/sh
This is a wrapper for the Xymon enadis script
. /usr/lib/xymon/server/etc/cgioptions.cfg cat > /tmp/enadis-cgi.dump; cat /tmp/enadis-cgi.dump | exec /usr/lib/xymon/server/bin/enadis.cgi $CGI_ENADIS_OPTS
Also, you might see if sending the disable messages from the command-line all at once also produces the same behaviour. If it does, then the CGI is not your problem, and is more likely to be the xymond process.
J
This may be a genuine problem to be solved, but.... Doesn't DOWNTIME fit the bill here to designate recurring maintenance windows? http://www.xymon.com/xymon/help/manpages/man5/hosts.cfg.5.html
Cheers.
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Simmons Clint Sent: Tuesday, January 29, 2013 11:36 AM To: xymon at xymon.com Subject: [Xymon] Problem disabling large groups of hosts
I'm having a problem trying to disable over 1000+ hosts and tests at one time for maintenance windows. I've even tried disabling in groups of 100-200 but it will still miss a group of servers here and there. It seems that when I click "apply" and once the screen refreshes could possibly stop running the enadis.sh/cgi script in the background (??)
Is there a way to debug this or modify the refresh time/interval?
The reason we want to disable all these hosts/tests for maintenance windows is to keep SLA during the reboots and service outages for patching etc.. Does anyone have other recommendations or suggest using REPORTTIME globally instead?
FreeBSD 9.0 Apache 2.2.22 Xymon v4.3.10
Thanks in advance,
Clint
This e-mail (and any attachment) is strictly confidential and for use only by intended recipient(s). The opinions therein expressed are those of the author. Its contents, therefore, do not represent any commitment between the company and the recipient(s) and no liability or responsibility is accepted by the company for the above mentioned content. If you are not an intended recipient(s), please notify the author promptly and delete this message.
participants (4)
-
C.Simmons@criflending.com
-
cleaver@terabithia.org
-
dddugan@iastate.edu
-
jlaidman@rebel-it.com.au