Hi,
I have installed the client but I moved it to an other directory. I tried to set XYMONHOME to this new directory, but this causes xymonlaunch to crash with a buffer overvlow. The rest of the commands are running fine. This is the code from ./common/xymonlaunch.c:
/* Find config */
if (!config) {
if (stat("/etc/tasks.cfg", &st) != -1) config = strdup("/etc/tasks.cfg");
else if (stat("/etc/xymon/tasks.cfg", &st) != -1) config =
strdup("/etc/xymon/tasks.cfg");
else if (stat("/etc/xymon-client/clientlaunch.cfg", &st) != -1) config =
strdup("/etc/xymon-client/clientlaunch.cfg");
else if (xgetenv("XYMONHOME")) {
char *pf = NULL;
sprintf(pf, "%s/etc/tasks.cfg", xgetenv("XYMONHOME"));
if (pf && stat(pf, &st) != -1) config = strdup(pf);
}
if (config) dbgprintf("Using config file: %s\n", config);
}
I fixed it like this (line 546). I also added an extra if statement so the script also tries to use clientlaunch.cfg:
/* Find config */ if (!config) { if (stat("/etc/tasks.cfg", &st) != -1) config = strdup("/etc/tasks.cfg"); else if (stat("/etc/xymon/tasks.cfg", &st) != -1) config = strdup("/etc/xymon/tasks.cfg"); else if (stat("/etc/xymon-client/clientlaunch.cfg", &st) != -1) config = strdup("/etc/xymon-client/clientlaunch.cfg"); else if (xgetenv("XYMONHOME")) { char pf[1024] ; sprintf(pf, "%s/etc/tasks.cfg", xgetenv("XYMONHOME")); if (pf && stat(pf, &st) != -1) { config = strdup(pf); } else { sprintf(pf, "%s/etc/clientlaunch.cfg", xgetenv("XYMONHOME")); config = strdup(pf); } }
I also have a problem with this line from file common/xymoncmd.c: sprintf(envfn, "%s/etc/xymonserver.cfg", xgetenv("XYMONHOME"));
I want to run xymoncmd without setting XYMONHOME. So it has to use XYMONHOME from compile time, but that's not working. In the Makefile I set XYMONHOME to a directory but during the compile, /client is added.
Stef
Den 13.01.2014 11:02, Stef Coene skrev:
I have installed the client but I moved it to an other directory. I tried to set XYMONHOME to this new directory, but this causes xymonlaunch to crash with a buffer overvlow. This is the code from ./common/xymonlaunch.c: [...]
else if (xgetenv("XYMONHOME")) { char *pf = NULL; sprintf(pf, "%s/etc/tasks.cfg", xgetenv("XYMONHOME")); if (pf && stat(pf, &st) != -1) config = strdup(pf); }
I have no idea how that ever got into the official code. It is so obviously broken I should have caught it before hitting 'commit'. Fixed.
I also have a problem with this line from file common/xymoncmd.c: sprintf(envfn, "%s/etc/xymonserver.cfg", xgetenv("XYMONHOME"));
I want to run xymoncmd without setting XYMONHOME. So it has to use XYMONHOME from compile time, but that's not working.
Now I'm confused. You start by saying you are working on a *client* installation, but xymonserver.cfg refers to a *server* installation.
In the Makefile I set XYMONHOME to a directory but during the compile, /client is added.
The "/client" in the top-level Makefile during 'configure'. If you don't want it, then re-build the client with XYMONHOME set the way you want.
Anyway, if you override XYMONHOME in your xymonclient.cfg or xymonserver.cfg, then it should work fine. The only problem that is to bootstrap it for xymonlaunch and xymoncmd (if you run that for commands not invoked through xymonlaunch). The easiest in that case is to add the --env and --config options for xymonlaunch / xymoncmd.
Regards, Henrik
Now I'm confused. You start by saying you are working on a *client* installation, but xymonserver.cfg refers to a *server* installation. It is indeed for a client. But the search logic for the config file for xymonlaunch is: /etc/tasks.cfg /etc/xymon-client/clientlaunch.cfg $XYMONHOME/etc/tasks.cfg
And I think the fourth is missing: $XYMONHOME/etc/clientlaunch.cfg
The "/client" in the top-level Makefile during 'configure'. If you don't want it, then re-build the client with XYMONHOME set the way you want.
Anyway, if you override XYMONHOME in your xymonclient.cfg or xymonserver.cfg, then it should work fine. The only problem that is to bootstrap it for xymonlaunch and xymoncmd (if you run that for commands not invoked through xymonlaunch). The easiest in that case is to add the --env and --config options for xymonlaunch / xymoncmd. I just want to get rid of a forced /client in the path.
I found that in lib/environ.c XYMONHOME is set to BUILD_HOME. So even if you define XYMONHOME in the top level Makefile, it's overwritten in lib/environ.c with BUILD_HOME. I fixed my problem by changing the Makfile in lib en client and remove "/client" from the defintion of BUILD_HOME
Stef
Den 13.01.2014 13:57, Stef Coene skrev:
I just want to get rid of a forced /client in the path.
OK, looking at this in more detail - it is a bit messy. I think the attached patch should fix it properly, it eliminates the ".../client" in a client-only configuration, but keeps it in a client+server installation.
Note that you should start from scratch with "./configure --client" to test it.
Regards, Henrik
On Tuesday 14 January 2014 10:36:52 henrik at hswn.dk wrote:
Den 13.01.2014 13:57, Stef Coene skrev:
I just want to get rid of a forced /client in the path.
OK, looking at this in more detail - it is a bit messy. I think the attached patch should fix it properly, it eliminates the ".../client" in a client-only configuration, but keeps it in a client+server installation. I think I should apply this patch agains xymon-4.3.14 ?
Stef
Den 14.01.2014 11:09, Stef Coene skrev:
On Tuesday 14 January 2014 10:36:52 henrik at hswn.dk wrote:
Den 13.01.2014 13:57, Stef Coene skrev:
I just want to get rid of a forced /client in the path.
OK, looking at this in more detail - it is a bit messy. I think the attached patch should fix it properly, it eliminates the ".../client" in a client-only configuration, but keeps it in a client+server installation. I think I should apply this patch agains xymon-4.3.14 ?
Works with 4.3.13 also.
Regards, Henrik
After applying the patch, I get the error
build/Makefile.rules:95: *** Recursive variable `CC' references itself (eventually). Stop.
Stef
Den 14.01.2014 12:51, Stef Coene skrev:
After applying the patch, I get the error
build/Makefile.rules:95: *** Recursive variable `CC' references itself (eventually). Stop.
Doesn't happen here. Try going to http://sourceforge.net/p/xymon/code/HEAD/tree/branches/4.3.14/ and click on "Download snapshot" to grab the current test-version of 4.3.14, and then apply the diff on top of that.
Regards, Henrik
On Thursday 16 January 2014 15:38:22 henrik at hswn.dk wrote:
Den 14.01.2014 12:51, Stef Coene skrev:
After applying the patch, I get the error
build/Makefile.rules:95: *** Recursive variable `CC' references itself (eventually). Stop.
Doesn't happen here. Try going to http://sourceforge.net/p/xymon/code/HEAD/tree/branches/4.3.14/ and click on "Download snapshot" to grab the current test-version of 4.3.14, and then apply the diff on top of that. I already tried that. But I found a solution, I resaved the diff and the patch is working now.
Stef
Hello,
Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
- The xymonproxy process used 100% of only one CPU (no multithread seen).
- On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks).
I tried to used the -lqueue option for the proxy with no success. Nothing special seen in any logs. The UNIX admin said me that the xymonproxy was just making 'time()' command all the time.
When I try on test env with both Solaris for proxy and server, all is ok. So I suspect an issue of scale (nbre of targets connected on the proxy) and multithread issue also.
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
Den 14.01.2014 14:23, Gautier Begin skrev:
Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
- The xymonproxy process used 100% of only one CPU (no multithread seen).
- On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks).
I tried to used the -lqueue option for the proxy with no success. Nothing special seen in any logs. The UNIX admin said me that the xymonproxy was just making 'time()' command all the time.
I've had the proxy handling about 5000 hosts simultaneously. On Linux, though.
I have not heard of such behaviour before, and I am sure there must be others running the proxy on Solaris in a similar setup. If you can make it happen again, please do a "kill -USR2" on the xymonproxy process to toggle debugging on/off.
None of the Xymon tools are multithreaded - so far I have stuck to the traditional Unix way of doing things.
Regards,
Henrik
Henrik,
I managed to reproduce the issue:
- I start the XYMON as a proxy server => Non pbl
- Stop the XYMON => Process remains alive (xymonlaunch)
- Kill the xymonlaunch
- Start the XYMON as aproxy server => The pbl occurs again
Could some files in the ./tmp of XYMON influence this ?
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
From: henrik at hswn.dk To: <xymon at xymon.com> Date: 01/14/2014 03:00 PM Subject: Re: [Xymon] xymonproxy perf issue Sent by: "Xymon" <xymon-bounces at xymon.com>
Den 14.01.2014 14:23, Gautier Begin skrev: Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
The xymonproxy process used 100% of only one CPU (no multithread seen).
On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks).
I tried to used the -lqueue option for the proxy with no success. Nothing special seen in any logs. The UNIX admin said me that the xymonproxy was just making 'time()' command all the time.
I've had the proxy handling about 5000 hosts simultaneously. On Linux, though. I have not heard of such behaviour before, and I am sure there must be others running the proxy on Solaris in a similar setup. If you can make it happen again, please do a "kill -USR2" on the xymonproxy process to toggle debugging on/off. None of the Xymon tools are multithreaded - so far I have stuck to the traditional Unix way of doing things.
Regards, Henrik
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Hello,
Could it comes from how the xymonproxy program handles the MACHINE variable ?
I say that because when the process becomes nuts, he makes only 'gettimer' action. Then I have a look in the source file and found this line (line 452) that could correspond:
if (proxyname && ((now = gettimer()) >= (laststatus+300))) {
proxyname comes from $MACHINE
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
CSC Computer Sciences SAS • Registered Office: Immeuble Le Balzac, 10 Place des Vosges, 92072 Paris La Défense Cedex, France • Registered in France: RCS Nanterre B 315 268 664
From: Gautier Begin/LUX/CSC at CSC To: xymon at xymon.com Date: 01/17/2014 02:42 PM Subject: Re: [Xymon] xymonproxy perf issue Sent by: "Xymon" <xymon-bounces at xymon.com>
Henrik,
I managed to reproduce the issue:
- I start the XYMON as a proxy server => Non pbl
- Stop the XYMON => Process remains alive (xymonlaunch)
- Kill the xymonlaunch
- Start the XYMON as aproxy server => The pbl occurs again
Could some files in the ./tmp of XYMON influence this ?
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
From: henrik at hswn.dk To: <xymon at xymon.com> Date: 01/14/2014 03:00 PM Subject: Re: [Xymon] xymonproxy perf issue Sent by: "Xymon" <xymon-bounces at xymon.com>
Den 14.01.2014 14:23, Gautier Begin skrev: Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
- The xymonproxy process used 100% of only one CPU (no multithread seen).
- On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks).
I tried to used the -lqueue option for the proxy with no success. Nothing special seen in any logs. The UNIX admin said me that the xymonproxy was just making 'time()' command all the time. I've had the proxy handling about 5000 hosts simultaneously. On Linux, though. I have not heard of such behaviour before, and I am sure there must be others running the proxy on Solaris in a similar setup. If you can make it happen again, please do a "kill -USR2" on the xymonproxy process to toggle debugging on/off. None of the Xymon tools are multithreaded - so far I have stuck to the traditional Unix way of doing things.
Regards, Henrik
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Den 17.01.2014 16:14, Gautier Begin skrev:
- that could correspond:
Could it comes from how the xymonproxy program handles the MACHINE variable ?
I say that because when the process becomes nuts, he makes only 'gettimer' action. Then I have a look in the source file and found this line (line
_if (proxyname && ((now = gettimer()) = (laststatus+300))) {_
[snip]
Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
- The xymonproxy process used 100% of only one CPU (no multithread seen).
- On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks).
I suspect some kind of error happened with the network socket handling. Could you add this patch and try to reproduce the problem? It doesn't change the behaviour, but it does add some error-reporting in case the core select() call fails.
Regards, Henrik
Henrik,
I applied the patch on my test machine that has not pbl => all continues to run well
I applied the patch on the machine with the 100%CPU issue. I get the same behaviour, the content of the xymonproxy.log is the following lines looping: 1166 2014-01-28 18:09:00 state 0: reading from client 1166 2014-01-28 18:09:00 state 1: reading from client 1166 2014-01-28 18:09:00 state 2: request combining 1166 2014-01-28 18:09:00 state 3: sending to server 1166 2014-01-28 18:09:00 state 4: reading from client 2014-01-28 18:09:00 select() failed: Invalid argument
Could the issue come from the fact that le @IP of the machine in the DNS is associated with 2 hostnames (not alias) ?
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
System Tools Team Lead CACEIS and APERAM accounts CSC Computer Sciences Luxembourg S.A. 12D Impasse Drosbach L-1882 Luxembourg
Global Outsourcing Service | p:+352 24 834 276 | m:+352 621 229 172 | gbegin at csc.com | www.csc.com
CSC • This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose • CSC Computer Sciences SAS • Registered Office: Immeuble Le Balzac, 10 Place des Vosges, 92072 Paris La Défense Cedex, France • Registered in France: RCS Nanterre B 315 268 664
From: henrik at hswn.dk To: <xymon at xymon.com> Date: 01/20/2014 11:42 AM Subject: Re: [Xymon] xymonproxy perf issue Sent by: "Xymon" <xymon-bounces at xymon.com>
Den 17.01.2014 16:14, Gautier Begin skrev: Could it comes from how the xymonproxy program handles the MACHINE variable ?
I say that because when the process becomes nuts, he makes only 'gettimer' action. Then I have a look in the source file and found this line (line 452) that could correspond:
if (proxyname && ((now = gettimer()) >= (laststatus+300))) { [snip] Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
- The xymonproxy process used 100% of only one CPU (no multithread seen).
- On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks). I suspect some kind of error happened with the network socket handling. Could you add this patch and try to reproduce the problem? It doesn't change the behaviour, but it does add some error-reporting in case the core select() call fails.
Regards, Henrik
[attachment "proxyerror.diff" deleted by Gautier Begin/LUX/CSC]
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On 1/14/2014 4:23 AM, Gautier Begin wrote:
Hello,
Last Sunday, I wanted to start a xymonproxy (vers 4.3.12) on a Solaris 10.5 with 900 targets . I had a performance issue:
- The xymonproxy process used 100% of only one CPU (no multithread seen).
- On the main XYMON server, data from this proxy (I have one other on Ubuntu with 50 targets working fine) came with difficulties (delays and lacks).
Is proxy the only service enabled in your tasks.cfg?
I have occasionally seen xymon run away with the processor. In some cases I have been able to find a cause. In all cases, it has been caused by an error in my configuration files.
Is it possible that you have defined a proxy or notification loop?
Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
participants (4)
-
gbegin@csc.com
-
henrik@hswn.dk
-
john.thurston@alaska.gov
-
stef.coene@docum.org