Hello,
I have Xymon 4.3.7 running as server on a CentOS 6 box - no problems. I have a mixture of RHEL3, 4, 5 and 6, and CentOS 5 & 6 clients. These all work fine except for the RHEL3 client. That one shows purple for the cpu, disk, files, memory and procs tests. The tests seem to have picked up some data from the first run after starting up, but since then they seem to have reported nothing (and hence purple).
I have been looking into this for a while now, and am at a bit of a loss as to why just this one client is having a problem. I have stopped the firewall on both client and server, but that makes no difference. A tcpdump shows that data is being sent to/from the client/server. I can see no obvious difference between this client and any of the others.
Anyone any ideas about this?
Thanks,
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
Hello,
as far as I can remember, RHEL3 has an old 2.4 kernel. I'm using hobbit client 4.2.0 on such machines - you should give it a try and install this elder hobbit client.
Good luck!
Regards Christian
CHRISTIAN BECKER System Engineer CSC
August-Horch-Strasse 28, 56070 Koblenz, Germany Global Outsourcing Services Central Region | cbecker4 at csc.com | www.csc.com
-----Ursprüngliche Nachricht----- Von: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] Im Auftrag von John Horne Gesendet: Montag, 9. Juli 2012 16:22 An: xymon at xymon.com Betreff: [Xymon] Xymon on RHEL3 - not working?
Hello,
I have Xymon 4.3.7 running as server on a CentOS 6 box - no problems. I have a mixture of RHEL3, 4, 5 and 6, and CentOS 5 & 6 clients. These all work fine except for the RHEL3 client. That one shows purple for the cpu, disk, files, memory and procs tests. The tests seem to have picked up some data from the first run after starting up, but since then they seem to have reported nothing (and hence purple).
I have been looking into this for a while now, and am at a bit of a loss as to why just this one client is having a problem. I have stopped the firewall on both client and server, but that makes no difference. A tcpdump shows that data is being sent to/from the client/server. I can see no obvious difference between this client and any of the others.
Anyone any ideas about this?
Thanks,
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
CSC • This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose • CSC Deutschland Services GmbH • Registered Office: Abraham-Lincoln-Park 1, 65189 Wiesbaden, Germany • Board of Directors: Gerhard Fercho (Chairman),Thomas Nebe, Peter Schmidt • Registered in Germany: HRB 7574, Wiesbaden
Does it go green when you restart the client, then eventually go purple again? If so, it seems like something stuck in xymonlaunch -- the client itself is just a shell script.
Can you try running xymonlaunch with --debug to see what's happening, or strace it as it's running and send the output?
-jc
Hello,
I have Xymon 4.3.7 running as server on a CentOS 6 box - no problems. I have a mixture of RHEL3, 4, 5 and 6, and CentOS 5 & 6 clients. These all work fine except for the RHEL3 client. That one shows purple for the cpu, disk, files, memory and procs tests. The tests seem to have picked up some data from the first run after starting up, but since then they seem to have reported nothing (and hence purple).
I have been looking into this for a while now, and am at a bit of a loss as to why just this one client is having a problem. I have stopped the firewall on both client and server, but that makes no difference. A tcpdump shows that data is being sent to/from the client/server. I can see no obvious difference between this client and any of the others.
Anyone any ideas about this?
Thanks,
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
I have a couple FreeBSD clients that go purple if /tmp goes to 100%.
Paul Root - Senior Engineer Managed Services Systems - CenturyLink
-----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of cleaver at terabithia.org Sent: Monday, July 09, 2012 1:03 PM To: John Horne Cc: xymon at xymon.com Subject: Re: [Xymon] Xymon on RHEL3 - not working?
Does it go green when you restart the client, then eventually go purple again? If so, it seems like something stuck in xymonlaunch -- the client itself is just a shell script.
Can you try running xymonlaunch with --debug to see what's happening, or strace it as it's running and send the output?
-jc
Hello,
I have Xymon 4.3.7 running as server on a CentOS 6 box - no problems. I have a mixture of RHEL3, 4, 5 and 6, and CentOS 5 & 6 clients. These all work fine except for the RHEL3 client. That one shows purple for the cpu, disk, files, memory and procs tests. The tests seem to have picked up some data from the first run after starting up, but since then they seem to have reported nothing (and hence purple).
I have been looking into this for a while now, and am at a bit of a loss as to why just this one client is having a problem. I have stopped the firewall on both client and server, but that makes no difference. A tcpdump shows that data is being sent to/from the client/server. I can see no obvious difference between this client and any of the others.
Anyone any ideas about this?
Thanks,
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
This communication is the property of CenturyLink and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
On Mon, 2012-07-09 at 11:02 -0700, cleaver at terabithia.org wrote:
Does it go green when you restart the client, then eventually go purple again?
Yes. The initial 'vmstat' process started by xymonlaunch runs, then ends but isn't restarted (so it seems). Hence I get one run, then nothing and so the colours go green once then purple until xymon is restarted.
Can you try running xymonlaunch with --debug to see what's happening, or strace it as it's running and send the output?
I did both yesterday but as far as I could see it showed nothing useful. (I'll repeat this and add the output below.)
I also ran it with the '--dump' option and that showed that the entries in clientlaunch.cfg were okay. The intervals were reported as 300 seconds (5 mins), so the processes should have been restarted. (I have 2 tasks configured.)
Looking at the code (in ./common/xymonlaunch.c) I can see the ('running') loop that it runs through and it looks fine. But because nothing at all is logged with debug, and as far as I can see all eventualities should report something, then the 'for' loop of the task list is probably not being entered (the initial ('for') loop works, but subsequently is failing?) I'll add some more logging to the client xymonlaunch code and see what happens.
Output from using 'xymonlaunch --debug':
=================================================== 2012-07-10 12:01:21 xymonlaunch starting 2012-07-10 12:01:21 Loading tasklist configuration from /home/xymon/client/etc/clientlaunch.cfg 29487 2012-07-10 12:01:21 Opening file /home/xymon/client/etc/clientlaunch.cfg 29487 2012-07-10 12:01:21 29487 2012-07-10 12:01:21 Starting tasklist scan 29487 2012-07-10 12:01:21 About to start task client 29488 2012-07-10 12:01:21 client -> Loading environment from /home/xymon/client/etc/xymonclient.cfg area 29488 2012-07-10 12:01:21 Opening file /home/xymon/client/etc/xymonclient.cfg 29488 2012-07-10 12:01:21 client -> Assigning stdout/stderr to log '/home/xymon/client/logs/xymonclient.log' 29487 2012-07-10 12:01:21 About to start task dns 29490 2012-07-10 12:01:21 dns -> Loading environment from /home/xymon/client/etc/xymonclient.cfg area 29490 2012-07-10 12:01:21 Opening file /home/xymon/client/etc/xymonclient.cfg 29490 2012-07-10 12:01:21 dns -> Assigning stdout/stderr to log '/home/xymon/client/logs/xymonclient.log' 29487 2012-07-10 12:01:21 29487 2012-07-10 12:01:21 Starting tasklist scan 29487 2012-07-10 12:01:21 Task client active with PID 29488 29487 2012-07-10 12:01:26 29487 2012-07-10 12:01:26 Starting tasklist scan 29487 2012-07-10 12:01:26 Task client active with PID 29488 29487 2012-07-10 12:01:28 29487 2012-07-10 12:01:28 Starting tasklist scan 29487 2012-07-10 12:01:33 29487 2012-07-10 12:01:33 Starting tasklist scan 29487 2012-07-10 12:01:38 29487 2012-07-10 12:01:38 Starting tasklist scan 29487 2012-07-10 12:01:43 29487 2012-07-10 12:01:43 Starting tasklist scan 29487 2012-07-10 12:01:48 29487 2012-07-10 12:01:48 Starting tasklist scan 29487 2012-07-10 12:01:53 29487 2012-07-10 12:01:53 Starting tasklist scan 29487 2012-07-10 12:01:58 29487 2012-07-10 12:01:58 Starting tasklist scan 29487 2012-07-10 12:02:03 29487 2012-07-10 12:02:03 Starting tasklist scan 29487 2012-07-10 12:02:08 29487 2012-07-10 12:02:08 Starting tasklist scan 29487 2012-07-10 12:02:13 29487 2012-07-10 12:02:13 Starting tasklist scan 29487 2012-07-10 12:02:18 29487 2012-07-10 12:02:18 Starting tasklist scan 29487 2012-07-10 12:02:23 29487 2012-07-10 12:02:23 Starting tasklist scan 29487 2012-07-10 12:02:28 29487 2012-07-10 12:02:28 Starting tasklist scan 29487 2012-07-10 12:02:33 29487 2012-07-10 12:02:33 Starting tasklist scan 29487 2012-07-10 12:02:38 29487 2012-07-10 12:02:38 Starting tasklist scan 29487 2012-07-10 12:02:43 29487 2012-07-10 12:02:43 Starting tasklist scan 29487 2012-07-10 12:02:48 29487 2012-07-10 12:02:48 Starting tasklist scan 29487 2012-07-10 12:02:53 29487 2012-07-10 12:02:53 Starting tasklist scan 29487 2012-07-10 12:02:58 29487 2012-07-10 12:02:58 Starting tasklist scan 29487 2012-07-10 12:03:03 29487 2012-07-10 12:03:03 Starting tasklist scan 29487 2012-07-10 12:03:08 29487 2012-07-10 12:03:08 Starting tasklist scan 29487 2012-07-10 12:03:13 29487 2012-07-10 12:03:13 Starting tasklist scan 29487 2012-07-10 12:03:18 29487 2012-07-10 12:03:18 Starting tasklist scan 29487 2012-07-10 12:03:23 29487 2012-07-10 12:03:23 Starting tasklist scan 29487 2012-07-10 12:03:28 29487 2012-07-10 12:03:28 Starting tasklist scan 29487 2012-07-10 12:03:33 29487 2012-07-10 12:03:33 Starting tasklist scan 29487 2012-07-10 12:03:38 29487 2012-07-10 12:03:38 Starting tasklist scan 29487 2012-07-10 12:03:43 29487 2012-07-10 12:03:43 Starting tasklist scan 29487 2012-07-10 12:03:48 29487 2012-07-10 12:03:48 Starting tasklist scan 29487 2012-07-10 12:03:53 29487 2012-07-10 12:03:53 Starting tasklist scan 29487 2012-07-10 12:03:58 29487 2012-07-10 12:03:58 Starting tasklist scan 29487 2012-07-10 12:04:03 29487 2012-07-10 12:04:03 Starting tasklist scan 29487 2012-07-10 12:04:08 29487 2012-07-10 12:04:08 Starting tasklist scan 29487 2012-07-10 12:04:13 29487 2012-07-10 12:04:13 Starting tasklist scan 29487 2012-07-10 12:04:18 29487 2012-07-10 12:04:18 Starting tasklist scan 29487 2012-07-10 12:04:23 29487 2012-07-10 12:04:23 Starting tasklist scan 29487 2012-07-10 12:04:28 29487 2012-07-10 12:04:28 Starting tasklist scan 29487 2012-07-10 12:04:33 29487 2012-07-10 12:04:33 Starting tasklist scan 29487 2012-07-10 12:04:38 29487 2012-07-10 12:04:38 Starting tasklist scan 29487 2012-07-10 12:04:43 29487 2012-07-10 12:04:43 Starting tasklist scan 29487 2012-07-10 12:04:48 29487 2012-07-10 12:04:48 Starting tasklist scan 29487 2012-07-10 12:04:53 29487 2012-07-10 12:04:53 Starting tasklist scan 29487 2012-07-10 12:04:58 29487 2012-07-10 12:04:58 Starting tasklist scan 29487 2012-07-10 12:05:03 29487 2012-07-10 12:05:03 Starting tasklist scan 29487 2012-07-10 12:05:08 29487 2012-07-10 12:05:08 Starting tasklist scan 29487 2012-07-10 12:05:13 29487 2012-07-10 12:05:13 Starting tasklist scan 29487 2012-07-10 12:05:18 29487 2012-07-10 12:05:18 Starting tasklist scan 29487 2012-07-10 12:05:23 29487 2012-07-10 12:05:23 Starting tasklist scan 29487 2012-07-10 12:05:28 29487 2012-07-10 12:05:28 Starting tasklist scan 29487 2012-07-10 12:05:33 29487 2012-07-10 12:05:33 Starting tasklist scan 29487 2012-07-10 12:05:38 29487 2012-07-10 12:05:38 Starting tasklist scan 29487 2012-07-10 12:05:43 29487 2012-07-10 12:05:43 Starting tasklist scan 29487 2012-07-10 12:05:48 29487 2012-07-10 12:05:48 Starting tasklist scan 29487 2012-07-10 12:05:53 29487 2012-07-10 12:05:53 Starting tasklist scan 29487 2012-07-10 12:05:58 29487 2012-07-10 12:05:58 Starting tasklist scan 29487 2012-07-10 12:06:03 29487 2012-07-10 12:06:03 Starting tasklist scan 29487 2012-07-10 12:06:08 29487 2012-07-10 12:06:08 Starting tasklist scan 29487 2012-07-10 12:06:13 29487 2012-07-10 12:06:13 Starting tasklist scan 29487 2012-07-10 12:06:18 29487 2012-07-10 12:06:18 Starting tasklist scan 29487 2012-07-10 12:06:23 29487 2012-07-10 12:06:23 Starting tasklist scan 29487 2012-07-10 12:06:28 29487 2012-07-10 12:06:28 Starting tasklist scan 29487 2012-07-10 12:06:33 29487 2012-07-10 12:06:33 Starting tasklist scan 29487 2012-07-10 12:06:38 29487 2012-07-10 12:06:38 Starting tasklist scan 29487 2012-07-10 12:06:43 29487 2012-07-10 12:06:43 Starting tasklist scan 29487 2012-07-10 12:06:48 29487 2012-07-10 12:06:48 Starting tasklist scan 29487 2012-07-10 12:06:53 29487 2012-07-10 12:06:53 Starting tasklist scan 29487 2012-07-10 12:06:58 29487 2012-07-10 12:06:58 Starting tasklist scan 29487 2012-07-10 12:07:03 29487 2012-07-10 12:07:03 Starting tasklist scan 29487 2012-07-10 12:07:08 29487 2012-07-10 12:07:08 Starting tasklist scan 29487 2012-07-10 12:07:13 29487 2012-07-10 12:07:13 Starting tasklist scan 29487 2012-07-10 12:07:18 29487 2012-07-10 12:07:18 Starting tasklist scan 29487 2012-07-10 12:07:23 29487 2012-07-10 12:07:23 Starting tasklist scan 29487 2012-07-10 12:07:28 29487 2012-07-10 12:07:28 Starting tasklist scan
The 'vmstat' task has ended by 12:06:33. As can be seen the loop (scan) carries on but no tasks are restarted. (The 'dns' task is a task we run on the clients to report to the 'dnsr' column on the Xymon server. It runs once then should be started again by xymonlaunch after 5 mins.)
The output of 'strace -f -p <xymonlaunch pid>' is attached. It shows the loop (scan) occurring but nothing else - hence it doesn't seem that the 'for' loop of the task list is being executed. By the end of the strace output the 'vmstat' task of the client has ended but not been restarted.
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
On Tue, 2012-07-10 at 12:26 +0100, John Horne wrote:
Looking at the code (in ./common/xymonlaunch.c) I can see the ('running') loop that it runs through and it looks fine. But because nothing at all is logged with debug, and as far as I can see all eventualities should report something, then the 'for' loop of the task list is probably not being entered (the initial ('for') loop works, but subsequently is failing?) I'll add some more logging to the client xymonlaunch code and see what happens.
As far as I can tell the problem in xymonlaunch.c comes from line 607:
time_t now = gettimer();
Printing out what 'now' is gives 0 all the time. 'gettimer' comes from lib/timefunc.c line 49:
====================================== time_t gettimer(void) { int res; struct timespec t;
#if (_POSIX_TIMERS > 0) && defined(_POSIX_MONOTONIC_CLOCK) res = clock_gettime(CLOCK_MONOTONIC, &t); return (time_t) t.tv_sec; #else return time(NULL); #endif }
However printing out 'res' shows that 'clock_gettime' is always returning an error (-1). If I use 'time(NULL)' instead, then it all works fine - the logs look correct and the purple results stay green. (I have run this for about 30 mins now with no purple appearing.)
Not sure how you would want to fix this, but I changed it slightly to say:
====================================== res = clock_gettime(CLOCK_MONOTONIC, &t); if (res == 0) { return (time_t) t.tv_sec; } else { return time(NULL); }
I wrote a small test program using clock_gettime, and on RHEL 4 and 6 they both ran without error. On RHEL 3 it always gives an error.
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
On Tue, 2012-07-10 at 15:12 +0100, John Horne wrote:
Printing out what 'now' is gives 0 all the time. 'gettimer' comes from lib/timefunc.c line 49:
====================================== time_t gettimer(void) { int res; struct timespec t;
#if (_POSIX_TIMERS > 0) && defined(_POSIX_MONOTONIC_CLOCK) res = clock_gettime(CLOCK_MONOTONIC, &t); return (time_t) t.tv_sec; #else return time(NULL); #endif }
However printing out 'res' shows that 'clock_gettime' is always returning an error (-1).
Sorry, should have added that errno is showing as 22 (EINVAL). The man page for clock_gettime states:
EINVAL The clk_id specified is not supported on this system.
However, it seems that although _POSIX_MONOTONIC_CLOCK may seem to be available, at runtime it is not. This link (for the 'curl' program) has more discussion about it: http://curl.haxx.se/mail/tracker-2008-07/0003.html
Checking the RHEL3 system using a small program calling 'sysconf' shows that _POSIX_MONOTONIC_CLOCK is available if checked, but is not available at runtime (using sysconf):
=========== monotonic clock: _POSIX_TIMERS: 200112 _POSIX_MONOTONIC_CLOCK: 0 Err found: -1 Err found: -1, errno = 22, string = Invalid argument Sysconf value for _POSIX_MONOTONIC_CLOCK: -1
So the clock seems to be available, but checking at runtime with sysconf says that it is not.
Whereas on an RHEL6 system it shows:
=========== monotonic clock: _POSIX_TIMERS: 200809 _POSIX_MONOTONIC_CLOCK: 0 Sysconf value for _POSIX_MONOTONIC_CLOCK: 200809
Again it seems that the clock is available, and the runtime/sysconf check confirms that.
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
On Mon, 2012-07-09 at 15:22 +0100, John Horne wrote:
Hello,
I have Xymon 4.3.7 running as server on a CentOS 6 box - no problems. I have a mixture of RHEL3, 4, 5 and 6, and CentOS 5 & 6 clients. These all work fine except for the RHEL3 client. That one shows purple for the cpu, disk, files, memory and procs tests. The tests seem to have picked up some data from the first run after starting up, but since then they seem to have reported nothing (and hence purple).
I have been looking into this for a while now, and am at a bit of a loss as to why just this one client is having a problem. I have stopped the firewall on both client and server, but that makes no difference. A tcpdump shows that data is being sent to/from the client/server.
Actually that is not quite true. Packets are sent to/from the client and server. However, the data relates to other tests - ntp, smtp etc, and not the cpu, memory etc tests.
In reply to others:
/tmp is not full, plenty of space available.
Although an earlier Hobbit version may work on RHEL3, I think that is avoiding the problem. (a) this is Xymon, not Hobbit, and (b) if this problem is hiding some other fault that may/may not be affecting other O/Ses then I would rather sort out the problem before it really does start to affect our other RHEL/CentOS clients. But thanks for the suggestion anyway. Yes, RHEL3 does use a 2.4 kernel.
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
participants (4)
-
christian.becker@rhein-zeitung.net
-
cleaver@terabithia.org
-
john.horne@plymouth.ac.uk
-
Paul.Root@CenturyLink.com