On 4.3.0 has the -no-cache been taken out for the rrd daemons? I have been having issues in testing where I am getting missing data for 5-15 minutes at random times. I have increased my MAX... items in the xymonserver.cfg and have tried to use the -no-cache in tasks.cfg. It currently looks like this on a process status.
xymon 27206 27201 0 Mar16 ? 00:00:00 xymond_channel --channel=status --log=/home/xymon/log/rrd-status.log xymond_rrd --rrddir=/home/xymon/data/rrd --no-cache
xymon 27207 27201 0 Mar16 ? 00:00:00 xymond_channel --channel=data --log=/home/xymon/log/rrd-data.log xymond_rrd --rrddir=/home/xymon/data/rrd --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --no-cache
xymon 27225 27206 0 Mar16 ? 00:00:02 xymond_rrd --rrddir=/home/xymon/data/rrd --no-cache
xymon 27249 27207 0 Mar16 ? 00:00:01 xymond_rrd --rrddir=/home/xymon/data/rrd --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --no-cache
The missing data seems to be affecting the Solaris systems more that the other systems.
Any ideas would be great.
Thank you,
Tom
On Thu, 17 Mar 2011 15:28:34 -0500, "Stewart, Tom L." <Tom.Stewart at landsend.com> wrote:
On 4.3.0 has the -no-cache been taken out for the rrd daemons? I have been having issues in testing where I am getting missing data for 5-15 minutes at random times.
"xymond_rrd --no-cache" should work fine.
Are you seeing any errors in the rrd-status.log or rrd-data.log files ?
Regards, Henrik
I am using the --no-cache but I still get holes. Seems to be related to Solaris 10 systems on both sparc and x86 and are totally random. I have included a picture of one of the systems that does not even have a heavy load from over the weekend. The only graphs that are affected are: "CPU Load" and "Users and Processes".
Here are the log files from the last restart with nothing from the weekend.
[root at xxxxx log]# cat rrd-data.log 2011-03-24 13:28:41 Tried to down BOARDBUSY: Invalid argument 2011-03-24 13:28:41 Peer not up, flushing message queue 2011-03-24 13:28:41 Shutting down, flushing cached updates to disk 2011-03-24 13:28:41 Cache flush completed 2011-03-24 13:28:56 Peer not up, flushing message queue
[root at xxxxx log]# cat rrd-status.log 2011-03-24 13:28:41 Tried to down BOARDBUSY: Invalid argument 2011-03-24 13:28:41 Peer not up, flushing message queue 2011-03-24 13:28:41 Shutting down, flushing cached updates to disk 2011-03-24 13:28:41 Cache flush completed 2011-03-24 13:28:56 Peer not up, flushing message queue
I did see some improvement when I added more memory for the following based on error messages from xymond.
xymonserver.cfg:MAXMSG_CLIENT=2048 # added more by Tom S xymonserver.cfg:MAXMSG_DATA=2048 # added more by Tom S xymonserver.cfg:MAXMSG_STATUS=2048 # added more by Tom S xymonserver.cfg:MAXMSG_NOTES=2048 # added more by Tom S xymonserver.cfg:MAXMSG_USER=2048 # added more by Tom S
The server is RHEL6 64-bit: [root at xxxxx etc]# uname -a Linux xxxx 2.6.32-71.18.1.el6.x86_64 #1 SMP Wed Feb 2 17:49:59 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
The only other item is that the clients are still using 4.2.3, but I don't think that would make a difference ??
Tom
-----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of henrik at hswn.dk Sent: Monday, March 28, 2011 7:28 AM To: xymon at xymon.com Subject: Re: [Xymon] Using --no-cache
On Thu, 17 Mar 2011 15:28:34 -0500, "Stewart, Tom L." <Tom.Stewart at landsend.com> wrote:
On 4.3.0 has the -no-cache been taken out for the rrd daemons? I have been having issues in testing where I am getting missing data for 5-15 minutes at random times.
"xymond_rrd --no-cache" should work fine.
Are you seeing any errors in the rrd-status.log or rrd-data.log files ?
Regards, Henrik
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On Tue, Mar 29, 2011 at 2:10 AM, Stewart, Tom L. <Tom.Stewart at landsend.com> wrote:
I am using the --no-cache but I still get holes.
The only graphs that are affected are: "CPU Load" and "Users and Processes".
I get empty graphs for these two when I have --no-cache. Without --no-cache I get graphs with gaps.
I get no errors in the logs.
I did see some improvement when I added more memory for the following based on error messages from xymond.
Interesting. I wondered about these settings, but when I ooked for errors in xymond.log, I saw nothing. I'll add these in my setup and see if it helps.
My clients and servers are all running 4.3.0 on SUSE. So for me, it's not a Solaris thing, or a Xymon version mismatch thing.
But I would imagine that Solaris messages would be different (perhaps larger) than Linux messages, and that might be why you're seeing problems only on your Solaris servers.
Cheers Jeremy
participants (3)
-
henrik@hswn.dk
-
jlaidman@rebel-it.com.au
-
Tom.Stewart@landsend.com