We are having issues with rrd leaving 5-10 minute intervals of no data for items such as cpu load on various systems. Most of the time it happens three times in a row like at 2, 3 and 4 pm. I am not finding anything is the logs on either the client or server. Some googling indicated that the issue may go away by using the no-cache option for RRD. I have added it to the hobbitlaunch.cfg as such:
hobbitlaunch.cfg: CMD hobbitd_channel --channel=status --log=$BBSERVERLOGS/rrd-status.log hobbitd_rrd --no-cache --extra-tests=cpucisco,ifaload,ifload,vload,wphlstat,wperrors --extra-script=/home/xymon/server/ext/extra-rrd.pl --rrddir=$BBVAR/rrd
hobbitlaunch.cfg: CMD hobbitd_channel --channel=data --log=$BBSERVERLOGS/rrd-data.log hobbitd_rrd --no-cache --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --rrddir=$BBVAR/rrd
This is on a 32 bit red hat system and when I do a ps -ef | grep rrd I show the following:
xymon 7635 7599 0 14:26 ? 00:00:01 hobbitd_channel --channel=status --log=/home/xymon/logs/rrd-status.log hobbitd_rrd --no-cache --extra-tests=cpucisco,ifaload,ifload,vload,wphlstat,wperrors --extra-script=/home/xymon/server/ext/extra-rrd.pl --rrddir=/home/xymon/data/rrd
xymon 7636 7599 0 14:26 ? 00:00:00 hobbitd_channel --channel=data --log=/home/xymon/logs/rrd-data.log hobbitd_rrd --no-cache --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --rrddir=/home/xymon/data/rrd
xymon 7672 7635 0 14:26 ? 00:00:07 hobbitd_rrd --no-cache --extra-tests=cpucisco,ifaload,ifload,vload,wphlstat,wperrors --extra-script=/home/xymon/server/ext/extra-rrd.pl --rrddir=/home/xymon/data/rrd
xymon 7681 7636 0 14:26 ? 00:00:03 hobbitd_rrd --no-cache --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --rrddir=/home/xymon/data/rrd
So it looks like it is in effect, but looking at the tmp file I still see the following:
srw-rw-rw- 1 xymon xymon 0 Nov 13 14:26 rrdctl.7672
srw-rw-rw- 1 xymon xymon 0 Nov 13 14:26 rrdctl.7681
When I stop and restart xymon I still get messages such as:
rrd-status.log:2009-11-13 14:26:09 Cache flush completed
rrd-status.log:2009-11-13 14:26:18 Peer not up, flushing message queue
So my question is have I placed the -no-cache in the wrong sequence on the startup command or is that been taken out of beta-2?
Thank you,
Tom