Users / Procs Graphing problem
All,
Long term bigbrother user, new hobbit convert.
I'm building some monitoring solutions at a place I'm working and during this have noticed an issue with one of the rrd graphs that I can't figure out.. Everyday at a specific time (different for each system) there is a 10minute gap in the graph, I've done an rrd dump on the data and it's appearing as a NaN entry.
This is only effecting the users / procs graph, anyone got any ideas? If it was effecting all the systems at the same time I might have an idea but it just seems to effect all the systems at different times.
Regards,
Mike Rowell
For more information about the Viatel Group, please visit www.viatel.com
THIS MESSAGE IS INTENDED ONLY FOR THE USE OF THE INTENDED RECIPIENT TO WHICH IT IS ADDRESSED AND MAY CONTAIN INFORMATION THAT IS PRIVILEGED, CONFIDENTIAL AND EXEMPT FROM DISCLOSURE. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, you are notified that any dissemination, distribution or copying of this e-mail is prohibited, and you should delete this e-mail from your system.
This message has been scanned for viruses and spam by Viatel MailControl - www.viatel.com
On Wed, Feb 01, 2006 at 12:27:48PM -0000, Rowell, Mike wrote:
I'm building some monitoring solutions at a place I'm working and during this have noticed an issue with one of the rrd graphs that I can't figure out.. Everyday at a specific time (different for each system) there is a 10minute gap in the graph, I've done an rrd dump on the data and it's appearing as a NaN entry.
This means that no data was being fed into the RRD file for 10-15 minutes.
This is only effecting the users / procs graph, anyone got any ideas?
Could it be that you are rebooting these servers once a day ? (I know, Unix folks rarely do that - but just in case).
Is this with the Hobbit client, or the BB client reporting data ?
Since it always happens on the same time for a given server, it would be interesting to see what messages are fed into Hobbit around that time. If you are running the Hobbit client, could you setup a cron job to fetch the client data around that time ? It should run something like
wget http://hobbitserver/cgi-bin/bb-hostsvc.sh?CLIENT=bad.client.name
and store the output in a file where you can look at it later. Best thing would be if you could run this every minute for 15 minutes around the time this problem occurs.
If you are running the BB client, the interesting part is the "cpu" column data that is sent around that time. So something similar, except that the URL you should fetch is
http://hobbitserver/cgi-bin/bb-hostsvc.sh?HOSTSVC=bad,client,name.cpu
Regards, Henrik
participants (2)
-
henrik@hswn.dk
-
Mike.Rowell@viatel.com