Graphs stop update 24 hours after client reboot; start again 24 hours later.
Hi all,
I need some help/suggestions to figure out why my "cpu load" and "users & processes" graphs stop updating about 24 hours after the systems reboot. The updates stop for anywhere from 12 to 24 hours, then simply start back up again. Only the "CPU load" and the "Users and Processes" graphs are having the problem; disk, memory, cpu utilization, network traffic don't miss a beat.
We have a number of identically configured systems, they all reboot around 00:30 local time (they are in different time zones) Wednesday mornings. And they all stop reporting cpu-load/users-and-processes graphs sometime Thursday mornings and then start up again Thursday afternoon/Friday morning. They don't all stop/start at exactly the same time, but the majority do stop/start at the same time. I end up with about a 24 hour gap in my graphs every week.
The rrd files are all updated except for the la.rrd, procs.rrd, and users.rrd: -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 16:56 clock.rrd -rw-r--r-- 1 hobbit hobbit 38536 Jan 22 16:56 disk,cvsrx.rrd -rw-r--r-- 1 hobbit hobbit 38536 Jan 22 16:56 disk,root.rrd -rw-r--r-- 1 hobbit hobbit 38536 Jan 22 16:56 ifstat.eth0.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 00:38 la.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 16:56 memory.actual.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 16:56 memory.real.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 16:56 memory.swap.rrd -rw-r--r-- 1 hobbit hobbit 57520 Jan 22 16:56 mysql.rrd -rw-r--r-- 1 hobbit hobbit 304312 Jan 22 16:56 netstat.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 00:38 procs.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 16:57 tcp.conn.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 16:59 tcp.ssh.rrd -rw-r--r-- 1 hobbit hobbit 19552 Jan 22 00:38 users.rrd -rw-r--r-- 1 hobbit hobbit 323296 Jan 22 16:56 vmstat.rrd
I've restarted the hobbit client and the hobbit server; no help.
Any pointers/suggestions would be very welcome!
Tom
Tom Brand CVS/pharmacy
In <E38DCD6606C55F499A4125611AB8D99605C8F6B1 at cvsexbpd2.Corp.CVS.com> "Brand, Thomas R." <TRBrand at cvs.com> writes:
I need some help/suggestions to figure out why my "cpu load" and "users & processes" graphs stop updating about 24 hours after the systems reboot. The updates stop for anywhere from 12 to 24 hours, then simply start back up again. Only the "CPU load" and the "Users and Processes" graphs are having the problem; disk, memory, cpu utilization, network traffic don't miss a beat.
The only explanation I can come up with is that the format of some of the "cpu" status message is different for the first 24 hours after a reboot.
Could you send me an example of the cpu status shortly after a reboot, and one when the graphs are working ?
What OS are these boxes ?
Regards, Henrik
-----Original Message----- From: Henrik "StC8rner [mailto:henrik at hswn.dk] Sent: Wednesday, January 28, 2009 7:23 AM To: hobbit at hswn.dk Subject: Re: [hobbit] Graphs stop update 24 hours after client reboot; start again 24 hours later.
In <E38DCD6606C55F499A4125611AB8D99605C8F6B1 at cvsexbpd2.Corp.CVS.com> "Brand, Thomas R." <TRBrand at cvs.com> writes:
I need some help/suggestions to figure out why my "cpu load" and "users & processes" graphs stop updating about 24 hours after the systems reboot. The updates stop for anywhere from 12 to 24 hours, then simply start back up again. Only the "CPU load" and the "Users and Processes" graphs are having the problem; disk, memory, cpu utilization, network traffic don't miss a beat.
The only explanation I can come up with is that the format of some of the "cpu" status message is different for the first 24 hours after a reboot.
Could you send me an example of the cpu status shortly after a reboot, and one when the graphs are working ?
What OS are these boxes ?
Regards, Henrik
Hi Henrik,
The systems are running SUSE Linux Enterprise Server 10 SP1 (SLES 10.1). It's pretty much a standard out-of-the-box OS install, nothing very odd. The hobbit server is also SLES10.1
Slight correction: the server reboots, graphs show fine for 24 hours, then graphs stop for 12-24 hours, then graphs start again... reboot: wed 00:30 graph shows data until Thursday 00:30 and then stops graph data starts again 12-24 hours after stopping.
I'm not sure what you mean by 'send me an example of the cpu status'... are you looking for a data file? a log file?
Thanks for taking time to respond, Tom
participants (2)
-
henrik@hswn.dk
-
TRBrand@cvs.com