This is on the hobbitd incoming messages service being watched, correct (judging by the screen capture)? You said it started happening on two different servers but now it is happening on additional graphs; what data are these new graphs showing?
On 11/30/07, Gary Baluha <gumby3203 at gmail.com> wrote:
On Nov 30, 2007 1:15 PM, Hubbard, Greg L <greg.hubbard at eds.com> wrote:
It sounds like you are zeroing in on the problem. Based on your other post (and this) it seems that the data is getting logged okay in the RRD, and that data is being faithfully reproduced by the graphs. The problem is that the data itself has unexpected values. So whatever is providing that data to the RRD is either faulty, or is in turn being misled by something else further upstream.
Yeah, I'm fairly confident now that it is the initial data being fed into the rrd file that is faulty. I'm still not sure what the initial "entry point" of this bad data is, though, nor why it is happening. I have a feeling that once I determine where the entry point is, that will lead me to the "why".
I don't remember where you said that this data was coming from. I know there can be a problem with "rollovers" when a signed integer is used as a counter and it grows to the point where the sign bit flips. This can cause a big jump in a reading if the software cannot handle the switch from 2,147,483,647 (hex 7FFFFFF) to the next value (hex 80000000) which flips the sign bit for a signed 32 bit integer. This has been a problem in the SNMP world for YEARS.
Hrm, that has been something vaguely on my mind. But I haven't really thought of that as _the_ reason why, since I don't know why there would be some sort of data rollover. We're talking about load average and disk space usage graphs that are showing invalid data. I'm also curious why it would have started all of a sudden, on two separate machines. But it does seem more and more like something like an integer rollover, or similar situation.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer