[hobbit] strange graph behavior - random machines & graphs
I'm almost certain now that this is a problem when Hobbit/rrdtool doesn't receive data when it is expecting it. In short, it seems that when a lack of data occurs, instead of assigning NaN or 0, this huge number is inserted into rrd database instead. I'm still not sure where this number is generated from, though.
Possibly the maximum value for that data entry type?
Josh
On 12/3/07, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm almost certain now that this is a problem when Hobbit/rrdtool doesn't receive data when it is expecting it. In short, it seems that when a lack of data occurs, instead of assigning NaN or 0, this huge number is inserted into rrd database instead. I'm still not sure where this number is generated from, though.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
I am now completely convinced that the strange behavior of the graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from, though.
Could you give a short example of a bogus and a changed (NaN) entry, just in case that is also happening to some of my data files?
/Thomas Kern /301-903-2211 (O) /301-905-6427 (M)
From: Gary Baluha [mailto:gumby3203 at gmail.com]
Sent: Wednesday, December 05, 2007 11:53 AM
To: hobbit at hswn.dk
Subject: Re: [hobbit] strange graph behavior - random machines &
graphs
I am now completely convinced that the strange behavior of the
graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have
at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from,
though.
cd hobbit_data_dir/host_machine rrdtool dump clock.rrd > clock.xml
I know any number that shows up greater than "e+1nn" is bogus, so I search for "e+1".
One of several bogus data lines: <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> 3.9551632477e+169</v></row>
Same line, changed to NaN (repeat for all affected lines): <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> NaN </v></row>
rrdtool restore clock.xml clock.rrd
On Dec 5, 2007 11:57 AM, Kern, Thomas <Thomas.Kern at hq.doe.gov> wrote:
Could you give a short example of a bogus and a changed (NaN) entry, just in case that is also happening to some of my data files?
/Thomas Kern /301-903-2211 (O) /301-905-6427 (M)
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 05, 2007 11:53 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I am now completely convinced that the strange behavior of the graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from, though.
I wrote a script to clean up these bogus values. Of course, if there are trend graphs where numbers large enough for NNNe+1NN to be valid, the script will have unexpected results. To run the script, you need to "cd" into the directory with the rrd files to be fixed.
On Dec 5, 2007 2:05 PM, Gary Baluha <gumby3203 at gmail.com> wrote:
cd hobbit_data_dir/host_machine rrdtool dump clock.rrd > clock.xml
I know any number that shows up greater than "e+1nn" is bogus, so I search for "e+1".
One of several bogus data lines: <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> 3.9551632477e+169</v></row>
Same line, changed to NaN (repeat for all affected lines): <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> NaN </v></row>
rrdtool restore clock.xml clock.rrd
On Dec 5, 2007 11:57 AM, Kern, Thomas <Thomas.Kern at hq.doe.gov> wrote:
Could you give a short example of a bogus and a changed (NaN) entry, just in case that is also happening to some of my data files?
/Thomas Kern /301-903-2211 (O) /301-905-6427 (M)
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 05, 2007 11:53 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I am now completely convinced that the strange behavior of the graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from, though.
Very cool, Gary. Awesome to the max!
Thank you very much for sharing your experience and a fix!
Just for the record - when using this double check the rrdtool dir and the user:group permissions - not everyone uses those same settings!
Again, thanks a lot!
On 12/5/07, Gary Baluha <gumby3203 at gmail.com> wrote:
I wrote a script to clean up these bogus values. Of course, if there are trend graphs where numbers large enough for NNNe+1NN to be valid, the script will have unexpected results. To run the script, you need to "cd" into the directory with the rrd files to be fixed.
On Dec 5, 2007 2:05 PM, Gary Baluha <gumby3203 at gmail.com> wrote:
cd hobbit_data_dir/host_machine rrdtool dump clock.rrd > clock.xml
I know any number that shows up greater than "e+1nn" is bogus, so I search for "e+1".
One of several bogus data lines: <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> 3.9551632477e+169</v></row>
Same line, changed to NaN (repeat for all affected lines): <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> NaN </v></row>
rrdtool restore clock.xml clock.rrd
On Dec 5, 2007 11:57 AM, Kern, Thomas <Thomas.Kern at hq.doe.gov > wrote:
Could you give a short example of a bogus and a changed (NaN) entry, just in case that is also happening to some of my data files?
/Thomas Kern /301-903-2211 (O) /301-905-6427 (M)
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 05, 2007 11:53 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I am now completely convinced that the strange behavior of the graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from, though.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
On Dec 5, 2007 6:01 PM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Very cool, Gary. Awesome to the max!
Thank you very much for sharing your experience and a fix!
Just for the record - when using this double check the rrdtool dir and the user:group permissions - not everyone uses those same settings!
Yeah, I thought of that after I uploaded the script. It's a simple enough script that people can just modify the appropriate sections of code to suit their environments.
Again, thanks a lot!
On 12/5/07, Gary Baluha <gumby3203 at gmail.com> wrote:
I wrote a script to clean up these bogus values. Of course, if there are trend graphs where numbers large enough for NNNe+1NN to be valid, the script will have unexpected results. To run the script, you need to "cd" into the directory with the rrd files to be fixed.
On Dec 5, 2007 2:05 PM, Gary Baluha < gumby3203 at gmail.com> wrote:
cd hobbit_data_dir/host_machine rrdtool dump clock.rrd > clock.xml
I know any number that shows up greater than "e+1nn" is bogus, so I search for "e+1".
One of several bogus data lines: <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> 3.9551632477e+169 </v></row>
Same line, changed to NaN (repeat for all affected lines): <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> NaN </v></row>
rrdtool restore clock.xml clock.rrd
On Dec 5, 2007 11:57 AM, Kern, Thomas < Thomas.Kern at hq.doe.gov > wrote:
Could you give a short example of a bogus and a changed (NaN) entry, just in case that is also happening to some of my data files?
/Thomas Kern /301-903-2211 (O) /301-905-6427 (M)
*From:* Gary Baluha [mailto:gumby3203 at gmail.com ] *Sent:* Wednesday, December 05, 2007 11:53 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I am now completely convinced that the strange behavior of the graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from, though.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
You're fine - I just wanted it noted in the mailing list archive =)
On 12/6/07, Gary Baluha <gumby3203 at gmail.com> wrote:
On Dec 5, 2007 6:01 PM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Very cool, Gary. Awesome to the max!
Thank you very much for sharing your experience and a fix!
Just for the record - when using this double check the rrdtool dir and the user:group permissions - not everyone uses those same settings!
Yeah, I thought of that after I uploaded the script. It's a simple enough script that people can just modify the appropriate sections of code to suit their environments.
Again, thanks a lot!
On 12/5/07, Gary Baluha <gumby3203 at gmail.com > wrote:
I wrote a script to clean up these bogus values. Of course, if there are trend graphs where numbers large enough for NNNe+1NN to be valid, the script will have unexpected results. To run the script, you need to "cd" into the directory with the rrd files to be fixed.
On Dec 5, 2007 2:05 PM, Gary Baluha < gumby3203 at gmail.com> wrote:
cd hobbit_data_dir/host_machine rrdtool dump clock.rrd > clock.xml
I know any number that shows up greater than "e+1nn" is bogus, so I search for "e+1".
One of several bogus data lines: <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> 3.9551632477e+169 </v></row>
Same line, changed to NaN (repeat for all affected lines): <!-- 2007-11-26 19:00:00 EST / 1196121600 --> <row><v> NaN </v></row>
rrdtool restore clock.xml clock.rrd
On Dec 5, 2007 11:57 AM, Kern, Thomas < Thomas.Kern at hq.doe.gov > wrote:
Could you give a short example of a bogus and a changed (NaN) entry, just in case that is also happening to some of my data files?
/Thomas Kern /301-903-2211 (O) /301-905-6427 (M)
*From:* Gary Baluha [mailto:gumby3203 at gmail.com ] *Sent:* Wednesday, December 05, 2007 11:53 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I am now completely convinced that the strange behavior of the graphs is due to some bad data getting inserted into the .rrd database files. The bad data is always the same value: 5.1776682516e+170. That's what the value looks like when you do an rrddump on the .rrd database file.
I still have no idea where this value is coming from, but I have at least determined how to fix these graphs. I'm working on a script to do this, but for now, I manually do an rrddump of the file, change all bogus values to NaN (basically, searching for "e+1", since none of the values I trend generally get that large, so I know these entries are just averaged values of correct data and the 5.17... number), and then do an rrdrestore from the modified xml file.
It would be nice to determine where this problem is coming from, though.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
This problem is definitely due to a lack of data. There are two hobbitd_rrd processes that run, as we all know. Why are there two processes, again?
One listens to the "data" channel and the other listens to the "status" channel?
From: Gary Baluha [mailto:gumby3203 at gmail.com]
Sent: Friday, December 07, 2007 2:13 PM
To: hobbit at hswn.dk
Subject: Re: [hobbit] strange graph behavior - random machines &
graphs
This problem is definitely due to a lack of data.
There are two hobbitd_rrd processes that run, as we all know.
Why are there two processes, again?
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
Gary,
Are you cross-posting to the rrd lists as well? Someone there may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues...much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
From: Gary Baluha [mailto:gumby3203 at gmail.com] Sent: Wednesday, December 12, 2007 10:15 AM To: hobbit at hswn.dk Subject: Re: [hobbit] strange graph behavior - random machines & graphs
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
I think a lot of us are thinking "speak for yourself, Gary" =)
I, too, enjoy a good challenge some times...but not when I'm too busy with other tasks - almost all the time =(
On 12/12/07, Galen Johnson <Galen.Johnson at sas.com> wrote:
Gary,
Are you cross-posting to the rrd lists as well? Someone there may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues…much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 12, 2007 10:15 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
No, no...you misunderstood...I'm not saying that at all...just asking if he was pursuing that path as well.
=G=
From: Josh Luthman [mailto:josh at imaginenetworksllc.com] Sent: Wednesday, December 12, 2007 10:29 AM To: hobbit at hswn.dk Subject: Re: [hobbit] strange graph behavior - random machines & graphs
I think a lot of us are thinking "speak for yourself, Gary" =)
I, too, enjoy a good challenge some times...but not when I'm too busy with other tasks - almost all the time =(
On 12/12/07, Galen Johnson <Galen.Johnson at sas.com> wrote:
Gary,
Are you cross-posting to the rrd lists as well? Someone there may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues...much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
From: Gary Baluha [mailto:gumby3203 at gmail.com] Sent: Wednesday, December 12, 2007 10:15 AM To: hobbit at hswn.dk Subject: Re: [hobbit] strange graph behavior - random machines & graphs
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
I see.
So from what I can tell we're up to the developer or an aid to fix this issue?
Gary did provide a workaround, still...
Josh
On 12/12/07, Galen Johnson <Galen.Johnson at sas.com> wrote:
No, no…you misunderstood…I'm not saying that at all…just asking if he was pursuing that path as well.
=G=
*From:* Josh Luthman [mailto:josh at imaginenetworksllc.com] *Sent:* Wednesday, December 12, 2007 10:29 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I think a lot of us are thinking "speak for yourself, Gary" =)
I, too, enjoy a good challenge some times...but not when I'm too busy with other tasks - almost all the time =(
On 12/12/07, *Galen Johnson* <Galen.Johnson at sas.com> wrote:
Gary,
Are you cross-posting to the rrd lists as well? Someone there may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues…much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 12, 2007 10:15 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
No, no…you misunderstood…I'm not saying that at all…just asking if he was pursuing that path as well.
Yes, I have posted a thread on the RRD forum (though I'm not doing a full cross-posting). I haven't gotten any responses back yet, and besides I'm not convinced it is a problem with RRD (or for that matter, Hobbit).
My main reason for updating this thread in the hobbit list is because much of it is Hobbit related (even if Hobbit itself isn't the cause).
=G=
*From:* Josh Luthman [mailto:josh at imaginenetworksllc.com] *Sent:* Wednesday, December 12, 2007 10:29 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I think a lot of us are thinking "speak for yourself, Gary" =)
I, too, enjoy a good challenge some times...but not when I'm too busy with other tasks - almost all the time =(
Heheh, since I seem to be the only one with this problem so far (or at least, the only one to post about it), I'm doing most of the responding to my own thread. Oh well, I've learned quite a bit about how Hobbit and RRD both work.
This is certainly a good challenge, since so many factors are involved in this problem. Unfortunately for me, we have people here that might try to represent this problem as a reason to get off Hobbit and use an inferior pay-ware monitoring solution. Thankfully I figured out a work-around, so I'm hoping that will buy me enough time to finally figure this one out.
...And hopefully in the mean time, I don't cause too much irrelevant noise on this mailing list, responding to my own problem ;-)
On 12/12/07, *Galen Johnson* <Galen.Johnson at sas.com> wrote:
Gary,
Are you cross-posting to the rrd lists as well? Someone there may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues…much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 12, 2007 10:15 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
We appreciate that...too many list users forget to reply with their fix (this is a global phenomena) and the debugging steps they did so the next guy that goes looking is left wondering.
=G=
From: Gary Baluha [mailto:gumby3203 at gmail.com] Sent: Wednesday, December 12, 2007 1:46 PM To: hobbit at hswn.dk Subject: Re: [hobbit] strange graph behavior - random machines & graphs
No, no...you misunderstood...I'm not saying that at all...just
asking if he was pursuing that path as well.
Yes, I have posted a thread on the RRD forum (though I'm not doing a full cross-posting). I haven't gotten any responses back yet, and besides I'm not convinced it is a problem with RRD (or for that matter, Hobbit).
My main reason for updating this thread in the hobbit list is because much of it is Hobbit related (even if Hobbit itself isn't the cause).
=G=
From: Josh Luthman [mailto:josh at imaginenetworksllc.com]
Sent: Wednesday, December 12, 2007 10:29 AM
To: hobbit at hswn.dk
Subject: Re: [hobbit] strange graph behavior - random machines &
graphs
I think a lot of us are thinking "speak for yourself, Gary" =)
I, too, enjoy a good challenge some times...but not when I'm too
busy with other tasks - almost all the time =(
Heheh, since I seem to be the only one with this problem so far (or at least, the only one to post about it), I'm doing most of the responding to my own thread. Oh well, I've learned quite a bit about how Hobbit and RRD both work.
This is certainly a good challenge, since so many factors are involved in this problem. Unfortunately for me, we have people here that might try to represent this problem as a reason to get off Hobbit and use an inferior pay-ware monitoring solution. Thankfully I figured out a work-around, so I'm hoping that will buy me enough time to finally figure this one out.
...And hopefully in the mean time, I don't cause too much irrelevant noise on this mailing list, responding to my own problem ;-)
On 12/12/07, Galen Johnson <Galen.Johnson at sas.com> wrote:
Gary,
Are you cross-posting to the rrd lists as well? Someone there
may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues...much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
From: Gary Baluha [mailto:gumby3203 at gmail.com]
Sent: Wednesday, December 12, 2007 10:15 AM
To: hobbit at hswn.dk
Subject: Re: [hobbit] strange graph behavior - random machines &
graphs
To try and narrow down the problem some more, I got rid of all
the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com>
wrote:
I'm curious, is anyone else having this issue? For reference, I
have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
--
Josh Luthman
Office: 937-552-2340
Direct: 937-552-2343
1100 Wayne St
Suite 1337
Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it,
poorly. --- Henry Spencer
I'm with Galen. Thanks for the documentation =)
On 12/12/07, Galen Johnson <Galen.Johnson at sas.com> wrote:
We appreciate that…too many list users forget to reply with their fix (this is a global phenomena) and the debugging steps they did so the next guy that goes looking is left wondering.
=G=
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 12, 2007 1:46 PM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
No, no…you misunderstood…I'm not saying that at all…just asking if he was pursuing that path as well.
Yes, I have posted a thread on the RRD forum (though I'm not doing a full cross-posting). I haven't gotten any responses back yet, and besides I'm not convinced it is a problem with RRD (or for that matter, Hobbit).
My main reason for updating this thread in the hobbit list is because much of it is Hobbit related (even if Hobbit itself isn't the cause).
=G=
*From:* Josh Luthman [mailto:josh at imaginenetworksllc.com] *Sent:* Wednesday, December 12, 2007 10:29 AM
*To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I think a lot of us are thinking "speak for yourself, Gary" =)
I, too, enjoy a good challenge some times...but not when I'm too busy with other tasks - almost all the time =(
Heheh, since I seem to be the only one with this problem so far (or at least, the only one to post about it), I'm doing most of the responding to my own thread. Oh well, I've learned quite a bit about how Hobbit and RRD both work.
This is certainly a good challenge, since so many factors are involved in this problem. Unfortunately for me, we have people here that might try to represent this problem as a reason to get off Hobbit and use an inferior pay-ware monitoring solution. Thankfully I figured out a work-around, so I'm hoping that will buy me enough time to finally figure this one out.
...And hopefully in the mean time, I don't cause too much irrelevant noise on this mailing list, responding to my own problem ;-)
On 12/12/07, *Galen Johnson* <Galen.Johnson at sas.com> wrote:
Gary,
Are you cross-posting to the rrd lists as well? Someone there may be better able to tell you where to look..or give some idea of something Henrik may need to check in the code. Most of us are rrd consumers only and even fewer see the same issue. I wish I were since I like debugging troublesome issues…much more fun than just clicking "next" when prompted or './configure && make && make install'.
=G=
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Wednesday, December 12, 2007 10:15 AM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
To try and narrow down the problem some more, I got rid of all the old, unused rrd data files and directories in Hobbit's /data directory, and removed all the bogus data from all of them. Since the problem re-occurred, I can fairly confidently say it has nothing to do with having too many rrd files (hey, you never know).
On Dec 11, 2007 9:42 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
I'm curious, is anyone else having this issue? For reference, I have attached a typical screen shot of one of the broken graphs; I think it's a better representation than an earlier screen shot I provided. If it's some bug in hobbit and/or rrdtool, I would think I'm not the only one with this problem.
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
It's interesting that it seems the CPU Load and Users and Processes graphs are the graphs that are most likely to have this strange corruption. I have also seen it on a few Disk graphs, but not nearly as many as the other two graphs. Interestingly, the CPU Utilization, Network I/O, and TCP Connection Times graphs have _never_ had this corruption. I'd also like to say the Memory Utilization graph hasn't had this issue either, though I can't recall with complete certainty that that is the case.
I wonder what the main difference between the 3 graphs that do have the issue is, and the 3 (possibly 4) graphs that have never exhibited this issue. There must be some physical difference, as I can't imagine it is all do purely to luck...
I thought I'd revisit this issue again. A new thought has occurred to me... Where does Hobbit generate the RRD files? I wonder what parameters Hobbit is using to pass to rrdtool, and if something there might be acting funny with some of the data I'm providing to that Hobbit module.
On Wed, Dec 19, 2007 at 10:59 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
It's interesting that it seems the CPU Load and Users and Processes graphs are the graphs that are most likely to have this strange corruption. I have also seen it on a few Disk graphs, but not nearly as many as the other two graphs. Interestingly, the CPU Utilization, Network I/O, and TCP Connection Times graphs have _never_ had this corruption. I'd also like to say the Memory Utilization graph hasn't had this issue either, though I can't recall with complete certainty that that is the case.
I wonder what the main difference between the 3 graphs that do have the issue is, and the 3 (possibly 4) graphs that have never exhibited this issue. There must be some physical difference, as I can't imagine it is all due purely to luck...
If you look through the source, the RRD support modules are easy to spot. If you are using custom graphs, then you need to review the ncv method, or the "roll your own" method.
But, since "garbage in -> garbage out" you might be on the right path.
GLH
From: Gary Baluha [mailto:gumby3203 at gmail.com]
Sent: Monday, April 07, 2008 1:15 PM
To: hobbit at hswn.dk
Subject: Re: [hobbit] strange graph behavior - random machines &
graphs
I thought I'd revisit this issue again. A new thought has
occurred to me... Where does Hobbit generate the RRD files? I wonder what parameters Hobbit is using to pass to rrdtool, and if something there might be acting funny with some of the data I'm providing to that Hobbit module.
On Wed, Dec 19, 2007 at 10:59 AM, Gary Baluha
<gumby3203 at gmail.com> wrote:
It's interesting that it seems the CPU Load and Users
and Processes graphs are the graphs that are most likely to have this strange corruption. I have also seen it on a few Disk graphs, but not nearly as many as the other two graphs. Interestingly, the CPU Utilization, Network I/O, and TCP Connection Times graphs have _never_ had this corruption. I'd also like to say the Memory Utilization graph hasn't had this issue either, though I can't recall with complete certainty that that is the case. I wonder what the main difference between the 3 graphs that do have the issue is, and the 3 (possibly 4) graphs that have never exhibited this issue. There must be some physical difference, as I can't imagine it is all due purely to luck...
In my case, it seems the "garbage" that is going into the graphs is caused by a lack of data, rather than actual bad data. I'm specifically wondering if there's some time interval mix-up that is causing the issue. If anyone would like to see a current example of one of the graphs with bad data, I'll gladly provide a screenshot.
On Mon, Apr 7, 2008 at 2:27 PM, Hubbard, Greg L <greg.hubbard at eds.com> wrote:
If you look through the source, the RRD support modules are easy to spot. If you are using custom graphs, then you need to review the ncv method, or the "roll your own" method.
But, since "garbage in -> garbage out" you might be on the right path.
GLH
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Monday, April 07, 2008 1:15 PM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I thought I'd revisit this issue again. A new thought has occurred to me... Where does Hobbit generate the RRD files? I wonder what parameters Hobbit is using to pass to rrdtool, and if something there might be acting funny with some of the data I'm providing to that Hobbit module.
On Wed, Dec 19, 2007 at 10:59 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
It's interesting that it seems the CPU Load and Users and Processes graphs are the graphs that are most likely to have this strange corruption. I have also seen it on a few Disk graphs, but not nearly as many as the other two graphs. Interestingly, the CPU Utilization, Network I/O, and TCP Connection Times graphs have _never_ had this corruption. I'd also like to say the Memory Utilization graph hasn't had this issue either, though I can't recall with complete certainty that that is the case.
I wonder what the main difference between the 3 graphs that do have the issue is, and the 3 (possibly 4) graphs that have never exhibited this issue. There must be some physical difference, as I can't imagine it is all due purely to luck...
Oh, and I'm using the NCV module.
On Mon, Apr 7, 2008 at 2:35 PM, Gary Baluha <gumby3203 at gmail.com> wrote:
In my case, it seems the "garbage" that is going into the graphs is caused by a lack of data, rather than actual bad data. I'm specifically wondering if there's some time interval mix-up that is causing the issue. If anyone would like to see a current example of one of the graphs with bad data, I'll gladly provide a screenshot.
On Mon, Apr 7, 2008 at 2:27 PM, Hubbard, Greg L <greg.hubbard at eds.com> wrote:
If you look through the source, the RRD support modules are easy to spot. If you are using custom graphs, then you need to review the ncv method, or the "roll your own" method.
But, since "garbage in -> garbage out" you might be on the right path.
GLH
*From:* Gary Baluha [mailto:gumby3203 at gmail.com] *Sent:* Monday, April 07, 2008 1:15 PM *To:* hobbit at hswn.dk *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
I thought I'd revisit this issue again. A new thought has occurred to me... Where does Hobbit generate the RRD files? I wonder what parameters Hobbit is using to pass to rrdtool, and if something there might be acting funny with some of the data I'm providing to that Hobbit module.
On Wed, Dec 19, 2007 at 10:59 AM, Gary Baluha <gumby3203 at gmail.com> wrote:
It's interesting that it seems the CPU Load and Users and Processes graphs are the graphs that are most likely to have this strange corruption. I have also seen it on a few Disk graphs, but not nearly as many as the other two graphs. Interestingly, the CPU Utilization, Network I/O, and TCP Connection Times graphs have _never_ had this corruption. I'd also like to say the Memory Utilization graph hasn't had this issue either, though I can't recall with complete certainty that that is the case.
I wonder what the main difference between the 3 graphs that do have the issue is, and the 3 (possibly 4) graphs that have never exhibited this issue. There must be some physical difference, as I can't imagine it is all due purely to luck...
participants (5)
-
Galen.Johnson@sas.com
-
greg.hubbard@eds.com
-
gumby3203@gmail.com
-
josh@imaginenetworksllc.com
-
Thomas.Kern@hq.doe.gov