[xymon]repair rrd data
Hello
I have many ncv graph and they often "dotted" instead made of lines It's like some data left in rrd : see attachment
- How to correct without loosing all data ?
- how it's happened ?
I had search for "rrd repair tools" but nothing seems works on my server or PEBCAK ;-)
(hope my english is not too bad)
Hi Diffusion
This is most likely a problem with the data source feeding into Xymon, in which case the RRD file will have times with no data, shown as "nan" (not a number) in "rrdtool fetch <filename> AVERAGE" output. Note that it's common for one or two "NaN" records at the end of each timescale in the RRD file (and a bunch at the start also, if the host has been added relatively recently), but it's less common to have numbers intermingled with NaN lines. Here's an example:
$ *rrdtool fetch disk,root.rrd AVERAGE* pct used
1718868300: 8.6000000000e+01 6.1474990000e+06 1718868600: 8.6000000000e+01 6.1474990000e+06 1718868900: -nan -nan 1718869200: -nan -nan 1718869500: -nan -nan 1718869800: 8.6000000000e+01 6.1474990000e+06 1718870100: 8.6000000000e+01 6.1474990000e+06 1718870400: 8.6000000000e+01 6.1474990000e+06 1718870700: 8.6000000000e+01 6.1474990000e+06 ...
This example is a sequence of data samples with numbers, then nan due to missing samples and then more numbers. If you find this in your RRD files, then the data doesn't exist, and there's no way to repair. The first column is epoch time, which can be converted to local time using (on Linux or with GNU date) "date --date @<epochtime>". This might be helpful in correlating the nan entries with the gaps in your graphs.
In my experience, this is often caused by a connectivity problem between the client and the server. You may find that data from the client (eg disk, cpu) has gaps, but data from xymonnet probes (eg conn, http, dns) have no gaps. This is consistent with missing client data samples.
You might find log messages in xymond.log that help identify a problem. Also on the client, the xymonclient.log file might be worth a look.
Cheers Jeremy
On Fri, 21 Jun 2024 at 01:16, <diffusion at bulot-fr.com> wrote:
Hello
I have many ncv graph and they often "dotted" instead made of lines It's like some data left in rrd : see attachment
- How to correct without loosing all data ?
- how it's happened ?
I had search for "rrd repair tools" but nothing seems works on my server or PEBCAK ;-)
(hope my english is not too bad)
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Le Fri, 21 Jun 2024 10:23:24 +1000, Jeremy Laidman <jeremy at laidman.org> a ?crit :
This is most likely a problem with the data source feeding into Xymon, in which case the RRD file will have times with no data, shown as "nan" (not a number) in "rrdtool fetch <filename> AVERAGE" output. Note that it's common for one or two "NaN" records at the end of each timescale in the RRD file (and a bunch at the start also, if the host has been added relatively recently), but it's less common to have numbers intermingled with NaN lines. Here's an example:
$ rrdtool fetch "${RRDBASEHOST}/ups,BCHARGE.rrd" AVERAGE | tail -15 1719046500: 1,0000000000e+02 1719046800: 1,0000000000e+02 1719047100: 1,0000000000e+02 1719047400: 1,0000000000e+02 1719047700: 1,0000000000e+02 1719048000: 1,0000000000e+02 1719048300: 1,0000000000e+02 1719048600: 1,0000000000e+02 1719048900: -nan 1719049200: -nan 1719049500: -nan 1719049800: -nan 1719050100: 1,0000000000e+02 1719050400: -nan 1719050700: -nan
You might find log messages in xymond.log that help identify a problem. Also on the client, the xymonclient.log file might be worth a look.
ok i'll have a look, at first nothing, but i'll add verbosity for my check "ups/bcharge" (the most easy to look, and the least important history)
Gaps like that happen when there's missing data. There's not a lot you can do about that.
If you really, really want to make the gaps go away, you can try dumping out the RRD to XML, edit the dump, then load it back in.
rrdtool dump xymon.rrd
After getting past the header, you'll see the actual data in this form:
<!-- 2024-06-19 18:55:00 EDT / 1718837700 -->
<row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row> <!-- 2024-06-19 19:00:00 EDT / 1718838000 --> <row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row> <!-- 2024-06-19 19:05:00 EDT / 1718838300 --> <row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row> <!-- 2024-06-19 19:10:00 EDT / 1718838600 --> <row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row> <!-- 2024-06-19 19:15:00 EDT / 1718838900 --> <row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row>
I think where there's a missing value, you'll see "NaN" or something similar. You could rewrite the empty values with an average of the row above and below. After saving the file, import it back into a new RRD. The problem with doing all that is you have to get it done between samples being written, which is typically 5 minute intervals. If you don't swap the old RRD for the new one in time, you'll miss the next sample.
Ralph Mitchell
On Thu, Jun 20, 2024 at 11:16?AM <diffusion at bulot-fr.com> wrote:
Hello
I have many ncv graph and they often "dotted" instead made of lines It's like some data left in rrd : see attachment
- How to correct without loosing all data ?
- how it's happened ?
I had search for "rrd repair tools" but nothing seems works on my server or PEBCAK ;-)
(hope my english is not too bad)
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Le Thu, 20 Jun 2024 20:35:31 -0400, Ralph M <ralphmitchell at gmail.com> a ?crit :
Gaps like that happen when there's missing data. There's not a lot you can do about that.
i think that to (and Jeremy Laidman previous answer)
If you really, really want to make the gaps go away, you can try dumping out the RRD to XML, edit the dump, then load it back in.
rrdtool dump xymon.rrd
$ rrdtool dump "${RRDBASEHOST}/ups,BCHARGE.rrd" | tail -7 | sed 's#.*-->\(.*\)# \1#' <row><v>9.9378222222e+01</v></row> <row><v>NaN</v></row> <row><v>1.0000000000e+02</v></row> <row><v>1.0000000000e+02</v></row> </database> </rra> </rrd>
I'll try to see the trouble, and after trying to add "missing data with value not too bad"
participants (3)
-
diffusion@bulot-fr.com
-
jeremy@laidman.org
-
ralphmitchell@gmail.com