[xymon] Problem with disk monitoring

23 Sep 2010


      This is a somewhat old post, but I'm responding anyway ...
In <AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com> Steve Holmes <sholmes42 at mac.com> writes:
...
...
...
...
Please see below, there is a problem with disk monitoring on one of the
server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
Filesystem         10
4-b]ocks      Used Available Capacity Mounted on
/dev/sda9              9920592    591896   8816628       7% /
/dev/sda10           152435112  54195172  90371708      38% /u01
/dev/sda8              9920592    154056   9254468       2% /tmp
...
It appears that Xymon has slipped one field to the left in parsing the df
output. The string at the beginning of each of the lines before the actual
df ouput should be the name of the filesystem (plus an icon, but we'll
ignore that for now). Then it is using the available number as the percent
used, which, of course, is huge.
...
I don't know if this is causing the problem but there is some funkiness with
the first line of the df output. It is broken between the 10 and the 4 and
there is a ']' instead of an 'l' in the word "blocks". Maybe this is a
cut/paste error, but if not, it is certainly not right.
There is a bug somewhere in the Xymon 4.3.0-beta code with the "df"
status handling. I've seen it cause random RRD files to appear for
systems that don't have such filesystems, and occasionally it would
also result in this behaviour where a disk status goes wild.
I haven't been able to nail it yet, mostly because it seems to happen
very rarely and completely without any pattern. It would seem like
some sort of memory corruption problem, but I've had the client-message
handler running for days with valgrind (memory access checker) enabled,
and it came up with nothing.
Very annoying.
Regards,
Henrik

[xymon] Problem with disk monitoring

henrik＠hswn.dk