All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Which OS? Steve
OS is ............... Red Hat Linux 5.4
On Wed, Jul 28, 2010 at 9:51 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Which OS? Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
On Wed, Jul 28, 2010 at 10:57 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
OS is ............... Red Hat Linux 5.4
On Wed, Jul 28, 2010 at 9:51 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Which OS? Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
I think it should read something like
Filesystem 1K-blocks Used Available Use% Mounted on
If your df actually outputs a broken first line I think that should be fixed first, then see if Xymon parses the output correctly.
Steve
-- The test of a democracy is not the magnificence of buildings or the speed of automobiles or the efficiency of air transportation, but rather the care given to the welfare of all the people. -Helen Adams Keller, lecturer and author (1880-1968)
Truth never damages a cause that is just. -Mohandas Karamchand Gandhi (1869-1948)
The output of df is as follows: [root at localhost u03]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda9 9920592 591896 8816628 7% / /dev/sda10 152435112 54216212 90350668 38% /u01 /dev/sda8 9920592 154052 9254472 2% /tmp /dev/sda7 9920592 3146968 6261556 34% /usr /dev/sda6 9920592 1632936 7775588 18% /opt /dev/sda5 14877060 456172 13652984 4% /var /dev/sda3 29753588 702880 27514896 3% /home /dev/sda1 101086 28003 67864 30% /boot tmpfs 74156180 13079592 61076588 18% /dev/shm /dev/mapper/VolGroup02-u02 2064204960 261882740 1697466668 14% /u02 /dev/mapper/VolGroup03-u03 2064204960 1827234716 132114692 94% /u03
When i pasted the output from the XYMON screen, I deleted the icons, it did have the icons..like red smilies:
Thank you, -shailesh On Wed, Jul 28, 2010 at 10:21 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:57 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
OS is ............... Red Hat Linux 5.4
On Wed, Jul 28, 2010 at 9:51 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Which OS? Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
I think it should read something like
Filesystem 1K-blocks Used Available Use% Mounted on
If your df actually outputs a broken first line I think that should be fixed first, then see if Xymon parses the output correctly.
Steve
-- The test of a democracy is not the magnificence of buildings or the speed of automobiles or the efficiency of air transportation, but rather the care given to the welfare of all the people. -Helen Adams Keller, lecturer and author (1880-1968)
Truth never damages a cause that is just. -Mohandas Karamchand Gandhi (1869-1948)
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
On Wed, Jul 28, 2010 at 11:29 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
The output of df is as follows: [root at localhost u03]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda9 9920592 591896 8816628 7% / /dev/sda10 152435112 54216212 90350668 38% /u01 /dev/sda8 9920592 154052 9254472 2% /tmp /dev/sda7 9920592 3146968 6261556 34% /usr /dev/sda6 9920592 1632936 7775588 18% /opt /dev/sda5 14877060 456172 13652984 4% /var /dev/sda3 29753588 702880 27514896 3% /home /dev/sda1 101086 28003 67864 30% /boot tmpfs 74156180 13079592 61076588 18% /dev/shm /dev/mapper/VolGroup02-u02 2064204960 261882740 1697466668 14% /u02 /dev/mapper/VolGroup03-u03 2064204960 1827234716 132114692 94% /u03
When i pasted the output from the XYMON screen, I deleted the icons, it did have the icons..like red smilies:
Thank you, -shailesh
On Wed, Jul 28, 2010 at 10:21 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:57 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
OS is ............... Red Hat Linux 5.4
On Wed, Jul 28, 2010 at 9:51 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Which OS? Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
I think it should read something like
Filesystem 1K-blocks Used Available Use% Mounted on
If your df actually outputs a broken first line I think that should be fixed first, then see if Xymon parses the output correctly.
Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Well, that looks good to me. I have no idea. Steve
-- The test of a democracy is not the magnificence of buildings or the speed of automobiles or the efficiency of air transportation, but rather the care given to the welfare of all the people. -Helen Adams Keller, lecturer and author (1880-1968)
Truth never damages a cause that is just. -Mohandas Karamchand Gandhi (1869-1948)
Are you using Xymon 4.3.0 beta?
I have seen this before on that version. It seems to be caused by too long status messages. Chech your hobbit test and see if it reports any status messages that are too long. I fixed this problem by increasing the maximum status message size (in hobbitserver.cfg). There are other discussions about this on the mailing list.
/Johan
From: sholmes42 at gmail.com [mailto:sholmes42 at gmail.com] On Behalf Of Steve Holmes Sent: den 28 juli 2010 17:40 To: xymon at xymon.com Subject: Re: [xymon] Problem with disk monitoring
On Wed, Jul 28, 2010 at 11:29 AM, Shailesh Paudyal <shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>> wrote: The output of df is as follows: [root at localhost u03]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda9 9920592 591896 8816628 7% / /dev/sda10 152435112 54216212 90350668 38% /u01 /dev/sda8 9920592 154052 9254472 2% /tmp /dev/sda7 9920592 3146968 6261556 34% /usr /dev/sda6 9920592 1632936 7775588 18% /opt /dev/sda5 14877060 456172 13652984 4% /var /dev/sda3 29753588 702880 27514896 3% /home /dev/sda1 101086 28003 67864 30% /boot tmpfs 74156180 13079592 61076588 18% /dev/shm /dev/mapper/VolGroup02-u02 2064204960 261882740 1697466668 14% /u02 /dev/mapper/VolGroup03-u03 2064204960 1827234716 132114692 94% /u03
When i pasted the output from the XYMON screen, I deleted the icons, it did have the icons..like red smilies:
Thank you, -shailesh
On Wed, Jul 28, 2010 at 10:21 AM, Steve Holmes <sholmes42 at mac.com<mailto:sholmes42 at mac.com>> wrote:
On Wed, Jul 28, 2010 at 10:57 AM, Shailesh Paudyal <shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>> wrote: OS is ............... Red Hat Linux 5.4
On Wed, Jul 28, 2010 at 9:51 AM, Steve Holmes <sholmes42 at mac.com<mailto:sholmes42 at mac.com>> wrote:
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal <shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>
Which OS? Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
I think it should read something like
Filesystem 1K-blocks Used Available Use% Mounted on
If your df actually outputs a broken first line I think that should be fixed first, then see if Xymon parses the output correctly.
Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>
Well, that looks good to me. I have no idea. Steve
-- The test of a democracy is not the magnificence of buildings or the speed of automobiles or the efficiency of air transportation, but rather the care given to the welfare of all the people. -Helen Adams Keller, lecturer and author (1880-1968)
Truth never damages a cause that is just. -Mohandas Karamchand Gandhi (1869-1948)
Yes I am using XYmon 4.3.0 beta.... I am going to increase the messages size and see....will post if it fixes the problem.
Thanks
On Wed, Jul 28, 2010 at 10:50 AM, Johan Sjöberg < johan.sjoberg at deltamanagement.se> wrote:
Are you using Xymon 4.3.0 beta?
I have seen this before on that version. It seems to be caused by too long status messages. Chech your hobbit test and see if it reports any status messages that are too long. I fixed this problem by increasing the maximum status message size (in hobbitserver.cfg). There are other discussions about this on the mailing list.
/Johan
*From:* sholmes42 at gmail.com [mailto:sholmes42 at gmail.com] *On Behalf Of *Steve Holmes *Sent:* den 28 juli 2010 17:40 *To:* xymon at xymon.com *Subject:* Re: [xymon] Problem with disk monitoring
On Wed, Jul 28, 2010 at 11:29 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
The output of df is as follows:
[root at localhost u03]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54216212 90350668 38% /u01
/dev/sda8 9920592 154052 9254472 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456172 13652984 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
tmpfs 74156180 13079592 61076588 18% /dev/shm
/dev/mapper/VolGroup02-u02
2064204960 261882740 1697466668 14% /u02/dev/mapper/VolGroup03-u03
2064204960 1827234716 132114692 94% /u03When i pasted the output from the XYMON screen, I deleted the icons, it did have the icons..like red smilies:
Thank you,
-shailesh
On Wed, Jul 28, 2010 at 10:21 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:57 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
OS is ............... Red Hat Linux 5.4
On Wed, Jul 28, 2010 at 9:51 AM, Steve Holmes <sholmes42 at mac.com> wrote:
On Wed, Jul 28, 2010 at 10:43 AM, Shailesh Paudyal < shailesh.paudyal at gmail.com> wrote:
All,
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
2% /tmp (9254468% used) has reached the PANIC level (95%)
34% /usr (6261556% used) has reached the PANIC level (95%)
18% /opt (7775588% used) has reached the PANIC level (95%)
4% /var (13653064% used) has reached the PANIC level (95%)
3% /home (27514896% used) has reached the PANIC level (95%)
30% /boot (67864% used) has reached the PANIC level (95%)
14% /u02 (1697518148% used) has reached the PANIC level (95%)
94% /u03 (136865636% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
/dev/sda7 9920592 3146968 6261556 34% /usr
/dev/sda6 9920592 1632936 7775588 18% /opt
/dev/sda5 14877060 456092 13653064 4% /var
/dev/sda3 29753588 702880 27514896 3% /home
/dev/sda1 101086 28003 67864 30% /boot
/dev/mapper/VolGroup02-u02 2064204960 261831260 1697518148 14% /u02
/dev/mapper/VolGroup03-u03 2064204960 1822483772 136865636 94% /u03
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Which OS?
Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
I think it should read something like
Filesystem 1K-blocks Used Available Use% Mounted on
If your df actually outputs a broken first line I think that should be fixed first, then see if Xymon parses the output correctly.
Steve
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
Well, that looks good to me. I have no idea.
Steve
-- The test of a democracy is not the magnificence of buildings or the speed of automobiles or the efficiency of air transportation, but rather the care given to the welfare of all the people. -Helen Adams Keller, lecturer and author (1880-1968)
Truth never damages a cause that is just. -Mohandas Karamchand Gandhi (1869-1948)
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
This is a somewhat old post, but I'm responding anyway ...
In <AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com> Steve Holmes <sholmes42 at mac.com> writes:
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%) 38% /u01 (90371708% used) has reached the PANIC level (95%)
Filesystem 10 4-b]ocks Used Available Capacity Mounted on /dev/sda9 9920592 591896 8816628 7% / /dev/sda10 152435112 54195172 90371708 38% /u01 /dev/sda8 9920592 154056 9254468 2% /tmp
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
There is a bug somewhere in the Xymon 4.3.0-beta code with the "df" status handling. I've seen it cause random RRD files to appear for systems that don't have such filesystems, and occasionally it would also result in this behaviour where a disk status goes wild.
I haven't been able to nail it yet, mostly because it seems to happen very rarely and completely without any pattern. It would seem like some sort of memory corruption problem, but I've had the client-message handler running for days with valgrind (memory access checker) enabled, and it came up with nothing.
Very annoying.
Regards, Henrik
Thanks Henrik, But I still see the problem, please see the following alert came from xymon a week or so ago.....
red Su] Aug 22 01:26:51 EDT 2010 - Filesystems NOT ok &red 21% / (20461936% used) has reached the PANIC level (95%) &red 1% /app (65009052% used) has reached the PANIC level (95%) &red 1% /home (112112360% used) has reached the PANIC level (95%) &red 6% /var (8933348% used) has reached the PANIC level (95%) &red 1% /tmp (18638136% used) has reached the PANIC level (95%) &red 49% /boot (48938% used) has reached the PANIC level (95%) &red 11% /u01 (218810168% used) has reached the PANIC level (95%) &red 7% /u04 (228560164% used) has reached the PANIC level (95%) &red 35% /u02 (1154279480% used) has reached the PANIC level (95%) &red 24% /old_u02 (1507070236% used) has reached the PANIC level (95%)
Filesystem 1
24-]locks Used Available Capacity Mounted on
/dev/sda5 27054004 5195620 20461936 21% /
/dev/sdb1 68814716 253696 65009052 1% /app
/dev/sdc2 118417044 192356 112112360 1% /home
/dev/sda3 9920624 475208 8933348 6% /var
/dev/sdc1 19840892 178616 18638136 1% /tmp
/dev/sda1 101086 46929 48938 49% /boot
/dev/mapper/VolGroup01-u01 258022788 26105832 218810168 11% /u01
/dev/mapper/VolGroup04-u04 258022788 16355836 228560164 7% /u04
/dev/mapper/VolGroup03-u03 1857784872 609135396 1154279480 35% /u02
/dev/mapper/VolGroup02-u02 2064204960 452279172 1507070236 24% /old_u02
On Thu, Sep 23, 2010 at 3:45 PM, Henrik Størner <henrik at hswn.dk> wrote:
This is a somewhat old post, but I'm responding anyway ...
In <AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com> Steve Holmes <sholmes42 at mac.com> writes:
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%) 38% /u01 (90371708% used) has reached the PANIC level (95%)
Filesystem 10 4-b]ocks Used Available Capacity Mounted on /dev/sda9 9920592 591896 8816628 7% / /dev/sda10 152435112 54195172 90371708 38% /u01 /dev/sda8 9920592 154056 9254468 2% /tmp
It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
There is a bug somewhere in the Xymon 4.3.0-beta code with the "df" status handling. I've seen it cause random RRD files to appear for systems that don't have such filesystems, and occasionally it would also result in this behaviour where a disk status goes wild.
I haven't been able to nail it yet, mostly because it seems to happen very rarely and completely without any pattern. It would seem like some sort of memory corruption problem, but I've had the client-message handler running for days with valgrind (memory access checker) enabled, and it came up with nothing.
Very annoying.
Regards, Henrik
To unsubscribe from the xymon list, send an e-mail to xymon-unsubscribe at xymon.com
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
I think this is somehow related to oversized status messages. We were having problems with this on 4.3.0 beta, and we also had a lot of oversized status messages (ports etc). Since we increased the max message size, we have not seen the problem with the disk test.
/Johan
From: Shailesh Paudyal [mailto:shailesh.paudyal at gmail.com] Sent: den 23 september 2010 22:57 To: xymon at xymon.com Subject: Re: [xymon] Problem with disk monitoring
Thanks Henrik, But I still see the problem, please see the following alert came from xymon a week or so ago.....
red Su] Aug 22 01:26:51 EDT 2010 - Filesystems NOT ok &red 21% / (20461936% used) has reached the PANIC level (95%) &red 1% /app (65009052% used) has reached the PANIC level (95%) &red 1% /home (112112360% used) has reached the PANIC level (95%) &red 6% /var (8933348% used) has reached the PANIC level (95%) &red 1% /tmp (18638136% used) has reached the PANIC level (95%) &red 49% /boot (48938% used) has reached the PANIC level (95%) &red 11% /u01 (218810168% used) has reached the PANIC level (95%) &red 7% /u04 (228560164% used) has reached the PANIC level (95%) &red 35% /u02 (1154279480% used) has reached the PANIC level (95%) &red 24% /old_u02 (1507070236% used) has reached the PANIC level (95%)
Filesystem 1
24-]locks Used Available Capacity Mounted on
/dev/sda5 27054004 5195620 20461936 21% /
/dev/sdb1 68814716 253696 65009052 1% /app
/dev/sdc2 118417044 192356 112112360 1% /home
/dev/sda3 9920624 475208 8933348 6% /var
/dev/sdc1 19840892 178616 18638136 1% /tmp
/dev/sda1 101086 46929 48938 49% /boot
/dev/mapper/VolGroup01-u01 258022788 26105832 218810168 11% /u01
/dev/mapper/VolGroup04-u04 258022788 16355836 228560164 7% /u04
/dev/mapper/VolGroup03-u03 1857784872 609135396 1154279480 35% /u02
/dev/mapper/VolGroup02-u02 2064204960 452279172 1507070236 24% /old_u02
On Thu, Sep 23, 2010 at 3:45 PM, Henrik Størner <henrik at hswn.dk<mailto:henrik at hswn.dk>> wrote: This is a somewhat old post, but I'm responding anyway ...
In <AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com<mailto:AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com>> Steve Holmes <sholmes42 at mac.com<mailto:sholmes42 at mac.com>> writes:
Please see below, there is a problem with disk monitoring on one of the server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%) 38% /u01 (90371708% used) has reached the PANIC level (95%)
Filesystem 10 4-b]ocks Used Available Capacity Mounted on /dev/sda9 9920592 591896 8816628 7% / /dev/sda10 152435112 54195172 90371708 38% /u01 /dev/sda8 9920592 154056 9254468 2% /tmp It appears that Xymon has slipped one field to the left in parsing the df output. The string at the beginning of each of the lines before the actual df ouput should be the name of the filesystem (plus an icon, but we'll ignore that for now). Then it is using the available number as the percent used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with the first line of the df output. It is broken between the 10 and the 4 and there is a ']' instead of an 'l' in the word "blocks". Maybe this is a cut/paste error, but if not, it is certainly not right.
There is a bug somewhere in the Xymon 4.3.0-beta code with the "df" status handling. I've seen it cause random RRD files to appear for systems that don't have such filesystems, and occasionally it would also result in this behaviour where a disk status goes wild.
I haven't been able to nail it yet, mostly because it seems to happen very rarely and completely without any pattern. It would seem like some sort of memory corruption problem, but I've had the client-message handler running for days with valgrind (memory access checker) enabled, and it came up with nothing.
Very annoying.
Regards, Henrik
To unsubscribe from the xymon list, send an e-mail to xymon-unsubscribe at xymon.com<mailto:xymon-unsubscribe at xymon.com>
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>
Is the 'W]d" below a cut-n-paste error, or is that the real output?
From: Shailesh Paudyal [shailesh.paudyal at gmail.com] Sent: Wednesday, July 28, 2010 7:43 AM To: xymon at xymon.com Subject: [xymon] Problem with disk monitoring
...snip...
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
...snip...
Thats a real output!
On Wed, Jul 28, 2010 at 10:55 AM, Tim McCloskey <tm at freedom.com> wrote:
Is the 'W]d" below a cut-n-paste error, or is that the real output?
From: Shailesh Paudyal [shailesh.paudyal at gmail.com] Sent: Wednesday, July 28, 2010 7:43 AM To: xymon at xymon.com Subject: [xymon] Problem with disk monitoring
...snip...
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
...snip...
-- Shailesh K. Paudyal shailesh.paudyal at gmail.com
participants (5)
-
henrik@hswn.dk
-
johan.sjoberg@deltamanagement.se
-
shailesh.paudyal@gmail.com
-
sholmes42@mac.com
-
tm@freedom.com