---------- Forwarded message ---------- From: Adam Goryachev <mailinglists at websitemanagers.com.au> On 27/4/20 05:06, Gary Allen Vollink wrote:
Hi all,
I have a configuration which uses RAID meta-devices set up as raid1 over empty slots for GUI configuration and notification. As such, I have md0 and md1 showing up as fatal errors in Xymon. Again, this setup is standard for this installation. md2 + are all normal normal, valid (and actually hold mounted filesystems).
I'd normally expect to be able to set up analysis.cfg to "something something IGNORE" for this machine. Like:
HOST=vault.home.vollink.com RAID md0 IGNORE RAID md1 IGNORE
Does such a thing exist (and I missed it/have the syntax wrong?) If not, /could/ such a thing exist?
I'm starting to become used to just having a RED screen (and that is dangerous).
If the answer to the above is all, 'no,' then what is the best way to ignore all RAID for that machine?
Thank you much for any thoughts, Gary
You will need to share a your /proc/mdstat and/or a pointer to which ext script you are using to monitor your md RAID. I suspect that your RAID arrays are defined as a two member RAID1 with one missing member, therefore, they would be expected to show as red, because they are failed.
You could either define the RAID arrays as RAID1 with only one member, or else define them as RAID0 with only one member.
Or, you could add the spare drives as spares, or simply not define them as RAID arrays until you actually need to use them.
Regards, Adam
Thank you for responding.
I'm going to guess that the answer to my actual question - is there a way to ignore individual md failures - is "I don't know". To be clear: "I don't know" is acceptable, I read through source-code looking for a way, and I couldn't find one (and so-many bits are auto-loaded that it's super hard to be sure enough to say "no"). I was hoping someone on-list would actually know, but I get why that might not be the case.
To the questions:
============================ /proc/mdstat ===========================
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md2 : active raid5 sda5[0] sdc5[2] sdb5[1] 11711382912 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
md1 : active raid1 sda2[0] sdb2[1] sdc2[2] 2097088 blocks [6/3] [UUU___]
md0 : active raid1 sda1[0] sdb1[1] sdc1[2] 2490176 blocks [6/3] [UUU___]
unused devices: <none> ============================ /proc/mdstat ===========================
I'm using the script here: http://www.it-eckert.com/blog/2015/agent-less-monitoring-with-xymon/ (xymon-rclient.sh).
Specifically, the platform is Synology and yes, Synology runs two raid1 arrays over all of the slots (even though some are empty). I could fix this easily by adding hard drives into the empty slots, but I specifically bought this unit so that I could expand it later. That is, I both understand that this is properly showing broken but unmounted RAIDs and I know why those RAIDs are broken (and thus why the errors are nominal in my setup).
I am still hoping that a failure state that is nominal would be something I'd be able to ignore (just as I can ignore specific libraries or individual filesystems).
The other choice for me is to entirely remove the mdstat portion of Ekert's script. (Sadly, there is nothing else for Synology monitoring that I can get to work at all, and that simple script otherwise covers all of what I need). This means, I won't be notified (through Xymon) if one of my drives does fail, but it's better than getting used to ignoring a RED background.
[Archive readers: It is okay to contact me directly with questions about my setup]
Thank you, Gary Allen Vollink