Hi All,
I'm just in the process of converting our old big brother monitor to xymon. I had a look at zabbix, nagios, groundworks, and pandorafms and found too many quirks, cruft and gotchas (plus the problem of open core systems). I returned to the bb clones and found xymon is not only active and maintained. I must say Xymon is a well thought out, well executed step beyond big brother, well done, Henrik.
Is there a way to have a hierarchy for alerts? For example, say we have a branch office with a print server, file server, router, switch, etc. I have a xymon entry for each (and multiples for each, ie poll the ftp port on the file server as well as just the conn monitoring). If the WAN link to the site dies, I get alerts for all the above, where as I should really just get one alert for the router. Or if the file server dies, I shouldn't get both ftp and file server alerts, just the server.
I know this wasn't available for big brother and can't find anything in xymon. Does this exsit and/or has this been considered?
For our big brother system, I created a perl script daemon that reads in the allevents log to create a "stateful" table of all monitored items. I also have a text file table of the relationships between the monitored items, using tabs:
core_switch wan_router site_router site_switch printer server ftp bbd_host local_server smtp user_subnet_switch printer
This table is read into a hashed array, the allevents log file is tailed to keep the state table current and the script awaits queries. When an alertable event comes in, this daemon is queried by the alert subsystem (using another perl script), the daemon checks to see if any parent items are in a red state, if so the alert is discarded.
This is effectively just a plugin and could be more efficiently done if integrated into the bbd. In terms of config files, having to maintain two files is not ideal. Currently the host listing and web layout are effectively combined in one file, so adding a hierarchy as above would be tricky. Perhaps a second "#" with the parent following, ie:
1.2.3.4 hostname.domain.com # smtp dns #
core_switch.domain.com
Anyway, this is just an initial query about this issue.
thanks and regards, Phil