Den Sat, 09 Oct 2010 08:25:55 -0400 skrev Elizabeth Schwartz:
The native xymon alert config is easier to read than Big Brother but it doesn't free me up from writing elaborate rules. Right now it takes ****25**** rules to cover our shift changes, and that's just for one set of alerts.
Look at Monday mornings. Nadja in Singapore started at 9 pm EST Sunday night and she's covering until 5:00 am EST. Sam in England starts at 4:00 am EST. The US guys start at 8:00 am EST. So I've got a rule from 9:00 pm to midnight Sundays, then 12:00-am-4:00 am Monday, another rule from 4:00am to 5:00 am Monday, and a third rule from 5:00 am to 8:00 am when the US guys start.... it's just endless. There's another set for Tuesday-Thursday and then more for Friday, Saturday and Sunday.
I can understand that your alert-rules get quite complex, but then it *is* a rather complex environment you have.
Now, I don't know how your time-based rules are intertwined with which systems your people manage. But from the very limited description I have, it sounds like you should structure your alert rules around the people who is manning each time period. Does that change a lot ?
$SINGAPORE=nadja $ENGLAND=sam $US=phil,dan,tom,joe
Something like this is what I'd suggest:
TIME=0:2100:0500 MAIL $SINGAPORE
TIME=0400:1200 MAIL $ENGLAND
TIME=0800:1600 MAIL $US
If you need to distribute alerts further - say, each of the "USguys" have different groups of servers each - you can add this as an extra condition on the MAIL alert, like
TIME=0800:1600 MAIL joe HOST=server1,server2 MAIL tom PAGE=california MAIL $US UNMATCHED
Then Joe would get alerts for those two servers only, Tom would get alerts for the California servers (if they're on one page called "california"), and "usguys" would get all those that didn't go to Joe or Tom.
The conditions can of course also relate to specific services, not just hosts.
Does that help ?
I'd be interested to hear suggestions for a better way of configuring your alerts. It's done the way it is because it seemed flexible enough to handle my needs, and it was much easier to understand than the BB setup (I could never quite figure out how the advanced parts of the BB alert setup worked). But that doesn't mean it is "set in stone" for ever. I am quite open to suggestions on how it can be improved.
Regards, Henrik