Yellow->red escalation, bug or feature?
Let's say I implement the 3-hour delay before sending an escalation notice. What should happen if the status is yellow for two hours, >then goes red for 2h50m, dips back into yellow for 10 minutes and then goes back to red ? Should the 2h50m count after the status >was yellow for a while? Or does a 10 minute yellow status completely reset the duration counter for the almost-3-hours red status?
Thinking about this again (since xymon woke everyone up again this morning) I'm liking the idea of a RECOVERY= flag.
Seems like there are two kinds of alerts, those where yellow-> red->yellow means things are not so bad (like disk space, which is the one I keep hitting) and those where yellow->red->yellow means you are looking at a larger performance problem (like, say, CPU load or other performance metrics). Being able to treat those two situations separately would be the biggest win.
thanks Betsy
participants (1)
-
betsy.schwartz@gmail.com