On 4 November 2014 07:16, Nicole Beck <nskyrca at syr.edu> wrote:
Our hobbit-alerts.cfg file has “DURATION>1m REPEAT=5m” for the msgs test for that machine.
You've configured REPEAT=5m meaning you want Xymon to resend alerts every 5 minutes until green. Is this what you want?
This is a different issue to "msgs" staying yellow for more than 5 minutes. Nearly all of my "msgs" events last for 5 minutes.
Your symptoms are consistent with 6 or more client data messages containing the same (or new) log messages. So I think you should look at the client data when it next occurs and see if it's being updated from one client data message to the next.
It's interesting that the alert emails have the same log entries, suggesting that the state mechanism is not working on the Xymon client. This would happen if something was erasing the logfetch state file on the Xymon client, named $XYMONTMP/logfetch.$MACHINEDOTS.status. If logfetch doesn't know where it got up to in a logfile, it has to start from the beginning each time, and it will report the same messages in the client data, each time it runs.
Unlikely, but another possibility, is that the logfile is being shortened each time. When logfetch detects that a logfile is shorter than the last time it ran, it assumes that the logfile rotated, and so it resets its state and goes back to the start of the logfile. How is the logfile being generated?
Cheers Jeremy