On 28 October 2014 09:58, Bill Arlofski <waa-hobbitml at revpol.com> wrote:
Other ideas? Can I somehow hammer this square peg into a round hole?
You can create a dynamic file based on the logfile, and alert on that. For example, in client-local.cfg, something like this:
log:LOG=/tmp/zlic.status; M=$(date +%M); [ $(expr $M % 10) -ge 5 ] && rm -f $LOG; grep "ArchivingAccountsLimit exceeded" /var/log/messages >> $LOG; [ -s $LOG ] && echo "$LOG":4096
I'm assuming that /var/log/messages is rotated daily. What happens here is that zlic.status will get the log entries from your current messages file (updated every 5 minutes) appended to it. If there are no log entries, then the filename is not echoed and Xymon will ignore it (and no alerts possible).
The trick here is that the zlic.status file is emptied only every second run (every 10 minutes) prior to appending the log entries. By shrinking the file size, logfetch thinks the file has been rotated, zeroes its status, and starts looking at the file from the beginning.
Note that if you get a log entry in your messages file just prior to rotation, then you'll only get an alert between the time the message is detected and the messages file is rotated, which could be only a few minutes, or even not at all if the timing isn't favourable. So in other words, this will generate an alert that persists until the next rotation of messages, or messages in the last 0-24 hours. If you want to go for longer than that, you could perhaps grep from the current and previous messages file, so you're alerting on any messages in the last 24-48 hours.
Another way to do this is to use a "file:" definition, similarly creating a status file and then alarming on the file's size (non-zero indicating an alertable log entry). For example:
file:LOG=/tmp/zlic.status; grep "ArchivingAccountsLimit exceeded" /var/log/messages >> $LOG; echo $LOG
Then in analysis.cfg, create a matching entry and alert on size>0. A down-side to this approach is that you get a particularly unhelpful message along the lines of "FILE /tmp/zlic.status red size >0".
A third and similar way to do this is to create a file that exists only if the licencing log is not detected. Like so:
file:LOG=/tmp/zlic.OK; grep "ArchivingAccountsLimit exceeded" >/dev/null && rm -f $LOG || touch $LOG; echo $LOG
Then in analysis.cfg, create a matching entry and alert on "noexist".
Yet another way to do this is to use a pseudo-file to generate a status message. For example:
file:COL=green; MSG="licencing OK"; LOGS=$(grep "ArchivingAccountsLimit exceeded" /var/log/messages); [ "$LOGS" ] && { COL=red; MSG="licencing error"; }; echo "status ${MACHINE}.zlic $COL $(date) $MSG" | $XYMON $XYMSRV @
There is no output from this pseudo-file, so Xymon will not take any "file" connotations from it and will simply ignore it, except for the side-effects from the $XYMON command that's also run here. This is tantamount to having a client-side ext script, and you may simply prefer to do that. But this can be deployed centrally.
A few notes:
- None of these specific examples have been tested, and may contain syntax errors, but scriptlets like these have been used on production systems.
- I deliberately avoided using colons and backticks, because they are interpreted by the logfetch binary, and break the scriptlets.
- These scriptlets take up to 15 minutes to start reporting after being added to client-local.cfg. When I'm testing these sort of things, I like to bring up a xymoncmd shell, and paste in the bits between the backticks, and look for errors or unexpected output.
J