On Fri, Jun 02, 2006 at 11:03:52AM -0500, Jeff Newman wrote:
Is there a facility already in place, or a way to graph the number of "hits" returned by a pattern match for a log file?
For instance:
I am checking xyz log file for the word "wrap" It would be *very* useful to have a graph that shows the number of times that word showed up between the previous check and the current check.
No, there isn't.
This could be very useful to illustrate, say, a disk dying (one blip of a bad read or something would be one thing, but looking at a graph over time that shows 1 blip one week, 10 the next, and 20 the week after that would indicate the disk was almost dead) etc...
Hobbit only looks at log entries over a 30-minute period, so we would have to extend that significantly. So this would have to be done at the client side rather than on the server. (Not a problem, I'm just thinking out loud).
Right now, the only way I have to do this is with a client side script that runs in a constant loop:
while true; do NUM=
grep "Buffer wrapped" /quotes/env/errlog | wc -l | sed 's/ *//g'if [ $NUM -gt $INITIALNUM ] ; then WRAP_NUM=expr $NUM - $INITIALNUM$BB $BBDISP "status $MACHINE.wraps greendateecho "wraps:$WRAP_NUM"" INITIALNUM=$NUM else OKNUM=0 $BB $BBDISP "status $MACHINE.wraps greendateecho "wraps:$OKNUM"" fi
If all that you want is the graph and not alerts, then I wonder if it couldn't be done more easily. Just do the "grep" and report the number like you do now. Then send it into the NCV handler, with a dataset definition that uses the DERIVE datatype (which is the default, btw). Then RRDtool should handle all of the "subtract current value from previous value if it's greater, else ..." stuff and you needn't worry about it.
.....
After thinking a bit more about this, I believe that having a method to do "grep ...| wc -l" in the client might be a good thing. So I've added a new type of configuration the the client-local.cfg file, so you can do
linecount:/var/log/messages
diskerrors I/O error.*/dev/hd
badlogins Login failed
and it will report back in the client message the data
diskerrors: 0 badlogins: 2
which are the number of times these two expressions were found in the /var/log/messages file.
Given those data, on the server side it will be easy to feed them into a graph and do other nice things with it.
Regards, Henrik