[hobbit] system log and application log monitoring
Henrik,
Is there a facility already in place, or a way to graph the number of "hits" returned by a pattern match for a log file?
For instance:
I am checking xyz log file for the word "wrap" It would be *very* useful to have a graph that shows the number of times that word showed up between the previous check and the current check.
This could be very useful to illustrate, say, a disk dying (one blip of a bad read or something would be one thing, but looking at a graph over time that shows 1 blip one week, 10 the next, and 20 the week after that would indicate the disk was almost dead) etc...
Right now, the only way I have to do this is with a client side script that runs in a constant loop:
while true; do
NUM=grep "Buffer wrapped" /quotes/env/errlog | wc -l | sed 's/ *//g'
if [ $NUM -gt $INITIALNUM ] ; then
WRAP_NUM=expr $NUM - $INITIALNUM
$BB $BBDISP "status $MACHINE.wraps green date
echo "wraps:$WRAP_NUM"
"
INITIALNUM=$NUM
else
OKNUM=0
$BB $BBDISP "status $MACHINE.wraps green date
echo "wraps:$OKNUM"
"
fi
-Jeff
On 5/28/06, Henrik Stoerner <henrik at hswn.dk> wrote:
On Sun, May 21, 2006 at 07:29:49PM +0200, Olivier Beau wrote:
well.. i was glad to find OS log files definitions in client-local.cfg Could there be basic OS pattern definitions in hobbit-client.cfg's DEFAULT ?
You'll have to contribute some, then. I don't really know what people are looking for in their logfiles.
next step: application log monitoring let's say i have 100 servers (differents OS of course) running mysql, and i want to follow "ended" in /var/log/mysqld.log ->setting up 100 entries in client-local.cfg doesn't seem great, could there be some kind of grouping in client-local.cfg (PAGE actually..) ? (i guess this would required to processing client-local.cfg before transferring to the clients..)
Welcome to the world of configuration "classes".
Step 1: Put a "CLASS:mysqlservers" on those hosts in bb-hosts. Step 2: Put a section in your client-local.cfg file with [mysqlservers] logfile:/var/log/mysql/status.log Step 3: Configure hobbit-clients.cfg for these logfiles.
Only problem is that you'll need todays snapshot for this to work.
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Fri, Jun 02, 2006 at 11:03:52AM -0500, Jeff Newman wrote:
Is there a facility already in place, or a way to graph the number of "hits" returned by a pattern match for a log file?
For instance:
I am checking xyz log file for the word "wrap" It would be *very* useful to have a graph that shows the number of times that word showed up between the previous check and the current check.
No, there isn't.
This could be very useful to illustrate, say, a disk dying (one blip of a bad read or something would be one thing, but looking at a graph over time that shows 1 blip one week, 10 the next, and 20 the week after that would indicate the disk was almost dead) etc...
Hobbit only looks at log entries over a 30-minute period, so we would have to extend that significantly. So this would have to be done at the client side rather than on the server. (Not a problem, I'm just thinking out loud).
Right now, the only way I have to do this is with a client side script that runs in a constant loop:
while true; do NUM=
grep "Buffer wrapped" /quotes/env/errlog | wc -l | sed 's/ *//g'if [ $NUM -gt $INITIALNUM ] ; then WRAP_NUM=expr $NUM - $INITIALNUM$BB $BBDISP "status $MACHINE.wraps greendateecho "wraps:$WRAP_NUM"" INITIALNUM=$NUM else OKNUM=0 $BB $BBDISP "status $MACHINE.wraps greendateecho "wraps:$OKNUM"" fi
If all that you want is the graph and not alerts, then I wonder if it couldn't be done more easily. Just do the "grep" and report the number like you do now. Then send it into the NCV handler, with a dataset definition that uses the DERIVE datatype (which is the default, btw). Then RRDtool should handle all of the "subtract current value from previous value if it's greater, else ..." stuff and you needn't worry about it.
.....
After thinking a bit more about this, I believe that having a method to do "grep ...| wc -l" in the client might be a good thing. So I've added a new type of configuration the the client-local.cfg file, so you can do
linecount:/var/log/messages
diskerrors I/O error.*/dev/hd
badlogins Login failed
and it will report back in the client message the data
diskerrors: 0 badlogins: 2
which are the number of times these two expressions were found in the /var/log/messages file.
Given those data, on the server side it will be easy to feed them into a graph and do other nice things with it.
Regards, Henrik
On Sun, Jun 04, 2006 at 10:04:44AM +0200, Henrik Stoerner wrote:
and it will report back in the client message the data
diskerrors: 0 badlogins: 2
which are the number of times these two expressions were found in the /var/log/messages file.
Given those data, on the server side it will be easy to feed them into a graph and do other nice things with it.
The graphs are now created by default, so you can track the trend of how often those lines are logged.
Regards, Henrik
participants (2)
-
henrik@hswn.dk
-
jeffnewman75@gmail.com