Bug Report: Critical error in log couldn't be sent to server every time
Hi,
My company is using Hobbit (4.2) to monitor OutOfMemoryError and StackOverflowError in application log, but we found out sometimes Hobbit client did not send data which contains these error strings to server, that resulted in no error reported. Below is our configuration snippet in client-local.cfg, as you can see, although we set maximum amount of data to 10240 bytes, we also set trigger on key word of Error, so even if there is more data in the log than the maximum size set, those matched error string should be sent to server in any case: [our server] log:/home/mine/server.log:10240 trigger Error
So I'm thinking two possible reasons:
- The regular expression for trigger is wrong.
- There's a bug/limitation in logfetch tool, it can only process a maximum data, for example, if application happened to write 100M data to log file in 5 mins, this tool will only process, say last 10M data.
I made some tests to find out root reason, each test contains two steps:
- Clean log, wait after client sends out data.
- Fill in some data into log, the first line is "OutOfMemoryError StackOverflowError", others are just garbage data.
Here is the result, I list the lines (L) and bytes (C) of log after filled in data:
- 485L, 54545C, catch error
- 1445L, 163025C, couldn't catch error
- 707L, 53771C, couldn't catch error
- 468L, 36451C, couldn't catch error
- 226L, 18615C, catch error
The test proves that the trigger pattern is correct, and logfetch tool has an issue to process all new data if it's large (in lines or in bytes, I don't know).
We need to fix it or have a workaround, since these errors are so important, we shouldn't miss them.
Thanks, Samuel Cai
Samuel,
Maybe the current release of Hobbit is not up to this task (maybe you should ask for a refund :) )? I think the Hobbit logfetch function is aimed more at "convenience monitoring" instead of real-time log filtering. It is not hard to envision cases where processing log files in "30 minute chunks" might have scalability problems.
If these messages are VERY important, you might search the Web for a tool that will scan a log file watching for these messages, and then write them to another log, and then have the Hobbit agent watch the log you create that only has "interesting" messages in it.
GLH
-----Original Message----- From: Samuel Cai [mailto:Samuel.Cai at ehealth-china.com] Sent: Wednesday, July 23, 2008 5:22 AM To: hobbit at hswn.dk Subject: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
Hi,
My company is using Hobbit (4.2) to monitor OutOfMemoryError and StackOverflowError in application log, but we found out sometimes Hobbit client did not send data which contains these error strings to server, that resulted in no error reported. Below is our configuration snippet in client-local.cfg, as you can see, although we set maximum amount of data to 10240 bytes, we also set trigger on key word of Error, so even if there is more data in the log than the maximum size set, those matched error string should be sent to server in any case: [our server] log:/home/mine/server.log:10240 trigger Error
So I'm thinking two possible reasons:
- The regular expression for trigger is wrong.
- There's a bug/limitation in logfetch tool, it can only process a maximum data, for example, if application happened to write 100M data to log file in 5 mins, this tool will only process, say last 10M data.
I made some tests to find out root reason, each test contains two steps:
- Clean log, wait after client sends out data.
- Fill in some data into log, the first line is "OutOfMemoryError StackOverflowError", others are just garbage data.
Here is the result, I list the lines (L) and bytes (C) of log after filled in data:
- 485L, 54545C, catch error
- 1445L, 163025C, couldn't catch error
- 707L, 53771C, couldn't catch error
- 468L, 36451C, couldn't catch error
- 226L, 18615C, catch error
The test proves that the trigger pattern is correct, and logfetch tool has an issue to process all new data if it's large (in lines or in bytes, I don't know).
We need to fix it or have a workaround, since these errors are so important, we shouldn't miss them.
Thanks, Samuel Cai
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Wednesday 23 July 2008, Hubbard, Greg L wrote:
Samuel,
Maybe the current release of Hobbit is not up to this task (maybe you should ask for a refund :) )? I think the Hobbit logfetch function is aimed more at "convenience monitoring" instead of real-time log filtering. It is not hard to envision cases where processing log files in "30 minute chunks" might have scalability problems.
If these messages are VERY important, you might search the Web for a tool that will scan a log file watching for these messages, and then write them to another log, and then have the Hobbit agent watch the log you create that only has "interesting" messages in it.
GLH
-----Original Message----- From: Samuel Cai [mailto:Samuel.Cai at ehealth-china.com] Sent: Wednesday, July 23, 2008 5:22 AM To: hobbit at hswn.dk Subject: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
Hi,
My company is using Hobbit (4.2) to monitor OutOfMemoryError and StackOverflowError in application log, but we found out sometimes Hobbit client did not send data which contains these error strings to server, that resulted in no error reported. Below is our configuration snippet in client-local.cfg, as you can see, although we set maximum amount of data to 10240 bytes, we also set trigger on key word of Error, so even if there is more data in the log than the maximum size set, those matched error string should be sent to server in any case: [our server] log:/home/mine/server.log:10240 trigger Error
So I'm thinking two possible reasons:
- The regular expression for trigger is wrong.
- There's a bug/limitation in logfetch tool, it can only process a maximum data, for example, if application happened to write 100M data to log file in 5 mins, this tool will only process, say last 10M data.
I made some tests to find out root reason, each test contains two steps:
- Clean log, wait after client sends out data.
- Fill in some data into log, the first line is "OutOfMemoryError StackOverflowError", others are just garbage data.
Here is the result, I list the lines (L) and bytes (C) of log after filled in data:
- 485L, 54545C, catch error
- 1445L, 163025C, couldn't catch error
- 707L, 53771C, couldn't catch error
- 468L, 36451C, couldn't catch error
- 226L, 18615C, catch error
The test proves that the trigger pattern is correct, and logfetch tool has an issue to process all new data if it's large (in lines or in bytes, I don't know).
We need to fix it or have a workaround, since these errors are so important, we shouldn't miss them.
Thanks, Samuel Cai
It really depends on what log level your application is logging at. If you are logging at 'INFO' level, then there will be alot of data to process. As you see, Hobbit implements a limit on how much log data it will parse. This is a good thing, at least in my opinion.
It all depends what is in your log... and why soo much data is being written. If they are all errors, well hobbit would be catching them telling you there are errors. Since this is not the case.. would guess your log has data other than errors.
Suggestions:
- tune your application log settings so that only errors are written.
- make use of the client-local.cfg log's setting of ignore. This will allow the hobbit client to identify what is an extraneous message, and ignore it. Per the man page:
The ignore PATTERN line (optional) defines lines in the logfile which are ignored entirely, i.e. they are stripped from the logfile data before sending it to the Hobbit server. It is used to remove completely unwanted "noise" entries from the logdata processed by Hobbit. "PATTERN" is a regular expression.
I hope this helps you, ~Steve
It's great to hear you guys, Hubbard and Steve, that you also find this is a limitation (more than a bug), not wrong in my configuration.
I was thinking to modify source codes before, but it might be difficult for me. I'll try your suggestions, thanks!
Samuel Cai
-----Original Message----- From: s_aiello at comcast.net [mailto:s_aiello at comcast.net] Sent: Wednesday, July 23, 2008 9:50 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
On Wednesday 23 July 2008, Hubbard, Greg L wrote:
Samuel,
Maybe the current release of Hobbit is not up to this task (maybe you should ask for a refund :) )? I think the Hobbit logfetch function is aimed more at "convenience monitoring" instead of real-time log filtering. It is not hard to envision cases where processing log files in "30 minute chunks" might have scalability problems.
If these messages are VERY important, you might search the Web for a tool that will scan a log file watching for these messages, and then write them to another log, and then have the Hobbit agent watch the log you create that only has "interesting" messages in it.
GLH
-----Original Message----- From: Samuel Cai [mailto:Samuel.Cai at ehealth-china.com] Sent: Wednesday, July 23, 2008 5:22 AM To: hobbit at hswn.dk Subject: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
Hi,
My company is using Hobbit (4.2) to monitor OutOfMemoryError and StackOverflowError in application log, but we found out sometimes Hobbit client did not send data which contains these error strings to server, that resulted in no error reported. Below is our configuration snippet in client-local.cfg, as you can see, although we set maximum amount of data to 10240 bytes, we also set trigger on key word of Error, so even if there is more data in the log than the maximum size set, those matched error string should be sent to server in any case: [our server] log:/home/mine/server.log:10240 trigger Error
So I'm thinking two possible reasons:
- The regular expression for trigger is wrong.
- There's a bug/limitation in logfetch tool, it can only process a maximum data, for example, if application happened to write 100M data to log file in 5 mins, this tool will only process, say last 10M data.
I made some tests to find out root reason, each test contains two steps:
- Clean log, wait after client sends out data.
- Fill in some data into log, the first line is "OutOfMemoryError StackOverflowError", others are just garbage data.
Here is the result, I list the lines (L) and bytes (C) of log after filled in data:
- 485L, 54545C, catch error
- 1445L, 163025C, couldn't catch error
- 707L, 53771C, couldn't catch error
- 468L, 36451C, couldn't catch error
- 226L, 18615C, catch error
The test proves that the trigger pattern is correct, and logfetch tool has an issue to process all new data if it's large (in lines or in bytes, I don't know).
We need to fix it or have a workaround, since these errors are so important, we shouldn't miss them.
Thanks, Samuel Cai
It really depends on what log level your application is logging at. If you are logging at 'INFO' level, then there will be alot of data to process. As you see, Hobbit implements a limit on how much log data it will parse. This is a good thing, at least in my opinion.
It all depends what is in your log... and why soo much data is being written. If they are all errors, well hobbit would be catching them telling you there are errors. Since this is not the case.. would guess your log has data other than errors.
Suggestions:
- tune your application log settings so that only errors are written.
- make use of the client-local.cfg log's setting of ignore. This will allow the hobbit client to identify what is an extraneous message, and ignore it. Per the man page:
The ignore PATTERN line (optional) defines lines in the logfile which are ignored entirely, i.e. they are stripped from the logfile data before sending it to the Hobbit server. It is used to remove completely unwanted "noise" entries from the logdata processed by Hobbit. "PATTERN" is a regular expression.
I hope this helps you, ~Steve
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Wednesday 23 July 2008, Samuel Cai wrote:
It really depends on what log level your application is logging at. If you are logging at 'INFO' level, then there will be alot of data to process. As you see, Hobbit implements a limit on how much log data it will parse. This is a good thing, at least in my opinion.
It all depends what is in your log... and why soo much data is being written. If they are all errors, well hobbit would be catching them telling you there are errors. Since this is not the case.. would guess your log has data other than errors.
Suggestions:
- tune your application log settings so that only errors are written.
- make use of the client-local.cfg log's setting of ignore. This will allow the hobbit client to identify what is an extraneous message, and ignore it. Per the man page:
The ignore PATTERN line (optional) defines lines in the logfile which are ignored entirely, i.e. they are stripped from the logfile data before sending it to the Hobbit server. It is used to remove completely unwanted "noise" entries from the logdata processed by Hobbit. "PATTERN" is a regular expression.
I hope this helps you, ~Steve
It's great to hear you guys, Hubbard and Steve, that you also find this is a limitation (more than a bug), not wrong in my configuration.
I was thinking to modify source codes before, but it might be difficult for me. I'll try your suggestions, thanks!
Samuel Cai
In my reply to your email, I said that this behavior "was a good thing". I do not find this to be a limitation at all. I offered you two possible solutions, were any of these applicable ?
The "limitation" really resides in whatever application is logging soo verbosely. Production level applications should have their logging limited as much as possible whenever possible, only logging indicators of errors. And whenever this isn't possible, make use of the IGNORE option.
~Steve
Hi, Steve
For your two suggestions, I checked the source codes, there is a 100K limitation, so it doesn't help to introduce ignore if the Error is out of that range. So you may wonder why we output more than 100K in just 30 minutes, there are several reasons:
- It is our production server log, which is busy.
- We set level to Warning, which output more than Error level
- Some our codes did not set log level correctly, we're in the process of cleaning up it.
- If there's exception, we output whole thread log in INFO level, which is huge.
Anyway, I still think 100K in 30 minutes is a little small value for a busy site's log, I would like to remove this limitation and also keep cleaning up our logs.
Thanks for your suggestion, Samuel Cai
-----Original Message----- From: s_aiello at comcast.net [mailto:s_aiello at comcast.net] Sent: Thursday, July 24, 2008 8:41 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
On Wednesday 23 July 2008, Samuel Cai wrote:
It really depends on what log level your application is logging at. If you are logging at 'INFO' level, then there will be alot of data to process. As you see, Hobbit implements a limit on how much log data it will parse. This is a good thing, at least in my opinion.
It all depends what is in your log... and why soo much data is being written. If they are all errors, well hobbit would be catching them telling you there are errors. Since this is not the case.. would guess your log has data other than errors.
Suggestions:
- tune your application log settings so that only errors are written.
- make use of the client-local.cfg log's setting of ignore. This will allow the hobbit client to identify what is an extraneous message, and ignore it. Per the man page:
The ignore PATTERN line (optional) defines lines in the logfile which are ignored entirely, i.e. they are stripped from the logfile data before sending it to the Hobbit server. It is used to remove completely unwanted "noise" entries from the logdata processed by Hobbit. "PATTERN" is a regular expression.
I hope this helps you, ~Steve
It's great to hear you guys, Hubbard and Steve, that you also find this is a limitation (more than a bug), not wrong in my configuration.
I was thinking to modify source codes before, but it might be difficult for me. I'll try your suggestions, thanks!
Samuel Cai
In my reply to your email, I said that this behavior "was a good thing". I do not find this to be a limitation at all. I offered you two possible solutions, were any of these applicable ?
The "limitation" really resides in whatever application is logging soo verbosely. Production level applications should have their logging limited as much as possible whenever possible, only logging indicators of errors. And whenever this isn't possible, make use of the IGNORE option.
~Steve
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Samuel,
If you think this through, you probably don't really want the Hobbit agent to send that much data up to the Hobbit server every 5 minutes. You are welcome to modify your own source code to change the 100K limit to anything you want, but I think you will be better served to look for a near-real-time log filtering process that can process your big, busy log on the local host, and then spit out "significant" messages in another log that you can wire into Hobbit.
I do not have any solutions to offer, but I think you can probably find plenty of options if you spend a few minutes in a Google search.
-----Original Message----- From: Samuel Cai [mailto:Samuel.Cai at ehealth-china.com] Sent: Thursday, July 24, 2008 8:01 PM To: hobbit at hswn.dk Subject: RE: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
Hi, Steve
For your two suggestions, I checked the source codes, there is a 100K limitation, so it doesn't help to introduce ignore if the Error is out of that range. So you may wonder why we output more than 100K in just 30 minutes, there are several reasons:
- It is our production server log, which is busy.
- We set level to Warning, which output more than Error level 3. Some our codes did not set log level correctly, we're in the process of cleaning up it.
- If there's exception, we output whole thread log in INFO level, which is huge.
Anyway, I still think 100K in 30 minutes is a little small value for a busy site's log, I would like to remove this limitation and also keep cleaning up our logs.
Thanks for your suggestion, Samuel Cai
-----Original Message----- From: s_aiello at comcast.net [mailto:s_aiello at comcast.net] Sent: Thursday, July 24, 2008 8:41 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
On Wednesday 23 July 2008, Samuel Cai wrote:
It really depends on what log level your application is logging at. If you are logging at 'INFO' level, then there will be alot of data to process. As you see, Hobbit implements a limit on how much log data it will parse. This is a good thing, at least in my opinion.
It all depends what is in your log... and why soo much data is being written. If they are all errors, well hobbit would be catching them telling you there are errors. Since this is not the case.. would guess your log has data other than errors.
Suggestions:
- tune your application log settings so that only errors are written.
- make use of the client-local.cfg log's setting of ignore. This will allow the hobbit client to identify what is an extraneous message, and ignore it. Per the man page:
The ignore PATTERN line (optional) defines lines in the logfile which
are ignored entirely, i.e. they are stripped from the logfile data before sending it to the Hobbit server. It is used to remove completely unwanted "noise" entries from the logdata processed by Hobbit. "PATTERN" is a regular expression.
I hope this helps you, ~Steve
It's great to hear you guys, Hubbard and Steve, that you also find this is a limitation (more than a bug), not wrong in my configuration.
I was thinking to modify source codes before, but it might be difficult for me. I'll try your suggestions, thanks!
Samuel Cai
In my reply to your email, I said that this behavior "was a good thing". I do not find this to be a limitation at all. I offered you two possible solutions, were any of these applicable ?
The "limitation" really resides in whatever application is logging soo verbosely. Production level applications should have their logging limited as much as possible whenever possible, only logging indicators of errors. And whenever this isn't possible, make use of the IGNORE option.
~Steve
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi Hubbard,
I think you have some misunderstanding on how Hobbit works. Hobbit client still sends the limited data you defined to server, no more. Removing 100k limitation just let the logfetch program processes more data every time before it sends data to server, so it may have impact on client's performance, but won't affect network traffic. Anyway, I will stick on my solution, introducing another filtering process may make hobbit client configuration complex.
Thanks, Samuel Cai
-----Original Message----- From: Hubbard, Greg L [mailto:greg.hubbard at eds.com] Sent: Friday, July 25, 2008 9:45 PM To: hobbit at hswn.dk Subject: RE: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
Samuel,
If you think this through, you probably don't really want the Hobbit agent to send that much data up to the Hobbit server every 5 minutes. You are welcome to modify your own source code to change the 100K limit to anything you want, but I think you will be better served to look for a near-real-time log filtering process that can process your big, busy log on the local host, and then spit out "significant" messages in another log that you can wire into Hobbit.
I do not have any solutions to offer, but I think you can probably find plenty of options if you spend a few minutes in a Google search.
-----Original Message----- From: Samuel Cai [mailto:Samuel.Cai at ehealth-china.com] Sent: Thursday, July 24, 2008 8:01 PM To: hobbit at hswn.dk Subject: RE: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
Hi, Steve
For your two suggestions, I checked the source codes, there is a 100K limitation, so it doesn't help to introduce ignore if the Error is out of that range. So you may wonder why we output more than 100K in just 30 minutes, there are several reasons:
- It is our production server log, which is busy.
- We set level to Warning, which output more than Error level 3. Some our codes did not set log level correctly, we're in the process of cleaning up it.
- If there's exception, we output whole thread log in INFO level, which is huge.
Anyway, I still think 100K in 30 minutes is a little small value for a busy site's log, I would like to remove this limitation and also keep cleaning up our logs.
Thanks for your suggestion, Samuel Cai
-----Original Message----- From: s_aiello at comcast.net [mailto:s_aiello at comcast.net] Sent: Thursday, July 24, 2008 8:41 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time
On Wednesday 23 July 2008, Samuel Cai wrote:
It really depends on what log level your application is logging at. If you are logging at 'INFO' level, then there will be alot of data to process. As you see, Hobbit implements a limit on how much log data it will parse. This is a good thing, at least in my opinion.
It all depends what is in your log... and why soo much data is being written. If they are all errors, well hobbit would be catching them telling you there are errors. Since this is not the case.. would guess your log has data other than errors.
Suggestions:
- tune your application log settings so that only errors are written.
- make use of the client-local.cfg log's setting of ignore. This will allow the hobbit client to identify what is an extraneous message, and ignore it. Per the man page:
The ignore PATTERN line (optional) defines lines in the logfile which
are ignored entirely, i.e. they are stripped from the logfile data before sending it to the Hobbit server. It is used to remove completely unwanted "noise" entries from the logdata processed by Hobbit. "PATTERN" is a regular expression.
I hope this helps you, ~Steve
It's great to hear you guys, Hubbard and Steve, that you also find this is a limitation (more than a bug), not wrong in my configuration.
I was thinking to modify source codes before, but it might be difficult for me. I'll try your suggestions, thanks!
Samuel Cai
In my reply to your email, I said that this behavior "was a good thing". I do not find this to be a limitation at all. I offered you two possible solutions, were any of these applicable ?
The "limitation" really resides in whatever application is logging soo verbosely. Production level applications should have their logging limited as much as possible whenever possible, only logging indicators of errors. And whenever this isn't possible, make use of the IGNORE option.
~Steve
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
participants (3)
-
greg.hubbard@eds.com
-
s_aiello@comcast.net
-
Samuel.Cai@ehealth-china.com