hobbitfetch spinning again
The patched hobbitfetch ran longer, but it still ended up consuming a processor after about 3 days. I did a kill -6 on it, but couldn't find the corefile anywhere.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
On Fri, 2007-07-27 at 12:29 -0500, McDonald, Dan wrote:
The patched hobbitfetch ran longer, but it still ended up consuming a processor after about 3 days. I did a kill -6 on it, but couldn't find the corefile anywhere.
Died again... Guess it's not more stable, just luck of the draw as to when it hangs. Again when I did a kill -6 there was no corefile anywhere generated.
Should I strace the process the next time it "gang aft agley"?
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
Hello,
I need a recommendation on paging. When I last submitted this, Henrik pointed out that when I use the DURATION<30, that if the Alert goes from yellow to Red within 30 minutes, I won't get the alert.
So, I thought if I divided up, like below, the problem would be solved.
HOST=%du* SERVICE=conn,cpu,disk,nfs MAIL DB_TEAM at mysite.com REPEAT=15 COLOR=YELLOW DURATION<30 RECOVERED
HOST=%du* SERVICE=conn,cpu,disk,nfs
MAIL 5555551212 at airmail.com REPEAT=15 COLOR=RED DURATION<30 RECOVERED
So, based on the settings up, At around midnight last night, a database server reached YELLOW and sent out two emails 15 minutes apart (30 minutes).
One hour and 30 minutes later, the database server reached RED, however, no pages were sent out.
From notifications log:
Sat Jul 28 00:08:55 2007 du102.disk (192.168.1.76) DB_TEAM at mysite.com [128] 1185599335 100
Sat Jul 28 00:24:24 2007 du102.disk (191.168.1.76) DB_TEAM at mysite.com 1185600264 100
Sat Jul 28 00:13:19 2007 du103.disk (192.168.1.78) DB_TEAM at mysite.com [128] 1185599599 100
Sat Jul 28 00:28:24 2007 du103.disk (192.168.1.78) DB_TEAM at mysite.com [128] 1185600504 100
What I'm trying to do is reduce the number of overall pages that get sent when an alert occurs, regardless it it's red or yellow.
The persons getting the notifications only want a maximum of two notifications per red or yellow alert.
Any suggestions? How is everyone else handling not getting bombarded with pages? I'm open to any strategy that insures I'm getting pages on everything, but also doesn't send multiple pages on the same problem.
I would also very much appreciate if you could send the setting in the alert file with the suggestion so I can set mine up that way.
Thanks All for the help.
James
On Sat, 2007-07-28 at 09:53 -0500, James Wade wrote:
Hello,
I need a recommendation on paging. When I last submitted this, Henrik pointed out that when I use the DURATION<30, that if the Alert goes from yellow to Red within 30 minutes, I won't get the alert.
So, I thought if I divided up, like below, the problem would be solved.
HOST=%du* SERVICE=conn,cpu,disk,nfs MAIL DB_TEAM at mysite.com REPEAT=15 COLOR=YELLOW DURATION<30 RECOVERED
HOST=%du* SERVICE=conn,cpu,disk,nfs
MAIL 5555551212 at airmail.com REPEAT=15 COLOR=RED DURATION<30 RECOVERED
Try testing your rules and see which ones it matches.
Usage: hobbitd_alert --test HOST SERVICE [duration [color [time]]]
Depending on your results, you may want the red alert first, the hobbit-alerts file is read from the top down.
Trent
So, based on the settings up, At around midnight last night, a database server reached YELLOW and sent out two emails 15 minutes apart (30 minutes).
One hour and 30 minutes later, the database server reached RED, however, no pages were sent out.
From notifications log:
Sat Jul 28 00:08:55 2007 du102.disk (192.168.1.76) DB_TEAM at mysite.com [128] 1185599335 100
Sat Jul 28 00:24:24 2007 du102.disk (191.168.1.76) DB_TEAM at mysite.com 1185600264 100
Sat Jul 28 00:13:19 2007 du103.disk (192.168.1.78) DB_TEAM at mysite.com [128] 1185599599 100
Sat Jul 28 00:28:24 2007 du103.disk (192.168.1.78) DB_TEAM at mysite.com [128] 1185600504 100
What I'm trying to do is reduce the number of overall pages that get sent when an alert occurs, regardless it it's red or yellow.
The persons getting the notifications only want a maximum of two notifications per red or yellow alert.
Any suggestions? How is everyone else handling not getting bombarded with pages? I'm open to any strategy that insures I'm getting pages on everything, but also doesn't send multiple pages on the same problem.
I would also very much appreciate if you could send the setting in the alert file with the suggestion so I can set mine up that way.
Thanks All for the help.
James
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Just want to confirm that I'm getting the following right:
--When the process hobbitd_client encounters errors in parsing its configuration file hobbit-clients.cfg, it logs them to clientdata.log rather than to hobbitclient.log (which never seems to get any entries).
--Any "unknown token" error in hobbit-clients.cfg will result in hobbitd_client disregarding all lines below the error.
participants (4)
-
Dan.McDonald@austinenergy.com
-
hobbit@epperson.homelinux.net
-
jkwade@futurefrontiers.com
-
trent.melcher@sitel.com