I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server. After doing this, I noticed the following in the page.log file:
2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the innocent) What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
Click on it the host's test and click on history - was red at all?
Are the WWW pages updating? Look in the top right corner of the page once you click on the host's test link.
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server. After doing this, I noticed the following in the page.log file:
2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the innocent) What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
Yes, the alert history shows it went yellow and then 5 minutes later recovered. The web page is showing everything correct. However, when I check the notifications.log file, I can see that it was still sending alerts about it being yellow, even though it was definitely green.
On Nov 14, 2007 10:38 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Click on it the host's test and click on history - was red at all?
Are the WWW pages updating? Look in the top right corner of the page once you click on the host's test link.
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server. After doing this, I noticed the following in the page.log file:
2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the innocent) What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
You're saying it went yellow, then green. The log tells you it sent an alert when it was yellow.
I'm not sure I'm seeing the problem here =/ It sent an alert first when it was yellow and another when it switch to green to inform you it recovered, correct?
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
Yes, the alert history shows it went yellow and then 5 minutes later recovered. The web page is showing everything correct. However, when I check the notifications.log file, I can see that it was still sending alerts about it being yellow, even though it was definitely green.
On Nov 14, 2007 10:38 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Click on it the host's test and click on history - was red at all?
Are the WWW pages updating? Look in the top right corner of the page once you click on the host's test link.
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server. After doing this, I noticed the following in the page.log file:
2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the
innocent)
What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
Order of events: 1am: alert went yellow, email was sent out 1:15am: alert recovered 2am, and each additional hour: email was sent out saying alert was yellow (it was actually showing green) 10:30a: I restart Hobbit and get the "stale alert" message, and it finally stops sending alerts. Recovery email was never sent out, even though it is in the alert rules.
On Nov 14, 2007 11:22 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
You're saying it went yellow, then green. The log tells you it sent an alert when it was yellow.
I'm not sure I'm seeing the problem here =/ It sent an alert first when it was yellow and another when it switch to green to inform you it recovered, correct?
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
Yes, the alert history shows it went yellow and then 5 minutes later recovered. The web page is showing everything correct. However, when I check the notifications.log file, I can see that it was still sending alerts about it being yellow, even though it was definitely green.
On Nov 14, 2007 10:38 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Click on it the host's test and click on history - was red at all?
Are the WWW pages updating? Look in the top right corner of the page once you click on the host's test link.
On 11/14/07, Gary Baluha < gumby3203 at gmail.com> wrote:
I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server. After doing this, I noticed the following in the page.log file:
2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the
innocent)
What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
-----Original Message----- From: Gary Baluha [mailto:gumby3203 at gmail.com] Sent: Wednesday, November 14, 2007 16:30 To: hobbit at hswn.dk Subject: Re: [hobbit] stale alerts
Order of events: 1am: alert went yellow, email was sent out 1:15am: alert recovered 2am, and each additional hour: email was sent out saying alert was yellow (it was actually showing green) 10:30a: I restart Hobbit and get the "stale alert" message, and it finally stops sending alerts. Recovery email was never sent out, even though it is in the alert rules.
On Nov 14, 2007 11:22 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
You're saying it went yellow, then green. The log tells you it sent an alert when it was yellow.
I'm not sure I'm seeing the problem here =/ It sent an alert first when it was yellow and another when it switch to green to inform you it recovered, correct?
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
Yes, the alert history shows it went yellow and then 5 minutes later recovered. The web page is showing everything correct. However, when I check the notifications.log file, I can see that it was still sending alerts about it being yellow, even though it was definitely green.
On Nov 14, 2007 10:38 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Click on it the host's test and click on history - was red at all?
Are the WWW pages updating? Look in the top right corner of the page once you click on the host's test link.
On 11/14/07, Gary Baluha < gumby3203 at gmail.com> wrote:
I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server.
After doing this, I noticed the following in the page.log file:2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the
innocent)
What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
Gary,
We get those too, along with leftover semaphores and shared memory segments when we stop the hobbit server. The snapshot, seems much better, but with tooltips and host descriptions pushing our display to the far right of the screen we cannot use it right now.
~David
Ps. Just to let you know it's not just your setup.
This has never happened to me - are the two of you using the 4.2.0 release?
Josh
On 11/14/07, Gore, David W (David) <david.gore at verizonbusiness.com> wrote:
-----Original Message----- From: Gary Baluha [mailto:gumby3203 at gmail.com] Sent: Wednesday, November 14, 2007 16:30 To: hobbit at hswn.dk Subject: Re: [hobbit] stale alerts
Order of events: 1am: alert went yellow, email was sent out 1:15am: alert recovered 2am, and each additional hour: email was sent out saying alert was yellow (it was actually showing green) 10:30a: I restart Hobbit and get the "stale alert" message, and it finally stops sending alerts. Recovery email was never sent out, even though it is in the alert rules.
On Nov 14, 2007 11:22 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
You're saying it went yellow, then green. The log tells you it sent an alert when it was yellow.
I'm not sure I'm seeing the problem here =/ It sent an alert first when it was yellow and another when it switch to green to inform you it recovered, correct?
On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote:
Yes, the alert history shows it went yellow and then 5 minutes later recovered. The web page is showing everything correct. However, when I check the notifications.log file, I can see that it was still sending alerts about it being yellow, even though it was definitely green.
On Nov 14, 2007 10:38 AM, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Click on it the host's test and click on history - was red at all?
Are the WWW pages updating? Look in the top right corner of the page once you click on the host's test link.
On 11/14/07, Gary Baluha < gumby3203 at gmail.com> wrote:
I noticed this morning that our Hobbit server was sending out alerts for the process check for a machine that was actually shown as green on the hobbit web page. I checked on the monitored machine, and the alert was indeed green, yet the server was still sending out emails as though it were in a yellow state. I restarted the Hobbit client on the monitored machine, and then restarted the Hobbit server on the server. After doing this, I noticed the following in the page.log file:
2007-11-14 10:26:13 Stale alert for host-name:procs dropped
(I changed the actual host name to "host-name" to protect the
innocent)
What exactly does this mean? Before I restarted the Hobbit server process, I manually edited the alert.chk temp file and removed the erroneous alert, but that didn't correct the problem. It was only after I restarted the Hobbit server process that it cleared the alert. Is this a bug in the 4.2.0 code, or is there something else going on here?
Gary,
We get those too, along with leftover semaphores and shared memory segments when we stop the hobbit server. The snapshot, seems much better, but with tooltips and host descriptions pushing our display to the far right of the screen we cannot use it right now.
~David
Ps. Just to let you know it's not just your setup.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
I am running 4.2.0 with the allinone patch. I do have a suspicion that external server side scripts could contribute to the stale alerts. In my case, specifically a script that ssh's to a remote host and executes a shell script on the remote host. It may also happen in conjunction with these errors in the logs, which someone else recently reported:
page.log.1:2007-11-05 16:20:53 hobbitd_alert: Got message 14425, expected 14424
As soon as the issue with tooltips is resolved, I will be moving our primary hobbit to the snapshot release.
~David
From: Josh Luthman [mailto:josh at imaginenetworksllc.com]
Sent: Wednesday, November 14, 2007 17:18
To: hobbit at hswn.dk
Subject: Re: [hobbit] stale alerts
This has never happened to me - are the two of you using the
4.2.0 release?
Josh
On 11/14/07, Gore, David W (David) <
david.gore at verizonbusiness.com <mailto:david.gore at verizonbusiness.com> > wrote:
> -----Original Message-----
> From: Gary Baluha [mailto: gumby3203 at gmail.com]
> Sent: Wednesday, November 14, 2007 16:30
> To: hobbit at hswn.dk
> Subject: Re: [hobbit] stale alerts
>
> Order of events:
> 1am: alert went yellow, email was sent out
> 1:15am: alert recovered
> 2am, and each additional hour: email was sent out
saying > alert was yellow (it was actually showing green) > 10:30a: I restart Hobbit and get the "stale alert" message, > and it finally stops sending alerts. Recovery email was > never sent out, even though it is in the alert rules. > > On Nov 14, 2007 11:22 AM, Josh Luthman > <josh at imaginenetworksllc.com> wrote: > > You're saying it went yellow, then green. The log tells > you it sent > > an alert when it was yellow. > > > > I'm not sure I'm seeing the problem here =/ It sent an alert first > > when it was yellow and another when it switch to green to > inform you > > it recovered, correct? > > > > > > > > On 11/14/07, Gary Baluha <gumby3203 at gmail.com> wrote: > > > Yes, the alert history shows it went yellow and then 5 > minutes later > > > recovered. The web page is showing everything correct. However, > > > when I check the notifications.log file, I can see that > it was still > > > sending alerts about it being yellow, even though it was > definitely > > > green. > > > > > > On Nov 14, 2007 10:38 AM, Josh Luthman > <josh at imaginenetworksllc.com> > > wrote: > > > > Click on it the host's test and click on history
was > red at all? > > > > > > > > Are the WWW pages updating? Look in the top right > corner of the > > > > page > > once > > > > you click on the host's test link. > > > > > > > > > > > > > > > > On 11/14/07, Gary Baluha < gumby3203 at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > I noticed this morning that our Hobbit server was sending out > > > > > alerts for the process check for a machine that was actually > > > > > shown as green on the hobbit web page. I checked on the > > > > > monitored machine, and the alert was indeed green, yet the > > > > > server was still sending out emails as though it were in a > > > > > yellow state. I restarted the Hobbit client on the monitored > > > > > machine, and then restarted the Hobbit server on the server. > > > > > After doing this, I noticed the following in the page.log > > > > > file: > > > > > > > > > > 2007-11-14 10:26:13 Stale alert for host-name:procs dropped > > > > > > > > > > (I changed the actual host name to "host-name" to protect the > > innocent) > > > > > What exactly does this mean? Before I restarted the Hobbit > > > > > server process, I manually edited the alert.chk temp file and > > > > > removed the erroneous alert, but that didn't correct the > > > > > problem. It was only after I restarted the Hobbit > server process that it cleared the alert. > > > > > Is this a bug in the 4.2.0 code, or is there something else > > > > > going on here? >
Gary, We get those too, along with leftover semaphores andshared memory segments when we stop the hobbit server. The snapshot, seems much better, but with tooltips and host descriptions pushing our display to the far right of the screen we cannot use it right now.
~David Ps. Just to let you know it's not just your setup. To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
Re: the tooltips issue:
Isn't it useful to make this an option whether they should be displayed on the bb2 page or not?
I really like the tooltips but I agree with David that they will clutter your bb2 page so that this page is not longer usable...
Johann
From: Gore, David W (David) [mailto:david.gore at verizonbusiness.com] Sent: Donnerstag, 15. November 2007 13:47 To: hobbit at hswn.dk Subject: RE: [hobbit] stale alerts
I am running 4.2.0 with the allinone patch. I do have a suspicion that external server side scripts could contribute to the stale alerts. In my case, specifically a script that ssh's to a remote host and executes a shell script on the remote host. It may also happen in conjunction with these errors in the logs, which someone else recently reported:
page.log.1:2007-11-05 16:20:53 hobbitd_alert: Got message 14425, expected 14424
As soon as the issue with tooltips is resolved, I will be moving our primary hobbit to the snapshot release.
The snapshot, seems much better, but with tooltips and host descriptions pushing our display to the far right of the screen we cannot use it right now.
~David
In regards to the hobbit snapshot:
They say a picture is worth a thousand words. If the list server allows it, I have attached an image of my bb2.html page. bb2 is our primary view and, as you can see from the image, tooltips pushes all the hobbit columns out of the view to the right.
Tooltips are the text you see when you hover over items like the smiley's in hobbit? I suppose it has always worked well for the little smiley gif icons, just not so well for the bb2 host links where it seems to add the tooltip to the hostname rather than create floating text.
~David
From: Johann Eggers [mailto:Johann.Eggers at teleatlas.com] Sent: Thursday, November 15, 2007 13:11 To: hobbit at hswn.dk Subject: [hobbit] tooltips
Re: the tooltips issue:
Isn't it useful to make this an option whether they should be
displayed on the bb2 page or not?
I really like the tooltips but I agree with David that they will
clutter your bb2 page so that this page is not longer usable...
Johann
From: Gore, David W (David)
[mailto:david.gore at verizonbusiness.com] Sent: Donnerstag, 15. November 2007 13:47 To: hobbit at hswn.dk Subject: RE: [hobbit] stale alerts
I am running 4.2.0 with the allinone patch. I do have a
suspicion that external server side scripts could contribute to the stale alerts. In my case, specifically a script that ssh's to a remote host and executes a shell script on the remote host. It may also happen in conjunction with these errors in the logs, which someone else recently reported:
page.log.1:2007-11-05 16:20:53 hobbitd_alert: Got message 14425,
expected 14424
As soon as the issue with tooltips is resolved, I will be moving
our primary hobbit to the snapshot release.
The snapshot, seems much
better, but with tooltips and host descriptions pushing our
display to the far right of the screen we cannot use it right now.
~David
Hi all,
In contrast to others in this mailing list, I think the use of the tooltips are very handy. I agree that the tooltips make the output of the bb2 page a bit different, but the big advance is that people instantly know which system is having the problem. Especially on the bb2 page, which is mostly used by managers who don't know the names of the system.
Biggest challenge is to keep the tool tip within boundaries but stil use full.
Regards,
Bert
From: Gore, David W (David) [mailto:david.gore at verizonbusiness.com] Sent: donderdag 15 november 2007 15:20 To: hobbit at hswn.dk Subject: RE: [hobbit] tooltips
In regards to the hobbit snapshot:
They say a picture is worth a thousand words. If the list server allows it, I have attached an image of my bb2.html page. bb2 is our primary view and, as you can see from the image, tooltips pushes all the hobbit columns out of the view to the right.
Tooltips are the text you see when you hover over items like the smiley's in hobbit? I suppose it has always worked well for the little smiley gif icons, just not so well for the bb2 host links where it seems to add the tooltip to the hostname rather than create floating text.
~David
From: Johann Eggers [mailto:Johann.Eggers at teleatlas.com] Sent: Thursday, November 15, 2007 13:11 To: hobbit at hswn.dk Subject: [hobbit] tooltips
Re: the tooltips issue:
Isn't it useful to make this an option whether they should be displayed on the bb2 page or not?
I really like the tooltips but I agree with David that they will clutter your bb2 page so that this page is not longer usable...
Johann
From: Gore, David W (David) [mailto:david.gore at verizonbusiness.com] Sent: Donnerstag, 15. November 2007 13:47 To: hobbit at hswn.dk Subject: RE: [hobbit] stale alerts
I am running 4.2.0 with the allinone patch. I do have a suspicion that external server side scripts could contribute to the stale alerts. In my case, specifically a script that ssh's to a remote host and executes a shell script on the remote host. It may also happen in conjunction with these errors in the logs, which someone else recently reported:
page.log.1:2007-11-05 16:20:53 hobbitd_alert: Got message 14425, expected 14424 As soon as the issue with tooltips is resolved, I will be moving our primary hobbit to the snapshot release.
The snapshot, seems much better, but with tooltips and host descriptions pushing our display to the far right of the screen we cannot use it right now.
~David
On Nov 15, 2007 7:46 AM, Gore, David W (David) <david.gore at verizonbusiness.com> wrote:
I am running 4.2.0 with the allinone patch. I do have a suspicion that external server side scripts could contribute to the stale alerts. In my case, specifically a script that ssh's to a remote host and executes a shell script on the remote host. It may also happen in conjunction with these errors in the logs, which someone else recently reported:
Hmm. For the machines in question, we have a client-side script that automatically restarts certain processes that tend to die on their own. It's not all of the time, but when it restarts them, it seems Hobbit sometimes gets these stale alerts. Thankfully it seems the fix is just to restart the Hobbit server, but this should still be fixed eventually.
page.log.1:2007-11-05 16:20:53 hobbitd_alert: Got message 14425, expected 14424
As soon as the issue with tooltips is resolved, I will be moving our primary hobbit to the snapshot release. ~David
Sounds like this is something we should consider as well.
Currently (4.2.0) we can only see email recipient under
HOST=t.test.com SERVICE=conn MAIL test at test.com
following alert recipient information can't be display under "info" column
host-client.cfg HOST=database.test.com PROC mysql 1 -1 red "TEXT=mysqld" GROUP=mysql-process-page-alerts
host-alerts.cfg GROUP=mysql-process-page-alerts MAIL 12345678 at test.com FORMAT=SMS
Is there a way to do it ? if not can this be an enhancement for next release ?
T.J. Yang
Windows Live Hotmail and Microsoft Office Outlook – together at last. Get it now. http://office.microsoft.com/en-us/outlook/HA102225181033.aspx?pid=CL10062697...
participants (6)
-
david.gore@verizonbusiness.com
-
gumby3203@gmail.com
-
Johann.Eggers@teleatlas.com
-
josh@imaginenetworksllc.com
-
klomph@nlr.nl
-
tj_yang@hotmail.com