single alert for a host going puprle
Hi
When a client is down all monitored services will get purple. This is usually causing multiple alerts to be sent for each service. Recipients can get really annoyed by the amount of alerts while they only care to know that the client should be restarted.
Is there a way to get just a single alert telling you the client stopped running, or at least just one purple event ? Has anyone configured the alerts file to prevent this situation?
Thanks,
Hezki
Let me understand your situation. You have one host, IE:
1.2.3.4 mybroken.host.com # ssh dns pop3 smtp http://mybrokenhost.com
Now after some time you have ssh and dns going down, but the box is still green on conn/ping? Logically, there are two problems - bad bind and bad opensshd, so two alerts is a wise choice.
If you are saying with the example above that every test goes down because it is offline, in my case at least, it only gives me one alert - bad conn, the rest are tagged with "depend" essentially and don't alert you. Is this not the case for you?
On 11/13/07, Hezki Englander <me2unix at gmail.com> wrote:
Hi
When a client is down all monitored services will get purple. This is usually causing multiple alerts to be sent for each service. Recipients can get really annoyed by the amount of alerts while they only care to know that the client should be restarted.
Is there a way to get just a single alert telling you the client stopped running, or at least just one purple event ? Has anyone configured the alerts file to prevent this situation?
Thanks,
Hezki
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
Josh Luthman a écrit :
Let me understand your situation. You have one host, IE:
1.2.3.4 <http://1.2.3.4> mybroken.host.com <http://mybroken.host.com>
ssh dns pop3 smtp http://mybrokenhost.com <http://mybrokenhost.com>
Now after some time you have ssh and dns going down, but the box is still green on conn/ping? Logically, there are two problems - bad bind and bad opensshd, so two alerts is a wise choice.
If you are saying with the example above that every test goes down because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't alert you. Is this not the case for you?
Hi
there are various situations where a host responds to ping but does not send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.
--
Frédéric Mangeant
Steria EDC Sophia Antipolis
Well your first problem there can't really be avoided unless the person modifying the firewall rules is more on top of their game.
The second issue you can easily have it start up when entering runlevel 3 (what a lot of people use out there, remainder of them being 5) but doing an ln -s /etc/init.d/hobbit /etc/rc3.d/S82hobbit
I have never seen Hobbit crash, but I have only been using it for upwards of a month or two. Still, a crashed hobbit won't disallow ICMP echoes.
Josh
On 11/13/07, Frédéric Mangeant <frederic.mangeant at steria.com> wrote:
Josh Luthman a écrit :
Let me understand your situation. You have one host, IE:
1.2.3.4 <http://1.2.3.4> mybroken.host.com <http://mybroken.host.com>
ssh dns pop3 smtp http://mybrokenhost.com <http://mybrokenhost.com>
Now after some time you have ssh and dns going down, but the box is still green on conn/ping? Logically, there are two problems - bad bind and bad opensshd, so two alerts is a wise choice.
If you are saying with the example above that every test goes down because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't alert you. Is this not the case for you?
Hi
there are various situations where a host responds to ping but does not send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.
--
Frédéric Mangeant
Steria EDC Sophia Antipolis
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
Thanks but I'm not asking about how to avoid having purple dots at all. When you are monitoring a large amounts of clients, it is quite likely to have a client not reporting once in a while. This is why I am asking for a solution to reduce the amount of notifications per host in case of everything from it is getting purple.
On 11/13/07, Josh Luthman <josh at imaginenetworksllc.com> wrote:
Well your first problem there can't really be avoided unless the person modifying the firewall rules is more on top of their game.
The second issue you can easily have it start up when entering runlevel 3 (what a lot of people use out there, remainder of them being 5) but doing an ln -s /etc/init.d/hobbit /etc/rc3.d/S82hobbit
I have never seen Hobbit crash, but I have only been using it for upwards of a month or two. Still, a crashed hobbit won't disallow ICMP echoes.
Josh
On 11/13/07, Frédéric Mangeant <frederic.mangeant at steria.com> wrote:
Josh Luthman a écrit :
Let me understand your situation. You have one host, IE:
1.2.3.4 <http://1.2.3.4> mybroken.host.com <http://mybroken.host.com>
ssh dns pop3 smtp http://mybrokenhost.com < http://mybrokenhost.com>
Now after some time you have ssh and dns going down, but the box is still green on conn/ping? Logically, there are two problems - bad bind and bad opensshd, so two alerts is a wise choice.
If you are saying with the example above that every test goes down because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't alert you. Is this not the case for you?
Hi
there are various situations where a host responds to ping but does not send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.
--
Frédéric Mangeant
Steria EDC Sophia Antipolis
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
- firewall rule modified, so no connection from a server to the hobbitd daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.
Yes , I am talking about these cases. Not about tests done by the server which are "hidden" by conn when there is no connectivity. I am talking about a hobbit client not running or not reaching hte server and then all services monitored localy send many purple alerts . (cpu,disk,memory,etc..)
On 11/13/07, Frédéric Mangeant <frederic.mangeant at steria.com> wrote:
Josh Luthman a écrit :
Let me understand your situation. You have one host, IE:
1.2.3.4 <http://1.2.3.4> mybroken.host.com <http://mybroken.host.com>
ssh dns pop3 smtp http://mybrokenhost.com <http://mybrokenhost.com>
Now after some time you have ssh and dns going down, but the box is still green on conn/ping? Logically, there are two problems - bad bind and bad opensshd, so two alerts is a wise choice.
If you are saying with the example above that every test goes down because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't alert you. Is this not the case for you?
Hi
there are various situations where a host responds to ping but does not send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.
--
Frédéric Mangeant
Steria EDC Sophia Antipolis
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
participants (3)
-
frederic.mangeant@steria.com
-
josh@imaginenetworksllc.com
-
me2unix@gmail.com