Hi all,
I'm hoping someone might be able to help me. I'm running Hobbit 4.1.2 on a Fedora Core 4, monitoring approximately 500 servers. I have been running Hobbit for a few months and a few times our DNS server has been rebooted for patching. When this happens it causes some servers to go purple and the only way I've been able to fix this is to restart the Hobbit service but it has generated a ton of alerts and not a lot of happy alert recipients. My /etc/resolv.conf file has primary and secondary DNS servers, so I would have thought if one wasn't available it would use the other, but this doesn't seem to be the case. Has anyone seen this or know what I could do to prevent these purples from occuring when the DNS server is rebooted?
Thanks much in advance.
On Tue, Jan 10, 2006 at 10:58:10AM -0500, Bill Perez wrote:
I'm hoping someone might be able to help me. I'm running Hobbit 4.1.2 on a Fedora Core 4, monitoring approximately 500 servers. I have been running Hobbit for a few months and a few times our DNS server has been rebooted for patching. When this happens it causes some servers to go purple and the only way I've been able to fix this is to restart the Hobbit service but it has generated a ton of alerts and not a lot of happy alert recipients. My /etc/resolv.conf file has primary and secondary DNS servers, so I would have thought if one wasn't available it would use the other, but this doesn't seem to be the case.
Which tests are going purple ? The network tests (conn, smtp, http etc.) or the client-side tests (cpu, disk, memory ...) ?
If it's the network tests, then the problem is probably that Hobbit is timing out the DNS requests because it takes too long to do the DNS lookups. It probably sends the query first to the server which is down, and then times out waiting for the response. But that would normally cause your network tests to go red - with a DNS error status - not purple. But setting up a caching DNS server on the Hobbit server might help with that (and is generally a good idea when testing many servers).
So I think it's your client-side tests that go purple. Which doesn't really make sense, since the only communication between the clients and Hobbit normally use the IP address directly. But you should check the BBDISP setting in your clients' etc/hobbitclient.cfg and make sure it is set to the IP of your Hobbit server, not the hostname.
Regards, Henrik
Which tests are going purple ? The network tests (conn, smtp, http etc.) or the client-side tests (cpu, disk, memory ...) ?
Henrik - It is the network test (conn) that went purple for several switches, router, windows servers, a unix server - there was really no consistency in what went purple. I was thinking of using the dns=ip switch for bbtest-net to resolve this - do you think that is a viable solution or would I be better off looking into setting up a caching DNS server on the Hobbit server?
Thank you
On 1/12/06, Henrik Stoerner <henrik at hswn.dk> wrote:
On Tue, Jan 10, 2006 at 10:58:10AM -0500, Bill Perez wrote:
I'm hoping someone might be able to help me. I'm running Hobbit 4.1.2on a Fedora Core 4, monitoring approximately 500 servers. I have been
running
Hobbit for a few months and a few times our DNS server has been rebooted for patching. When this happens it causes some servers to go purple and the only way I've been able to fix this is to restart the Hobbit service but it has generated a ton of alerts and not a lot of happy alert recipients. My /etc/resolv.conf file has primary and secondary DNS servers, so I would have thought if one wasn't available it would use the other, but this doesn't seem to be the case.
Which tests are going purple ? The network tests (conn, smtp, http etc.) or the client-side tests (cpu, disk, memory ...) ?
If it's the network tests, then the problem is probably that Hobbit is timing out the DNS requests because it takes too long to do the DNS lookups. It probably sends the query first to the server which is down, and then times out waiting for the response. But that would normally cause your network tests to go red - with a DNS error status - not purple. But setting up a caching DNS server on the Hobbit server might help with that (and is generally a good idea when testing many servers).
So I think it's your client-side tests that go purple. Which doesn't really make sense, since the only communication between the clients and Hobbit normally use the IP address directly. But you should check the BBDISP setting in your clients' etc/hobbitclient.cfg and make sure it is set to the IP of your Hobbit server, not the hostname.
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Thu, Jan 12, 2006 at 08:18:55AM -0500, Bill Perez wrote:
Which tests are going purple ? The network tests (conn, smtp, http etc.) or the client-side tests (cpu, disk, memory ...) ?
Henrik - It is the network test (conn) that went purple for several switches, router, windows servers, a unix server - there was really no consistency in what went purple. I was thinking of using the dns=ip switch for bbtest-net to resolve this - do you think that is a viable solution or would I be better off looking into setting up a caching DNS server on the Hobbit server?
Since you wrote that you are monitoring some 500 hosts, I would really suggest that you setup a caching DNS server on your Hobbit server.
Last I used Red Hat (Fedora), there was a "caching-dns" RPM included with the necessary config files to set this up. All I needed to do was to add a "forwarders" entry to named.conf, so that it would query our local DNS server (the one from resolv.conf) instead of the public root DNS servers; and change resolv.conf to point at 127.0.0.1.
The --dns=ip switch will work, but I don't really like it because you will inevitable change the IP of one of your hosts, and you're bound to forget changing the bb-hosts file as well as the DNS entries. That causes some confusion and a frustrated admin when you find out why the ping-test doesn't work.
Regards, Henrik
Has anyone developed a way to disable and enable a single monitored service on Wondows clients, instead of all services?
Michael Frey
This message, and any attachments to it, may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are notified that any use, dissemination, distribution, copying, or communication of this message is strictly prohibited. If you have received this message in error, please notify the sender immediately by return e-mail and delete the message and any attachments. Thank you.
Thanks for the information Henrik, I really appreciate it.
On 1/12/06, Henrik Stoerner <henrik at hswn.dk> wrote:
On Thu, Jan 12, 2006 at 08:18:55AM -0500, Bill Perez wrote:
Which tests are going purple ? The network tests (conn, smtp, http etc.) or the client-side tests (cpu, disk, memory ...) ?
Henrik - It is the network test (conn) that went purple for several switches, router, windows servers, a unix server - there was really no consistency in what went purple. I was thinking of using the dns=ip switch for bbtest-net to resolve this - do you think that is a viable solution or would I be better off looking into setting up a caching DNS server on the Hobbit server?
Since you wrote that you are monitoring some 500 hosts, I would really suggest that you setup a caching DNS server on your Hobbit server.
Last I used Red Hat (Fedora), there was a "caching-dns" RPM included with the necessary config files to set this up. All I needed to do was to add a "forwarders" entry to named.conf, so that it would query our local DNS server (the one from resolv.conf) instead of the public root DNS servers; and change resolv.conf to point at 127.0.0.1.
The --dns=ip switch will work, but I don't really like it because you will inevitable change the IP of one of your hosts, and you're bound to forget changing the bb-hosts file as well as the DNS entries. That causes some confusion and a frustrated admin when you find out why the ping-test doesn't work.
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
participants (3)
-
billieperez@gmail.com
-
henrik@hswn.dk
-
michael_frey@glic.com