Unsure if this is a bug or not. I noticed that for a few days I've been getting "Runtime 483 longer than time limit (300)" and the bbtest column shows:
TIME SPENT <snip> DNS tests executed 450.074868 <snip> TIME TOTAL 482.741716
I was able to successfully debug this by adding "--debug" to the [bbnet] CMD in hobbitlaunch.cfg. This revealed:
2009-09-01 08:26:01 ares_search: tlookup='dns.server.com', class=1, type=1 2009-09-01 08:26:01 Processing 0 DNS lookups with ARES 2009-09-01 08:33:31 Finished ARES queue after loop 20
Note that it took 7 minutes to complete this. Where 'dns.server.com' is a server that we took offline on the day the issue began. We didn't remove this from bb-hosts and instead marked it blue.
The fix was to comment it out from bb-hosts. It says "loop 20" above so I was searching for config options that listed 20 as a value but didn't come up with anything.
The issue is resolved on my end but I'd like to find out if this is a bug or expected behavior. It seems that a single DNS check on a host that's down shouldn't delay things by 7 minutes.
I'm running Xymon v4.2.3 on FreeBSD 7.2-RELEASE.
Thanks, Chris
Even when blue Xymon does the tests. If the server is offline it will timeout on the DNS tests (as well as other tests obvIously).
No bug, expected behavior.
On 9/1/09, Chris Wopat <chrisw at supranet.net> wrote:
Unsure if this is a bug or not. I noticed that for a few days I've been getting "Runtime 483 longer than time limit (300)" and the bbtest column shows:
TIME SPENT <snip> DNS tests executed 450.074868 <snip> TIME TOTAL 482.741716
I was able to successfully debug this by adding "--debug" to the [bbnet] CMD in hobbitlaunch.cfg. This revealed:
2009-09-01 08:26:01 ares_search: tlookup='dns.server.com', class=1, type=1 2009-09-01 08:26:01 Processing 0 DNS lookups with ARES 2009-09-01 08:33:31 Finished ARES queue after loop 20
Note that it took 7 minutes to complete this. Where 'dns.server.com' is a server that we took offline on the day the issue began. We didn't remove this from bb-hosts and instead marked it blue.
The fix was to comment it out from bb-hosts. It says "loop 20" above so I was searching for config options that listed 20 as a value but didn't come up with anything.
The issue is resolved on my end but I'd like to find out if this is a bug or expected behavior. It seems that a single DNS check on a host that's down shouldn't delay things by 7 minutes.
I'm running Xymon v4.2.3 on FreeBSD 7.2-RELEASE.
Thanks, Chris
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Sent from my mobile device
Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
"When you have eliminated the impossible, that which remains, however improbable, must be the truth." --- Sir Arthur Conan Doyle
Josh Luthman wrote:
Even when blue Xymon does the tests. If the server is offline it will timeout on the DNS tests (as well as other tests obvIously).
No bug, expected behavior.
It taking 7 minutes to timeout still seems wrong. I also normally expect if a server is unreachable via ICMP it doesn't do further tests.
From what I can tell the DNS check comes from [bbnet] which I have at a default of 15:
[bbnet] ENVFILE /usr/local/www/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --timeout=15 LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
bbtest.net --help lists a --dns-timeout option but that says it
defaults to 30, so something still seems wrong.
--Chris
Chris Wopat wrote:
Unsure if this is a bug or not. I noticed that for a few days I've been getting "Runtime 483 longer than time limit (300)" and the bbtest column shows:
TIME SPENT <snip> DNS tests executed 450.074868 <snip> TIME TOTAL 482.741716
I was able to successfully debug this by adding "--debug" to the [bbnet] CMD in hobbitlaunch.cfg. This revealed:
2009-09-01 08:26:01 ares_search: tlookup='dns.server.com', class=1, type=1 2009-09-01 08:26:01 Processing 0 DNS lookups with ARES 2009-09-01 08:33:31 Finished ARES queue after loop 20
Note that it took 7 minutes to complete this. Where 'dns.server.com' is a server that we took offline on the day the issue began. We didn't remove this from bb-hosts and instead marked it blue.
The fix was to comment it out from bb-hosts. It says "loop 20" above so I was searching for config options that listed 20 as a value but didn't come up with anything.
The issue is resolved on my end but I'd like to find out if this is a bug or expected behavior. It seems that a single DNS check on a host that's down shouldn't delay things by 7 minutes.
I'm running Xymon v4.2.3 on FreeBSD 7.2-RELEASE.
I've also tuned my DNS tests. The [bbnet] section in hobbitlaunch.cfg is the place to do it. See 'man bbtest-net' for options. I've upped the general timeout because I have a slow web server. I might shift known ones to urlplus because I can do custom timeouts on each server. My settings are: [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --dns-timeout=15 --timeout=20 --concurrency=512 LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
Check the 'bbtest' column for your hobbit server for stats on bbtest-net runtimes - there's an RRD graph in there too.
David.
-- David Baldwin - IT Unit Australian Sports Commission www.ausport.gov.au Tel 02 62147830 Fax 02 62141830 PO Box 176 Belconnen ACT 2616 david.baldwin at ausport.gov.au Leverrier Street Bruce ACT 2617
Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au
This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.
participants (3)
-
chrisw@supranet.net
-
david.baldwin@ausport.gov.au
-
josh@imaginenetworksllc.com