On Wed, Nov 29, 2006 at 06:40:58AM -0600, Daniel J McDonald wrote:
I am seeing intermittent resolver failures in /var/log/hobbit/bb-network.log.
This box also runs MRTG, which chews up a lot of resources with as many points as I monitor. The failures appear to occur one minute and 3-10 seconds into the MRTG polling cycle, during which time the box is CPU bound for about 30 seconds (decrypting several thousand SNMP V3 PDU's)
Is there a way to extend the ares resolver timeout?
"--dns-timeout=N" (default: 30 seconds) for the bbtest-net program.
Or is there some local resolver caching I could set up to help mitigate this problem?
A local caching DNS server on the Hobbit box doing network tests is always a good idea.
At any other point in the MRTG polling cycle the resolver seems to work fine. The other pieces cause the system to be network bound during the initial poll (about 25 seconds), and disk bound (40 seconds) whilst re-writing the ~6000 RRD files.
So another solution might be to make sure that the MRTG update and the Hobbit network tests do not run at the same time. You can do that if you run the mrtg update from hobbitlaunch instead of through cron; the GROUP keyword for each section in hobbitlaunch.cfg is used to make sure there is only one task belonging to each GROUP running at the same time.
Regards, Henrik