On 5/13/2012 at 10:56 AM, in message <4FAFCBAE.6010809 at hswn.dk>, Henrik Størner<henrik at hswn.dk> wrote: On 13-05-2012 04:41, Jon Dustin wrote: This evening I had the luck to experience a "purple storm" with Xymon v4.3.7 [snip]
The xymonnet column for shepherd remained purple until the DNS servers came back. It almost seemed like Xymon could not "find itself"?
Any thoughts on ways to combat this issue?
What's logged in your xymonnet.log file ?
All I found were the following two entries:
2012-05-12 20:58:14 WARNING: Runtime 481 longer than time limit (300) 2012-05-12 22:07:20 WARNING: Runtime 767 longer than time limit (300)
Are your network tests and the Xymon website on the same server, or different servers ?
Same server, physical SLES11SP1x64, 16 GiB RAM, not overloaded.
Do you have a local caching DNS server, or does your resolv.conf point to remote DNS servers ?
resolv.conf uses two Active Directory name servers, but the majority of tests go against domains provided by another entity in my University system. When their DNS servers go south, my Xymon server starts having troubles.
Also, the TTL for our DNS records is very low (5 minutes I believe). I'm going to see if we can increase this for our server names.
Thanks for reading.
--
Jon Dustin - Network Specialist University of Southern Maine Portland, ME 207-780-4152