Hi, i know this is just a Workaround, but maybe you can profit. I have a xymon machine with a local caching bind daemon, which also helps to improve the Speed of the DNS Tests a lot.
yum install bind
customize /etc/named: options { listen-on port 53 { 127.0.0.1; }; #listen-on-v6 port 53 { ::1; }; directory "/var/named"; dump-file "/var/named/data/cache_dump.db"; statistics-file "/var/named/data/named_stats.txt"; memstatistics-file "/var/named/data/named_mem_stats.txt"; allow-query { localhost; }; #recursion yes; forwarders { foo1; foo2; }; forward only; notify no;
dnssec-enable no; dnssec-validation no; #dnssec-lookaside auto; /* Path to ISC DLV key */ bindkeys-file "/etc/named.iscdlv.key"; managed-keys-directory "/var/named/dynamic"; }; zone "." IN { type hint; file "named.ca"; }; include "/etc/named.rfc1912.zones"; include "/etc/named.root.key";Make sure named.conf is 640
Enhance /etc/resolv.conf: nameserver 127.0.0.1
Regards,
Lukas Kohl
ERGO Direkt Versicherungen
Systembetrieb 2
Karl-Martell-Straße 60
90344 Nürnberg
Deutschland
Tel.: +49-911-148-2857
Von: L-M-J <linuxmasterjedi at free.fr> An: Xymon at xymon.com Datum: 16.02.2016 10:46 Betreff: [SPAM] Re: [Xymon] Xymon disruption every night! Gesendet von: "Xymon" <xymon-bounces at xymon.com>
Hi,
I'm still running into troubles every night between ~0h30 and ~2h40 :-(
- I checked the backup on my physical XYmon server : around 9pm and runs for 4:45 min.
- We cross-monitored the DNS server from another monitoring tool : no DNS outage detected.
- I monitored the Xymon server network link state with "mii-tool" every seconds : no troubles detected
- I pinged my Xymon servers from 2 differents network places all night long : no troubles detected.
- No firewalls between my Xymon server and the monitored hosts
- Over 500 hosts, only ~30 are in trouble every night and mostly the same
- Hosts are VM, physical servers, public internet website
Here is what I've found in the xymond.log today : 2016-02-16 02:02:57 Flapping detected for www.foo1.com:http - 5 changes in 1708 seconds 2016-02-16 02:02:57 Flapping detected for www.foo2.com:http - 5 changes in 1708 seconds 2016-02-16 02:02:57 Flapping detected for www.microsoft.com:http - 5 changes in 1708 seconds 2016-02-16 02:06:14 Flapping detected for server01:http - 5 changes in 1678 seconds 2016-02-16 02:06:14 Flapping detected for server02:http - 5 changes in 1678 seconds 2016-02-16 02:06:29 Flapping detected for server03:conn - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server04:ldap - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server06:ssh - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server05:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server07:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server08:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server09:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for foo.bar1.com:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for foo.bar2.com:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for foo.bar3.fr:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server10:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server11-t:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server12:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server13:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server14:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server15:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server16:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server17:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server18:http - 5 changes in 1745 seconds 2016-02-16 02:07:21 Flapping detected for server19:http - 5 changes in 1745 seconds
Here is a part of the configuration + errors displayed in the XYmon HTTP interface : hosts.cfg : 0.0.0.0 server03 # conn NAME:"server03" DESCR:"VM FOO BAR" Error : conn NOT ok : DNS lookup failed / Unable to resolve hostname server03 System unreachable for 2 poll periods (86 seconds)
Everything looks like the DNS resolution failed.
hosts.cfg : 10.X.Y.188 server05 # conn tse NAME:"Server 05" DESCR:"My comment" http://server05/ Error : DNS error red http://server05/ - DNS error
- Why I have a "DNS error" here ? I set up the IP yesterday to this host to solve the issue. The "conn" error disappear since yesterday evening but the http still remains.
Le 29 janvier 2016 13:22:06 GMT+01:00, Becker Christian <christian.becker at rhein-zeitung.net> a écrit : My intention was the figure out if the network connection of the Xymon server itself has a problem… For example, if your Xymon server is hardware, then it has a wired network interface that is connected to a network switch. That’s your link between the Xymon server and all of your other VMs and physical servers.