The new server went into a "flapping" state.
During my next test, I'll try stopping the tests on the new server and see what happens..
-----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner Sent: Monday, August 15, 2011 4:17 PM To: xymon at xymon.com Subject: Re: [Xymon] New server causing issues with CONN test
On 15-08-2011 22:46, Poppy, Ben wrote:
I'm having a pretty strange issue. We have our existing hobbit servers running on Fedora servers running hobbit 4.2.0. I'm working on installing brand new servers that will be running CentOS 6 64-bit and the latest version of xymon (4.3.3 before I saw 4.3.4 today).
[installs and starts 4.3 version]
Within a few minutes, 4 servers turn to red alerts on CONN on the existing Fedora based Hobbit servers. They begin flapping on and off of red alert until I shutdown the new CentOS xymon server. Within a few minutes of the new server being shut down, the alerts go away for good.
I have tried going to Centos 5 32-bit, 64-bit, even trying xymon 4.2.3, or all the way back to hobbit 4.2.0 all with the same result, and the exact same 4 servers each time.
As I understand, you were running both versions simultaneously. Did those servers also go red on the new Xymon version, or only on the old one? If they were red also on the new server, did you try stopping network tests on the old server and did that make a difference ?
Which ping-tool are you using - xymonping or fping ?
I haven't heard of anything like this before, but I suspect it may be an issue with the way "ping" works. When routing traffic, most systems will pass ping-traffic with a low priority, so it is quite easy for ping-requests and -responses to be dropped. Since xymonping and fping pump out a lot of ping-traffic rather quickly, maybe the new server just happened to be more "lucky" with its data than the old one - perhaps due to the switch port it is on, or the speed of the network interface and so on.
It might be worthwhile to make sure that the old and the new system does not run the network tests at the same time - keep an eye (with "ps" on when the network test runs on the old system, and don't start Xymon on the new system until about 30 secs after the old system completes the network tests. (Assuming your network tests don't take more than a couple of minutes, so there is time for both systems to run their tests within the default 5 minute interval).
Regards, Henrik
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
The contents of this message may contain private, protected and/or privileged information. If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within. Please contact the sender and advise of the erroneous delivery by return e-mail or telephone. Thank you for your cooperation.