On Mon, January 25, 2016 1:13 pm, Randall Badilla Castro wrote:
Hi guys: We are getting this graph from a webserver and the boos want a deeper explanation of it.
So AFAIK the conn test is a ping; so it will give and idea of how network is working and also can reflect how the server is performing serving request. (Note aside: I feel that TCP connection times from a ping test can be misleading, I mean ping is icmp which people calls 3.5 layer ).
Hi Randall,
The "conn" test is actually just a simple ICMP echo (the "TCP" there is a misnomer unfortunately; it's being graphed by the 'tcp' RRD interpreter). The test is identical to the output you'd get from a run of "fping -Ae" against a list of IP addresses, since that's actually what's happening on the backend :) (Unless you're using the legacy 'xymonping', but that was intended to do the same thing.)
[root at localhost ~]# fping -Ae 4.2.2.2 8.8.8.8 127.0.0.1 4.2.2.2 is alive (15.2 ms) 8.8.8.8 is alive (22.4 ms) 127.0.0.1 is alive (0.01 ms)
With your graph in particular, your average is in the microsecond range, but something bumped up to 18/20ms for long enough to factor into the RRD averaging for that period. RRD data is less granular over time, but you might be able to zoom in to get a better idea of the time frames.
It's possible something was happening on your monitor server... a quick way to rule it out is to see if all other network graphs have the same problem. More likely, there was an issue somewhere in the network.
The 'conn' test won't normally go red as long the destination is actually alive (returns a response before fping times out), but you can force an override by using the 'DS' syntax in analysis.cfg to cause an alert to happen at any particular arbitrary level:
For details on that, see https://xymon.com/help/manpages/man5/analysis.cfg.5.html#lbAN
HTH, -jc