Hello Xymon Gurus!
I have searched the interwebs for it but could not find anything useful.
My question is how are the values calculated for the Clock offset graph, displayed at the bottom of the "Trends" column. I have seen some posts (and my experience also confirms this) that Xymon's ntp check (specified in the hosts.cfg) is only to see if NTP daemon on a client is up (responsive). While the clock offset is plotted via some internal logic (client feeds data to the server).
Would really appreciate if some one can throw some light on it!
Some links that I have gone through: http://osdir.com/ml/monitoring.hobbit/2007-03/msg00361.html http://lists.xymon.com/oldarchive/2009/01/msg00417.html
-- Regards, Junaid Shahid, TODO:______
On Wed, June 22, 2016 10:56 am, Junaid Shahid wrote:
Hello Xymon Gurus!
I have searched the interwebs for it but could not find anything useful.
My question is how are the values calculated for the Clock offset graph, displayed at the bottom of the "Trends" column. I have seen some posts (and my experience also confirms this) that Xymon's ntp check (specified in the hosts.cfg) is only to see if NTP daemon on a client is up (responsive). While the clock offset is plotted via some internal logic (client feeds data to the server).
Would really appreciate if some one can throw some light on it!
Some links that I have gone through: http://osdir.com/ml/monitoring.hobbit/2007-03/msg00361.html http://lists.xymon.com/oldarchive/2009/01/msg00417.html
Hi,
The "clock" value is computed from the timestamp of the client message (as seen at the end of generation by the client) to the timestamp of the "cpu" status message generation by the xymond_client on the server. (It's thus dependent on your xymon server having the time set correcty.) The client isn't doing a comparison itself against anything externally.
This is subject to skew if: a) your xymon server itself is wrong b) you have a xymonproxy in the middle and messages are delayed getting to xymond c) your xymond_client process is backlogged with [client] messages d) your xymon server is overloaded and has a long period between transmission and TCP processing by xymond
For b) and c) the 4.4 version (and the Terabithia RPMS, IIRC) use receipt time for the first proxy encountered, and xymond parsing/separation time otherwise, for comparison, which fixes both of those things. Local ntp problems and "raw" TCP lag until connection receipt will still affect things, however.
One "feature" of this in 4.3 though is that if you see clock skew rising and times are correct on both sides, then you can easily tell that your xymon server is having performance problems.
HTH, -jc
Thanks JC!
Now that makes it very clear how CPU stats contain server's timestamp (and why).
I have checked we are running version 4.3.21.
Now lets look at the reasons of skew: a) your xymon server itself is wrong Our server's time is correct (as I have manually checked it multiple times manually and also with "ntpstats"). Plus, we have some 300+ clients under Xymon monitoring, and none of them exhibit any time skew in their CLOCK Offset trends
b) you have a xymonproxy in the middle and messages are delayed getting to xymond We don't use any xymon proxy
c) your xymond_client process is backlogged with [client] messages This also can't be the reason because all other clients don't exhibit any noticeable skew in their respective Clock Offset trends
d) your xymon server is overloaded and has a long period between transmission and TCP processing by xymond This also must not be the case as no other client show any noticeable Clock Offset trend.
In our case there is one specific server (out of 300+) that has a clock offset trend that alternates b/w 2-15 secs (like a sinusoidal wave). This machine's time is in perfect sync with our NTP server though (no clock drift exists actually). This machine has a little complicated network topology though (behind various layers such as firewalls, load balancers etc). My only guess now is that this is because of its weird network location, what do you think JC?
On Tue, 28 Jun 2016, 23:27 Junaid Shahid <shahid.junaid at gmail.com> wrote:
Thanks JC!
Now that makes it very clear how CPU stats contain server's timestamp (and why).
I have checked we are running version 4.3.21.
Now lets look at the reasons of skew: a) your xymon server itself is wrong Our server's time is correct (as I have manually checked it multiple times manually and also with "ntpstats"). Plus, we have some 300+ clients under Xymon monitoring, and none of them exhibit any time skew in their CLOCK Offset trends
b) you have a xymonproxy in the middle and messages are delayed getting to xymond We don't use any xymon proxy
c) your xymond_client process is backlogged with [client] messages This also can't be the reason because all other clients don't exhibit any noticeable skew in their respective Clock Offset trends
d) your xymon server is overloaded and has a long period between transmission and TCP processing by xymond This also must not be the case as no other client show any noticeable Clock Offset trend.
In our case there is one specific server (out of 300+) that has a clock offset trend that alternates b/w 2-15 secs (like a sinusoidal wave). This machine's time is in perfect sync with our NTP server though (no clock drift exists actually). This machine has a little complicated network topology though (behind various layers such as firewalls, load balancers etc). My only guess now is that this is because of its weird network location, what do you think JC?
I tend to agree. If it takes a few seconds to make a TCP connection to the xymon server and transmit the client message, you will see such a delay.
Try manually sending a client message and see how long it takes. Something like:
$ time $XYMON $XYMSRV "client/timetest $MACHINE.$SERVEROSTYPE"
(run within a xymoncmd shell on the client)
J
participants (3)
-
cleaver@terabithia.org
-
jlaidman@rebel-it.com.au
-
shahid.junaid@gmail.com