Hi all, Recently I've included 2 new host to monitor in xymon but from time to time response times are quite high (2 o 3 seconds). Shape in graphs draws ups and downs in a very recognisable manner:
[image: Inline image 1] [image: Inline image 2]
Testing the host (a windows server) with telnet for hours (Perl script) doesn't shows this responses times (always fewer than 1 second). Does anyone knows where could be the problem? Is it a xymon server problem or it may be in the remote server? I think it is in the remote server but I couldn't demonstrate it and than curious graphs doesn't help clarifying it.
On Tuesday 23 April 2013 13:33:05 Raul GN wrote:
Hi all, Recently I've included 2 new host to monitor in xymon but from time to time response times are quite high (2 o 3 seconds). Shape in graphs draws ups and downs in a very recognisable manner:
[image: Inline image 1] [image: Inline image 2]
Testing the host (a windows server) with telnet for hours (Perl script) doesn't shows this responses times (always fewer than 1 second). Does anyone knows where could be the problem? Is it a xymon server problem or it may be in the remote server? I think it is in the remote server but I couldn't demonstrate it and than curious graphs doesn't help clarifying it. I see the same on some tests, mosts of them ping. I never figured out why this is happening.....
Stef
Hi All,
I want to install xymon in my desktop from scratch, can you please send any document for it?
Thanks, Srinivas
-----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Stef Coene Sent: Friday, April 26, 2013 11:44 PM To: xymon at xymon.com Subject: Re: [Xymon] Rare shape in graphs.
On Tuesday 23 April 2013 13:33:05 Raul GN wrote:
Hi all, Recently I've included 2 new host to monitor in xymon but from time to time response times are quite high (2 o 3 seconds). Shape in graphs draws ups and downs in a very recognisable manner:
[image: Inline image 1] [image: Inline image 2]
Testing the host (a windows server) with telnet for hours (Perl script) doesn't shows this responses times (always fewer than 1 second). Does anyone knows where could be the problem? Is it a xymon server problem or it may be in the remote server? I think it is in the remote server but I couldn't demonstrate it and than curious graphs doesn't help clarifying it. I see the same on some tests, mosts of them ping. I never figured out why this is happening.....
Stef
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Please create a new thread for a new topic from now on.
http://en.wikibooks.org/wiki/System_Monitoring_with_Xymon/Administration_Gui...
Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
On Sat, Apr 27, 2013 at 12:43 AM, <srinivas.dasari1 at wipro.com> wrote:
Hi All,
I want to install xymon in my desktop from scratch, can you please send any document for it?
Thanks, Srinivas
-----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Stef Coene Sent: Friday, April 26, 2013 11:44 PM To: xymon at xymon.com Subject: Re: [Xymon] Rare shape in graphs.
On Tuesday 23 April 2013 13:33:05 Raul GN wrote:
Hi all, Recently I've included 2 new host to monitor in xymon but from time to time response times are quite high (2 o 3 seconds). Shape in graphs draws ups and downs in a very recognisable manner:
[image: Inline image 1] [image: Inline image 2]
Testing the host (a windows server) with telnet for hours (Perl script) doesn't shows this responses times (always fewer than 1 second). Does anyone knows where could be the problem? Is it a xymon server problem or it may be in the remote server? I think it is in the remote server but I couldn't demonstrate it and than curious graphs doesn't help clarifying it. I see the same on some tests, mosts of them ping. I never figured out why this is happening.....
Stef
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On 23 April 2013 21:33, Raul GN <ragonlan at gmail.com> wrote:
Hi all, Recently I've included 2 new host to monitor in xymon but from time to time response times are quite high (2 o 3 seconds). Shape in graphs draws ups and downs in a very recognisable manner:
I believe this is a result of quantization and interpolation when RRD stores the results, and it shows up when RRD creates the graphs. It also seems more pronounced when the metrics are whole-numbers, such as clock offset. It's also more obvious for really small numbers. Here's a graph of mine that shows an interesting repeating diamond shape.
[image: Inline images 1] (But your multi-coloured "Connection Times" graph is far prettier.)
This took me a while to understand what's going on, and I still don't quite get it. But by way of an example, let's say that the clock offset (error) of a host is 2.333 seconds relative to the Xymon server. But let's say the quantization causes RRD to accept only whole numbers (I'm stretching the truth here for demonstration purposes). The numbers will come in as 2.333, 2.333, 2.333, but RRD will store them as 2, 2, 3, 2, 2, 3. So you end up with a repeating pattern (a spike every 3rd sample) that really doesn't exist, and is related to the way RRD stores, consolidates and retrieves numbers. This pattern is easily detected by our brain vision circuitry.
Of course, RRD can indeed store floating point numbers, but when you have really small variations in numbers (many decimal places), RRD has to round the numbers at some point, and this gives the same effect. Also, the number of decimal places required to avoid such effects is increased when you do averaging and interpolation.
Interpolation, the way I see it, is as follows. When RRD gets a value of 7 at 3:05:00 and a value of 8 at 3:10:01, it doesn't store "8" for the second value, because it didn't come in exactly 5 minutes after the first value. Instead, RRD removes one second's worth from the value, because it has to store the second value at 3:10:00. If it took 301 seconds to go from 7 to 8, then RRD will calculate that the value after 300 seconds would have been around 7.996... So if your sampling jumps around a little earlier and later, in a regular manner, you may see an interpolation artefact that matches that variation.
Typically, xymonnet performs its checks in the same order every time, and earlier checks might delay later checks very slightly, just enough to push the sample timestamp into the next whole second, causing an interpolation artefact. Because the ordering is not randomised, the artefacts are consistent and recognisable as patterns in the graphs. It could be that, by adding "--shuffle" to the call to xymonnet, the artefacts will no longer form a recognisable pattern.
While I don't fully understand what's going on, I think it's safe to say that it's not a problem with either Xymon or the servers being tested, but is a result of how RRD does its thing.
J
Thank you Jeremy, That Interpolation feature explains a lot of things with RRD files.
On Mon, Apr 29, 2013 at 6:55 AM, Jeremy Laidman <jlaidman at rebel-it.com.au>wrote:
On 23 April 2013 21:33, Raul GN <ragonlan at gmail.com> wrote:
Hi all, Recently I've included 2 new host to monitor in xymon but from time to time response times are quite high (2 o 3 seconds). Shape in graphs draws ups and downs in a very recognisable manner:
I believe this is a result of quantization and interpolation when RRD stores the results, and it shows up when RRD creates the graphs. It also seems more pronounced when the metrics are whole-numbers, such as clock offset. It's also more obvious for really small numbers. Here's a graph of mine that shows an interesting repeating diamond shape.
[image: Inline images 1] (But your multi-coloured "Connection Times" graph is far prettier.)
This took me a while to understand what's going on, and I still don't quite get it. But by way of an example, let's say that the clock offset (error) of a host is 2.333 seconds relative to the Xymon server. But let's say the quantization causes RRD to accept only whole numbers (I'm stretching the truth here for demonstration purposes). The numbers will come in as 2.333, 2.333, 2.333, but RRD will store them as 2, 2, 3, 2, 2, 3. So you end up with a repeating pattern (a spike every 3rd sample) that really doesn't exist, and is related to the way RRD stores, consolidates and retrieves numbers. This pattern is easily detected by our brain vision circuitry.
Of course, RRD can indeed store floating point numbers, but when you have really small variations in numbers (many decimal places), RRD has to round the numbers at some point, and this gives the same effect. Also, the number of decimal places required to avoid such effects is increased when you do averaging and interpolation.
Interpolation, the way I see it, is as follows. When RRD gets a value of 7 at 3:05:00 and a value of 8 at 3:10:01, it doesn't store "8" for the second value, because it didn't come in exactly 5 minutes after the first value. Instead, RRD removes one second's worth from the value, because it has to store the second value at 3:10:00. If it took 301 seconds to go from 7 to 8, then RRD will calculate that the value after 300 seconds would have been around 7.996... So if your sampling jumps around a little earlier and later, in a regular manner, you may see an interpolation artefact that matches that variation.
Typically, xymonnet performs its checks in the same order every time, and earlier checks might delay later checks very slightly, just enough to push the sample timestamp into the next whole second, causing an interpolation artefact. Because the ordering is not randomised, the artefacts are consistent and recognisable as patterns in the graphs. It could be that, by adding "--shuffle" to the call to xymonnet, the artefacts will no longer form a recognisable pattern.
While I don't fully understand what's going on, I think it's safe to say that it's not a problem with either Xymon or the servers being tested, but is a result of how RRD does its thing.
J
participants (5)
-
jlaidman@rebel-it.com.au
-
josh@imaginenetworksllc.com
-
ragonlan@gmail.com
-
srinivas.dasari1@wipro.com
-
stef.coene@docum.org