[Xymon] Scaling

16 Apr 2013

      [Sorry to respond so late, I am catching up on emails]
I monitor about 43,000 devices split across 8 instances.
It runs on ancient hardware with 2 CPU, 8GB RAM, sun x4200's
I split RRD's to a different host, as well as xymongen and histfiles being
handled outside of stock xymon
The only issue I have run into (which I suspect will be fixed by beefier
hardware) is that once I get around 5,000 hosts, if xymon crashes, the
IPC/Shared Memory does not clean up right away, and it goes into a
continual restart process - henrik posted to the list earlier a way to
restart that kills all those things, so I haven't had issues since (still
tracking down what causes the crash)
On 4/11/13 4:23 PM, "Olivier AUDRY" <olivier at audry.fr> wrote:
...
great many thx for your time I will check this
...
but there are only so many hours in the
day and there's other low-hanging fruit at the moment :)
so true :)
Le jeudi 11 avril 2013 à 20:12 +0000, cleaver at terabithia.org a écrit :
...
...
Le jeudi 11 avril 2013 Ã  20:40 +0200, Olivier AUDRY a Ã©crit :
...
hello
as I understand I should run xymon on a single node to improve memory
access latency. Right ?
--snip--
...
...
numactl --hardware
available: 2 nodes (0-1)
node 0 size: 12097 MB
node 0 free: 594 MB
node 1 size: 12120 MB
node 1 free: 12 MB
node distances:
node   0   1
0:  10  20
event I got 24 cpu. Multi core and hyperthreading. Is that correct ?
That seems odd; almost like hyperthreading is disabled? You should see
"node 0 cpus: ..." above each size. I'm running RHEL 6.4; it's possible
things have changed in that output over time if you're on a different
system.
...
...
As I can see my two node are full. Not good at all I guess.
My policy is the default one. Perhaps you can advice a specific
policy
...
for a xymon setup ?
numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
22
23
cpubind: 0 1
nodebind: 0 1
membind: 0 1
Generally speaking, yeah, use numactl in front of xymonlaunch to ensure
the entire process tree gets assigned to a single node. But it really
depends on your workload (can everything fit in that node?) and what
else
is going on on the box. If you have something which analyzes xymondata
in
a large dump, then does heavy munging on it and sends it back, it might
be
better to have than on a different node than (say) the xymond_* worker
modules.
'numastat -s -z -p xymon' is your friend
The RH Performance Tuning and Resource Management guides are definitely
useful reading as well. I'm sure there's plenty of cgroup stuff that
could
be helpful if/when the time came, but there are only so many hours in
the
day and there's other low-hanging fruit at the moment :)
I'd definitely start with running the 'numad' service and seeing what it
does over time; it really could be all that you need.
HTH,
-jc

Xymon mailing list
Xymon at xymon.com
http://lists.xymon.com/mailman/listinfo/xymon
This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.

[Xymon] Scaling

sean.clark＠twcable.com