[Sorry to respond so late, I am catching up on emails]
I monitor about 43,000 devices split across 8 instances. It runs on ancient hardware with 2 CPU, 8GB RAM, sun x4200's
I split RRD's to a different host, as well as xymongen and histfiles being handled outside of stock xymon
The only issue I have run into (which I suspect will be fixed by beefier hardware) is that once I get around 5,000 hosts, if xymon crashes, the IPC/Shared Memory does not clean up right away, and it goes into a continual restart process - henrik posted to the list earlier a way to restart that kills all those things, so I haven't had issues since (still tracking down what causes the crash)
On 4/11/13 4:23 PM, "Olivier AUDRY" <olivier at audry.fr> wrote:
great many thx for your time I will check this
but there are only so many hours in the day and there's other low-hanging fruit at the moment :)
so true :)
Le jeudi 11 avril 2013 à 20:12 +0000, cleaver at terabithia.org a écrit :
Le jeudi 11 avril 2013 à 20:40 +0200, Olivier AUDRY a écrit :
hello
as I understand I should run xymon on a single node to improve memory access latency. Right ?
--snip--
numactl --hardware available: 2 nodes (0-1) node 0 size: 12097 MB node 0 free: 594 MB node 1 size: 12120 MB node 1 free: 12 MB node distances: node 0 1 0: 10 20
event I got 24 cpu. Multi core and hyperthreading. Is that correct ?
That seems odd; almost like hyperthreading is disabled? You should see "node 0 cpus: ..." above each size. I'm running RHEL 6.4; it's possible things have changed in that output over time if you're on a different system.
As I can see my two node are full. Not good at all I guess.
My policy is the default one. Perhaps you can advice a specific
policy
for a xymon setup ?
numactl --show policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 cpubind: 0 1 nodebind: 0 1 membind: 0 1
Generally speaking, yeah, use numactl in front of xymonlaunch to ensure the entire process tree gets assigned to a single node. But it really depends on your workload (can everything fit in that node?) and what else is going on on the box. If you have something which analyzes xymondata in a large dump, then does heavy munging on it and sends it back, it might be better to have than on a different node than (say) the xymond_* worker modules.
'numastat -s -z -p xymon' is your friend
The RH Performance Tuning and Resource Management guides are definitely useful reading as well. I'm sure there's plenty of cgroup stuff that could be helpful if/when the time came, but there are only so many hours in the day and there's other low-hanging fruit at the moment :)
I'd definitely start with running the 'numad' service and seeing what it does over time; it really could be all that you need.
HTH,
-jc
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.