On Thu, May 15, 2014 7:26 am, Weber, Matt wrote:
Hi all,
We are attempting to setup Xymon at our organization, where it would be monitoring approximately 20,000 hosts with the Xymon client in central mode (pulldata). Just wondering if anyone has Xymon monitoring anywhere close to that number of machines? We are looking for ideas on what hardware specs would be required for the machine running the server side of the Xymon software, or other suggestions on how to setup the environment. Is there a way to load balance multiple Xymon servers?
Thanks, Matt
We're checking a fairly large number of "things" in Xymon, but many are not actual servers (although they do receive client messages).
We've never used the xymonfetch utility at that scale (although it was one of the options we considered as we grew out), but that might be an area to look into as I'm not sure what its parallelization capabilities are.
We've made heavy use of the "backfeed queue" that Henrik introduced in 4.3.13 to help keep the core xymond efficient at processing messages. We're also using a --bfq option to xymonproxy to allow it to receive incoming one-way TCP messages from various systems, close the connection, and drop them onto said BFQ queue. This frees xymond up for simply a) handling two-way messages and requests for data, and b) channel management.
Currently we have ~285000 host+svc combinations (ignoring info/trends) and ~75000 logical "hosts" (though most of those aren't servers per se). We process about 2800 msgs/s on a 32-way box with a load average of about 7. Lots of RAM. We've moved our pollers out off the main server, but that was mainly to remove long-run network breaks causing lots of alerts more than any efficiency need. If not for that, everything would be running on a pair of redundant boxes.
We're using the most recent RPMs at http://terabithia.org/rpms/xymon/testing/el6/ in production at this time.
HTH, -jc