Looking to monitor ~20,000 hosts.
Hi all,
We are attempting to setup Xymon at our organization, where it would be monitoring approximately 20,000 hosts with the Xymon client in central mode (pulldata). Just wondering if anyone has Xymon monitoring anywhere close to that number of machines? We are looking for ideas on what hardware specs would be required for the machine running the server side of the Xymon software, or other suggestions on how to setup the environment. Is there a way to load balance multiple Xymon servers?
Thanks, Matt
The information contained in this e-mail, and any attachment, is confidential and is intended solely for the use of the intended recipient. Access, copying or re-use of the e-mail or any attachment, or any information contained therein, by any other person is not authorized. If you are not the intended recipient please return the e-mail to the sender and delete it from your computer. Although we attempt to sweep e-mail and attachments for viruses, we do not guarantee that either are virus-free and accept no liability for any damage sustained as a result of viruses.
Please refer to http://disclaimer.bnymellon.com/eu.htm for certain disclosures relating to European legal entities.
On Thu, May 15, 2014 7:26 am, Weber, Matt wrote:
Hi all,
We are attempting to setup Xymon at our organization, where it would be monitoring approximately 20,000 hosts with the Xymon client in central mode (pulldata). Just wondering if anyone has Xymon monitoring anywhere close to that number of machines? We are looking for ideas on what hardware specs would be required for the machine running the server side of the Xymon software, or other suggestions on how to setup the environment. Is there a way to load balance multiple Xymon servers?
Thanks, Matt
We're checking a fairly large number of "things" in Xymon, but many are not actual servers (although they do receive client messages).
We've never used the xymonfetch utility at that scale (although it was one of the options we considered as we grew out), but that might be an area to look into as I'm not sure what its parallelization capabilities are.
We've made heavy use of the "backfeed queue" that Henrik introduced in 4.3.13 to help keep the core xymond efficient at processing messages. We're also using a --bfq option to xymonproxy to allow it to receive incoming one-way TCP messages from various systems, close the connection, and drop them onto said BFQ queue. This frees xymond up for simply a) handling two-way messages and requests for data, and b) channel management.
Currently we have ~285000 host+svc combinations (ignoring info/trends) and ~75000 logical "hosts" (though most of those aren't servers per se). We process about 2800 msgs/s on a 32-way box with a load average of about 7. Lots of RAM. We've moved our pollers out off the main server, but that was mainly to remove long-run network breaks causing lots of alerts more than any efficiency need. If not for that, everything would be running on a pair of redundant boxes.
We're using the most recent RPMs at http://terabithia.org/rpms/xymon/testing/el6/ in production at this time.
HTH, -jc
Hi Matt,
the reply by J.C. Cleaver pretty much obsoletes my message-in-progress but it may be of some help anyway (see <original reply> below).
Like J.C. I’m not sure if using xymonfetch (pulldata) is fast enough. In addition it requires extra setup of the msgcache on all systems, requires an tcp-port to be reachable from the xymon server and adds extra latency (depending on the configured poll interval).
Depending on your network topology using ssh-tunnel, shared by Padraig Lennon, in conjunction with xymonproxy might be an alternative to use Xymon satellite servers. I published two blog-posts about that [1][2] and have a patched version of ssh-tunnel (along with documentation) describing the setup [3].
Probably old news: You should definitely route the xymon messages through xymonproxy.
HTH Thomas
[1] http://www.it-eckert.com/blog/2014/remote-site-monitoring-with-ssh-tunnel/ [2] http://www.it-eckert.com/blog/2014/combine-ssh-tunnel-with-xymonproxy/ [3] http://www.it-eckert.com/software/patches/ssh-tunnel/
<original reply> There was as thread on the mailing list (Subject: Big Environment) a while ago
http://lists.xymon.com/archive/2011-October/032605.html
One setup at least in the range you are looking for was with satellite xymon servers reporting to one central instance w/ ~10k hosts (contributed by Nicolas Lienard). That was on a 4x SSD RAID5 system.
Another setup (reported by Thomas Brand) had ~8k hosts on a single instance (but with 2 network test satellites). The disk setup was not reported in detail but there were a significant IO issues.
Given that SSD technology made massive progress since Oct. 201. An 8 disk SSD system in RAID10 is definitely affordable nowadays and I would _expect_ that the IO-problems can be solved with a modern RAID setup.
Both installations are _way_ below the 3300 msg/s limit where the backfeed-queue (http://sourceforge.net/p/xymon/code/HEAD/tree/branches/4.3.18/README.backfee...) feature is needed. </original reply>
On 15 May 2014, at 16:26, Weber, Matt <matt.weber at bnymellon.com> wrote:
Hi all,
We are attempting to setup Xymon at our organization, where it would be monitoring approximately 20,000 hosts with the Xymon client in central mode (pulldata). Just wondering if anyone has Xymon monitoring anywhere close to that number of machines? We are looking for ideas on what hardware specs would be required for the machine running the server side of the Xymon software, or other suggestions on how to setup the environment. Is there a way to load balance multiple Xymon servers?
Thanks, Matt
The information contained in this e-mail, and any attachment, is confidential and is intended solely for the use of the intended recipient. Access, copying or re-use of the e-mail or any attachment, or any information contained therein, by any other person is not authorized. If you are not the intended recipient please return the e-mail to the sender and delete it from your computer. Although we attempt to sweep e-mail and attachments for viruses, we do not guarantee that either are virus-free and accept no liability for any damage sustained as a result of viruses.
Please refer to http://disclaimer.bnymellon.com/eu.htm for certain disclosures relating to European legal entities.
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Den 15-05-2014 16:26, Weber, Matt skrev:
Hi all,
We are attempting to setup Xymon at our organization, where it would be monitoring approximately 20,000 hosts with the Xymon client in central mode (pulldata). Just wondering if anyone has Xymon monitoring anywhere close to that number of machines? We are looking for ideas on what hardware specs would be required for the machine running the server side of the Xymon software, or other suggestions on how to setup the environment. Is there a way to load balance multiple Xymon servers?
I wasn't auite up to that number of hosts when I had a large installation - only about 7000 hosts. But I expect that Your main bottlenecks will be
a) Disk I/O for the RRD files. Use SSD disks for those.
b) TCP sockets. All of your client messages end up being converted into status-messages, and sent back to xymond via a normal TCP connection, so you will be using lots of sockets on the Xymon server. Enabling the "backfeed" feature will fix that for you.
As for hardware specs, I think any "decent" server will work fine. Dual- or quad-core, and for memory a rough estimate is ~200 kB per host you monitor. So 4 GB for Xymon, and hence a 64-bit system.
I am curious to hear how it works out for you.
Regards, Henrik
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEVAwUBU3UrCPADgvSOCWu5AQKR2AgAtUf2WgCyusdW7wOts14rCLWZKq4UMvPL emrLyfF2Eg6A2QXWt1EC61xZYKqyETAO017My9iIBkCJHy2wp8ph21IhuOQWgsFv UGrY93JV00SMNA4FxdVKd21+5M5AMx0iyyufoUAl3KNSL2p7QGDbPTk3PYNON4g2 AQPXE/kRzCjnRvcsHo0/PDCvSfdz72XQ337BqIN4OGxPORBprhQdhnTB43AMYfqX xPsoxBmadkiSjSjDICSy2PuC41VcUTxc8134KjHnyKU0+oTOf/Ru+E+HkLtpS6VT QxDT0VVoRXPac5YzLSLPGa5D4rd1LfvDDxegJaKRTq42aJ8iVIs0Qw== =Sf6X -----END PGP SIGNATURE-----
participants (4)
-
cleaver@terabithia.org
-
henrik@hswn.dk
-
matt.weber@bnymellon.com
-
thomas.eckert@it-eckert.de