On Fri, August 18, 2023 01:31, Adam Thorn wrote:
On 17/08/2023 19:21, J.C. Cleaver wrote:
As it currently stands, xymon is already very efficient, so it's worth pointing out that the complexity really doesn't have to come in until you start hitting things at scale. If you're on a modern core and handling 90 msg/s, xymond is going to be just fine. If you're trying to push 9000 msg/s, more needs to be taken into consideration.
We see about ~100 msgs / second and CPU load on xymond has never even crossed our radar (we run xymond on a relatively low-spec VM with a single virtual CPU)
The place where we have historically encountered performance-related issues was writing data to rrd files. We have a lot of custom tests and like being able to store and graph numerical data. My colleague thus implemented a custom xymond_channel listener which lets us take all our custom messages that come in a whole range of different formats, parse the messages to extract the numerical fields we're interested in, and then store them in a postgresql backend database rather than rrd files. This did also mean a moderate amount of custom development work for the web frontend (including some external js libs to draw graphs) to extract the data from the backend.
That wouldn't have been possible without the xymond_channel architecture along with the simple and well-defined format of the xymon messages themselves.
Agreed. The simplicity of a streaming pipe of well-defined ASCII messages combined with a bus architecture makes it incredibly easy to extend or (in larger orgs) spew data out to another team to process in whatever manner they see fit.
We had a bigdata team that wanted to do multi-dimensional analysis on "metrics" and a data/DS feed into their cluster via xymond_channel solved the problem completely. Often the only sticking point is that xymond_channel's recipient needs to be able to handle the quantity of data that's coming at it efficiently. The SysV IPC for registered channel listeners flagging their message as taken is one of the very few pinch points in the system.
-jc