OK, that string looks good to me. Although due to (presumably) HTML tags, it showed up as white on white in my email client (Gmail), so I'll reproduce it here, with formatting removed:
[uptime] 20:36 up 22 days, 8:17, 6 users, load averages: 2.12 2.12 1.61
For comparison, [uptime] from one of my Linux servers:
[uptime] 12:54pm up 178 days 20:49, 0 users, load average: 1.20, 1.44, 1.52
Your string has what xymond_client.c is looking for:
- the "[uptime]" section header
- the string "<space>up<space>", required to calculate uptime and include "up N days" in the CPU status
- the string "load average:<space>" or "load averages:<space>" followed by either "<float> <float> <float>" or "<float>,<space><float>,<space><float>"
The uptime calculation is not related to graphing the CPU load average. However, that section of code has a debug line, so if you enable debugging, you might be able to use this to confirm that the uptime code is being run. If that code is being run, then the [uptime] section is making its way to the parser. The debug message is "CPU check host <hostname>: <uptime> days". There's another similar line showing hours of uptime. It might not be practical for you to run xymond_client with "--debug", depending on your environment. But it might be instructive if you can.
The fact that Xymon is sending a status "cpu" message for you is a good sign. But at that point, any thresholding that is going to happen, will have happened. So if you were to set LOAD to a very low level in analysis.cfg, you should be able to make the status yellow or red.
Then onto the graphing. The RRD file is populated from the "load=N" string in the CPU status message header. The CPU status page should have a first line that looks like:
"<timestamp> up: <uptime string>, <usercount> users, <proccount> procs, load=<loadavg>s"
From one of my servers:
"Wed Jul 31 13:02:34 EST 2024 up: 178 days, 0 users, 225 procs, load=1.41"
The parser (within xymond_rrd) that grabs the uptime number for graphing looks at the "load=" string at the end of this line. But it only use a line that contains "up:<space>" or "uptime:" or "Uptime:". The load string (in this case 1.41) can be "NN%" or "NN.NN" or "NN".
It might be helpful to capture a raw CPU status message using "xymond_channel --channel=status", to see if the structure looks right, or if there are any spurious characters.
A couple of other things:
- it can take a few samples before an RRD file gets any numbers, so give it maybe 15 minutes before you conclude it's not working
- the RRD routines might be logging an error in rrd-status.log
- check that the RRD file doesn't already exist; remove it if it does (it might have incompatible structure)
- check that the xymon user has permissions to write to the directory where the RRD file is located
J
On Wed, 31 Jul 2024 at 12:39, Kris Springer <kspringer@innovateteam.com> wrote:
Previously I had both a [cpu] section and an [uptime] section in my client script. I removed all references to [cpu] from my script so that only [uptime] would get sent to the server. I tried 'darwin', 'powershell', and 'linux' as the OSTYPE, and the text content of the uptime data does get displayed on the cpu webpage, but no graph is generated. Depending on which OSTYPE I choose in my script it slightly changes the text that's displayed on the cpu webpage.
Here's what the uptime output looks like in the clientlog when using 'darwin' as the ostype.
[uptime] 20:36 up 22 days, 8:17, 6 users, load averages: 2.12 2.12 1.61
I should note that the load average in that uptime output is totally wrong. I have a good block of code that gives the correct percentage output, but for now I'm just trying to figure out the secret to getting the la graph to generate.
Kris Springer
On 7/30/24 19:23, Jeremy Laidman wrote:
Kris
Glad you're making progress.
What have you been using for OSTYPE that works for everything but CPU?
Can you show an example of the section in the client message with CPU load information, which I think is likely to be [uptime]?
The Darwin client message parser (as many others do, including eg Linux and Solaris) essentially sends the "load average" string to the generic UNIX CPU parser. I'd assume the load average string is in [uptime] the same as for Linux. But whatever shows "load average:" or "load averages:" would work. If the UNIX parser sees either of these strings, followed by three numbers, it will pick out the middle (5 minute) number and use that for thresholding and constructing a status message. Then the status parser for "cpu" will take the value to use in graphing (ie updating the la.rrd file).
J
On Wed, 31 Jul 2024 at 10:28, Kris Springer <kspringer@innovateteam.com> wrote:
Thanks Zak. I finally got back to working on my Mac client and have made significant progress now that I'm sending things to the server in the correct format. I am now hitting an issue that I was also running into previously regarding the server not generating a cpu (la) graph. It seems that everything else works fine with my client, but no matter what I do I can't get the server to generate a cpu graph when sending the data as client $hostname.$ostype $clientclass
The cpu graph works perfectly if I send data as status $hostname.cpu green\n$cpuData
But of course I can't do that because it forces the color green and then I can't use the analysis.cfg tolerances on it. If I send it with no color it doesn't work at all. status $hostname.cpu $cpuData
I've looked all through the original XymonPSclient for Windows that I'm using as a functioning example, but I'm not seeing the magic sauce that makes the cpu graph work. All the other tests graphs work just fine, but this cpu graph has given me trouble since I first started this project. Any ideas?
I've tried the following $ostype $clientclass options with none seeming to change anything. linux freebsd netbsd openbsd darwin bbwin powershell generic
Kris Springer
On 7/10/24 03:42, Beck, Zak wrote:
Hi Kris
Jeremy is right – for server side analysis, don’t send multiple individual status messages for these core tests. I don’t think analysis is triggered for status messages.
Instead, send one ‘client’ message with [cpu], [disk], [memory] sections, and send it as a client message, e.g. this from the Windows Powershell client:
client $($clientname).$($script:XymonSettings.clientsoftware) $($ script:XymonSettings.clientclass) XymonPS
The defaults here are ‘powershell’ for clientsoftware (which is really OSTYPE) and clientclass.
If you have a Windows host running the Powershell client, you can see the last client message sent to xymon in the xymon-lastcollect.txt file. This is verbatim what is sent to the server via TCP.
https://www.xymon.com/help/manpages/man1/xymon.1.html
client[/COLLECTORID] HOSTNAME.OSTYPE [HOSTCLASS]
Used to send a "client" message to the Xymon server. Client messages are generated by the Xymon client; when sent to the Xymon server they are matched against the rules in the *analysis.cfg <https://www.xymon.com/help/manpages/man5/analysis.cfg.5.html>(5)* configuration file, and status messages are generated for the client-side tests. The COLLECTORID is used when sending client-data that are additions to the standard client data. The data will be concatenated with the normal client data.
You need to find the right OSTYPE for MacOS… xymond/xymond_client.c in the server source is what handles these messages, you’ll see a bunch of includes to match different OS types:
#include "client/linux.c"
#include "client/freebsd.c"
#include "client/netbsd.c"
#include "client/openbsd.c"
#include "client/solaris.c"
#include "client/hpux.c"
#include "client/osf.c"
#include "client/aix.c"
#include "client/darwin.c"
#include "client/irix.c"
#include "client/sco_sv.c"
#include "client/bbwin.c"
#include "client/powershell.c" /* Must go after client/bbwin.c */
#include "client/zvm.c"
#include "client/zvse.c"
#include "client/zos.c"
#include "client/mqcollect.c"
#include "client/snmpcollect.c"
#include "client/generic.c"
These included files handle the different OSTYPEs and differences in message formats between each (e.g. the output of the df command has different headings for some OSes).
My guess is Darwin is something like OS X, you’re right that there’s nothing for modern MacOS. Maybe Darwin would work.
Alternatively, if you can make the output resemble what you’d get from a Unix client then you could use OSTYPE = one of the unix types (possibly one of the BSDs, as MacOS has BSD origins).
I‘d be tempted to start with darwin, and look in xymond/client/darwin.c - note that xymond_client is probably expecting to see some sections that you may not be sending yet:
timestr = getdata("date"); uptimestr = getdata("uptime"); clockstr = getdata("clock"); msgcachestr = getdata("msgcache"); whostr = getdata("who"); psstr = getdata("ps"); topstr = getdata("top"); dfstr = getdata("df"); inodestr = getdata("inode"); meminfostr = getdata("meminfo"); msgsstr = getdata("msgs"); netstatstr = getdata("netstat"); ifstatstr = getdata("ifstat"); portsstr = getdata("ports");Cheers
Zak
*From:* Kris Springer <kspringer@innovateteam.com> <kspringer@innovateteam.com> *Sent:* Wednesday, July 10, 2024 5:37 AM *To:* Xymon mailinglist <xymon@xymon.com> <xymon@xymon.com>; Jeremy Laidman <jeremy@laidman.org> <jeremy@laidman.org> *Subject:* [External] [Xymon] Re: analysis tolerances not applying to scripted MacOS host
*CAUTION:* External email. Be cautious with links and attachments.
Understood, and thanks for the info. I thought it might be something like that. The problem is that there isn't a modern functioning MacOS xymon client that uses the xymond_client process. All the math is wrong in the 2015 version. The cpu, mem, and disk checks are all different in the modern MacOS versions than they were 10 years ago. For whatever reason Xymon has seemed to ignore that Mac's need monitored too. I've got a good script that produces good data, but it's not using any xymond_client process. I'd be happy to share my code with anyone who can assist with getting this working the 'correct' way. We could wrap it up into an App.dmg for easy install. Then we'll finally have a MacOS client available just like we already have the Windows PS client and the Linux clients.
Kris Springer
On July 9, 2024 10:01:28 PM Jeremy Laidman <jeremy@laidman.org> wrote:
The analysis.cfg thresholds are applied in central mode. If you are operating in local mode (which you appear to be, as you are sending status messages from the client) then the CPU thresholding needs to be done on the client.
In central mode, the xymond_client process parses client messages for relevant sections (eg [df] for disk data, [uptime] for CPU load averages) and performs thresholding checks against settings in analysis.cfg. It's my understanding that in local mode, the xymond_client process runs on the client side, so if you're implementing a full client that runs in local mode, you have to replicate the behaviour of xymond_client.
It's far simpler to make a client that runs only in central mode. You construct a client message consisting of all of the sections you're interested in, and send it to the server for it to parse for thresholding and extracting metrics for rrd files. The structure of the client message contents is important, as the parser expects the sections to be in a format that matches the OS ID of the client ("darwin" for MacOS?).
J
On Wed, 10 July 2024, 07:28 Kris Springer, <kspringer@innovateteam.com> wrote:
I've been working to create a new MacOS Xymon Client since the last iteration I can find is using MacPorts and it's from 2015. I've tried both Python and Powershell using Homebrew and I've settled on Powershell. I may convert it to Python later once I get it finalized. It's just a single script that gathers host data and sends it to the server without any 2-way communication or client-local.cfg handshake stuff going on. I have data being sent successfully from a Mac laptop and the server is displaying everything fine with graphs and text outputs. But the server isn't applying the analysis.cfg tolerances to the host and I don't know why. Is there some magic piece of code that I'm overlooking or don't know exists that tells the server to compare the data to the tolerances in analysis.cfg?
My MacPSclient.ps1 is sending it's data in this format, but the status is always green on the server. "status $hostname.cpu green\n$cpuData"
This also works, but the status is always green. "status $hostname.cpu green $timestamp $cpuData"
This also works, but the status is always green. "status $hostname.disk green\n[disk]\n$diskUsageToSend"
I can change the word green to yellow and it changes the color on the server, but if I remove the word green, no data is updated on the server side when the script is ran. So I'm assuming the color tag is required. I've attempted to read how-to's, looked in other client scripts for a clue, and searched in forums, but I'm not finding anything. Can someone here help?
Thanks so much!
-- Kris Springer
Xymon mailing list -- xymon@xymon.com To unsubscribe send an email to xymon-leave@xymon.com
Xymon mailing list -- xymon@xymon.com
To unsubscribe send an email to xymon-leave@xymon.com
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security, AI-powered support capabilities, and assessment of internal compliance with Accenture policy. Your privacy is important to us. Accenture uses your personal data only in compliance with data protection laws. For further information on how Accenture processes your personal data, please see our privacy statement at https://www.accenture.com/us-en/privacy-policy.
www.accenture.com
Xymon mailing list -- xymon@xymon.com To unsubscribe send an email to xymon-leave@xymon.com
Xymon mailing list -- xymon@xymon.com To unsubscribe send an email to xymon-leave@xymon.com