memory status not updated on some Solaris clients after upgrade
Hi,
Since upgrading to 4.3.13, the 'memory' tests on some Solaris clients have turned purple. All the other tests for these hosts are updated correctly, but svcstatus for memory reflects that no update has been applied and the rrd files memory.real and memory.swap are not being updated. None of the clients were upgraded, only the server.
The clientlog for the sections memory or swap for a solaris host that works and a solaris host that doesnt work are indistinguishable in format.
Whilst checking the timestamp on the memory.real.rrd file, I noticed that some of the rrd files in both directories are up to 60 minutes old, but if an attempt to access the data is made by for instance opening the status page, all the rrd files seem to get updated, so I am not sure this is relevant, but if anyone can explain it, I would be grateful.
Does anyone have any suggestions how I can debug the errant solaris memory tests? There is nothing relevant I can find in any of the logs which is tagged with the host name or even the word 'memory'. I have upgraded one solaris client (4.3.10->4.3.13) just to rule it out, but as expected, it made no difference.
Thanks
Andy
Hi,
On 15 January 2014 10:16, Andy Smith <abs at shadymint.com> wrote:
Hi,
Since upgrading to 4.3.13, the 'memory' tests on some Solaris clients have turned purple. All the other tests for these hosts are updated correctly, but svcstatus for memory reflects that no update has been applied and the rrd files memory.real and memory.swap are not being updated. None of the clients were upgraded, only the server.
The clientlog for the sections memory or swap for a solaris host that works and a solaris host that doesnt work are indistinguishable in format.
Whilst checking the timestamp on the memory.real.rrd file, I noticed that some of the rrd files in both directories are up to 60 minutes old, but if an attempt to access the data is made by for instance opening the status page, all the rrd files seem to get updated, so I am not sure this is relevant, but if anyone can explain it, I would be grateful.
Does anyone have any suggestions how I can debug the errant solaris memory tests? There is nothing relevant I can find in any of the logs which is tagged with the host name or even the word 'memory'. I have upgraded one solaris client (4.3.10->4.3.13) just to rule it out, but as expected, it made no difference.
Thanks
Andy
If a Solaris host is stood up with no swap configured, any memory checks submitted to 4.3.13 (and possibly previous versions) are silently discarded. This affects most (but not all) of our non-global zones. This patch for xymond/client/solaris.c fixes the problem.
On the subject of non-global zones, it is necessary to run slightly different commands in a non-global zone to extract the correct information. This is especially true in respect of memory and swap and will be particularly visible to anyone running memory capped zones. I offer this patch to xymonclient-sunos.sh if anyone is interested. Thanks
Andy
Den 17-01-2014 21:20, Andy Smith skrev:
If a Solaris host is stood up with no swap configured, any memory checks submitted to 4.3.13 (and possibly previous versions) are silently discarded. This affects most (but not all) of our non-global zones. This patch for xymond/client/solaris.c fixes the problem.
On the subject of non-global zones, it is necessary to run slightly different commands in a non-global zone to extract the correct information. This is especially true in respect of memory and swap and will be particularly visible to anyone running memory capped zones. I offer this patch to xymonclient-sunos.sh if anyone is interested.
Thanks, applied. I don't know enough about Solaris to determine if your client-script patches may break some setups, so I trust that you have tested it on different systems.
The change for server-side handling of the swap data seems sound, so I have no worries about that.
Regards, Henrik
participants (2)
-
abs@shadymint.com
-
henrik@hswn.dk