FreeBSD Actual Memory Usage
Good Xymon Folks
Xymon doesn't support reporting "actual" memory usage for FreeBSD systems - that is, available memory that may or may not be in use for buffers and cache. It only reports, graphs, and alerts on, swap and physical memory usage. Some OSes use some of the unused memory for filesystem caching and other purposes related to performance, and so the reported free memory count goes down over time even though the memory available to for use is not decreasing. On my systems, free memory is only a few percent. So this doesn't give any indication of the risk of memory "resource exhaustion". So what I really need is to report on "actual" memory usage. From what reading I've done, for FreeBSD this would be total memory subtract "free" + "inactive" memory, although that depends on who you ask.
The "actual" memory reporting is not only a problem for FreeBSD, but for all supported OSes except for Linux, IRIX and Windows. I suspect many of these OSes report their free memory to include used-but-available memory also, and so the "real" available memory is a useful number.
Even real memory usage reporting seems to have caused trouble in the past for FreeBSD, as Xymon has had to have its own client-side binary for getting the memory numbers, as is also the case for HPUX and the other *BSDs, NetBSD and OpenBSD. For all other OS types, standard OS tools (free, sar) are used to get memory usage numbers.
Memory usage reporting in Xymon seems to be quite a mixed bag, in general. Some clients report usage to [memory], some to [freemem], some to [meminfo], some to [free]. The Irix client has no specific memory usage report at all. This is by no means a complaint - I'm sure Henrik would have been much happier if all OSes had a standard memory query interface, and I suspect a lot of these different reporting methods were for legacy support.
So. I need to get actual memory reported for some FreeBSD systems, so I'm trying to work out the best way to do this. I'd like to fix it in a way that fits in with the "standard" model, but as I described above, there isn't really a "standard" model. Here are several ways I can think of to solve the problem:
FreeBSD reports [top] header output, if top is installed (as do many OSes). The server-side code could simply grab the numbers from there. This would work for other OSes that don't report "active" memory, and could be a common interface to memory usage metrics. Any new OS that doesn't have extensive client support could simply report a [top] section in client data, and Xymon would magically start reporting actual free memory. The down-side to this is that top isn't installed everywhere. Although on my systems it is, so that's OK. The other down-side is that it requires patches to the server code.
I could replace the "freebsd-meminfo" binary that comes with the Xymon client, so that the "free" figure has the "inactive" memory added (or whatever adjustments are appropriate). This doesn't solve the problem for any other OS. Perhaps that's OK - perhaps the problem is very much OS dependent because each OS has its own unique memory management. I think the binary can be replaced by a simple shell script that parses the "top" header or "sysctl vm.vmtotal" for the correct figures. (It seems that "top" gets all its numbers from sysctl anyway, so I'd do the latter, so as to avoid a dependency.)
I could report each of the different memory metrics separately to Xymon: active, inactive, wired, cache, buffers, free. Then I can graph them all, and look for various conditions on each of them separately, or in certain combinations that make sense. This is the most flexible option, and would provide the highest degree of insight to someone trying to troubleshoot a sluggish server, but it requires a lot more work on both client and server. It's also specific to *BSD systems.
So, any other suggestions on the best way to achieve this? Which of the above is the best approach, do you think?
The other issue I have is that nobody seems to agree on what's a useful measure to keep an eye on. The Xymon server-side code for Darwin reports used memory as the sum of active, inactive and wired. But other sources use the sum of active, wired, cache and buffers. Yet other sources say that buffers cannot be freed, and also that inactive pages are kind-of available if needed. My intention is to be able to predict when it's time to add RAM to avoid performance degradation, but it's not clear what numbers are going to give me that.
Cheers Jeremy
On Thu, Nov 21, 2013, at 0:23, Jeremy Laidman wrote:
Good Xymon Folks
Xymon doesn't support reporting "actual" memory usage for FreeBSD systems
Correct, and this is quite annoying!
- I could report each of the different memory metrics separately to Xymon: active, inactive, wired, cache, buffers, free. Then I can graph them all, and look for various conditions on each of them separately, or in certain combinations that make sense. This is the most flexible option, and would provide the highest degree of insight to someone trying to troubleshoot a sluggish server, but it requires a lot more work on both client and server. It's also specific to *BSD systems.
Yes, more data is better. For example, look at what Observium pulls over SNMP vs what Xymon reports:
So, any other suggestions on the best way to achieve this? Which of the above is the best approach, do you think?
The other issue I have is that nobody seems to agree on what's a useful measure to keep an eye on. The Xymon server-side code for Darwin reports used memory as the sum of active, inactive and wired. But other sources use the sum of active, wired, cache and buffers. Yet other sources say that buffers cannot be freed, and also that inactive pages are kind-of available if needed. My intention is to be able to predict when it's time to add RAM to avoid performance degradation, but it's not clear what numbers are going to give me that.
Graph it all as granularly as you can. Let the admins figure out what's important to monitor.
On 22 November 2013 08:49, Mark Felder <feld at feld.me> wrote:
Yes, more data is better. For example, look at what Observium pulls over SNMP vs what Xymon reports:
Now, that's what I want!
Interestingly, Observium(SNMP) splits all memory into used+cached+buffers+shared+free. I don't know where those numbers come from - they don't map neatly to what "top" shows: active+inactive+wired+cache+buffers+free. So used+shared = active+inactive+wired??
According to this: http://www.daemonforums.org/showthread.php?t=2125Net-SNMP counts cache memory twice when calculating MIB::memAvailReal.0. It's a bit suspect.
So, any other suggestions on the best way to achieve this?
Graph it all as granularly as you can. Let the admins figure out what's important to monitor.
You're correct of course. But it's the most work, and the least likely to get completed anytime soon.
A bigger problem is that Xymon's genericised way of reporting memory is a call to unix_memory_report() with parameters for total, used and actual - and that's all. (For FreeBSD and others, "actual" is set to -1.) The function unix_memory_report() does the memory threshold checks (via status message) and also governs what gets sent to the RRD files. If I wanted to alert on all available memory numbers, and to have them all on the graph for the "memory" page, I'd have to find another way to get them sent to the RRD files and to check for threshold violations, because Xymon is simply not geared up to do this. And it probably won't ever be, because different OSes do memory management differently.
I think what I'm left with is a two-prong approach.
Improve the "memory" page: I need to have "actual" memory reported by the client, and parsed by the OS-specific code in xymond, so that it thresholds on, and generates a status message with, 3 numbers instead of two. This needs adjustments to the client-side code client/freebsd-meminfo.c, to add an "Actual: nnn" line to its output; and also to the server-side code xymond/client/freebsd.c, to parse that line in the same way that the Linux code does.
Display the extra numbers: I need to get all the separate numbers - perhaps from [top] - reported into a completely separate graph (eg [topmem]), that can be viewed on the trends page. I can knock up a server-side perl script to do that right now, but ultimately this would be best done in the Xymon server-side code (probably xymond/client/freebsd.c), and could include thresholding if it makes sense.
J
On Nov 21, 2013, at 20:00, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
You're correct of course. But it's the most work, and the least likely to get completed anytime soon.
Let me save you a ton of work. Everything you need should be obtained from sysctl.
On 22 November 2013 13:10, Mark Felder <feld at feld.me> wrote:
Let me save you a ton of work. Everything you need should be obtained from sysctl.
I can get these numbers from the [top] client data, because top gets them from sysctl system calls. So the client-side really is really the easy part. Most of the work is in making changes to the Xymon server code - not just because it's code, but also because it needs to be turned into a patch and submitted for inclusion, vetted and approved by Henrik, then the new code packaged up and installed onto Xymon servers.
J
participants (2)
-
feld@feld.me
-
jlaidman@rebel-it.com.au