suppress log contents from 'msgs' column ?
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
I now see that this behavior is also on my 4.3.17 server. I've just never noticed it because: A) I never visit any 'green' bubbles B) I only have three hosts running the xymon client
When the test goes 'red', I can see value in displaying the offending lines from the log, but I can't see any value in continually leaking arbitrary lines of my logs to arbitrary users. Is there an easy way to suppress these log lines from the green status messages?
Argh. I've only now noticed that the 'cpu' column is also spilling all of my running processes. I thought I had taken care of that problem when I suppressed the 'procs' column. Does anyone actually derive value from having this information continually published on the Xymon web interface for all the world to see?
Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
I now see that this behavior is also on my 4.3.17 server. I've just never noticed it because: A) I never visit any 'green' bubbles B) I only have three hosts running the xymon client
When the test goes 'red', I can see value in displaying the offending lines from the log, but I can't see any value in continually leaking arbitrary lines of my logs to arbitrary users. Is there an easy way to suppress these log lines from the green status messages?
Argh. I've only now noticed that the 'cpu' column is also spilling all of my running processes. I thought I had taken care of that problem when I suppressed the 'procs' column. Does anyone actually derive value from having this information continually published on the Xymon web interface for all the world to see?
Is your Xymon server open to the public internet? :-)
cpu: I think it's useful to capture those moments in time especially when you want to see what changed between green and red.
logs: I agree; it's a waste to capture/display the contents if you're not matching anything
On 5/13/2015 8:48 AM, Mark Felder wrote:
- snip -
Is your Xymon server open to the public internet? :-)
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
No, but my xymon server is available to anyone on my network and handles messages from clients in several departments. I don't see any reason or value in publishing my /var/adm/messages (or spilling my process list) for everyone to read.
cpu: I think it's useful to capture those moments in time especially when you want to see what changed between green and red.
Where is the value is continually spilling the process list when it is green? When there is an alarm, I can begin to see the value. But not on a day-to-day, always-green, basis.
The client normally only reports on a 5-minute interval. When it reports 'red' on 'cpu', is there really any value in knowing what it was idling on five minutes earlier? At best I'm going to care what is sucking the resources right now. In most cases, though, I don't even care about that. In the event of a 'red' notice, I go directly to the host and use my system tools to determine what is wrong.
logs: I agree; it's a waste to capture/display the contents if you're not matching anything
Are you aware of any way to suppress this display? I'd like to stop it.
I can switch to a different client, or suppress more columns, but I'd like to find a more elegant way to reach the goal.
Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
On Wed, May 13, 2015 10:07 am, John Thurston wrote:
On 5/13/2015 8:48 AM, Mark Felder wrote:
- snip -
Is your Xymon server open to the public internet? :-)
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
No, but my xymon server is available to anyone on my network and handles messages from clients in several departments. I don't see any reason or value in publishing my /var/adm/messages (or spilling my process list) for everyone to read.
cpu: I think it's useful to capture those moments in time especially when you want to see what changed between green and red.
Where is the value is continually spilling the process list when it is green? When there is an alarm, I can begin to see the value. But not on a day-to-day, always-green, basis.
The client normally only reports on a 5-minute interval. When it reports 'red' on 'cpu', is there really any value in knowing what it was idling on five minutes earlier? At best I'm going to care what is sucking the resources right now. In most cases, though, I don't even care about that. In the event of a 'red' notice, I go directly to the host and use my system tools to determine what is wrong.
logs: I agree; it's a waste to capture/display the contents if you're not matching anything
Are you aware of any way to suppress this display? I'd like to stop it.
I can switch to a different client, or suppress more columns, but I'd like to find a more elegant way to reach the goal.
I think it really comes down to your audience. xymon/hobbit/Big Brother were all designed around making life easier for sysadmins, so it tends to default showing more technical data and placing it within easy reach.
For precedence, we do have the following options to xymond_client currently:
--no-ps-listing Normally the "procs" status message includes the full process-listing received from the client. If you prefer to just have the monitored processes shown, this option will turn off the full ps-listing.
--no-port-listing Normally the "ports" status message includes the full netstat-listing received from the client. If you prefer to just have the monitored ports shown, this option will turn off the full netstat-listing.
I could see one of two paths here:
a) adding individual flags for each of the remaining test results xymond_client generates, controlling the raw data on the status page, or
b) a single master "raw data on status" flag used for all at once
They're not mutually exclusive, of course (most restrictive flag would win), but I'm not sure on the various use cases out there.
One thing to keep in mind is that if you're doing server-side configuration with dynamic status message (I'd have to check on the static HTML output, since I haven't used it in quite a while), anyone who can read the status message will also be able to read the Client Data link, which contains the raw data anyway. So if this is to hide data, I feel like it really only makes sense when clients are in --local evaluation mode.
OTOH, there are other reasons besides security to not display data in the status message: it's more legible, and it's less data to re-transmit internally (like netstat output on huge servers).
And finally, there are middle ground options that could be patched in. xymongen and svcstatus.cgi still see everything in the message output as a blob to stick into a <PRE> tag, but there's no reason a small disclosure triange widget couldn't be used to hide the things below the '&color' lines, which serve as de-facto sub-statuses.
Thoughts?
-jc
J.C. Cleaver wrote:
On Wed, May 13, 2015 10:07 am, John Thurston wrote:
On 5/13/2015 8:48 AM, Mark Felder wrote:
- snip -
Is your Xymon server open to the public internet? :-) No, but my xymon server is available to anyone on my network and handles messages from clients in several departments. I don't see any reason or value in publishing my /var/adm/messages (or spilling my process list) for everyone to read.
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
cpu: I think it's useful to capture those moments in time especially when you want to see what changed between green and red. Where is the value is continually spilling the process list when it is green? When there is an alarm, I can begin to see the value. But not on a day-to-day, always-green, basis.
The client normally only reports on a 5-minute interval. When it reports 'red' on 'cpu', is there really any value in knowing what it was idling on five minutes earlier? At best I'm going to care what is sucking the resources right now. In most cases, though, I don't even care about that. In the event of a 'red' notice, I go directly to the host and use my system tools to determine what is wrong.
logs: I agree; it's a waste to capture/display the contents if you're not matching anything Are you aware of any way to suppress this display? I'd like to stop it.
I can switch to a different client, or suppress more columns, but I'd like to find a more elegant way to reach the goal.
I think it really comes down to your audience. xymon/hobbit/Big Brother were all designed around making life easier for sysadmins, so it tends to default showing more technical data and placing it within easy reach.
For precedence, we do have the following options to xymond_client currently:
--no-ps-listing Normally the "procs" status message includes the full process-listing received from the client. If you prefer to just have the monitored processes shown, this option will turn off the full ps-listing.
--no-port-listing Normally the "ports" status message includes the full netstat-listing received from the client. If you prefer to just have the monitored ports shown, this option will turn off the full netstat-listing.
I could see one of two paths here:
a) adding individual flags for each of the remaining test results xymond_client generates, controlling the raw data on the status page, or
b) a single master "raw data on status" flag used for all at once
They're not mutually exclusive, of course (most restrictive flag would win), but I'm not sure on the various use cases out there.
One thing to keep in mind is that if you're doing server-side configuration with dynamic status message (I'd have to check on the static HTML output, since I haven't used it in quite a while), anyone who can read the status message will also be able to read the Client Data link, which contains the raw data anyway. So if this is to hide data, I feel like it really only makes sense when clients are in --local evaluation mode.
OTOH, there are other reasons besides security to not display data in the status message: it's more legible, and it's less data to re-transmit internally (like netstat output on huge servers).
And finally, there are middle ground options that could be patched in. xymongen and svcstatus.cgi still see everything in the message output as a blob to stick into a <PRE> tag, but there's no reason a small disclosure triange widget couldn't be used to hide the things below the '&color' lines, which serve as de-facto sub-statuses.
Thoughts?
-jc
We find the listings of both procs and ports incredibly useful for post incident forensics, we would not want to be without it. What about per-host flags similar to HIDEHTTP which performs a similar function for sensitive web page checks?
Andy
On 5/13/2015 9:20 AM, J.C. Cleaver wrote:
On Wed, May 13, 2015 10:07 am, John Thurston wrote:
On 5/13/2015 8:48 AM, Mark Felder wrote:
- snip -
Is your Xymon server open to the public internet? :-)
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
No, but my xymon server is available to anyone on my network and handles messages from clients in several departments. I don't see any reason or value in publishing my /var/adm/messages (or spilling my process list) for everyone to read.
. . .
I think it really comes down to your audience. xymon/hobbit/Big Brother were all designed around making life easier for sysadmins, so it tends to default showing more technical data and placing it within easy reach.
To me, xymon/hobbit/BB are alerting tools. Their purpose is to tell me "A threshold you defined has been exceeded. You'd better go figure out if there is a problem brewing!" When Xymon has done this, it's job is done. I don't expect it to do much more.
It's silly (to me, anyway) to think I can predict all the information I will need to diagnose or correct future host problems and pre-populate Xymon with that information. To know what information I might need, I'd need to know what problem I am going to have. If I know what problem I'm going to have, I should take preemptive steps to avoid having the problem.
For precedence, we do have the following options to xymond_client currently: --no-ps-listing ... --no-port-listing ...
Thank you for pointing these out. This is just the sort of thing I was hoping to find. It helps clean up some of the leaked information. It still leaves me 'cpu' and 'msgs', but this is a start.
I could see one of two paths here:
a) adding individual flags for each of the remaining test results xymond_client generates, controlling the raw data on the status page, or
b) a single master "raw data on status" flag used for all at once
(a) would mean any new client capabilities would need to be tightly coupled to the server so they weren't left out. It is flexible, but it could be a hassle.
(b) would probably meet my needs nicely. I just want the status. It's nice to have an indication of what breached the threshold, but tailing /var/adm/messages into the HTML page for every green update seems pointless. Maybe "IncludeRawData=color[,color]" so it could be included with red and yellow but not with green.
Seen below for an idea for (c).
One thing to keep in mind is that if you're doing server-side configuration with dynamic status message . . . .
Only three of my clients are in "central" mode, and those are only in that mode because they are running on my Xymon servers and come out of the build-process configured that way. All of my other clients are running older BB or BBPe clients. Which means in my case, I should probably pursue option:
(c) I suppress the columns I don't want on the three clients that display them. I suspect that my desire to suppress this information is a fringe-case and not worth the effort to incorporate.
-- Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
On 14 May 2015 at 06:32, John Thurston <john.thurston at alaska.gov> wrote:
To me, xymon/hobbit/BB are alerting tools. Their purpose is to tell me "A threshold you defined has been exceeded. You'd better go figure out if there is a problem brewing!" When Xymon has done this, it's job is done. I don't expect it to do much more.
Personally, Xymon is much more than for alerting. It's also critical for forensics. When a fault has been detected, the graphs and snapshot reports are extremely valuable for working out what historical factors may be relevant to a fault.
Two ways I use Xymon for forensics:
If an event has a history, there might be a pattern that can enlighten the cause (eg disk space problems at the start of every month) or a coincident event (eg packet loss concurrent with a spike in disk I/O).
If a threshold measure has a short-term spike or a long-term slow increase, then identifying when the metric started its incline can help pin down the change or event that caused it.
"Go fix it" helps with the immediate problem and it's purpose is tactical, for the short term. But looking to the past can help prevent recurrence in the future.
In the specific case of a CPU load fault, it can be valuable to know what processes are new - in other words, what wasn't running 5 minutes before the event, that was running after the event. In some cases a new process lifetime can be gleaned from the STIME column in the output of "ps -ef". In other cases, it might be a process that is run from cron or inetd, or in a while loop, and doesn't have a very long lifetime. Or there might be a situation where you have a clean-up process that has crashed, and you might want to know what was running that is no longer. In reality, these are somewhat contrived scenarios, and I have no concrete examples to prove that it can happen. But in your own words, it's "silly to think [we] can predict all the information [we'll] need", and so in my opinion (and experience) the more, the better.
If security is the problem, then secure the data. Suppressing the data is only one way to secure the data, and doing so can have down-sides.
In my deployment, I limit unauthenticated access to Apache, so those who don't need to see my log files and process listings, don't get to see them, but those who might benefit, can see them.
Cheers Jeremy
On Wed, May 13, 2015 1:32 pm, John Thurston wrote:
On 5/13/2015 9:20 AM, J.C. Cleaver wrote:
I could see one of two paths here:
a) adding individual flags for each of the remaining test results xymond_client generates, controlling the raw data on the status page, or
b) a single master "raw data on status" flag used for all at once
(a) would mean any new client capabilities would need to be tightly coupled to the server so they weren't left out. It is flexible, but it could be a hassle.
(b) would probably meet my needs nicely. I just want the status. It's nice to have an indication of what breached the threshold, but tailing /var/adm/messages into the HTML page for every green update seems pointless. Maybe "IncludeRawData=color[,color]" so it could be included with red and yellow but not with green.
Seen below for an idea for (c).
I actually probably should have clarified these as xymond_client command line options rather than flags per se. Although it does bring up an issue in that the command line is hard-coded for local-configuration mode users. The only way to modify that is to edit the xymonclient.sh script.
Although an environment flag (cf. (c)) could be useful, a "LOCALCLIENTOPTS=" variable in clientlaunch.cfg would allow arbitrary options to be passed to xymond_client in these case.
One thing to keep in mind is that if you're doing server-side configuration with dynamic status message . . . .
Only three of my clients are in "central" mode, and those are only in that mode because they are running on my Xymon servers and come out of the build-process configured that way. All of my other clients are running older BB or BBPe clients. Which means in my case, I should probably pursue option:
This is correct. If the BB clients are creating status messages rather than transmitting the raw client messages, then you'd need to edit/configure things there. It's morally equivalent to Xymon's local mode.
(c) I suppress the columns I don't want on the three clients that display them. I suspect that my desire to suppress this information is a fringe-case and not worth the effort to incorporate.
On Wed, May 13, 2015 12:42 pm, Andy Smith wrote:
What about per-host flags similar to HIDEHTTP which performs a similar function for sensitive web page checks?
I think there's a use case for for both types of control here. A set of xymond_client --options (+a generic option) for not including anything beyond threshold evaluations in the status messages generated. This helps the --local config use case as well as provides easy global server control(*). Secondly, a 'HIDECLIENTDATA' flag in hosts.cfg read in by xymond_client could give the same effect on a per-host basis for specific exceptions.
*svcstatus.cgi could be altered to inhibit CLIENTLOG display for a host with this setting in hosts.cfg too, but given that the data was still present in the client channel to begin with, that's only a surface level of security. The "best" solution will be to use --local mode and never send that raw data to begin with.
I don't think either of these types of options are ready for 4.3.20, but it's probably something that can be put into the next release pretty easily.
Regards,
-jc
On 5/13/2015 10:43 AM, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.
In client-local.cfg, where your central mode clients get the config that tells them what should be monitored and sent, any lines in your logs that match an "ignore" line will NOT be sent from the client to the server. Just define log filters that filter out anything unimportant or sensitive:
Here's one stanza in my client-local.cfg file, monitoring a gluster host. I ignore all lines at INFO (the " I " filter) and certain error/warning lines that I do not want to even be sent from the client to xymon:
[slc01nas2.REDACTED.com] log:/var/log/glusterfs/etc-glusterfs-glusterd.vol.log:10240 ignore ( I ) log:/var/log/glusterfs/glustershd.log:10240 ignore ( I ) log:/var/log/glusterfs/nfs.log:10240 ignore xlator\/protocol\/client.so\(client_inodelk\+0x96\) \[0x7ffd96db3426\]\)\)\) 0-: Assertion failed: 0 ignore <gfid:00000000-0000-0000-0000-000000000000> failed .Invalid argument ignore ( I ) log:/var/log/messages:10240
Here's the corresponding entry in analysis.cfg that actually produces alarms:
HOST=%slc01(dfs|nas) LOG %gluster "%( E )" COLOR=red PROC glusterd 1 red PROC glusterfs 1 red MEMACT 99 100
Thanks, Shawn
participants (6)
-
abs@shadymint.com
-
cleaver@terabithia.org
-
feld@feld.me
-
hobbit@elyograg.org
-
jlaidman@rebel-it.com.au
-
john.thurston@alaska.gov