Xymon client doesn't clean up all of its children
~~ I can't verify on other OSes right now, so I'm hoping someone can chime in ~~
On FreeBSD when I stop the Xymon client process it doesn't clean up all of its children. Primarily you'll find that the vmstat command is not sent a signal and continues to run ... indefinitely?
As a result, restarting the Xymon client leaves vmstat processes around that really should not be there. I'm working around this right now, but is it possible this is being seen on other platforms and it's a bug in the client shutdown code?
Thanks!
On 2/26/2015 9:14 AM, Mark Felder wrote:
~~ I can't verify on other OSes right now, so I'm hoping someone can chime in ~~
On FreeBSD when I stop the Xymon client process it doesn't clean up all of its children. Primarily you'll find that the vmstat command is not sent a signal and continues to run ... indefinitely?
I have observed this behavior on Solaris, but the vmstat does eventually disappear. It does not run forever.
-- Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
However, the result is that you can't do a restart on Solaris 10 with SMF if you are using vmstat. I have patched my scripts on Solaris to kill the child processes.
____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences* || \\UTGERS |---------------------*O*--------------------- ||_// Biomedical | Ryan Novosielski - Senior Technologist || \\ and Health | novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>- 973/972.0922 (2x0922) || \\ Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark `'
On Feb 26, 2015, at 13:17, John Thurston <john.thurston at alaska.gov<mailto:john.thurston at alaska.gov>> wrote:
On 2/26/2015 9:14 AM, Mark Felder wrote: ~~ I can't verify on other OSes right now, so I'm hoping someone can chime in ~~
On FreeBSD when I stop the Xymon client process it doesn't clean up all of its children. Primarily you'll find that the vmstat command is not sent a signal and continues to run ... indefinitely?
I have observed this behavior on Solaris, but the vmstat does eventually disappear. It does not run forever.
-- Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov<mailto:John.Thurston at alaska.gov> Enterprise Technology Services Department of Administration State of Alaska
Xymon mailing list Xymon at xymon.com<mailto:Xymon at xymon.com> http://lists.xymon.com/mailman/listinfo/xymon
On Feb 26, 2015, at 13:17, John Thurston <john.thurston at alaska.gov
On 2/26/2015 9:14 AM, Mark Felder wrote:
~~ I can't verify on other OSes right now, so I'm hoping someone can chime in ~~
On FreeBSD when I stop the Xymon client process it doesn't clean up all of its children. Primarily you'll find that the vmstat command is not sent a signal and continues to run ... indefinitely?
I have observed this behavior on Solaris, but the vmstat does eventually disappear. It does not run forever.
On 2/26/2015 9:44 AM, Novosielski, Ryan wrote:
However, the result is that you can't do a restart on Solaris 10 with SMF if you are using vmstat. I have patched my scripts on Solaris to kill the child processes.
Ahhh. I haven't run into this problem because I'm not trying to use SMF to control it. I use " ~/server/xymon.sh restart " if I want to restart it interactively as the non-priv'd user. As root, my init-script does about the same thing.
For future reference, how did you modify your manifest or scripts to meet your needs?
Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
On Thu, February 26, 2015 10:44 am, Novosielski, Ryan wrote:
However, the result is that you can't do a restart on Solaris 10 with SMF if you are using vmstat. I have patched my scripts on Solaris to kill the child processes.
On Feb 26, 2015, at 13:17, John Thurston <john.thurston at alaska.gov<mailto:john.thurston at alaska.gov>> wrote:
On 2/26/2015 9:14 AM, Mark Felder wrote: ~~ I can't verify on other OSes right now, so I'm hoping someone can chime in ~~
On FreeBSD when I stop the Xymon client process it doesn't clean up all of its children. Primarily you'll find that the vmstat command is not sent a signal and continues to run ... indefinitely?
I have observed this behavior on Solaris, but the vmstat does eventually disappear. It does not run forever.
That's... interesting. vmstat (and anything of a similar nature) is indeed being forked via nohup on a 5m timer. If it lasts more than 5m after the last run of xymonclient.sh, there's definitely something wrong somewhere. I wasn't aware that backgrounded processes like that could cause a problem for Solaris under SMF though.
It does raise the question a little of three changes I'd considered committing, but wanted to get differing perspectives on (especially from non-Linux OS's):
- Pipe the vmstat command to a nohup'd shell instead of executing it directly. I'm curious if it might help SMF cope a little better, but must confess that the primary reason was simply to have a 'ps' output that doesn't look quite as scary:
5599 ? S 0:00 /bin/sh 5602 ? S 0:00 \_ vmstat 300 2
Kill backgrounded vmstat (or any other processes) owned by the configured user when given a 'stop' SysV script command, but *not* when given a 'restart'.
Generally speaking, patch the startup scripts and default configuration to simply run 'xymoncmd /path/to/xymonlaunch --log=/path/to/log/file'. This was the path I took for putting in systemd compatibility, and it's possible it might provide something simpler for OSs' service monitors to track too.
Long time users of the Terabithia RPMs might notice that these three have been in for a while, but going beyond RH-derived Linux distros is a bigger step.
What would you folks think?
-jc
On Thu, Feb 26, 2015, at 13:33, J.C. Cleaver wrote:
What would you folks think?
I don't have much of an opinion. On FreeBSD I just pushed an update to "pkill -U $xymon_client_user vmstat" so it cleans up on a stop/restart.
I swear I saw the vmstat live on longer than 5 minutes, but maybe I need to do some more formal testing. It appears that the command is:
vmstat 300 2
but on FreeBSD that means it updates/prints new output every 300 seconds and the 2 means two intervals, so I guess the max time it will run is 10 minutes / 600 seconds?
On Thu, February 26, 2015 12:10 pm, Mark Felder wrote:
On Thu, Feb 26, 2015, at 13:33, J.C. Cleaver wrote:
What would you folks think?
I don't have much of an opinion. On FreeBSD I just pushed an update to "pkill -U $xymon_client_user vmstat" so it cleans up on a stop/restart.
I swear I saw the vmstat live on longer than 5 minutes, but maybe I need to do some more formal testing. It appears that the command is:
vmstat 300 2
but on FreeBSD that means it updates/prints new output every 300 seconds and the 2 means two intervals, so I guess the max time it will run is 10 minutes / 600 seconds?
Interesting. I guess that makes it similar to AIX in that regards.
http://lists.xymon.com/archive/2014-September/040192.html
We should clarify this on each supported OS. On RHEL5, "vmstat 10 2" gives me:
-bash-3.2$ vmstat 10 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 312 9461348 203472 84676768 0 0 34 319 0 0 4 1 94 0 0
(wait 10 seconds)
1 0 312 9466564 203472 84677632 0 0 51 5290 3138 8928 3 2 95 0 0
(exit)
-jc
On Thu, Feb 26, 2015, at 14:36, J.C. Cleaver wrote:
We should clarify this on each supported OS. On RHEL5, "vmstat 10 2" gives me:
-bash-3.2$ vmstat 10 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 312 9461348 203472 84676768 0 0 34 319 0 0 4 1 94 0 0
(wait 10 seconds)
1 0 312 9466564 203472 84677632 0 0 51 5290 3138 8928 3 2 95 0 0
(exit)
Yes, on FreeBSD it is behaving exactly like that. I guess I described it incorrectly. In order for it to hit 600 seconds it would have to have a third interval... so it prints once immediately at 0 seconds, and at 300 seconds it prints a second time and exits.
My mistake!
On Thu, February 26, 2015 12:39 pm, Mark Felder wrote:
On Thu, Feb 26, 2015, at 14:36, J.C. Cleaver wrote:
We should clarify this on each supported OS. On RHEL5, "vmstat 10 2" gives me:
-bash-3.2$ vmstat 10 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 312 9461348 203472 84676768 0 0 34 319 0 0 4 1 94 0 0
(wait 10 seconds)
1 0 312 9466564 203472 84677632 0 0 51 5290 3138 8928 3 2 95 0 0
(exit)
Yes, on FreeBSD it is behaving exactly like that. I guess I described it incorrectly. In order for it to hit 600 seconds it would have to have a third interval... so it prints once immediately at 0 seconds, and at 300 seconds it prints a second time and exits.
My mistake!
Ah, no worries! :)
-jc
The vmstat is set to run for 5 minutes. It is a forked process. I see it on all unix systems that I've run Xymon on. I usually ignore it or just kill it. As John said, it doesn't run forever.
=G=
From: Xymon <xymon-bounces at xymon.com> on behalf of John Thurston <john.thurston at alaska.gov> Sent: Thursday, February 26, 2015 1:17 PM To: xymon at xymon.com Subject: Re: [Xymon] Xymon client doesn't clean up all of its children
On 2/26/2015 9:14 AM, Mark Felder wrote:
~~ I can't verify on other OSes right now, so I'm hoping someone can chime in ~~
On FreeBSD when I stop the Xymon client process it doesn't clean up all of its children. Primarily you'll find that the vmstat command is not sent a signal and continues to run ... indefinitely?
I have observed this behavior on Solaris, but the vmstat does eventually disappear. It does not run forever.
-- Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
participants (5)
-
cleaver@terabithia.org
-
feld@feld.me
-
Galen.Johnson@sas.com
-
john.thurston@alaska.gov
-
novosirj@ca.rutgers.edu