On Tue, December 1, 2015 12:06 pm, John Thurston wrote:
So that I can kill off my errant process and keep xymon alerts flowing, I need to identify that process. It seems like it would be simple to get the right PID. I have xymon_history writing a pid file when it is launched with the following line in my tasks.cfg
CMD xymond_channel --channel=stachg --log=$XYMONSERVERLOGS/h.log xymond_history --pidfile=$XYMONSERVERLOGS/x_h.pid
But I am unable to get xymon_alert to create this pid file. It isn't a documented option on the man-page, but I had hoped the capability was buried in the there anyway. It doesn't appear to be so :(
So I thought maybe I could get the pid of the parent by specifying --pidfile on that instance of xymond_channel with:
CMD xymond_channel --channel=page --log=$XYMONSERVERLOGS/a.log --pidfile=$XYMONSERVERLOGS/x_a.pid xymond_alert --debug --checkpoint-file=$XYMONTMP/a.chk --checkpoint-interval=600
but this doesn't seem to work either.
Unfortunately, this isn't an option with xymond_alert as-is, although it will be with 4.4 (since the --pid-file option will be in a standard library), and 4.4 (and the Terabithia RPMs) have a 'PIDFILE' option in tasks.cfg for controlling this at the xymonlaunch level too.
The idea to have xymond_channel write out pid files too is an interesting one -- something I hadn't considered -- and might be able to be added in some form in 4.4. (How it could interact with --multilocal would be interesting, though...)
The only way I can think to identify the process is to pgrep -f "xymond_channel --channel=page" which seems like a long way around the barn, but I can make it work.
Should I be able to get a pidfile written for this worker process, or it simply impossible?
This is probably the easiest in this case. Really, the xymond_channel process can substitute for xymond_alert in this case since xymond_channel will kill the worker(s) when it terminates itself (under normal circumstances).
HTH, -jc