I have custom scripts running on a client system which do the actual monitoring; send the status updates to Xymon; and if they detect an outage, the script itself performs the corrective action. You can accomplish this via a properly configured sudo, and granting the needed rights to the Xymon user running the client.
You can also prevent repeated attempts to restart something by having the script create a restart file, which is deleted the next pass of monitoring which finds a "green" condition.
If a down condition is encountered, and the restart file does not exist, then perform the restart and create the restart file (I usually also set the status to "yellow" so the restart attempt is captured in the Xymon history). If on the next pass, the down condition still exists and the restart file also exists, then I set the status to red which then triggers a page, so a human can review the situation. Finally, whenever the script finds a "green" condition, it always checks for the "restart" file and deletes it if found.
Hope that helps, Bruce
Bruce White Senior Enterprise Systems Engineer | Phone: 1-630-671-5169 | Fax: 630-893-1648 | bewhite at fellowes.com | http://www.fellowes.com/ Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc. -----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner Sent: Friday, December 23, 2011 4:21 PM To: xymon at xymon.com Subject: Re: [Xymon] XYMON - corrective measures?
On 23-12-2011 23:17, Tom S wrote:
Thanks Henrik,
So this should do it?
Yes.
HOST=www.foo.com <http://www.foo.com/> <http://www.foo.com <http://www.foo.com/>> SERVICE=http MAIL cio at foo.com <mailto:cio at foo.com> <mailto:cio at foo.com <mailto:cio at foo.com>> DURATION>2 COLOR=red SCRIPT /usr/local/bin/restartapache.sh 123456789 REPEAT 1440
That above will email cio at foo.com <mailto:cio at foo.com> after 2 minutes of RED It will also call up /usr/local/bin/restartapache.sh and run it once every 24 hours if it's down that long?
Yes.
Do I need to put in a DURATION on that one also or does it keep the 2 minutes from the above line or does it run it as soon as it see's it's red? Can I put a DURATION on that also? (eg. SCRIPT /usr/local/bin/restartapache.sh 123456789 REPEAT 1440 DURATION>2 )
When you put the DURATION setting on the MAIL or SCRIPT line, it is local to that recipient, so You can add a DURATION on the SCRIPT also - either the same, or different.
You can also put it on the HOST+SERVICE line, in which case it will be the default for all of the recipients.
Regards, Henrik
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon