GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg: HOST=myhost.mydomain.com GROUP=mygroup PROC blah PROC blahblah DISK ....etc
alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL me at mydomain.com REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers, Heather
I generally put the RECOVERED on the mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg: HOST=myhost.mydomain.com<http://myhost.mydomain.com> GROUP=mygroup PROC blah PROC blahblah DISK ....etc
alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL me at mydomain.com<mailto:me at mydomain.com> REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers, Heather
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Yeah, I tried that too. No joy.
On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com> wrote:
I generally put the RECOVERED on the mail line.****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On Behalf Of *Heather Keen *Sent:* Friday, July 01, 2011 11:47 AM *To:* xymon at xymon.com *Subject:* [Xymon] GROUPs and recovery alerts****
Hi,****
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.****
I used the following config -****
analysis.cfg:****
HOST=myhost.mydomain.com GROUP=mygroup****
PROC blah**** PROC blahblah**** DISK ....etc****
alerts.cfg:****
GROUP=mygroup NOTICE RECOVERED COLOR=red****
MAIL me at mydomain.com REPEAT=15****
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.****
Am I doing something wrong, or is this a bug?****
I can provide the debug output from the alert process if required.****
Cheers,****
Heather****
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Have you tried using the test program to see how it acts for the failure?
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com<mailto:Paul.Root at qwest.com>> wrote: I generally put the RECOVERED on the mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com<mailto:xymon at xymon.com> Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg: HOST=myhost.mydomain.com<http://myhost.mydomain.com> GROUP=mygroup PROC blah PROC blahblah DISK ....etc
alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL me at mydomain.com<mailto:me at mydomain.com> REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers, Heather
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.
On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com> wrote:
Have you tried using the test program to see how it acts for the failure?
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
*From:* Heather Keen [mailto:keenha at googlemail.com] *Sent:* Saturday, July 02, 2011 12:51 PM *To:* Root, Paul *Cc:* xymon at xymon.com *Subject:* Re: [Xymon] GROUPs and recovery alerts****
Yeah, I tried that too. No joy.****
On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com> wrote:****
I generally put the RECOVERED on the mail line.****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On Behalf Of *Heather Keen *Sent:* Friday, July 01, 2011 11:47 AM *To:* xymon at xymon.com *Subject:* [Xymon] GROUPs and recovery alerts****
Hi,****
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.****
I used the following config -****
analysis.cfg:****
HOST=myhost.mydomain.com GROUP=mygroup****
PROC blah**** PROC blahblah**** DISK ....etc********
alerts.cfg:****
GROUP=mygroup NOTICE RECOVERED COLOR=red****
MAIL me at mydomain.com REPEAT=15********
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.****
Am I doing something wrong, or is this a bug?****
I can provide the debug output from the alert process if required.****
Cheers,****
Heather****
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.****
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
You can't.
I've never used notice. What does it do?
I've never had any luck with recovered on anything except the Mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Monday, July 04, 2011 5:13 AM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.
On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com<mailto:Paul.Root at qwest.com>> wrote: Have you tried using the test program to see how it acts for the failure?
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com<mailto:keenha at googlemail.com>] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com<mailto:xymon at xymon.com> Subject: Re: [Xymon] GROUPs and recovery alerts
Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com<mailto:Paul.Root at qwest.com>> wrote: I generally put the RECOVERED on the mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com<mailto:xymon at xymon.com> Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg: HOST=myhost.mydomain.com<http://myhost.mydomain.com> GROUP=mygroup PROC blah PROC blahblah DISK ....etc
alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL me at mydomain.com<mailto:me at mydomain.com> REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers, Heather
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
NOTICE means that when you enable or disable an alert you'll get a msg.
I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
On 4 July 2011 16:48, Root, Paul <Paul.Root at qwest.com> wrote:
You can’t.****
I’ve never used notice. What does it do?****
I’ve never had any luck with recovered on anything except the Mail line.** **
Paul Root - Engineer III - Qwest is becoming CenturyLink****
*From:* Heather Keen [mailto:keenha at googlemail.com] *Sent:* Monday, July 04, 2011 5:13 AM
*To:* Root, Paul *Cc:* xymon at xymon.com *Subject:* Re: [Xymon] GROUPs and recovery alerts****
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.****
On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com> wrote:****
Have you tried using the test program to see how it acts for the failure?*
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
*From:* Heather Keen [mailto:keenha at googlemail.com] *Sent:* Saturday, July 02, 2011 12:51 PM *To:* Root, Paul *Cc:* xymon at xymon.com *Subject:* Re: [Xymon] GROUPs and recovery alerts****
Yeah, I tried that too. No joy.****
On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com> wrote:****
I generally put the RECOVERED on the mail line.****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On Behalf Of *Heather Keen *Sent:* Friday, July 01, 2011 11:47 AM *To:* xymon at xymon.com *Subject:* [Xymon] GROUPs and recovery alerts****
Hi,****
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.****
I used the following config -****
analysis.cfg:****
HOST=myhost.mydomain.com GROUP=mygroup****
PROC blah**** PROC blahblah**** DISK ....etc********
alerts.cfg:****
GROUP=mygroup NOTICE RECOVERED COLOR=red****
MAIL me at mydomain.com REPEAT=15********
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.****
Am I doing something wrong, or is this a bug?****
I can provide the debug output from the alert process if required.****
Cheers,****
Heather****
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.****
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.****
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <keenha at googlemail.com> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?
On 4 July 2011 16:48, Root, Paul <Paul.Root at qwest.com> wrote:
You can’t.
I’ve never used notice. What does it do?
I’ve never had any luck with recovered on anything except the Mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Monday, July 04, 2011 5:13 AM
To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.
On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com> wrote:
Have you tried using the test program to see how it acts for the failure?
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
Yeah, I tried that too. No joy.
On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com> wrote:
I generally put the RECOVERED on the mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg:
HOST=myhost.mydomain.com GROUP=mygroup
PROC blah
PROC blahblah
DISK ....etc
alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
MAIL me at mydomain.com REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers,
Heather
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Aye, there is nothing wrong with that.
Anyway, I think this is a BUG.
Xymon Version 4.3.3. Configuration as follows:
analysis.cfg: HOST=myhost.mydomain.com GROUP=heather PROC TESTtestTEST 1
(equally, you could have the GROUP entry on the PROC line, doesn't matter in this instance as the result is the same)
alerts.cfg: HOST=* MAIL heather1 at mydomain.com RECOVERED
GROUP=heather MAIL heather2 at mydomain.com RECOVERED
When the alert is generated, both e-mail addresses get the notification. But when the alert is cleared, only heather1 at mydomain.com gets the recovery message.
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(
On 4 July 2011 17:58, Asif Iqbal <vadud3 at gmail.com> wrote:
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <keenha at googlemail.com> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?
On 4 July 2011 16:48, Root, Paul <Paul.Root at qwest.com> wrote:
You can’t.
I’ve never used notice. What does it do?
I’ve never had any luck with recovered on anything except the Mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Monday, July 04, 2011 5:13 AM
To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.
On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com> wrote:
Have you tried using the test program to see how it acts for the
failure?
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
Yeah, I tried that too. No joy.
On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com> wrote:
I generally put the RECOVERED on the mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On
Behalf
Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg:
HOST=myhost.mydomain.com GROUP=mygroup
PROC blah PROC blahblah DISK ....etcalerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
MAIL me at mydomain.com REPEAT=15Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers,
Heather
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
I'm afraid you are right ! We have also noticed that RECOVERED msgs do not get sent to GROUPs (using Xymon-4.3.2)
Dominique
On 07/ 5/11 11:58 AM, Heather Keen wrote:
Aye, there is nothing wrong with that.
Anyway, I think this is a BUG.
Xymon Version 4.3.3. Configuration as follows:
analysis.cfg: HOST=myhost.mydomain.com <http://myhost.mydomain.com> GROUP=heather PROC TESTtestTEST 1
(equally, you could have the GROUP entry on the PROC line, doesn't matter in this instance as the result is the same)
alerts.cfg: HOST=* MAIL heather1 at mydomain.com <mailto:heather1 at mydomain.com> RECOVERED
GROUP=heather MAIL heather2 at mydomain.com <mailto:heather2 at mydomain.com> RECOVERED
When the alert is generated, both e-mail addresses get the notification. But when the alert is cleared, only heather1 at mydomain.com <mailto:heather1 at mydomain.com> gets the recovery message.
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(
On 4 July 2011 17:58, Asif Iqbal <vadud3 at gmail.com <mailto:vadud3 at gmail.com>> wrote:
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <keenha at googlemail.com <mailto:keenha at googlemail.com>> wrote: > NOTICE means that when you enable or disable an alert you'll get a msg. > I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as > they aren't in a GROUP. RECOVERED and COLOR=RED on same line? > > On 4 July 2011 16:48, Root, Paul <Paul.Root at qwest.com <mailto:Paul.Root at qwest.com>> wrote: >> >> You can’t. >> >> >> >> I’ve never used notice. What does it do? >> >> >> >> I’ve never had any luck with recovered on anything except the Mail line. >> >> >> >> >> >> Paul Root - Engineer III - Qwest is becoming CenturyLink >> >> >> >> From: Heather Keen [mailto:keenha at googlemail.com <mailto:keenha at googlemail.com>] >> Sent: Monday, July 04, 2011 5:13 AM >> >> To: Root, Paul >> Cc: xymon at xymon.com <mailto:xymon at xymon.com> >> Subject: Re: [Xymon] GROUPs and recovery alerts >> >> >> >> The test for the alert works fine - but how do you mimic a "recovered" >> message with the --test option? I don't think you can. >> >> >> >> On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com <mailto:Paul.Root at qwest.com>> wrote: >> >> Have you tried using the test program to see how it acts for the failure? >> >> >> >> /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs >> --duration=500 |grep -v Failed >> >> >> >> Paul Root - Engineer III - Qwest is becoming CenturyLink >> >> >> >> From: Heather Keen [mailto:keenha at googlemail.com <mailto:keenha at googlemail.com>] >> Sent: Saturday, July 02, 2011 12:51 PM >> To: Root, Paul >> Cc: xymon at xymon.com <mailto:xymon at xymon.com> >> Subject: Re: [Xymon] GROUPs and recovery alerts >> >> >> >> Yeah, I tried that too. No joy. >> >> On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com <mailto:Paul.Root at qwest.com>> wrote: >> >> I generally put the RECOVERED on the mail line. >> >> >> >> >> >> Paul Root - Engineer III - Qwest is becoming CenturyLink >> >> >> >> From: xymon-bounces at xymon.com <mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com <mailto:xymon-bounces at xymon.com>] On Behalf >> Of Heather Keen >> Sent: Friday, July 01, 2011 11:47 AM >> To: xymon at xymon.com <mailto:xymon at xymon.com> >> Subject: [Xymon] GROUPs and recovery alerts >> >> >> >> Hi, >> >> >> >> I want to be able to group servers so that alerts for one bunch of servers >> go to one group of people, and another group of servesr to another group of >> people. >> >> >> >> I used the following config - >> >> >> >> analysis.cfg: >> >> HOST=myhost.mydomain.com <http://myhost.mydomain.com> GROUP=mygroup >> >> PROC blah >> >> PROC blahblah >> >> DISK ....etc >> >> >> >> alerts.cfg: >> >> GROUP=mygroup NOTICE RECOVERED COLOR=red >> >> MAIL me at mydomain.com <mailto:me at mydomain.com> REPEAT=15 >> >> >> >> >> >> Now, I tested this by stopping one of the PROCs listed, and I successfully >> received the alert e-mail. However, when I restarted that process to clear >> the alert, the status goes green but I do not receive any recovery message. >> >> >> >> I also tried just having the GROUP defined against each individual PROC >> line (rather than against the HOST), but that didn't result in a recovery >> message either. >> >> >> >> Am I doing something wrong, or is this a bug? >> >> >> >> I can provide the debug output from the alert process if required. >> >> >> >> Cheers, >> >> Heather >> >> >> >> ________________________________ >> >> This communication is the property of Qwest and may contain confidential >> or >> privileged information. Unauthorized use of this communication is strictly >> prohibited and may be unlawful. If you have received this communication >> in error, please immediately notify the sender by reply e-mail and destroy >> all copies of the communication and any attachments. >> >> >> >> >> >> ________________________________ >> >> This communication is the property of Qwest and may contain confidential >> or >> privileged information. Unauthorized use of this communication is strictly >> prohibited and may be unlawful. If you have received this communication >> in error, please immediately notify the sender by reply e-mail and destroy >> all copies of the communication and any attachments. >> >> >> >> ________________________________ >> This communication is the property of Qwest and may contain confidential >> or >> privileged information. Unauthorized use of this communication is strictly >> prohibited and may be unlawful. If you have received this communication >> in error, please immediately notify the sender by reply e-mail and destroy >> all copies of the communication and any attachments. > > > _______________________________________________ > Xymon mailing list > Xymon at xymon.com <mailto:Xymon at xymon.com> > http://lists.xymon.com/mailman/listinfo/xymon > > -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu <http://pgp.mit.edu> A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On 05-07-2011 11:58, Heather Keen wrote:
Anyway, I think this is a BUG.
Xymon Version 4.3.3. Configuration as follows:
analysis.cfg: HOST=myhost.mydomain.com GROUP=heather PROC TESTtestTEST 1
alerts.cfg: HOST=* MAIL heather1 at mydomain.com RECOVERED
GROUP=heather MAIL heather2 at mydomain.com RECOVERED
When the alert is generated, both e-mail addresses get the notification. But when the alert is cleared, only heather1 at mydomain.com <mailto:heather1 at mydomain.com> gets the recovery message.
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(
It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.
The problem is that when the PROC triggers a red status, Xymon knows that the rule was one that included a "GROUP=heather" setting. But when the recovery happens, it is because none of the rules in analysis.cfg triggered. So Xymon does not know that the green status is a recovery from a rule that contained the GROUP setting.
There is some state lost here.
To solve this, the xymond_alert module will have to keep track of the active alerts, and which GROUP settings triggered them. When the recovery happens, it will then use that list of groups that received the alert as the basis for sending out the recovered-notices.
It can be solved, of course. Just don't be disappointed when you see 4.3.4 being released later today without a fix for this problem.
Regards, Henrik
On 01-08-2011 17:14, Henrik Størner wrote:
On 05-07-2011 11:58, Heather Keen wrote:
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(
It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.
After looking at this once again, I actually think there is a very simple solution to this after all. If we don't check the GROUP rules at all for recovery-messages (i.e. any group setting will match), then xymond_alert will consider all the possible recipients. However, there is another check so it only sends recovery-messages to those recipients that actually did receive the alert. So I think the attached patch should solve this.
Regards, Henrik
On 1 August 2011 16:37, Henrik Størner <henrik at hswn.dk> wrote:
On 01-08-2011 17:14, Henrik Størner wrote:
On 05-07-2011 11:58, Heather Keen wrote:
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(
It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.
After looking at this once again, I actually think there is a very simple solution to this after all. If we don't check the GROUP rules at all for recovery-messages (i.e. any group setting will match), then xymond_alert will consider all the possible recipients. However, there is another check so it only sends recovery-messages to those recipients that actually did receive the alert. So I think the attached patch should solve this.
Regards, Henrik
Henrik,
I've been doing a bit more testing with alerts using GROUPS, and I've discovered a slight flaw with this solution, when you are using SCRIPT as the recipient rather than MAIL. Because it doesn't check the GROUP when it sends a RECOVERED message, you can end up getting multiple RECOVERED messages sent to the same person. (tested with v4.3.7)
For example:
GROUP=A SERVICE=procs RECOVERED COLOR=red SCRIPT /home/xymon/server/ext/sms_notification 447777123456 FORMAT=SMS DURATION>5 GROUP=B SERVICE=procs RECOVERED COLOR=red SCRIPT /home/xymon/server/ext/sms_notification 447777123456 FORMAT=SMS DURATION>10
So you've got two groups of machines, each having the same recipient, but needing a different alert delay. Now, if procs goes red on a machine in group A, the red alert is handled fine, but when it recovers, 447777123456 actually gets two recovery messages.
Note this only happens if the recipient is a SCRIPT command, it works fine if you use MAIL recipients.
Help!
Cheers, Heather
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <keenha at googlemail.com> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?
On 4 July 2011 16:48, Root, Paul <Paul.Root at qwest.com> wrote:
You can’t.
I’ve never used notice. What does it do?
I’ve never had any luck with recovered on anything except the Mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Monday, July 04, 2011 5:13 AM
To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.
On 3 July 2011 00:29, Root, Paul <Paul.Root at qwest.com> wrote:
Have you tried using the test program to see how it acts for the failure?
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: Heather Keen [mailto:keenha at googlemail.com] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts
Yeah, I tried that too. No joy.
On 1 July 2011 19:59, Root, Paul <Paul.Root at qwest.com> wrote:
I generally put the RECOVERED on the mail line.
Paul Root - Engineer III - Qwest is becoming CenturyLink
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg:
HOST=myhost.mydomain.com GROUP=mygroup
PROC blah
PROC blahblah
DISK ....etc
alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
MAIL me at mydomain.com REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers,
Heather
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
participants (5)
-
dominique.frise@unil.ch
-
henrik@hswn.dk
-
keenha@googlemail.com
-
Paul.Root@qwest.com
-
vadud3@gmail.com