Duplicate email Recipients in Info page
Hi folks. Still working on cleaning up my xymon infrastructure. I believe I'm getting a better handle on things with my analysis.cfg and alerts.cfg settings now.
I made some more changes to the analysis and alerts files to add other groups for emails.
So now, I think I've duplicated a notify group in one or the other places.
When I click on a certain host's info link on an application page, I'm seeing the same target recipient for things like conn, cpu, disk, memory, etc.
What's the best way to troubleshoot my rules/settings to figure out where I'm setting the duplicated target recipients at?
PS – I can send along the alerts and analysis files if that helps.
Thanks much!
Don K
(resending to cc to group)
On Tue, Jun 5, 2012 at 2:02 PM, Don Kuhlman <Don.Kuhlman at schawk.com> wrote:
What's the best way to troubleshoot my rules/settings to figure out where I'm setting the duplicated target recipients at?
Try on the server: /usr/local/xymon/server/bin/xymond_alert --test <host> <options>
example: /usr/local/xymon/server/bin/xymond_alert --test myhost.example.com disk /usr/local/xymon/server/bin/xymond_alert --test myhost.example.com --color=yellow --duration=200 #in seconds
This will show you exactly which rules in alerts.cfg are firing. (substitute correct path for your xymon installation)
Thanks Betsy! I'll try testing this way.
Now, since I have your attention ;), do you happen to know how I can test why my service monitors arent' working?
I've been working on the email issue and now seem to have another issue. I shut down the bbwin service on two windows computers that xymon was monitoring and it's not alerting that the service and/or process has stopped.
Much appreciated!
Don K
On 6/6/12 12:26 PM, "Betsy Schwartz" <betsy.schwartz at gmail.com> wrote:
(resending to cc to group)
On Tue, Jun 5, 2012 at 2:02 PM, Don Kuhlman <Don.Kuhlman at schawk.com> wrote:
What's the best way to troubleshoot my rules/settings to figure out where I'm setting the duplicated target recipients at?
Try on the server: /usr/local/xymon/server/bin/xymond_alert --test <host> <options>
example: /usr/local/xymon/server/bin/xymond_alert --test myhost.example.com disk /usr/local/xymon/server/bin/xymond_alert --test myhost.example.com --color=yellow --duration=200 #in seconds
This will show you exactly which rules in alerts.cfg are firing. (substitute correct path for your xymon installation)
On Wed, Jun 6, 2012 at 1:28 PM, Don Kuhlman <Don.Kuhlman at schawk.com> wrote:
I've been working on the email issue and now seem to have another issue. I shut down the bbwin service on two windows computers that xymon was monitoring and it's not alerting that the service and/or process has stopped.
Default purple interval , aka "status lifetime" is half an hour... I think it's hardcoded on the server side.
Individual alerts can also set their lifetimes. We do that with some of our custom tests, run them every half hour with a lifetime of a couple of hours.
Hi folks. We recently rebuilt our Xymon server to 4.3.10. Lost some alert configurations. After re-doing some of them, we are getting 3 or 4 emails for the same thing. Any tips on what I've missed would be appreciated!
Thanks
Don K
As an example, I ran "./xymond_alert --test myhostname disk
It gives back this:
00016804 2012-09-14 08:39:38 Matching host:service:dgroup:page 'Media-Agent02-IMM:disk:NONE:Other' against rule line 186 00016804 2012-09-14 08:39:38 *** Match with 'SERVICE=cpu,disk,memory TIME=W:0800:1700 EXHOST=DONXP RECOVERED' *** 00016804 2012-09-14 08:39:38 Matching host:service:dgroup:page 'Media-Agent02-IMM:disk:NONE:Other' against rule line 186 00016804 2012-09-14 08:39:38 *** Match with 'SERVICE=cpu,disk,memory TIME=W:0800:1700 EXHOST=DONXP RECOVERED' *** 00016804 2012-09-14 08:39:38 Script alert with command '/usr/lib64/xymon/server/ext/html_mail.pl' and recipient donk at co.com
Here is the rule from alerts.cfg on line 186:
SERVICE=cpu,disk,memory TIME=W:0800:1700 EXHOST=DONXP RECOVERED
I'm getting 4 emails as below with GREEN status and 1 with YELLOW status and can't figure out where the duplication is coming from. Can anyone suggest what is wrong with my rules?
Monitoring Alert for:Media-Agent02-IMM - disk - status is (GREEN) Fri 9/14/12 8:36 AM Monitoring Alert for:Media-Agent02-IMM - disk - status is (GREEN) Fri 9/14/12 8:36 AM Monitoring Alert for:Media-Agent02-IMM - disk - status is (GREEN) Fri 9/14/12 8:36 AM Monitoring Alert for:Media-Agent02-IMM - disk - status is (GREEN) Fri 9/14/12 8:36 AM Monitoring Alert for:Media-Agent02-IMM - disk - status is (YELLOW) Fri 9/14/12 8:34 AM
The email body includes the following: yellow Fri Sep 14 08:34:57 2012
Filesystem 1024-blocks Used Available Capacity Mounted on Physical memory 102196 93324 8872 91% Physical memory Virtual memory 102196 93324 8872 91% Virtual memory Swap space 0 0 0 0% Swap space
Devmon version 0.3.0-rc1 running on Systems_Monitor
participants (2)
-
betsy.schwartz@gmail.com
-
Don.Kuhlman@schawk.com