Hi folks,
I'm sure this is going to be something really silly that I've missed - but I've been pulling my hair out over this one for a couple of days now.
alerts.cfg
$EMAIL_ALERT=carl.inglis at rakon.com $LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT
HOST=%lin(.*) SERVICE=%win(.*) MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP
HOST=* EXPAGE=printers MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP
When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.
It's clear that the first HOST line is being ignored - I suspect my regex is incorrect in some way.
Any thoughts or pointers would be appreciated.
Regards,
Carl
[Rakon Logo]
Carl Inglis Systems Administrator
Rakon UK Limited Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom Tel: +44 (0)1522 812630 | Fax:+44 (0) 1522 812664 | Mob: +44 (0) 7786 552915 Carl.Inglis at rakon.com | www.rakon.com
[Winner of the NZ Hi-Tech Awards 2011 - Hi-Tech Company of the Decade]
Winner of the 2010 Lincolnshire Business of the Year Award
This message together with any attachments contains confidential information and may be subject to privilege. If you are not the intended recipient you may not distribute it in any way, you must notify the sender immediately and delete any copies of the message along with its attachments.
Rakon UK Ltd is a limited company registered in England and Wales. Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS Company Registration Number: 5128090.
Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.
Hi.
How does it look in the "Info" page for that server? Both alert lines would match that server, thus giving you an alert every hour, with an extra e-mail once per day, after a day, unless the server is in the printers page. If you don't want the second alert line to match that server/test, you need to exclude it.
/Johan
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Carl Inglis Sent: den 15 december 2011 11:03 To: xymon at xymon.com Subject: [Xymon] Alerting - I'm not doing it right...
Hi folks,
I'm sure this is going to be something really silly that I've missed - but I've been pulling my hair out over this one for a couple of days now.
alerts.cfg
$EMAIL_ALERT=carl.inglis at rakon.com<mailto:$EMAIL_ALERT=carl.inglis at rakon.com> $LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT
HOST=%lin(.*) SERVICE=%win(.*) MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP
HOST=* EXPAGE=printers MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP
When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.
It's clear that the first HOST line is being ignored - I suspect my regex is incorrect in some way.
Any thoughts or pointers would be appreciated.
Regards,
Carl
[Rakon Logo]
Carl Inglis Systems Administrator
Rakon UK Limited Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom Tel: +44 (0)1522 812630 | Fax:+44 (0) 1522 812664 | Mob: +44 (0) 7786 552915 Carl.Inglis at rakon.com<mailto:Carl.Inglis at rakon.com> | www.rakon.com<http://www.rakon.com>
[Winner of the NZ Hi-Tech Awards 2011 - Hi-Tech Company of the Decade]
Winner of the 2010 Lincolnshire Business of the Year Award
This message together with any attachments contains confidential information and may be subject to privilege. If you are not the intended recipient you may not distribute it in any way, you must notify the sender immediately and delete any copies of the message along with its attachments.
Rakon UK Ltd is a limited company registered in England and Wales. Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS Company Registration Number: 5128090.
Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.
On Thu, 15 Dec 2011 10:02:43 +0000, Carl Inglis <Carl.Inglis at rakon.com> wrote:
alerts.cfg
$EMAIL_ALERT=carl.inglis at rakon.com $LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT
HOST=%lin(.*) SERVICE=%win(.*) MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP
HOST=* EXPAGE=printers MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP
When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.
There may be some confusion about "service" here.
When you refer to "winUpdates" - is that a status-column in Xymon, or a Windows Service that you are monitoring with a client on the Windows machine? The latter would typically show up in a "svcs" (services) status column on Xymon.
The SERVICE=... setting in alerts.cfg refer to the status-column, not a Windows service. So to catch a "Windows updates" service that is not running, you would have 'SERVICE=svcs' in alerts.cfg.
What the first part of your alerts.cfg says, is "if you have a host whose name contains 'lin', and that host has a status-column that contains 'win', then send an alert after 1 day, and repeat every 24 hours".
The second part of your configuration says "Any status that has an error - except those on the 'printers' page, and those handled by other rules - trigger an alert that is repeated once an hour". Pretty broad definition, I think.
Hope that removes a bit of confusion.
Regards, Henrik
Carl Inglis Systems Administrator
Rakon UK Limited Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom Tel: +44 (0)1522 812630 | Fax:+44 (0) 1522 812664 | Mob: +44 (0) 7786 552915 Carl.Inglis at rakon.com | www.rakon.com Winner of the 2010 Lincolnshire Business of the Year Award
This message together with any attachments contains confidential information and may be subject to privilege. If you are not the intended recipient you may not distribute it in any way, you must notify the sender immediately and delete any copies of the message along with its attachments. -----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of henrik at hswn.dk Sent: 15 December 2011 11:36 To: xymon at xymon.com
On Thu, 15 Dec 2011 10:02:43 +0000, Carl Inglis <Carl.Inglis at rakon.com> wrote:
alerts.cfg
$EMAIL_ALERT=carl.inglis at rakon.com $LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT
HOST=%lin(.*) SERVICE=%win(.*) MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP
HOST=* EXPAGE=printers MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP
When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.
There may be some confusion about "service" here.
When you refer to "winUpdates" - is that a status-column in Xymon, or a Windows Service that you are monitoring with a client on the Windows machine? The latter would typically show up in a "svcs" (services) status column on Xymon.
It's a status column that's returned by a BBWIN ext script- it goes yellow if there are pending Windows Updates on that server.
The SERVICE=... setting in alerts.cfg refer to the status-column, not a Windows service. So to catch a "Windows updates" service that is not running, you would have 'SERVICE=svcs' in alerts.cfg.
What the first part of your alerts.cfg says, is "if you have a host whose name contains 'lin', and that host has a status-column that contains 'win', then send an alert after 1 day, and repeat every 24 hours".
Which is what I wanted it to do.
The second part of your configuration says "Any status that has an error - except those on the 'printers' page, and those handled by other rules - trigger an alert that is repeated once an hour". Pretty broad definition, I think.
Indeed - I'm currently in development mode trying to finalise how we're going to do our alerting; the last line of the configuration was intended as a "you missed one" alert for me. There are a number of lines above the first line in my original email.
Hope that removes a bit of confusion.
It does indeed, thank you.
It appears that removing the "DURATION>1d" option has stopped the second rule for firing - which would make sense since (as Johan suggested) the first rule is unmatched until the alert has a duration of more than one day.
Is that interpretation correct?
Thanks,
Carl
Rakon UK Ltd is a limited company registered in England and Wales. Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS Company Registration Number: 5128090.
Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.
participants (3)
-
Carl.Inglis@rakon.com
-
henrik@hswn.dk
-
Johan.Sjoberg@deltamanagement.se