Had an event last night that caused a test to go red for 20 minutes, but no email was sent. All other events before and after sent alerts as expected Verified via the info dot that xymon believes it should alert for this event. There is nothing in the maillog for this event, it appears xymon did not send anything.
Where else can I look to troubleshoot?
In case you want to see for yourself from bb-hosts {code} page SLA SLA Checks 0.0.0.0 <url redacted> # noconn cont;https://<url redacted>/;experience {code} from hobbit-alerts.cfg (will have to trust me that there are no matching exceptions above this entry in the file) {code} PAGE=SLA DURATION>1 IGNORE SERVICE=sslcert MAIL <address redacted> RECOVERED REPEAT=365d COLOR=yellow,purple MAIL <address redacted> RECOVERED REPEAT=30m COLOR=red {code} from the info dot {code} Alerting:ServiceRecipient1st DelayStop afterRepeatTime of DayColorscontent<address redacted>(R)1m-52w 1d-purple,yellow<address redacted> (R)1m-30m-red {code}
The first place to look would be in however you track messages. For instance, I use sendmail to transport messages from the server Xymon runs on to other MTA's in my organization. I'm assuming you're running on Linux, so If you're using sendmail, I'd check in /var/log/maillog to see if the message was submitted from Xymon and where it went from there....
Jamison Maxwell
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Michael Baydoun Sent: Tuesday, March 06, 2012 9:36 AM To: xymon at xymon.com Subject: [Xymon] troubleshooting missing alert
Had an event last night that caused a test to go red for 20 minutes, but no email was sent. All other events before and after sent alerts as expected Verified via the info dot that xymon believes it should alert for this event. There is nothing in the maillog for this event, it appears xymon did not send anything.
Where else can I look to troubleshoot?
In case you want to see for yourself from bb-hosts {code} page SLA SLA Checks 0.0.0.0 <url redacted> # noconn cont;https://<url redacted>/;experience {code} from hobbit-alerts.cfg (will have to trust me that there are no matching exceptions above this entry in the file) {code} PAGE=SLA DURATION>1 IGNORE SERVICE=sslcert MAIL <address redacted> RECOVERED REPEAT=365d COLOR=yellow,purple MAIL <address redacted> RECOVERED REPEAT=30m COLOR=red {code} from the info dot {code} Alerting:
Service
Recipient
1st Delay
Stop after
Repeat
Time of Day
Colors
content
<address redacted>(R)
1m
52w 1d
purple,yellow
<address redacted> (R)
1m
30m
red
{code}
Ooh, re-read you original message. Sorry, you've already checked there....
...I'll shutup now.
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Jamison Maxwell Sent: Tuesday, March 06, 2012 6:44 PM To: xymon at xymon.com Subject: Re: [Xymon] troubleshooting missing alert
The first place to look would be in however you track messages. For instance, I use sendmail to transport messages from the server Xymon runs on to other MTA's in my organization. I'm assuming you're running on Linux, so If you're using sendmail, I'd check in /var/log/maillog to see if the message was submitted from Xymon and where it went from there....
Jamison Maxwell
From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com]<mailto:[mailto:xymon-bounces at xymon.com]> On Behalf Of Michael Baydoun Sent: Tuesday, March 06, 2012 9:36 AM To: xymon at xymon.com<mailto:xymon at xymon.com> Subject: [Xymon] troubleshooting missing alert
Had an event last night that caused a test to go red for 20 minutes, but no email was sent. All other events before and after sent alerts as expected Verified via the info dot that xymon believes it should alert for this event. There is nothing in the maillog for this event, it appears xymon did not send anything.
Where else can I look to troubleshoot?
In case you want to see for yourself from bb-hosts {code} page SLA SLA Checks 0.0.0.0 <url redacted> # noconn cont;https://<url redacted>/;experience {code} from hobbit-alerts.cfg (will have to trust me that there are no matching exceptions above this entry in the file) {code} PAGE=SLA DURATION>1 IGNORE SERVICE=sslcert MAIL <address redacted> RECOVERED REPEAT=365d COLOR=yellow,purple MAIL <address redacted> RECOVERED REPEAT=30m COLOR=red {code} from the info dot {code} Alerting:
Service
Recipient
1st Delay
Stop after
Repeat
Time of Day
Colors
content
<address redacted>(R)
1m
52w 1d
purple,yellow
<address redacted> (R)
1m
30m
red
{code}
On Wed, Mar 7, 2012 at 1:35 AM, Michael Baydoun <indymichaelb at gmail.com>wrote:
Had an event last night that caused a test to go red for 20 minutes, but no email was sent.
There is nothing in the maillog for this event, it appears xymon did not
send anything.
Has this worked recently? Perhaps the link between alerting an your MTA has been severed.
Where else can I look to troubleshoot?
Perhaps look for errors from xymond_alert. These will be in "alert.log". You might find a reason why it can't run /usr/bin/mail. Also, try "xymond_alert --dump-config" and check the parsed config to see if it matches what you expect.
J
participants (3)
-
indymichaelb@gmail.com
-
jamison@newasterisk.com
-
jlaidman@rebel-it.com.au