Monitoring Exchange 2010 - imaps/pop3s alarms
I have recently migrated from Exchange 2003 to Exchange 2010. Several times a day, I get alarms for imaps and pop3s, which are resolved less than a minute later.
The graph (http://www.elyograg.org/pop3s.png) shows hourly spikes in
reponse time. The alarms are not every hour, but when they do happen,
they correspond to the spikes. The email from xymon looks like this.
Note the "seconds" value, which is extremely low. Anytime there's an
alarm, the value is low like this.
Service pop3s on server.example.com is not OK : Service unavailable (connect timeout)
Seconds: 0.003435
There have been no complaints from people who use the service, and believe me, I'd hear about it if there was a problem.
The non-SSL versions are not alarming. I just noticed that the graph for pop3 has two entries - pop3 and pop3s. From what I can tell, it checks TLS on the standard test. The pop3s graph on the pop3 entry looks identical to the pop3s graph I've included here, except it's red instead of blue.
I suspect that it's related to the throttling policies in Exchange 2010, but I don't know what to change. I don't want to open the default throttling policy way up, but I did change the anonymous connection limit from 1 to 5. Has anyone else had this problem and found a way to solve it?
Thanks, Shawn
On 11/18/2010 2:02 PM, Shawn Heisey wrote:
I have recently migrated from Exchange 2003 to Exchange 2010. Several times a day, I get alarms for imaps and pop3s, which are resolved less than a minute later.
The graph (http://www.elyograg.org/pop3s.png) shows hourly spikes in reponse time. The alarms are not every hour, but when they do happen, they correspond to the spikes. The email from xymon looks like this.
Note the "seconds" value, which is extremely low. Anytime there's an alarm, the value is low like this.
Service pop3s on server.example.com is not OK : Service unavailable (connect timeout)
Seconds: 0.003435
There have been no complaints from people who use the service, and believe me, I'd hear about it if there was a problem.
The non-SSL versions are not alarming. I just noticed that the graph for pop3 has two entries - pop3 and pop3s. From what I can tell, it checks TLS on the standard test. The pop3s graph on the pop3 entry looks identical to the pop3s graph I've included here, except it's red instead of blue.
I suspect that it's related to the throttling policies in Exchange 2010, but I don't know what to change. I don't want to open the default throttling policy way up, but I did change the anonymous connection limit from 1 to 5. Has anyone else had this problem and found a way to solve it?
I forgot to mention versions. Xymon is the debian package in lenny-backports, x86_64 version 4.3.0~beta2.dfsg-5~bpo50+1. The Exchange server is running on Windows Server 2008 R2 with BBWin 0.12.
On Thu, November 18, 2010 16:02, Shawn Heisey wrote:
I have recently migrated from Exchange 2003 to Exchange 2010. Several times a day, I get alarms for imaps and pop3s, which are resolved less than a minute later.
Not the same thing, but I had a situation where internal web servers would all go red a number of times a day then go green again on the next test. No complaints from users, and I couldn't identify the network anomaly causing it, so I just use "badhttp 2:3:4" for them in bb-hosts. They pretty much stay in "smiley" green, but that's better than alarms for no good purpose.
It's "badTEST" in the manpage.
On 11/18/2010 4:51 PM, Xymon User in Richmond wrote:
Not the same thing, but I had a situation where internal web servers would all go red a number of times a day then go green again on the next test. No complaints from users, and I couldn't identify the network anomaly causing it, so I just use "badhttp 2:3:4" for them in bb-hosts. They pretty much stay in "smiley" green, but that's better than alarms for no good purpose.
It's "badTEST" in the manpage.
Thanks! This will get rid of the false alarms while I work out what's really wrong and how to fix it. I think that'll take an extended tcpdump followed by inspection in wireshark. I went with:
badimaps:1:2:3 badpop3s:1:2:3
I'm still interested in knowing if anyone else has run into this already and dealt with it at the source. If I do manage to find a way in Exchange to fix it, I'll post it here.
Shawn
On 11/18/2010 4:51 PM, Xymon User in Richmond wrote:
Not the same thing, but I had a situation where internal web servers would all go red a number of times a day then go green again on the next test. No complaints from users, and I couldn't identify the network anomaly causing it, so I just use "badhttp 2:3:4" for them in bb-hosts. They pretty much stay in "smiley" green, but that's better than alarms for no good purpose.
It's "badTEST" in the manpage.
Does this work for all tests, or does it only work for the built-in network tests? I have a couple of custom scripts that occasionally give false alarms.
Shawn
On Fri, November 19, 2010 10:29, Shawn Heisey wrote:
On 11/18/2010 4:51 PM, Xymon User in Richmond wrote:
Not the same thing, but I had a situation where internal web servers would all go red a number of times a day then go green again on the next test. No complaints from users, and I couldn't identify the network anomaly causing it, so I just use "badhttp 2:3:4" for them in bb-hosts. They pretty much stay in "smiley" green, but that's better than alarms for no good purpose.
It's "badTEST" in the manpage.
Does this work for all tests, or does it only work for the built-in network tests? I have a couple of custom scripts that occasionally give false alarms.
IIRC it's only for the built-in network tests. That's implied but not definitively stated in the manpage.
participants (3)
-
epperson@alumni.unc.edu
-
hobbit@elyograg.org
-
hobbit@epperson.homelinux.net