We have one class of device (exadata storage cels) , and one single server out of hundreds, for which we occasionally get bogus trap alerts.
We don't use trap alerts at present on *any* devices.
What happens is, we get a trap alert, and then after some time passes it turns purple. Someone gets paged and woken up for the purple which is VERY annoying. I drop it on the xymon server side, and it goes away for days or weeks or months.
I can set a rule to ignore trap when paging, but why are we getting them? The exadata storage cels do not run a xymon client, they are ping monitored only. The lone server does run the linux client. What could be triggering the trap just for these boxes? Has anyone else ever experienced this?
On Tue, Mar 12, 2013 at 5:44 PM, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
On 12 March 2013 21:54, Betsy Schwartz <betsy.schwartz at gmail.com> wrote:
We have one class of device (exadata storage cels) , and one single server out of hundreds, for which we occasionally get bogus trap alerts.
Are you talking of SNMP traps?
Yes, we don't use them at all, and then out of the blue we'll get a purple:
clear Thu Feb 16 10:31:22 2012 Unknown trap (.1.3.6.1.4.1.111.16.2.0.1)
We've only gotten them for one particular linux host, plus several Exadata cels. The above OID is from an exadata cell and does appear to be from an Oracle Exadata mib . But why would the xymon server, only occasionally, get one of these when SNMP is not enabled? I'm not running devmon, not sure what other ways there are to have Xymon pick up an snmp alert?
Betsy,
On Tue, Mar 12, 2013 at 5:44 PM, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
On 12 March 2013 21:54, Betsy Schwartz <betsy.schwartz at gmail.com> wrote:
We have one class of device (exadata storage cels) , and one single server out of hundreds, for which we occasionally get bogus trap alerts.
Are you talking of SNMP traps? Yes, we don't use them at all, and then out of the blue we'll get a purple:
clear Thu Feb 16 10:31:22 2012 Unknown trap (.1.3.6.1.4.1.111.16.2.0.1) SNMP traps are event-based notifications. If you follow the recipe http://cerebro.victoriacollege.edu/hobbit-trap.html for setting up trap notifications using snmptt/sec/etc part of the config requires have a poller that checks for expiring (soon to go purple) trap status messages and sends a clear "no traps" message to prevent that. We've only gotten them for one particular linux host, plus several Exadata cels. The above OID is from an exadata cell and does appear to be from an Oracle Exadata mib . But why would the xymon server, only occasionally, get one of these when SNMP is not enabled? I'm not running devmon, not sure what other ways there are to have Xymon pick up an snmp alert? It absolutely requires some test to generate these. Check the IP address of the originating server that sent the trap status message, then check what tests are running from there. Might also be worth checking Ghost Clients to see if there are more of these that you don't know about.
devmon does not do SNMP traps in any way. It is SNMP polling only.
David.
-- David Baldwin - Senior Systems Administrator (Datacentres + Networks) Information and Communication Technology Services Australian Sports Commission http://ausport.gov.au Tel 02 62147830 Fax 02 62141830 PO Box 176 Belconnen ACT 2616 david.baldwin at ausport.gov.au Leverrier Street Bruce ACT 2617
Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au
This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.
On 14 March 2013 10:03, David Baldwin <david.baldwin at ausport.gov.au> wrote:
It absolutely requires some test to generate these. Check the IP address of the originating server that sent the trap status message, then check what tests are running from there. Might also be worth checking Ghost Clients to see if there are more of these that you don't know about.
Also, check the trap destination configured on the device. If it's set to the Xymon server, then look for a process on your Xymon server that's listening for SNMP packets. On Linux, you can do "sudo netstat -naup | grep :162" and it should show the PID and name of the process that is receiving the traps.
devmon does not do SNMP traps in any way. It is SNMP polling only.
(As David implied) neither does Xymon. There must be another process that receives a trap and then generates a Xymon status message, but not necessarily running on the Xymon server.
Googling the phrase ["Unknown trap" xymon] shows the HOWTO that David linked to. I suspect someone has set this up on your Xymon server. This means you probably have snmptrapd running, which you should stop if you don't ever use SNMP traps.
J
On Wed, Mar 13, 2013 at 7:49 PM, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
On 14 March 2013 10:03, David Baldwin <david.baldwin at ausport.gov.au> wrote:
It absolutely requires some test to generate these. Check the IP address of the originating server that sent the trap status message, then check what tests are running from there. Might also be worth checking Ghost Clients to see if there are more of these that you don't know about.
Also, check the trap destination configured on the device. If it's set to the Xymon server, then look for a process on your Xymon server that's listening for SNMP packets. On Linux, you can do "sudo netstat -naup | grep :162" and it should show the PID and name of the process that is receiving the traps.
devmon does not do SNMP traps in any way. It is SNMP polling only.
(As David implied) neither does Xymon. There must be another process that receives a trap and then generates a Xymon status message, but not necessarily running on the Xymon server.
Googling the phrase ["Unknown trap" xymon] shows the HOWTO that David linked to. I suspect someone has set this up on your Xymon server. This means you probably have snmptrapd running, which you should stop if you don't ever use SNMP traps.
Hm, still puzzled. The Exadata cels are mimimal storage devices that don't run a linux client. The lone server that we get these messages from is not running any SNMP tests. We only get them once in a blue moon.
There are no snmp processes of any sort running on the linux server. I built that server myself from our master image and only installed xymon. The one thing that has somethng to do with snmp is the VMWare cli and esxi tests, BUT the hosts that are alerting aren't vmware hosts (and the ESXI tests are running on another server since I haven't gotten thm to run yet on this one, but that's another story
sudo netstat -naup | grep 162 returns nothing.
The one linux server that we occasionally see this from is running two hp hardware tests that call hpacucli, bb-roracle, and ntp and memory tests.
I have a bazillion ghost clients because the esxi test is returning non-FQDN names for all vmware hosts, but those aren't the hosts that are alerting. I'm getting alerts from two exadata cells and one linux HP G5
From Jan 1, 2012 we've gotten this many purple trap alerts:
Host State changes dm02cel14.example.com 4 ( 28.57 %) dm02cel13.example.com 4 ( 28.57 %) dm03cel14.example.com 1 ( 7.14 %) dm03cel13.example.com 1 ( 7.14 %) dm03cel12.example.com 1 ( 7.14 %) dm03cel11.example.com 1 ( 7.14 %) dm03cel09.example.com 1 ( 7.14 %) dba-apps2.example.com 1 ( 7.14 %) Other hosts 0 ( 0.00 %)
Note that we have 28 identically configured exadata cels and only seven of them have ever alerted
It's not really that many - fourteen alerts in 15 months - but since every single one is bogus, I'd like to hunt it down and shoot it If somethng were really sending these alerts I think we'd get more than this.
participants (3)
-
betsy.schwartz@gmail.com
-
david.baldwin@ausport.gov.au
-
jlaidman@rebel-it.com.au