Suitability for bb replacement in large enterprise
Hello list,
I learned about hobbit a couple years back, and I've been using it in some small shops, but I'm hoping to be able to deploy it for my main employer.
At my day job, we've been using the old big brother 1.9e plus bbgen-3.5 to monitor hundreds of unix and windows servers spread across 2 data centers. Id like to move to hobbit, but it is important that it be able to do what bb does. For the most part it is better than bb, but I have 2 specific areas of concern:
- snmp traps: Our netcool ticketing system relies on snmp traps from big brother whenever there is a significant event.
Are any xymon users currently using snmp traps in a similar sort of way?
- alerting failover: We currently have active-active monitoring of all our systems. That is, the bb servers in both data centers redundantly monitor all the servers, but it would be a nuisance to get 2 pages for every event, so we use the bb failover setup so that normally only the bb server in data center "A" sends alerts, but if the bb server in data center "B" can't reach the bb server in data center "A", then it goes into failover mode and sends the alerts on behalf of the unreachable bb server in data center "A". The worst case would be some sort of split brain where we get 2 alerts, but we have not seen that scenario arise, and it works well.
Is xymon 4.2.2 capable of this sort of failover behavior?
I'm hoping to replace bb with xymon, but these 2 items are deal breakers.
Thanks in advance for your words of wisdom,
Joe
Joe Sloan wrote:
Hello list,
I learned about hobbit a couple years back, and I've been using it in some small shops, but I'm hoping to be able to deploy it for my main employer.
At my day job, we've been using the old big brother 1.9e plus bbgen-3.5 to monitor hundreds of unix and windows servers spread across 2 data centers. Id like to move to hobbit, but it is important that it be able to do what bb does. For the most part it is better than bb, but I have 2 specific areas of concern:
- snmp traps: Our netcool ticketing system relies on snmp traps from big brother whenever there is a significant event.
We use the elegant method from Andy Farrior : http://cerebro.victoriacollege.edu/hobbit-trap.html
Dominique UNIL - University of Lausanne
Dominique Frise wrote:
Joe Sloan wrote:
- snmp traps: Our netcool ticketing system relies on snmp traps from big brother whenever there is a significant event.
We use the elegant method from Andy Farrior : http://cerebro.victoriacollege.edu/hobbit-trap.html
Interesting, thanks for the reference -
Joe
On Sunday 07 December 2008 12:46:14 Dominique Frise wrote:
Joe Sloan wrote:
Hello list,
I learned about hobbit a couple years back, and I've been using it in some small shops, but I'm hoping to be able to deploy it for my main employer.
At my day job, we've been using the old big brother 1.9e plus bbgen-3.5 to monitor hundreds of unix and windows servers spread across 2 data centers. Id like to move to hobbit, but it is important that it be able to do what bb does. For the most part it is better than bb, but I have 2 specific areas of concern:
- snmp traps: Our netcool ticketing system relies on snmp traps from big brother whenever there is a significant event.
We use the elegant method from Andy Farrior : http://cerebro.victoriacollege.edu/hobbit-trap.html
This is the wrong way around though, it adds support for Hobbit reporting on snmp traps sent by other devices, not for sending SNMP traps for alerts.
However, it should be relatively easy to make an alerting script to plug into Hobbit to send traps.
The biggest requirement for someone to be able to test such a script compatible with the BB feature would be the MIB file used by BB ...
Regards, Buchan
In <200812081608.21732.bgmilne at staff.telkomsa.net> Buchan Milne <bgmilne at staff.telkomsa.net> writes:
However, it should be relatively easy to make an alerting script to plug into Hobbit to send traps.
The biggest requirement for someone to be able to test such a script compatible with the BB feature would be the MIB file used by BB ...
As far as I recall, BB just sends a trap message using an OID that's been configured in the alert-setup config. A trap is really just a plain text-string wrapped into an SNMP message.
So there's no MIB involved, other than the "MIB" which has only the one OID which is used for the trap.
Regards, Henrik
Henrik Størner wrote:
In <200812081608.21732.bgmilne at staff.telkomsa.net> Buchan Milne <bgmilne at staff.telkomsa.net> writes:
However, it should be relatively easy to make an alerting script to plug into Hobbit to send traps.
The biggest requirement for someone to be able to test such a script compatible with the BB feature would be the MIB file used by BB ...
As far as I recall, BB just sends a trap message using an OID that's been configured in the alert-setup config. A trap is really just a plain text-string wrapped into an SNMP message.
So there's no MIB involved, other than the "MIB" which has only the one OID which is used for the trap.
Yes, it's very simple and straightforward: send out an snmp trap based on an event, per the bbwarnrules.cfg, just like the email alerts. If hobbit can be made to do that, one of the two key requirements would be met, and only the failover behaviour would still need to be resolved.
Joe
In <493DA5AF.1080704 at tmsusa.com> J Sloan <joe at tmsusa.com> writes:
Yes, it's very simple and straightforward: send out an snmp trap based on an event, per the bbwarnrules.cfg, just like the email alerts. If hobbit can be made to do that, one of the two key requirements would be met, and only the failover behaviour would still need to be resolved.
In hobbit-alerts.cfg you'd have something like
HOST=* TEST=disk
SCRIPT /usr/local/bin/trapmessage 0
Then your /usr/local/bin/trapmessage would do the trap-sending. E.g. using Net-SNMP tools and the defaults from BB's bbwarnsetup.cfg:
#!/bin/sh
OID="enterprises.7058" # 7058 is the Big Brother OID
SNMPSTATION="10.3.2.1" # IP of your monitoring system
# BB maps a service to a numeric trapcode. See the
# 'trapcodes' definition in bbwarnsetup.cfg
# Add those you use here.
case "$BBSVCNAME" in
"disk")
TRAPCODE="2"
;;
"cpu")
TRAPCODE="4"
;;
esac
# ... and adds 1 if the status has recovered
if test "$RECOVERED" = "1"
then
TRAPCODE=`expr $TRAPCODE + 1`
fi
snmptrap -v1 -c public $SNMPSTATION $OID \
$BBHOSTNAME 6 $TRAPCODE '' $OID s "$BBALPHAMSG"
exit 0
Regards, Henrik
participants (4)
-
bgmilne@staff.telkomsa.net
-
dominique.frise@unil.ch
-
henrik@hswn.dk
-
joe@tmsusa.com