I'd be using Henrik's solution as follows, given your situation:
"I run two completely separate systems in parallel, and have the clients report to both of them. The system at our disaster center has the paging module disabled (just disable the [bbpage] section in hobbitlaunch.cfg), to avoid double alerts - it is simple to activate it, if necessary.
"Config files are rsync'ed from the primary site to the disaster site regularly."
Though to be honest, this failover script may be something that can be converted over to be used in hobbit. You might be better off going one of a dozen different options that are slightly different than how you have it setup, but that's up to you.
Hobbit doesn't have this built-in. That's for sure. I would think it's fairly easy to use it to get much the same effect, though. I'll wait for others responses on your situation and throw my own thoughts back in tomorrow morning.
Tod Hansmann Network Engineer
-----Original Message----- From: Sloan [mailto:joe at tmsusa.com] Sent: Thursday, November 01, 2007 5:03 PM To: hobbit at hswn.dk Subject: Re: [hobbit] big brother replacement
Tod Hansmann wrote:
Let me see if I understand. You have several bb servers at one datacenter, each with their twin at the other datacenter, and both sets do the tests. They report to one central display server, but only one set reports at a time, depending on failover state, correct?
You have the basic idea, but there is no single central server, just pairs of bb servers, one to a data center, in each lan which is being monitored. For each pair of bb servers, only the server at data center A does reporting, unless the server in data center B cannot reach the server in data center A, in which case the server in data center B will take over the reporting duties until the bb server in data center A becomes reachable again. While this could theoretically lead to a split brain condition, the failover condition has only ever triggered when there was a wan outage.
Is this failover automatic? If so, how is this failover determined? What if this failover has a false positive? If not, what is your timeframe to swap over?
IIRC It takes one bb cycle to kick in.
We've not seen a false positive, as I mentioned above.
It's just the standard built-in bb failover -
head ~bb/ext/failover follows:
#!/bin/sh
failover
BIG BROTHER - FAILOVER SCRIPT
Sean MacGuire
(c) Copyright Quest Software, Inc. 1997-2003 All rights reserved.
failover WATCHES BBNET and BBPAGER
IF BBNET OR BBPAGER BECOMES UNAVAILABLE, THEN TAKE OVER UNTIL THEY
RETURN
To use, just add failover to the BBEXT variable in etc/bbdef.sh
To configure BBPAGER failover:
define both the primary and failover machines as BBPAGERS in
etc/bb-hosts
and set bbwarn: FAILOVER in etc/bbwarnsetup.cfg
Joe
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk