[hobbit] Some thoughts on clustered hobbit

10 May 2005

      On 5/9/05, Kauffman, Tom <KauffmanT at nibco.com> wrote:
...
First, let me express my thanks to Brian for putting this document
together and allowing Henrik to distribute it! I've a lot of experience
with IBM's HACMP for AIX, and getting a clustered configuration working
as desired is not a trivial procedure.
Henrik -- check me on this: it's my impression we no longer need a
'BBPAGER' entry on the client-side bb-hosts because the hobbit server
passes all potentially alertable statuses to hobbit-alert and it decides
if an alert is really required.
Brian -- no offense, but I would rather categorise your configuration as
"active/inactive". I'm looking at doing an "active/passive" cluster when
time frees up -- about a month from now. The difference? I'm running two
hobbit/apache instances all the time -- but the 'passive' (fallover)
side is not doing alerting or network tests. It does build displays
(it's my technical documentation server as well) and it does keep both
history and rrd data updated. Both hosts show up on the client side as
'BBDISPLAY'. On failover it will take over the IP address for the hobbit
display and re-launch hobbit with network testing and alerting enabled.
I agree with your assessment, but chose the model for a few reasons
(note that I'm basing my experience on about 2 1/2 years running a dual
big brother failover setup):

There is always one repository for both configuration and data that are
kept reasonably identical on both systems (within the synch delay).
There is only one ip address accepting BB reports cutting down on
both network traffic and firewall rules (for hosts in locked down vlans).
The other system can be dedicated to another purpose (it currently
hosts our documentation site that fails over in the opposite direction).
No redundant work is done. Indeed, no load is being 'shared' across
the systems unless you host the web server on the other box.
There is a risk to this based on the possibility of complete machine failure
in between synchronizations. Hence, Hobbit may come up without all
the updates for hosts or alerts. Based on my current model, I will lose
about a day of historical data. These synch rates can be changed and
a gigabit crossover between machines cuts down on any traffic imposed
by multiple synch's.

Note that you could very easily turn off the hobbit alerts with the same
clustering software by truncating and restoring the hobbit-alerts.cfg file.
Not sure how to disable the network tests, so that may require some
custom coding... Once complete, you could use the same cluster
resource sw to accomplish a 'hot' standby.
Depending on host count and test count, this might be a bad idea -- but
...
we've only got about 300 entries in bb-hosts.
So -- thanks ever so much, again, for providing this -- it will make my
life ever so much easier next month when I get the time to automate the
failover environment.
Tom
Tom Kauffman
NIBCO, Inc
To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk

[hobbit] Some thoughts on clustered hobbit

brianlynch＠gmail.com