dr for hobbit

19 May 2008


      Hi all,
I am redesigning the method we use for performing a failover to a disaster
recovery installation of hobbit. I am interested in opinions on the approach
and any shortcomings.
Note: This is not HA/clustering, it is for DR purposes.
We are aiming to have:
a production hobbit deployment
a DR hobbit deployment
clients will be configured to send metrics to both servers. which will keep
historical rrd data up to date etc.
The production server will be configured to send out alerts. The dr server
will not.
At regular intervals, rsync will be used to synchronise data from the
production server to the dr server, including the in memory checkpoint file.
In the event of a dr, the dr hobbit server will be promoted to active by
restarting hobbit, and loading the checkpoint and alert configurations.
I am expecting that this will ensure that the dr server will be "up to date"
with proudction as per the last checkpoint. This includes tests that have
been disabled or acknowledged.
Prior to failback to the production hobbit installation, the reverse of the
above would be performed.
An rsync of rrd data files would be performed to cover any windows where one
of the servers was offline for a period of time.
Is there anything wrong with this approach?
Cheers
Phil
--
Tel: 0400 466 952
Fax: 0433 123 226
email: philwild AT gmail.com

dr for hobbit

philwild＠gmail.com