On Thursday 26 March 2009, Malcolm Hunter wrote:
The other (the slave one) first checks the status of the other server (a simpel wget of the status page can be enough) and only sends out the alert if this page is not green.
So, basically, both servers are triggering on the same alert, but the slave server only sends out the alert if the primary server is not green.
Wouldn't there be more involved in this? What if the primary server's hobbit daemon was down, but the web service was still running? The secondary server would want to report the hobbit daemon being down, but wouldn't because the primary server's page was still green and hadn't been updated. Not if you let the master hobbit send it status to the slave hobbit and check the status on the slace hobbit. If the master hobbit is down, the slave hobbit will become purple. It is possible that you miss some alerts.
If I need to do something like this, I would create a custom test that does some more advanced checking (Can I send an alert? Can I rech my smtp server? ...). The status is send to the other hobbit server and can be used to trigger a failover in alerting.
Stef