CLUSTERS


Fault recovery in a cluster
Fault recovery is the ability of a IBM® Lotus® Domino™ server to clean up and restart itself after a failure. Fault recovery works well in a Domino cluster. If there is no Domino server to fail over to, fault recovery still ensures that users will have constant access to their data. Even if users fail over to another cluster server, fault recovery increases availability because the failed server becomes available again. In addition, depending on the workload balancing parameters you've set, some users will fail back to the original server when they open new databases.

If you are using an operating system cluster in conjunction with a Domino cluster, the decision about whether or not to use fault recovery depends on how you configured the operating system cluster. If you configured the operating system cluster to fail over on a hardware failure only, fault recovery works well. Fault recovery restarts Domino on its current server, and no operating system fail over occurs.

If you configured your operating system cluster to fail over on both hardware and software failures, you don't need fault recovery because the operating system cluster will restart Domino on another server in the cluster. In fact, you should disable fault recovery so you won't have Domino restarting itself while the operating system cluster is also restarting it. This can lead to problems.

By default, fault recovery is disabled. You enable it in the Server document.

1. From the Domino Administrator or the Web Administrator, click the Configuration tab.

2. In the Task pane, expand Server, and click All Server Documents.

3. In the Results pane, select the Server document you want, click Edit Server, and then click the Basics tab.

4. In the Fault Recovery section, choose "Enabled" in the "Automatically Restart Server After Fault/Crash" field.

5. (Optional) Complete any of the following fields that you want.

6. Make any other changes you want to the Server document, and then click Save & Close.