So what does Cluster Recovery actually recover anyway?

The Windows Server 2003 Resource Kit introduced a handy, easy-to-use utility that even warranted its own download link on Microsoft.com/downloads. However the name “Cluster Recovery” can lead to misunderstanding and taken at face value makes quite a bold statement. So what exactly does this handy utility actually recover?

ClusterRecovery.exe in fact performs two completely independent but important tasks. The first functionality probably suits its name the best, in that it recovers lost cluster resource checkpoints. To understand this functionality a bit of background technical detail is necessary here. The Microsoft server clustering service provides the ability to cluster various server features, roles, applications and services. These “resources” may have registry configuration keys stored in various locations under HKLMSoftware and/or HKLMSystem that need to be maintained and synchronized between all nodes of a cluster. This synchronization is necessary to ensure stability and predictable behavior of the resource across the entire cluster. The clustering service provides a method for replicating these keys between server nodes and this automated process includes saving these registry keys onto the quorum device under the default folder location <quorumdrive>:MSCS{Resource GUID}. These folders and files (0000000n.CPR) are what we call Resource Checkpoints. Should these checkpoint files ever get lost or deleted due to various reasons we can use ClusterRecovery.exe to simply recreate these checkpoint files.

The second functionality that Cluster Recovery provides could arguably be the more valuable, however it may also be the most misunderstood. The option to “Replace a physical disk resource” doesn’t actually do any disk replacement for you, it does however automate a very tedious clean up job on all of your other resources if you have had to replace one or more clustered storage volumes. Here is what you need to know before using Cluster Recovery if you have had a disk failure or simply need to replace or migrate your clustered disks to new disks. First you will need to present/attach your new LUNs, partition and format the disks and restore all data to your new disks. Then using Cluster Administrator create a new Physical Disk resource to manage the new disk. Once that is complete you are still left with an original disk resource that, more than likely, will have multiple other resources that are configured to be dependent upon it. In the case of file server clusters this could be quite a few File Share resources. Here is where Cluster Recovery comes into play. More precisely, it will analyze the cluster resources to find any resource that is dependent upon your specified original disk resource and move that dependency to the new disk resource. It will also rename your original disk resource to “<Original Name> (lost)” and rename the new disk resource to “<Original Name>”. For example, if you have a Physical Disk resource “Disk U:” with a File Share resource named “User Shares - U:Users” that is dependent upon “Disk U:” and now you have added a larger “New DISK U:” to the cluster to replace the original, the following is what will occur through the use of Cluster Recovery:

Before:

[Disk U:]                 <-- original resource

+    |_______ [User Shares - U:Users]

[New DISK U:]        <-- new resource

After:

[Disk U: (lost)] <-- original resource

[Disk U:]           <-- new resource

+    |_______ [User Shares - U:Users]

You can download Cluster Recovery here.

Notes: Once again you must add, format and restore any data and manage the drive letters manually in addition to creating the cluster resource yourself. Also there is no 64 bit version as of this writing and the 32 bit download should not be used with 64-bit operating systems.

Author: Chris Allen

Microsoft Enterprise Platforms Support

Support Escalation Engineer