Windows Failover Cluster Storage Quick Test

This blog post brought to you by eighteen year veteran Microsoft Premier Field Engineer David Morgan.

Goal of this Post

One of the best troubleshooting tools around for Windows Failover Cluster since Windows Server 2008 is the Cluster Validation Wizard. The Validation Wizard can make certain your new failover cluster deployment goes right the first time, but it’s also a great tool to use to find problems in an existing production cluster. Did you know having a current Validation Wizard report is a part of the Microsoft Support statement for failover clustering? It is; and a current report can shave off a lot of questions & troubleshooting work the support engineer will do when they assist you.

What I want to talk about today is one part of the Validation Report and that is storage testing. For the Validation Wizard to test your clustered storage it must have storage to test and more importantly, storage that is in an offline state. For example, if you have a SQL cluster the odds are you will have to take your SQL Service offline to test any associated storage hardware. The average cluster generally has no free storage to test.

Well, most failover clusters do have a witness disk and here lies a quick method of getting storage testing performed. The witness disk since Server 2008 has been different from what clustering had in its quorum model in Server versions 2003 and older. In those legacy clusters the cluster could not survive the loss of the quorum disk but a 2008 or newer cluster can lose it and keep right only providing high availability. Now, the method I’m going to provide instructions for shortly won’t provide the ‘best’ storage test results but it will provide you a quick, safe method to get ‘some’, that may be better than taking your cluster out of production or having to have a storage team carve out and assign a piece of storage for you. The following method requires just a few clicks of the mouse. To mitigate the slight risk involved here one could just ‘change’ the quorum model to a File Share Witness instead until after the storage validation tests are complete and then change it back to a disk witness. I’ll put a link in the directions below to FSW configuration at the time you would need to configure it.

A great TechNet article one should read prior to performing this procedure:

Instruction Detail
(at wizard screens not shown in this document just click next):

STEP 1:    Note the three red squared items we will be working with in the main Cluster Administrator window.


STEP 2:

  • Right click the Cluster Name
  • Choose More Actions
  • Choose Configure Cluster Quorum Settings


STEP 3:

  • Choose Advanced Quorum Configuration


STEP 4:

  • Choose Do Not Configure A Quorum Witness
    • Note: If you wish to use the risk mitigation option of an FSW choose Configure A File Share Witness instead and follow the instructions in that path.


STEP 5:

  • Note the warning and choose Next


STEP 6:

  • Note the No Witness Configured result and choose Finish.


STEP 7:

  • In Cluster Administrator select Disks in the navigation pane and verify the Disk Witness disk has been returned to Available Storage.


STEP 8:

  • Start the Validate a Cluster option in the main Cluster Administrator window
  • In the Validate a Configuration Wizard choose Run Only The Tests I Select


STEP 9:

  • Deselect all but the Storage option


    STEP 10:

    • Select the disk you just returned to Available Storage


    STEP 11:    Allow the Wizard to take the disk offline and run the tests.


    STEP 12:    View the report and verify the tests completed


STEP 13:    Right click the Cluster Name:

  • Choose More Actions
  • Choose Configure Cluster Quorum Settings
  • Choose Select the Quorum Witness


STEP 14:    Choose Configure a Disk Witness


STEP 15:    Select the disk that was the previous Disk Witness


STEP 16:    Verify the Disk Witness has been configured successfully when the wizard finishes


STEP 17:    Verify the Disk Witness has been configured successfully in the Cluster Core Resources pane in Cluster Administrator


STEP 18:    Verify the Disk Witness has been configured successfully via the Storage/Disks option in the navigation pane


STEP 19:    Review the storage test results and troubleshoot as necessary.

To summarize:

  • We just removed and replaced the Witness Disk in a functioning cluster with no down time, reboots, failovers, restarts of the cluster service, etc. We’ve given ourselves the opportunity to test the entire storage stack, OS file system through the HBAs and all the way down to a specific physical disk, with the testing capabilities of the Failover Cluster Validation Wizard.