Windows 2008 Failover Cluster Validation Fails on ‘Validate SCSI-3 Persistent Reservation’

We’ve been seeing a lot of calls lately from customers who are running the validation that’s required prior to installing and configuring failover clustering, and the validation fails in the ‘Storage’ portion of the tests. The specific error seen in the validation report is:

image

If you click on the ‘Validate SCSI-3 Persistent Reservation’ link in the report. It will take you to the detail section.

Validate SCSI-3 Persistent Reservation

  • Validate that storage supports the SCSI-3 Persistent Reservation commands.
  • Validating Cluster Disk 0 for Persistent Reservation support
  • Registering PR key for cluster disk 0 from node node1.cluster.com
  • Failed to Register PR key for cluster disk 0 from node node1.cluster.com status 1
  • Cluster Disk 0 does not support Persistent Reservation

If you dig a little deeper, you can also look at the ValidateStorage.txt file that’s located in the Windows\Cluster\Reports directory.

00000fd4.00000fd8::15:56:45.857 CprepDiskPRUnRegister: Enter CprepDiskPRUnRegister: ulSignature 0xd0426bb2
00000fd4.00000fd8::15:56:45.857 CprepDiskFind: found disk with signature 0xd0426bb2
00000fd4.00000fd8::15:56:49.977 CprepDiskPRUnRegister: Failed to unregister PR key, status 1117
00000fd4.00000fd8::15:56:49.977 CprepDiskPRUnRegister: Exit CprepDiskPRUnRegister: hr 0x8007045d
00000fd4.00000fd8::15:56:54.097 CprepDiskFind: found disk with signature 0xd0426bb2
00000fd4.00000fd8::15:56:54.097 CprepDiskIsPRPresent: Failed to read PR reservations, status 0
00000fd4.00000fd8::15:56:54.097 CprepDiskIsPRPresent: Exit CprepDiskIsPRPresent hr 0x0, Present 0
00000fd4.00000fd8::15:56:54.097 CprepDiskFind: found disk with signature 0xd0426bb2
00000fd4.00000fd8::15:56:54.097 DoIoctlAndAlloc: ControlCode 0x70050, retCode 1, status 122
00000fd4.00000fd8::15:56:54.097 CprepDiskGetArbSectors: Exit CprepDiskGetArbSectors: hr 0x0, SectorX 11 SectorY 12

So what is a “Persistent Reservation” (PR) and why should you care? A PR is a SCSI command, which clustering uses to protect LUN’s. When a LUN is reserved, no other computers on the SAN can access the disk, except the ones cluster controls. This is important to protect other machines from accessing the disk and corrupting the data on the disk.

Validate is a functional test tool that verifies that your storage supports all the necessary SCSI commands that clustering requires. It is critical that Validate tests pass, for your cluster to work correctly. The Storage tests are by far the most important, they should not be dismissed!

If you are reading this blog, then the bad news is that Validate has probably identified that your storage does not support Persistent Reservations, and is not compatible with Windows Server 2008 Failover Clustering. The good news is that it most likely will work, you just have to do a few things! All storage vendors and almost all current shipping models support Win2008 Failover Clustering, but many require firmware updates or configuration settings. Microsoft has been working closely with partners such as HP, EMC, IBM, NetApp, HDS, Fujitsu, Lefthand, Equallogic, Xiotech, NEC, LSI, Infortrend, 3PAR, Intransa, FalconStor, Nexsan, and even more… and they all work!

First things first, call your storage vendor and ask them if your storage is compatible AND configured for use with Windows Server 2008 Failover Clustering.

There are two things to verify:

  1. Correct firmware version
  2. Correct configuration settings

The storage vendor is really the right person to tell you how to correctly configure their arrays to work with Failover Clustering, so they are the right source. We can’t post the specific steps for each vendor but as we become aware of publicly available documentation from the SAN vendors, we’ll add them to this post as they start being published.

HP has a publicly available document detailing the steps needed to get SCSI-3 PR’s to work specific to their hardware (pages16-18)

Implementing Microsoft® Windows® Server 2008 Service Pack 2 beta on HP ProLiant servers

With any of these vendor links, although they may contain steps to resolve the PR problem, we still strongly recommend being directly engaged with the vendor to verify with them that these storage configuration changes are current, appropriate for your environment and hardware, and non destructive to your data. Microsoft makes no guarantees on any of the 3rd party links we are providing. They are solely intended to have information on hand to discuss with your particular vendor.

Jeff Hughes
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support