Support Tip: Virtual Machine Connections to Storage Lost When Adding Host Cluster in VMM 2012 R2

~ Chuck Timon | Senior Support Escalation Engineer

HOW62Recently we had a customer report a significant outage in his Windows Server 2012 R2 Hyper-V Failover Cluster when adding it to System Center 2012 R2 Virtual Machine Manager (VMM 2012 R2). The result was that virtual machine workloads went offline because of a loss of connectivity to the backend storage.

We have not seen this issue very much, and we suspect the root cause to be environmental in nature, however we wanted to make this information available for awareness. The issue only surfaces when there is multi-path connectivity to the storage hosting the virtual machine files, and the storage connectivity can be via the inbox Microsoft MPIO or by way of a third party MPIO solution (DSM (Device Specific Module)).

When Hyper-V Hosts\Clusters are added to VMM for management, VMM deploys an agent that provides the communications connectivity between the host and the VMM server.

clip_image002

Additionally, a Refresh Host step is included as part of the job to ensure that VMM collects all of the relevant configuration information pertaining to the host\cluster, which is then added to the VMM database. Configuration information includes what storage the host has access to and whether or not the access is via a single or multiple paths. In multi-path scenarios where the Multipath I/O feature is installed, registry keys are created that VMM will pre-populate with a list of devices.

The two registry keys are:

HKEY LOCAL MACHINE\SYSTEM\CurrentControlSet\Control\MPDEV\MPIOSupportedDeviceList

HKEY LOCAL MACHINE\SYSTEM\CurrentControlSet\Services\msdsm\Parameters\DsmSupportedDeviceList

There is a single device added to each list by the Multipath I/O feature installation: 'Vendor 8Product 16'. This is a placeholder value to show the correct Vendor ID\Product ID format and is not used for management of any disk devices. See the following for more information:

Determining the Hardware ID to Be Managed by MPIO

Looking at a debug trace of the Add-SCVMHost job execution shows the list of devices being added. An example is below.

[Microsoft-VirtualMachineManager-Debug]9,2,AddHyperVHostSubtask.cs,544,Mpio hardware VendorId MSFT2005 ProductId iSCSIBusType_0x9 addition successful on host contoso-hyp3.contoso.com,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,2,AddHyperVHostSubtask.cs,544,Mpio hardware VendorId MSFT2011 ProductId SASBusType_0xA addition successful on host contoso-hyp3.contoso.com,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,2,AddHyperVHostSubtask.cs,544,Mpio hardware VendorId NETAPP ProductId LUN addition successful on host contoso-hyp3.contoso.com,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,2,AddHyperVHostSubtask.cs,544,Mpio hardware VendorId EMC ProductId SYMMETRIX addition successful on host contoso-hyp3.contoso.com,{00000000-0000-0000-0000-000000000000}

After all the devices are added, a 'claim' is placed on them by VMM and the trace reflects this.

[Microsoft-VirtualMachineManager-Debug]4,4,WsmanAPIWrapper.cs,2424,WinRM: URL: [http://contoso-hyp3.contoso.com:5985], Verb: [INVOKE], Method: [Update], Resource: [http://schemas.microsoft.com/wbem/wsman/1/wmi/root/microsoft/windows/storage/MSFT_MPIOClaimedHW],{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,2,AddHyperVHostSubtask.cs,586,Claiming Mpio hardware successful on host contoso-hyp3.contoso.com with restart required True,{00000000-0000-0000-0000-000000000000}

The final list of devices includes the following:

Vendor 8Product 16 ß Placeholder
MSFT2005iSCSIBusType_0x9
MSFT2011SASBusType_0xA
NETAPP LUN
NETAPP LUN C-Mode
EMC SYMMETRIX
HP HSV300
DGC DISK
DGC LUNZ
DGC RAID 0
DGC RAID 1
DGC RAID 10
DGC RAID 3
DGC RAID 5
DGC VRAID
3PARdataVV
EQLOGIC_100E-00
IBM 2145

You will notice in the final trace snippet above that a restart is required. This is also reflected in a Warning message (26211) after the job completes:

clip_image004

Under ideal conditions (i.e. the hosts\clusters are added in VMM before workloads are deployed), an interruption in storage connectivity is not noticed, the Hosts are rebooted and life goes on. However, we understand that most environments are already in place where hosts are deployed with workloads running in them and then System Center Virtual Machine Manager is added later. In an effort to be more proactive, and to help customers avoid a potential outage, the SCVMM Product Team has provided the PowerShell script below to pre-populate the MPIO registry keys in a host before bringing it under management by VMM.

Here is the script:

#———————————————————————————————
#This script pre-populates the device whitelist when MPIO is being used for access to storage
#This script is provided 'As-Is' so use it at your own risk
#———————————————————————————————

$scvmmMpioHardwareIdArray = @()
$mpioHardwareIdClass = get-cimclass -ClassName MSFT_MSDSMSupportedHW -Namespace ROOT/Microsoft/Windows/Storage
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="MSFT2005"; "ProductID"="iSCSIBusType_0x9"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="MSFT2011"; "ProductID"="SASBusType_0xA"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="NETAPP" ; "ProductID"="LUN"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="NETAPP" ; "ProductID"="LUN C-Mode"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="EMC" ; "ProductID"="SYMMETRIX"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="HP" ; "ProductID"="HSV300"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="DISK"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="LUNZ"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="RAID 0"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="RAID 1"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="RAID 10"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="RAID 3"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="RAID 5"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="DGC" ; "ProductID"="VRAID"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="3PARdata"; "ProductID"="VV"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="EQLOGIC_"; "ProductID"="100E-00"} -ClientOnly
$scvmmMpioHardwareIdArray += New-CimInstance -CimClass $mpioHardwareIdClass -Property @{"VendorID"="IBM" ; "ProductID"="2145"} -ClientOnly
$hwIdsOnSystem = Get-MSDSMSupportedHW

foreach ($scvmmMpioHardwareId in $scvmmMpioHardwareIdArray)

{

$found = $false;
foreach ($hwIdOnSystem in $hwIdsOnSystem)

{

if(($hwIdOnSystem.ProductId -eq $scvmmMpioHardwareId.ProductId) -and ($hwIdOnSystem.VendorId -eq $scvmmMpioHardwareId.VendorId))

{

$found = $true
break

}

}

if(!$found)

{

New-MSDSMSupportedHW -VendorID $scvmmMpioHardwareId.VendorId -ProductID $scvmmMpioHardwareId.ProductId

}

}

Looking at a debug trace of the Add-SCVMHost job execution where the PowerShell script has been 'pre-emptively' run on the host, we see the following:

[Microsoft-VirtualMachineManager-Debug]4,4,WsmanAPIWrapper.cs,2085,WinRM: URL: [http://contoso-hyp3.contoso.com:5985], Verb:[ENUMERATE], Resource: [http://schemas.microsoft.com/wbem/wsman/1/wmi/root/microsoft/windows/storage/MSFT_MSDSMSupportedHW], Filter: [],{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,4,AddHyperVHostSubtask.cs,561,Mpio hardware VendorId MSFT2005 ProductId iSCSIBusType_0x9 already present on host contoso-hyp3.contoso.com, skipping addition,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,4,AddHyperVHostSubtask.cs,561,Mpio hardware VendorId MSFT2011 ProductId SASBusType_0xA already present on host contoso-hyp3.contoso.com, skipping addition,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,4,AddHyperVHostSubtask.cs,561,Mpio hardware VendorId NETAPP ProductId LUN already present on host contoso-hyp3.contoso.com, skipping addition,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,4,AddHyperVHostSubtask.cs,561,Mpio hardware VendorId NETAPP ProductId LUN C-Mode already present on host contoso-hyp3.contoso.com, skipping addition,{00000000-0000-0000-0000-000000000000}

[Microsoft-VirtualMachineManager-Debug]9,4,AddHyperVHostSubtask.cs,561,Mpio hardware VendorId EMC ProductId SYMMETRIX already present on host contoso-hyp3.contoso.com, skipping addition,{00000000-0000-0000-0000-000000000000}

While we are unable to provide any support for the script itself, I hope you will find this useful.

Thanks, and come back again soon!

Chuck Timon | Senior Support Escalation Engineer | Microsoft Enterprise Platforms Support

Get the latest System Center news on Facebook and Twitter:

clip_image001 clip_image002

System Center All Up: http://blogs.technet.com/b/systemcenter/

Configuration Manager Support Team blog: http://blogs.technet.com/configurationmgr/ 
Data Protection Manager Team blog: http://blogs.technet.com/dpm/ 
Orchestrator Support Team blog: http://blogs.technet.com/b/orchestrator/ 
Operations Manager Team blog: http://blogs.technet.com/momteam/ 
Service Manager Team blog: http://blogs.technet.com/b/servicemanager 
Virtual Machine Manager Team blog: http://blogs.technet.com/scvmm

Microsoft Intune: http://blogs.technet.com/b/microsoftintune/
WSUS Support Team blog: http://blogs.technet.com/sus/
The RMS blog: http://blogs.technet.com/b/rms/
App-V Team blog: http://blogs.technet.com/appv/
MED-V Team blog: http://blogs.technet.com/medv/
Server App-V Team blog: http://blogs.technet.com/b/serverappv
The Surface Team blog: http://blogs.technet.com/b/surface/
The Application Proxy blog: http://blogs.technet.com/b/applicationproxyblog/

The Forefront Endpoint Protection blog : http://blogs.technet.com/b/clientsecurity/
The Forefront Identity Manager blog : http://blogs.msdn.com/b/ms-identity-support/
The Forefront TMG blog: http://blogs.technet.com/b/isablog/
The Forefront UAG blog: http://blogs.technet.com/b/edgeaccessblog/

SCVMM 2012 R2