Windows Server 2012 SMB Multichannel and CSV Redirected traffic caveats

CSV (Cluster Shared Volumes) volumes were introduced with Windows Server 2008 R2 to enable simultaneous access to the same LUN from several Hyper-V Host, using a common name space called ClusterStorage and allowing to store several VMs by providing them with high-availability and Live Migration without moving the LUN from one host to another.

In addition to simplifying and centralising the management of VMs, CSV volumes added a greater fault tolerance, allowing redirect access to the LUN through Cluster Networks to the Coordinator node, in the event that any of the nodes lose direct connectivity with the SAN. This for example allows continue giving service to the VMs hosted on a Hyper-V server that loses connectivity to storage, since you can redirect disk access through cluster networks until connectivity is restored with the SAN.

The main intention of this article is to cover and explain how the new Windows Server 2012 Failover Cluster manages the CSV Networks and the SMB Multichannel together, but before moving forward let’s review how this works actually in Windows Server 2008 R2.

The old way. Windows Server 2008 R2 Cluster Networks roles and metrics

In Windows Server 2008 R2 we can define which Cluster Network we want to use for CSV redirected traffic by configuring the network metric based on roles that we assign to them. CSV redirected traffic will use the network with the lowest metric. As you may know, we have three different types of Cluster Networks based on their role:

· External Network

The public network is the network which clients will connect to gain access to the highly available resources on the cluster. This network will be connected to the main network infrastructure in your organization. This is automatically selected when the cluster is built by determining which network has a default gateway.

· Internal Network

All networks in a cluster are used for intra cluster communication. You can fine tune this to ensure intra cluster communication uses a specific network and can fall back to another network if required. The private network should also be used for management tasks ensuring no additional overhead is being added to the public, storage and Live Migration networks.

· Excluded Cluster Networks

All the networks that we want to isolate from the Cluster and don’t use it for intra communications must be excluded. iSCSI or SMB storage traffic introduced in Windows Server 2012 are the types of networks you want to exclude

The following table shows the types of roles that exist in Windows Server 2008 R2 and the value of the metric assigned to each of these roles.

Type

Role

Metric

Default Gateway

Comments

External Network

3

10000

Yes

Top of Form

By default all external cluster network will have a metric value starting at 10000 and incrementing by 100.  The first external network which the cluster sees when it first comes online has a metric of 10000, the seconds has a metric of 10100, etc. Bottom of Form

Internal Network

 

1000

No

Top of Form

By default, all internal cluster network will have a metric value starting at 1000 and incrementing by 100.  The first internal network which the cluster sees when it first comes online has a metric of 1000, the second has a metric of 1100, etc Bottom of Form

Excluded Cluster Network

0

10100

No

By default all the excluded networks for Cluster will have a metric value of 100 more than the biggest External Network. If the biggest external network has a metric of 10100, the first excluded network will have a value of 10200

We will use an example to show how the Cluster networks configured metrics depending on the role that they play in Windows Server 2008 R2.

In the screenshoot below, we have a Cluster with 3 networks (Mgmt, Cluster, and Live Migration) and metrics have been automatically assigned.

· Network Mgmt is external and the Cluster automatically assigns the 10000 metric.

· The Cluster network is the first detected internal network and is automatically assigned the 1000 metric.

· The second internal network LiveMigration is assigned the metric of 1000 + 100

This configuration ensures that the CSV redirected traffic always use the “Cluster” network because it has the lowest metric.

clip_image002

While the Cluster can assign metrics correctly depending on the role as we have seen in the previous example, we may find ourselves in situations with multiple Cluster Networks, where might we configure these settings manually. To do this, Windows Server 2008 R2 offers us the possibility to use Powershell module for Clusters and assign metrics manually. The following code example shows how to change the Cluster network metrics rather than use its default 1000, use one even lower value, and thus ensure that other internal Cluster networks always have higher values. You can find more detailed information in the following link https://blogs.msdn.com/b/clustering/archive/2011/06/17/10176338.aspx

clip_image004

The new way. Windows Server 2012 Cluster Network roles and Metrics

In Windows Server 2012 Failover Clusters, the philosophy of the metrics and the roles of Cluster networks is maintained. So, we still have external, internal, and excluded networks, but the value of the metrics by default change substantially for a better integration with the new SMB Multichannel functionality. In Windows Server 2008 R2, the CSV traffic could be redirected by a single physical network adapter, but in Windows Server 2012, the CSV traffic can be redirected using more than one network adapter simultaneously taking advantage of SMB Multichannel.

However, if by any restrictions of our environment, we want to ensure that we only use the network with the lower metric, we must understand how the Cluster in Windows Server 2012 determines that the network traffic redirected CSV.

The following table shows how Windows Server 2012 assigns metrics automatically to the Cluster networks on the basis not only of the role, but the functionality of the physical network card

Type

Role

Metric

Default Gateway

if RDMA Capable

if RSS Capable

NetFT Link Speed (1GB at least)

Comments

External Network

3

Starting at 80000

Yes

-19000

-9600

- (16 * Network Card Link Speed in GB)

Top of Form

RSS will not be substracted to the metric if the adapter is RDMA capable .If the second and next External cards have the same value, the cluster will increase the final metric value by one for these additional External Networks Bottom of Form

Internal Network

1

Starting at 40000

No

-19000

-9600

- (16 * Network Card Link Speed in GB)

Top of Form

RSS will not be substracted to the metric if the adapter is RDMA capable. If the second and next External cards have the same value, the cluster will increase the final metric value by one for these additional Internal Networks Bottom of Form

Excluded Cluster Network

0

Starting at 80000

No

-19000

-9600

- (16 * Network Card Link Speed in GB)

Top of Form

RSS will not be substracted to the metric if the adapter is RDMA capable. If the second and next External cards have the same value, the cluster will increase the final metric value by one for these additional Excluded Networks Bottom of Form

Like for Windows Server 2008 R2, let’s see a real example in Windows Server 2012 to better understand how this formula apply to our environment.

In the screenshoot below, we have a Cluster with 5 networks (Contoso_Mgmt, Contoso_Cluster, iSCSI, Live Migration and Slow) and the metrics that have been automatically assigned once the cluster was created.

· Network “Contoso_Mgmt” is external and the Cluster automatically assigns the 70240 metric.

o This is because the Physical Adapter is not RDMA but is RSS capable with a 10GB Link Speed. (80000 – 9600 -160 = 70240)

· The “Live Migration” network is the first detected internal network when the cluster was created and has the autometric of 30240.

o This is because the physical adapter is not RDMA but is RSS capable with a 10GB link Speed. (40000 -9600 – 160 = 30240)

· The second internal network “Contoso_Cluster” has the autometric of 30240 +1.

o This is because the physical adapter is not RDMA but is RSS capable with a 10GB link Speed. (40000 -9600 -160 +1 = 30241)

· The third internal network “Slow” has the autometric of 40000.

o This is because the physical adapter is not RDMA and RSS capable and the link Speed is 100MB. The NetFT substract is not applied because the adapter must be at least 1GB.

· The iSCSI network is automatically detected and Excluded from the Cluster Networks. The metric assigned (70241) will be the bigger and not used for Cluster communications

o This is because the physical adapter is not RDMA but is RSS capable with a 10GB link Speed (80000 – 9600 -160 +1 = 70241)

clip_image006

In the above configuration, the “Live Migration” Cluster network has the lowest metric, and this is maybe a configuration that we don’t want for our environment so let’s manually change the metrics and assign the lowest metric to the “Contoso_Cluster” Cluster Network.

clip_image008

How the SMB Multichannel changes the behavior to select the CSV Cluster Network:

Now, the “Contoso_Cluster” Cluster Network has the lowest metric, but does this means that this will be the only adapter used for CSV Redirected traffic? The answer is NO! The reason why the metrics doesn’t apply in Windows Server 2012 by default, is because SMB Multichannel is enabled by default as well. So let’s figure out what subnets or Cluster Networks will be used in this particular case applying these rules:

· Rule 1: SMB Multichannel take precedence over the Network Priorities of NetFT to decide what Subnets to use for the CSV Redirected traffic

· Rule 2: The cluster will only use Internal Cluster Networks by default for SMB Multichannel. This behavior can be changed to also used the External Networks modifying the UseClientAccessNetworksForSharedVolumes Cluster parameter

· Rule 3: SMB Multichannel requires identical link speed and features (RSS and/or RDMA) to stream the CSV redirected traffic over different subnets simultaneously

· Rule 4: If Adapters are not identical, SMB Multichannel will use the faster adapter/s only to stream the CSV redirected traffic

· Rule 5: Failover Cluster will fail back to NetFT the decision of what subnet to use only if SMB Multichannel is not available or disabled. Then the lowest metric logic will apply and the CSV redirected traffic will be send over the Lowest metric subnet

Reading the rules 1 and 2 we realize that SMB Multichannel is enabled because is the default configuration in Windows Server 2012, and three network adapters can be used for CSV Redirected traffic. The Internal adapters with role 1. If we continue reading the rules 3 and 4 we also see that “Slow” network has slower link speed (100MB) compared with networks “Contoso_Cluster” and “Live Migration”, so this finally explain us why these two Cluster Networks will be used for CSV Redirected traffic.

In some scenarios we may need to force use only one particular subnet for CSV Redirected traffic and avoid SMB Multichannel even if it’s the last recommended step. To achieve that, we need to disable SMB Multichannel using the command “Set-SMBClientConfiguration –EnableMultichannel $False” and reboot the server (Note that we must apply this on all cluster nodes). After this change, the Cluster will use the NetFT Network Priorities and will force to use only the subnet with the lowest metric that we have manually configured before.