Deploying Windows Server 2012 with SMB Direct (SMB over RDMA) and the Mellanox ConnectX-2/ConnectX-3 using InfiniBand – Step by Step

1) Introduction

We have covered the basics of SMB Direct and some of the use cases in previous blog posts and TechNet articles. You can find them at https://smb3.info.

However, I get a lot of questions about specifically which cards work with this new feature and how exactly you set those up. This is one in a series of blog posts that cover specific instructions for RDMA NICs. In this specific post, we’ll cover all the details to deploy the Mellanox ConnectX-2 and ConnectX-3 adapters, using the InfiniBand “flavor” of RDMA.

 

2) Hardware and Software

To implement and test this technology, you will need:

  • Two or more computers running Windows Server 2012
  • One or more Mellanox ConnectX-2 or ConnectX-3 adapters for each server
  • One or more Mellanox InfiniBand switches
  • Two or more cables required for InfiniBand (typically using QSFP connectors)

Mellanox states support for Windows Server 2012 SMB Direct and Kernel-mode RDMA capabilities on the following adapter models:

  • Mellanox ConnectX-2. This card uses Quad Data Rate (QDR) InfiniBand at 32 Gbps data rate.
  • Mellanox ConnectX-3. This card uses Fourteen Data Rate (FDR) InfiniBand at 54 Gbps data rate.

You can find more information about these adapters on Mellanox’s web site.

Important note: The older Mellanox InfiniBand adapters (including the first generation of ConnectX adapters and the InfiniHost III adapters), won't work with SMB Direct in Windows Server 2012.

There are many options in terms of adapters, cables and switches.  At the Mellanox web site you can find more information about these InfiniBand adapters (https://www.mellanox.com/content/pages.php?pg=infiniband_cards_overview&menu_section=41) and InfiniBand switches (https://www.mellanox.com/content/pages.php?pg=switch_systems_overview&menu_section=49). Here are some examples of configurations you can use to try the Windows Server 2012:

2.1) Two computers using QDR

If you want to setup a simple pair of computers to test SMB Direct, you simply need two InfiniBand cards and a back-to-back cable. This could be used for simple testing like one file server and one Hyper-V server. If you want the most affordable InfiniBand solution, you can use a single-port QDR card, which operates at 32Gbps data rate. Here are the parts you will need:

  • 2 x ConnectX-2, Single port, QSFP connector, QDR InfiniBand (part # MHQH19B-XTR)
  • 1 x QSFP to QSFP cables, 1m (part # MC2206130-001)

2.2) Eight computers using QDR

If you want to try a more realistic configuration with InfiniBand, you could setup a two-node file server cluster connected to a six-node Hyper-V cluster. In this setup, you will need 8 computers, each with an InfiniBand card. You will also need a switch with at least 8 ports (Mellanox offers an 8-port model). Using QDR speeds, you’ll need the following parts:

  • 8 x ConnectX-2, Single port, QSFP connector, QDR InfiniBand (part # MHQH19B-XTR)
  • 8 x QSFP to QSFP cables, 1m (part # MC2206130-001)
  • 1 x IS5022 InfiniBand Switch, 8 ports, QSFP, QDR (part # MIS5022Q-1BFR)

2.3) Two computers using FDR

You may also try the faster FDR speeds (54Gbps data rate). The minimum setup in this case would again be two cards and a cable. Please note that the QDR and FDR cables are different, although they use similar connectors. Here’s what you will need:

  • 2 x ConnectX-3 adapter, Single port, QSFP, FDR InfiniBand (part # MCX353A-FCBT)
  • 1 x QSFP to QSFP cables (FDR), 1m (part # MC2207130-001)

Please note that you will need a system with PCIe Gen3 slots to achieve the rated speed in this card. These slots are available on newer system like the ones equipped with an Intel Romley motherboard. If you use an older system, the card will be limited by the speed of the older PCIe Gen2 bus.

2.4) Ten computers using dual FDR cards

If you’re interested in experience great throughput in a private cloud setup, you could configure a two-node file server cluster plus an eight-node Hyper-V cluster. You could also use two InfiniBand cards for each system, for added performance and fault tolerance. In this setup, you would need 20 FDR cards and a 20-port FDR switch (Mellanox sells a model with 36 FDR ports). Here are the parts required:

  • 20 x ConnectX-3 adapter, Single port, QSFP, FDR InfiniBand (part # MCX353A-FCBT)
  • 20 x QSFP to QSFP cables (FDR), 1m (part # MC2207130-001)
  • 1 x SX6036 InfiniBand Switch, 36 ports, QSFP, FDR 

3) Download and update the drivers

Windows Server 2012 includes an inbox driver for the Mellanox ConnectX-2 and ConnectX-3 cards. However, Mellanox provides updated firmware and drivers for download. You should be able to use the inbox driver to access the Internet to download the updated driver.

The latest Mellanox drivers for Windows Server 2012 can be downloaded from the Windows Server 2012 tab on this page on the Mellanox web site: https://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=32&menu_section=34.

The package is provided to you as a single executable file. Simply run the EXE file to update the firmware and driver. This package will also install Mellanox tools on the server. Please note that this package is different from the Windows Server 2012 Beta package. Make sure you grab the latest version.

After the download, simply run the executable file and choose one of the installation options (complete or custom). The installer will automatically detect if you have at least one card with an old firmware, offering to update it. You should always update to the latest firmware provided.

Note 1: This package does not update firmware for OEM cards. If you using this type of card, contact your OEM for an update.

Note 2: Certain Intel Romley systems won't boot Windows Server 2012 when an old Mellanox firmware is present. It might be required for you to update the firmware of the Mellanox card using another system before you can use that Mellanox card on the Intel Romley system. That issue might also be addressed in certain cases by updating the firmware/BIOS of the Intel Romley system.

 

4) Configure a subnet manager

When using an InfiniBand network, you are required to have a subnet manager running. The best option is to use a managed InfiniBand switch (which runs a subnet manager), but you can also install a subnet manager on a computer connected to an unmanaged switch. Here are some details:

4.1) Best option – Using a managed switches with a built-in subnet manager

For this option, make sure you use managed switches. These switches come ready to run their own subnet manager and all you have to do is enable that option using the switch’s web interface.

clip_image001

 

4.2) Using OpenSM with a single unmanaged switch

If you don’t have a managed switch, you can use one of the computers running Windows Server 2012 to run your subnet manager. When you installed the Mellanox tools on step 3, you also installed the OpenSM.EXE tool, which is a subnet manager that runs on Windows Server. You want to make sure you install it as an auto-starting service.

Although the installation program configures OpenSM to run as a service, it misses the parameter to limit the log size. Here are a few commands to remove the default service and add a new one that has all the right parameters and starts automatically. Run them from a PowerShell prompt running as Administrator:

SC.EXE delete OpenSM
New-Service –Name "OpenSM" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -L 128" -DisplayName "OpenSM" –Description "OpenSM" -StartupType Automatic
Start-Service OpenSM

Note 1: This assumes that you installed the tools to the default location: C:\Program Files\Mellanox\MLNX_VPI

Note 2: For fault tolerance, make sure you have two computers on your network configured to run OpenSM. It is not recommended to run OpenSM in more than two computers connected to a switch.

4.3) Using OpenSM with two unmanaged switches

For complete fault tolerance, you want to have two switches and have two cards (or a dual-ported card) per computer, one going to each switch. With SMB Multichannel, you get fault tolerance in case a single card, cable or switch has a problem. However, each instance of OpenSM can only handle a single switch. In this case, you need two instances of OpenSM.EXE running on the computer, one for each card, working as a subnet manager for each of the two unmanaged switches.

In order to identify the two ports you have on the system (either on a single dual-ported card or in two single-ported cards). To do this, you need to run the IBSTAT tool from Mellanox, which will show you the identification for each InfiniBand port in your system (look for a line showing the port GUID). Here’s a sample with the two port GUIDs highlighted:

PS C:\> ibstat
CA 'ibv_device0'
        CA type:
        Number of ports: 2
        Firmware version: 0x20009209e
        Hardware version: 0xb0
        Node GUID: 0x0002c903000f9956
        System image GUID: 0x0002c903000f9959

        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 40
                Base lid: 1
                LMC: 0
                SM lid: 1
                Capability mask: 0x90580000
                Port GUID: 0x0002c903000f9957

        Port 2:
                State: Down
                Physical state: Polling
                Rate: 70
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x90580000
                Port GUID: 0x0002c903000f9958

Once you have identified the two port GUIDs, you can run the following commands from a PowerShell prompt running as Administrator:

SC.EXE delete OpenSM
New-Service –Name "OpenSM1" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -g 0x0002c903000f9957 -L 128" -DisplayName "OpenSM1" –Description "OpenSM for the first IB subnet" -StartupType Automatic
New-Service –Name "OpenSM2" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -g 0x0002c903000f9958 -L 128" -DisplayName "OpenSM2" –Description "OpenSM for the second IB subnet" -StartupType Automatic
Start-Service OpenSM1
Start-Service OpenSM2

Note 1: This assumes that you installed the tools to the default location: C:\Program Files\Mellanox\MLNX_VPI

Note 2: For fault tolerance, make sure you have two computers on your network, both configured to run two instances of OpenSM. It is not recommended to run OpenSM in more than two computers connected to a switch.

 

5) Configure IP Addresses

After you have the drivers in place, you should configure the IP address for your NIC. If you’re using DHCP, that should happen automatically, so just skip to the next step.

For those doing manual configuration, assign an IP address to your interface using either the GUI or something similar to the PowerShell below. This assumes that the interface is called RDMA1, that you’re assigning the IP address 192.168.1.10 to the interface and that your DNS server is at 192.168.1.2.

Set-NetIPInterface -InterfaceAlias RDMA1 -DHCP Disabled
Remove-NetIPAddress -InterfaceAlias RDMA1 -AddressFamily IPv4 -Confirm:$false
New-NetIPAddress -InterfaceAlias RDMA1 -IPAddress 192.168.1.10 -PrefixLength 24 -Type Unicast
Set-DnsClientServerAddress -InterfaceAlias RDMA1 -ServerAddresses 192.168.1.2

6) Verify everything is working

Follow the steps below to confirm everything is working as expected:

6.1) Verify network adapter configuration

Use the following PowerShell cmdlets to verify Network Direct is globally enabled and that you have NICs with the RDMA capability. Run on both the SMB server and the SMB client.

Get-NetOffloadGlobalSetting | Select NetworkDirect
Get-NetAdapterRDMA
Get-NetAdapterHardwareInfo

6.2) Verify SMB configuration

Use the following PowerShell cmdlets to make sure SMB Multichannel is enabled, confirm the NICs are being properly recognized by SMB and that their RDMA capability is being properly identified.

On the SMB client, run the following PowerShell cmdlets:

Get-SmbClientConfiguration | Select EnableMultichannel
Get-SmbClientNetworkInterface

On the SMB server, run the following PowerShell cmdlets:

Get-SmbServerConfiguration | Select EnableMultichannel
Get-SmbServerNetworkInterface
netstat.exe -xan | ? {$_ -match "445"}

Note: The NETSTAT command confirms if the File Server is listening on the RDMA interfaces.

6.3) Verify the SMB connection

On the SMB client, start a long-running file copy to create a lasting session with the SMB Server. While the copy is ongoing, open a PowerShell window and run the following cmdlets to verify the connection is using the right SMB dialect and that SMB Direct is working:

Get-SmbConnection
Get-SmbMultichannelConnection
netstat.exe -xan | ? {$_ -match "445"}

Note: If you have no activity while you run the commands above, it’s possible you get an empty list. This is likely because your session has expired and there are no current connections.

7) Review Performance Counters

There are several performance counters that you can use to verify that the RDMA interfaces are being used and that the SMB Direct connections are being established. You can also use the regular SMB Server and and SMB Client performance counters to verify the performance of SMB, including IOPs (data requests per second), Latency (average seconds per request) and Throughput (data bytes per second). Here's a short list of the relevant performance counters.

On the SMB Client, watch for the following performance counters:

  • RDMA Activity - One instance per RDMA interface
  • SMB Direct Connection - One instance per SMB Direct connection
  • SMB Client Shares - One instance per SMB share the client is currently using

On the SMB Server, watch for the following performance counters:

  • RDMA Activity - One instance per RDMA interface
  • SMB Direct Connection - One instance per SMB Direct connection
  • SMB Server Shares - One instance per SMB share the server is currently sharing
  • SMB Server Session - One instance per client SMB session established with the server

8) Review the connection log details (optional)

SMB 3.0 now offers a “Object State Diagnostic” event log that can be used to troubleshoot Multichannel (and therefore RDMA) connections. Keep  in mind that this is a debug log, so it’s very verbose and requires a special procedure for gathering the events. You can follow the steps below:

First, enable the log in Event Viewer:

  • Open Event Viewer
  • On the menu, select “View” then “Show Analytic and Debug Logs”
  • Expand the tree on the left: Applications and Services Log, Microsoft, Windows, SMB Client, ObjectStateDiagnostic
  • On the “Actions” pane on the right, select “Enable Log”
  • Click OK to confirm the action.

After the log is enabled, perform the operation that requires an RDMA connection. For instance, copy a file or run a specific operation.
If  you’re using mapped drives, be sure to map them after you enable the log, or else the connection events won’t be properly captured.

Next, disable the log in Event Viewer:

  • In Event Viewer, make sure you select Applications and Services Log, Microsoft, Windows, SMB Client, ObjectStateDiagnostic
  • On the “Actions” page on the right, “Disable Log”

Finally, review the events on the log in Event Viewer. You can filter the log to include only the SMB events that confirm that you have an SMB Direct connection or only error events.

The “Smb_MultiChannel” keyword will filter for connection, disconnection and error events related to SMB. You can also filter by event numbers 30700 to 30706.

  • Click on the “ObjectStateDiagnostic” item on the tree on the left.
  • On the “Actions” pane on the right, select “Filter Current Log…”
  • Select the appropriate filters

You can also use a PowerShell window and run the following cmdlets to view the events. If there are any RDMA-related connection errors, you can use the following:

Get-WinEvent -LogName Microsoft-Windows-SMBClient/ObjectStateDiagnostic -Oldest |? Message -match "RDMA"

9) Conclusion

I hope this helps you with your testing of the Mellanox InfiniBand adapters. I wanted to covered all different angles to make sure you don’t miss any relevant steps. I also wanted to have enough troubleshooting guidance here to get you covered for any known issues. Let us know how was your experience by posting a comment.