Infiniband on HPC Server 2008 - again

After I published a document about the installation of Mellanox Infiniband on Server 2008, I have received some good feedback that deserves sharing. Note that:

The WinIB 1.4 beta available for download today on Mellanox's web site does not work with the HPC Server 2008 March CTP. We are working with Mellanox to fix that.

The procedure I illustrated in that document uses a "trick": Deploy the Mellanox package with msiexec first, add the Infiniband network to the cluster configuration later. This works both with our deployment tools and with 3rd parties'. However, one can exploit the built-in HPC Server 2008 tools better. Here's how:

1. Install the Mellanox WinIB package (currently WinIB_x86_1_4_0_2094.msi) on the head node. Set up the cluster network configuration to include Infiniband as the MPI network.

2. Create an o/s image and deployment template for the compute nodes.

3. In the Admin Console, right-click on the image and select Manage Drivers. You need 3 drivers for the card to be visible in the admin console, hence configurable:

  • ib_bus.inf Mellanox InfiniBand Fabric driver
  • mthca.inf InfiniBand Host Channel Adapter driver
  • netipoib.inf Mellanox IP over Infiniband protocol driver

You will find the first 2 files in the C:\Program Files\Mellanox\WinIB\Drivers on the head node after installation of the WinIB package. The last one will be in C:\Program Files\Mellanox\WinIB\IPoIB

4. Copy WinIB_x86_1_4_0_2094.msi package to %CCP_DATA%\InstallShare on the head node.

5. Edit the compute nodes deployment template and add an Installation->Unicast Copy from z:\<winib package>.msi to c:\<winib package>.msi; move the copy operation before the "Install CCP" task.

6. In the same template, add an Installation->Execute OS command: msiexec /i c:\<winib package>.msi /qn ADDLOCAL=ALL.

7. Deploy the compute nodes.

8. When they are deployed, you can start the Network Direct provider with  clusrun /nodes:<list of nodes> "%WinIB_HOME%\IPoIB\NDI\ndinstall.exe" -i