CentOS cluster in Azure - Part 1

I have been working on Azure Resource Manager templates for a while. They are an interesting way to describe and deploy complex configurations in Azure. In particular, I have developed a template to deploy a CentOS 6.5 cluster. It will soon be published in the main Azure repository on github, but if you are interested you can find it in my repository. Note that it is set up to deploy once pulled into the main Azure one. You'll have to clone it and edit azuredeploy.json to change the location of deploy.sh if you want to use it before then.

This template deploys a 2-10-node cluster using the ARM "copy" feature. Each node has 2 network cards: 

  • The first one is on a "public" subnet.
  • The second one is on a "private" subnet.

The public subnet is fronted by a load-balancer with one dynamically assigned public IP address and as many NAT ports as there are nodes, starting from port 50000. The template also asks for a dns name and a location for the ip address. Thus, if you want to connect to node 0, you'll type:

  • ssh <user>@<name>.<location>.cloudapp.azure.com -p 50000

The private network is intended for cluster communications only.

The template also configures:

  • A storage account where all the virtual hard disks are stored.
  • A virtual network where the private and public subnet reside.

The template invokes a custom bash script that configures the second network card on CentOS nodes. By default the second card is recognized but the network stack is not set up.

The limit on the number of nodes is artificial. Alas, ARM does not support arithmetic operators yet, so one has to list all the possible NAT ports for the load-balancer configuration. I listed 10. Feel free to add more and you'll be able to increase the number of nodes. Also, if not all nodes require external connections, you can have as many nodes as the current subnet limit.

In part 2 I will add:

- Shared storage for all the nodes using azure file shares.

- A script to set up opengrid and mpi.