Getting data and applications to / from Azure

After Wenming’s post on deploying an azure hpc cluster with powershell, I have been looking for an easy way to transfer applications & data to / from such cluster.

I have found that in HPC SP3 the hpcpack command includes the option to upload and mount a VHD drive on azure nodes. The command is also available with the free client utilities. That is very useful because:

- You can put whatever you like in the vhd file, including the executable that you need to run.

- It is stored in blob storage, so it is permanent. If you need to delete the cluster deployment in order to save money, the data in the vhd drive will stay.

- The vhd drive can be snapshotted and the snapshots mounted read-only on multiple nodes.

- You can download the vhd back to your PC when you’re done and mount it.

Here’s a summary of the process:

1. On your PC, create and attach a fixed-size vhd drive with the storage management tools (GUI or diskpart).

image 

2. Copy your files to the drive.

3. Again with storage manager gui or diskpart, detach the vhd.

4. Use hpcpack (or a gui tool like cloudberry explorer) to upload the vhd file to blob storage

- hpcpack upload <file>.vhd /account:<storage account name> /key:<storage key> /container:<container name> /description:<description string>

5. Connect via rdp to the headnode in azure (or any other node where you want to mount the drive).

6. To mount read-write on one node, run

- hpcpack mount <file>.vhd /account:<storage account name> /key:<storage key> /container:<container name>

7. To mount read-only, add the /snapshot parameter. You can also run the command on specific nodes via job manager or clusrun.

Once the drive is mounted, you can share it out via the GUI or with net share. The smb endpoints and firewall exceptions must be configured on all nodes for that to work.

When you’re done, you can use the same gui tool or hpcpack to download the vhd file back to your pc. The download does not erase the file from blob storage.

- hpcpack download <file>.vhd /account:<storage account name> /key:<storage key> /container:<container name>

Mount the vhd drive on your pc and retrieve your results.