Azure Batch - VNET and Custom Image Support for Virtual Machine Pools

Overview

This blog post is out-of-date as it is now possible to use custom images and VNETs using a regular 'Batch service' Batch account; a 'user subscription' Batch account is no longer required.  There is documentation available for how to use custom images and VNETs.

Two of the most frequently requested capabilities for Azure Batch are now available – the ability for the VMs in a VirtualMachine pool to be created from a custom image and the ability for the pool VMs to be part of a virtual network (VNET).

Copying and installing software, copying data, and configuring the OS when a standard VM image is used can be very time consuming, lengthening the time until the VM is ready to perform work. A custom image avoids the data copy and installation overhead – custom images are produced with everything installed and configured.  Custom images can be produced and used to create both Windows and Linux pool VMs.

Another common requirement is for pool VMs to be able to securely communicate with a license server or file server - other VMs that are part of a virtual network. The ability to associate Cloud Service pool VMs with a VNET has been available for a while, but now VirtualMachine pool VMs can now also be associated with a VNET.

To use these two new features, a new Batch account needs to be created with a specific configuration using two new account properties – pool allocation mode and key vault reference. To date, when pools have been created, the actual VMs are provisioned in system Batch service subscriptions. Now, when the new pool allocation mode property is set to “User Subscription”, pool allocation behaior changes as follows:

  • Azure Batch will create virtual machine pool VMs directly in the subscription associated with the Batch account; one or more Virtual Machine Scale Set’s (VMSSs) and associated storage accounts will be created and visible in the Azure portal, PowerShell, etc.
  • Virtual machine core quotas will be applied, not the Batch core quota.
  • The Batch client API must use Azure Active Directory authentication, which is now supported in addition to the existing account name and key authentication. Azure Batch support for Azure Active Directory is documented here.
  • It will not be possible to create Cloud Service pools.

When a Batch account has the pool allocation mode property set to “Batch Service”, the default, then pool allocation is unchanged – Cloud Services or VMSSs are created in internal Batch subscription and hidden; it is not possible to use VNETs with virtual machine pools or use custom images.

Creating “User Subscription” Batch Accounts

Subscription Access Control

The Azure Batch Service Principal must be given access to the subscription to allow Azure Batch to create resources in the subscription.

In the portal:

  • Select "Subscriptions" from the left-hand menu, then select the subscription which you will use to create the "User Subscription"Batch account.
  • Select the "Access Control (IAM)" menu item for the subscription.
  • Select the "Add" button, select "Contributor" for the role, select either "MicrosoftAzureBatch" or "Microsoft Azure Batch" as the member, then Save the change.

[caption id="attachment_4265" align="aligncenter" width="699"]Update Subscription for Azure Batch Account Update Subscription for Azure Batch Account[/caption]

Create a Key Vault

A Key Vault is required that belongs to the same resource group as the new Batch account to be created.

If you have an existing Key Vault, then note the resource group and ensure it is in the same region as the new Batch account.  If you are creating a new Key Vault, then it is recommended you create it first, ahead of creating the Batch account:

  • Select "Key vaults" from the main portal menu, then select "Add".
  • Enter a name.
  • Pick an existing resource group or create a new one; note down the resource group.
  • Ensure you pick the same Location and the new Batch account.
  • Leave all other properties at their default values.

Create the “User Subscription” Batch Account

To enable pool resource allocations in your subscription, the pool allocation mode and key vault properties must be set in the “New Batch account” blade.

For the Key Vault:

  • If creating a new Key Vault, then set the Resource Group to be the same as you will use for the Batch account.
  • Keep all other properties as default; they will be configured as part of the "New Batch account" process.

[caption id="attachment_4275" align="alignnone" width="310"]User Subscription Batch Account User Subscription Batch Account[/caption]

If the subscription access control has not been configured correctly, then an error will be displayed on create batch account blade and the account cannot be created.

 

API Authentication for "User Subscription" Batch Accounts

The Batch client API must use Azure Active Directory authentication, which is now supported in addition to the existing account name and key authentication. Azure Batch support for Azure Active Directory is documented here.

 

Creating Batch Pools using the "User Subscription" Batch Account

When a Batch pool is created in the 'User Subscription' account, a number of resources are created and will be visible in the portal - resource groups containing virtual machine scale sets, storage accounts, load balancers, network security groups, public IP addresses, and virtual networks.

Quotas

As VMs and other resources, such as storage accounts, are created directly in your subscription when a pool is created, then the Azure Batch core quota is not used and the compute core quota, plus other subscription resource quotas are applied.

All quota values for a subscription can be viewed in the Azure Portal – select Subscriptions, select the subscription you will use for the Batch account, select the ‘Usage + quotas’ item in the ‘Settings’ section.

For every 40 Linux VMs and 20 Windows VMs in a Batch pool, the following resources are required:

  • One storage account (Quota = “Storage Accounts”, Provider = “Microsoft.Storage”)
  • One public IP (Quota = “Public IP Addresses”, Provider = “Microsoft.Network”)
  • One virtual network (Quota = “Virtual Networks”, Provider = “Microsoft.Network”)
  • One network security group (Quota = “Network Security Groups”, Provider = “Microsoft.Network”)
  • One virtual machine scale set (Quota = “Virtual Machine Scale Sets”, Provider = “Microsoft.Compute”)
  • One load balancer (Quota = “Load Balancers”, Provider = “Microsoft.Network”)

The core quota at a region level or per VM family should be set:

  • Quota = “Total Regional Cores”, Provider = “Microsoft.Compute”
  • Quota = “… Family Cores”, Provider = “Microsoft.Compute”

Pool Creation

Pool creation is slightly different for a 'User Subscription" Batch account:

  • For "Image Type", the value of "Cloud Services (Windows only)" cannot be selected; for standard images "Markeplace (Linux/Windows)" must be selected.
  • A new "Image Type" value of "Custom Image (Linux/Windows)" is available.

Create a Pool using a Custom Image

Only Batch accounts with the pool allocation mode set to ‘UserSubscription’ can create pools using a custom VM image. A new property called ‘osDisk’ is specified when a pool is created.

The following conditions must be met to create a pool using a custom image:

  • One unique custom image VHD blob can support up to 40 Linux VM instances or 20 Windows VM instances. You will need to create copies of the VHD blob to create pools with more VMs.  For example, a pool with 200 Windows VMs needs 10 unique VHD blobs to be supplied in the 'osDisk' property.
  • The storage accounts holding the custom image VHD blobs need to be in the same subscription as the Batch account.
  • The specified storage accounts should be in the same region as the 'User Subscription' Batch account.
  • Only Standard storage is currently supported; Premium storage will be supported in the future.
  • You can specify one storage account with multiple custom VHD blobs or multiple storage accounts each having a single blob. We recommend you to use multiple storage accounts to get a better performance.
  • You need to select the appropriate nodeAgentSkuId depending on the OS of the base image of your VHD. You can get mapping of available nodeAgentSkuId to the OS Image reference by performing the List supported NodeAgentSkuId request.

To create a pool using the Azure portal:

  • Open blade for a batch account, select Pools menu item, on the Pools blade select the "Add" command; the "Add pool" blade will be displayed.
  • Select “Custom Image (Linux/Windows)” from Image Type dropdown, the “Custom Image” picker will be displayed, open the picker and pick one or multiple VHDs from the same container and click Select button.  The support for multiple VHDs from different storage accounts and different containers will be added soon.
  • Pick the correct Publisher/Offer/Sku for your custom VHDs and select the desired Caching mode, then fill all the other parameters for the pool.
  • To check if a pool is based on custom image see the "Operating System" property in resource summary section of Pool details blade - it should have a value of "Custom VM image".
  • All the custom VHDs associated with a pool are displayed on the pool properties blade.

 

Include Pool VMs in a Virtual Network

The following conditions must be met to deploy a pool in a virtual network:

  • The specified virtual network (VNET) must be in the same Azure region as the ‘User Subscription’ Batch Account.
  • The specified VNET must be in the same subscription as the ‘User Subscription’ Batch Account.
  • The specified VNET must be an Azure Resource Manager (ARM) based VNet.  Classic VNets created via Azure Service Management are not supported.
  • The specified subnet should have enough free IP addresses to accommodate the number of VMs that will be in the pool. If the subnet doesn't have enough free IP addresses, the pool will partially allocate compute nodes, and a resize error will occur.
  • The specified subnet must allow communication from the Azure Batch service to be able to schedule tasks on the compute nodes. This can be verified by checking if the specified VNET has any associated Network Security Groups (NSG). If communication to the compute nodes in the specified subnet is denied by an NSG, then the Batch service will set the state of the compute nodes to unusable.
  • If the specified VNET has any associated Network Security Groups (NSG) then inbound communication must be enabled; ports 29876, 29877, and 22 for a Linux pool; 3389 for a Windows pool.

The specification of a VNET is not currently supported by the Azure Portal; the API must be used to create a pool associated with a VNET. The pool is associated with the VNET by specifying the resourceId for the VNETs subnet in the 'networkConfiguration' section of the request to create the pool.