N-series - first Azure VMs with GPU

Article
12/13/2016

Great news - Azure N-series Virtual Machines are now available in public preview. N-Series instances are enabled with NVIDIA’s cutting edge GPUs to allow you to run GPU-accelerated workloads and visualize them. These powerful sizes come with the agility you have come to expect from Azure, paying per-minute of usage.

N-Series VMs are split into two categories: NC-series and NV-series. NC-Series (compute-focused GPUs) are powered by Tesla K80 GPUs and offers the fastest computational GPU available in the public cloud. Furthermore, unlike other providers, these new SKUs expose the GPUs through discreet device assignment (DDA) which results in close to bare-metal performance. You can now crunch through data much faster with CUDA across many scenarios including energy exploration applications, crash simulations, ray traced rendering, deep learning and more. The Tesla K80 delivers 4992 CUDA cores with a dual-GPU design, up to 2.91 Teraflops of double-precision and up to 8.93 Teraflops of single-precision performance.

NV-Series is focused more on visualization. Data movement has traditionally been a challenge with HPC scenarios using large datasets produced in the cloud. With the Azure NV-Series, you’ll be able to use Tesla M60 GPUs and NVIDIA GRID in Azure for desktop accelerated applications and virtual desktops. With these powerful visualization GPUs in Azure, you will be able to visualize graphic-intensive workflows to get superior graphics capability and run single precision workloads such as encoding and rendering. The Tesla M60 delivers 4096 CUDA cores in a dual-GPU design with up to 36 streams of 1080p H.264.

N-series VMs are powered by Intel Xeon E5-2690v3 CPUs. You can scale from 6 cores + 56 Gb RAM + 1x GPU (NV6/NC6) to 24 cores + 224 Gb RAM + 4x GPUs (NV24/NC24). List of all N-series VM sizes is available here. These VMs are currently available in 4 Azure Regions: East US (NC only), South Central US, West Europe and Southeast Asia. Currently you can use Standard (non-SSD)Storage only for data, but all N-series VMs include at least 340Gb of temporary SSD storage.

Prices are available here. Don't get confused with the high monthly pricing. Yes, N-series VMs are much more expensive than A-series.

But don't forget that you are charged on per-minute basis. Designers or engineers won't need to work with heavy graphics 24/7. So this pricing is more indicative:

For example, NV6 VM (Windows Server 2016, South Central US, $1.35/hour) for an engineer, that works 8 hours a day 22 days per month (176 hours total), will cost around $237 per month. Combined with a storage and network traffic cost, it will be around $250 per month per end-user. It's $9000 for 3 years, which is comparable with a professional graphics station with similar hardware characteristics ($6000-$10000), but:

Hardware graphics station can't scale. It's hard to predict how much compute and graphics power an engineer or designer will need for his work. In Azure you can scale from NV6 to NV24 and back in minutes, this is just a simple reboot.
Hardware graphics stations cost a lot. This is a significant CAPEX for the organization. In Azure you pay as you grow, no upfront investments required.
Hardware graphics station can fail. OS can fail. You will need to find a replacement for a user. In Azure you can spin up a replacement VM in minutes.
You don't pay when your engineer or designer don't use his workstation. He can be on vacation, on sick leave, or he can leave the company. It will be hard to sell the hardware graphics station, but in Azure you just turn off the VM and don't pay anymore.

Azure N-series VM is a great option when you've just started a business. Anyway, you will be able to purchase a hardware graphics station in the future when you will have more budget and you will know how much compute and graphics power you need for your employees.

Deployment of N-series

Let's test Azure N-series virtual machine in action. Login to the Azure Portal and create new VM. I will deploy NV6 VM with Windows Server 2016 in West Europe region.

VM creates in less than a minute, which is much faster than a regular A-series VM. I assume that such VMs are pre-created, and then just booted up and specialized with user settings.

Next step - install Nvidia drivers, like described here.

Reboot and check that Nvidia GPU now visible in Device Manager.

Then let's enable H.264/AVC 444 mode for RDP. I've already mentioned this new RDP feature of Windows Server 2016, that improves the picture quality and increases FPS. To do than, open Group Policy Editor (gpedit.msc) and go to Computer Configuration -> Administrative Templates -> Windows Components -> Remote Desktop Services -> Remote Desktop Session Host -> Remote Session Environment. Then enable 2 policies:

Prioritize H.264/AVC 444 Graphics mode for Remote Desktop connections

Configure H.264/AVC hardware encoding for Remote Desktop connections

Now let's test it in real-life scenario.

Testing

To test the performance of GPU inside a VM, that runs in the Azure datacenter 2200km far away from my location, I will use Nvidia demos. Lifelike Human Face Rendering demo is a good example of 3D rendering in real time.

Let's test the network latency between my location (Moscow, Russia) and VM in West Europe Azure region (Netherlands). I use GPON Internet connection, device is connected through 802.11n Wi-Fi. As you see, average latency is 50ms.

Let's start the demo on a full screen with FullHD resolution. It looks great!

I use Intel Compute Stick with Windows 10 as an endpoint. Just imagine to see such 3D quality on the $120 device, that fits in a pocket.

Here is the live recording from my phone. As you see, FPS is pretty stable, so such solution will be suitable for designers and engineers.

[embed]https://youtu.be/2l5Z688ym9Y[/embed]

Cost optimizations

To reduce the cost, we need to configure automatic Power-off after work hours and add capabilities to end-users to power on and power off their VMs by themselves. This can be done by granting them Contributor rights on the VM level.

Then I will put 2 scripts on the desktop. First script will power-on the users VM, second script will shut is down. These scripts leverage simple Azure RM cmdlets - Start-AzureRmVM and Stop-AzureRmVM.

You can enable automatic Power-off in the VM properties.

Also it will be a good practice to switch external IP to Static mode. Otherwise it will change when the VM will be booted back after de-allocation.

User turn on his VM, wait for 3 minutes, and then he clicks on the RDP icon and connects to his VM. When the work is done, he turns off the VM. If he'll forget to it, VM will automatically shut down at the scheduled time.

This is only the beginning of GPU-powered VMs in Azure. They open a lot of new opportunities for different industries. Let's be in touch and see how this idea will evolve after General Availability in 2017.

N-series - first Azure VMs with GPU

Deployment of N-series

Testing

Cost optimizations

Additional resources