Hybrid Cloud Blog

RSS

This post was written by Michael Kelley, Principal PM Manager, Cloud + Enterprise team

Introduction

This blog post is #4 in a series of technical posts about running and managing Linux and FreeBSD in your on-premises datacenter.  Other posts in the series are here:

Overview

Running Linux and FreeBSD as a guest operating system on Hyper-V

Managing Linux and UNIX using System Center and PowerShell DSC

Linux Network Goals

Achieving high network performance is critical to running production workloads in a virtual environment.  Microsoft’s goal is for Linux virtual machines to achieve essentially the same level of performance that could be achieved in a non-virtual environment – i.e., just running directly on the hardware without a hypervisor.  For example, if the physical network is 10G Ethernet, our goal is that a Linux virtual machine should be able to saturate the physical network and achieve nearly 10 Gbps in network performance.

Because we focus on Linux servers running production workloads in a datacenter or the Azure public cloud, the typical virtual machine also has multiple vCPUs and is running on physical hardware with multiple processor sockets and cores.  The networking usage profile has many network connections to multiple running processes or threads that are part of the workload.  While the latency of a single network packet is important, even more important is the overall throughput that can be achieved by this multi-connection workload.  The Linux networking features we have implemented, and the environments in which we measure the resulting network performance, are designed to match this usage profile.  Later sections of this post give specific details on these configurations.
 
Linux Network Performance Features

The Linux Integration Services drivers in Linux guests on Hyper-V implement several features that improve network performance and throughput.  These features include Virtual Receive Side Scaling (vRSS) and various TCP/IP offloads.

The vRSS feature provides additional network performance by using multiple vCPUs to handle incoming network packets.  In the absence of vRSS, incoming network packets always cause an interrupt on vCPU 0.  In heavy network loads, vCPU 0 may reach 100% utilization and become a bottleneck even though the virtual machine overall has spare CPU cycles because other vCPUs are lightly loaded.  The vRSS code in the network device driver spreads out the interrupts from arriving network packets across multiple vCPUs so that vCPU 0 is less likely to become a bottleneck.  As a result, a typical production virtual machine, with its multiple vCPUs, can get higher network throughput.  Our measurements have shown network throughput increases as the interrupt load is spread across up to eight vCPUs.  If you are running a virtual machine with more than eight vCPUs, vRSS only uses eight of the vCPUs to handle interrupts.  Conversely, if you are running a small VM with only one vCPU, vRSS will not provide any benefit.

Similar to vRSS, the sending of network packets is also spread out across multiple vCPUs so that the virtual switch workload on the Hyper-V host does not bottleneck on a single CPU.

TCP large send offload accumulates multiple outgoing network packets into one large Ethernet frame that may be bigger than the standard Ethernet MTU.  This large frame is handed to the Hyper-V host via the virtual NIC driver in the Linux guest.  The Hyper-V host can then use hardware in the physical NIC to segment the large frame as it goes onto the physical Ethernet.  If the physical NIC doesn’t support such segmentation offload, Hyper-V will do the segmentation in software.  But in either case, passing a single large packet from the Linux guest to the Hyper-V host is more efficient than passing multiple smaller packets.  This efficiency gain results in enhanced network throughput.

Checksum offload uses a similar approach.  The outgoing network packet, without a checksum, is handed to the Hyper-V host via the virtual NIC driver in the Linux guest.  The Hyper-V host can then take advantage of hardware in the physical NIC to calculate the checksum, reducing the load on the CPU.  If the physical NIC doesn’t support checksum offload, Hyper-V will do the checksum in software.  In this case, there’s no real advantage since the cost of calculating the checksum is pretty much the same regardless of whether it is done in the guest or the host. 

These features are transparent to applications and don’t usually require that you do any management or tuning.  The features work behind the scene as key enablers for driving up overall network throughput and reducing the CPU overhead of network transfers.

Linux Network Availability Features

A new networking feature is hot add and remove for virtual NICs.  This feature is available starting with Windows Server 2016 Hyper-V (which is currently available as a Technical Preview), and with the latest versions of the Linux Integration Services drivers (LIS 4.0 and later).  With hot add/remove of vNICs, you can add a vNIC to a virtual machine, or remove a vNIC from a virtual machine, while the virtual machine is running.  This capability increases the uptime of your virtual machines.  It can be particularly handy in troubleshooting networking problems, because you can add a network connection to gain access to a virtual machine if some failure has caused the existing network connection to stop working.

As a simple example, suppose you have a Linux virtual machine running on Hyper-V with a single vNIC.  The output of ifconfig in the Linux guest might look like this:

 

As expected, you see one network connection on eth0, plus the loopback connection.

But then, if you go into Hyper-V Manager and change the Settings on the VM to add another vNIC, almost immediately the Linux VM notices the new vNIC, creates eth1, and assigns an IP address using DHCP.  This second vNIC is live and can be used by real workloads.  The output of ifconfig immediately becomes:

Similarly, if you use Hyper-V Manager to remove a vNIC from the VM, the corresponding eth device in the Linux VM disappears almost immediately and any network connections using the device are dropped.

Linux Network Performance

We measure baseline network throughput by sending data between two Linux virtual machines, each residing on a separate Hyper-V host.  The Hyper-V hosts are connected via physical Ethernet, either 10G, or the new and faster 40G version.  The Linux virtual machines are configured with eight vCPUs each, in order to get the maximum benefit from vRSS.  We use iperf3 as the tool running in each Linux guest for creating the network load and measuring throughput.  iperf3 is an open source project available here on github.  We configure iperf3 with 16 threads in order to have a number of different connections, simulating a typical production server workload. 

So the typical measurement configuration looks like this:

With this configuration, we see a maximum of about 9.4 Gbps throughput, which is essentially saturating the 10G physical Ethernet connection.

With a 40G physical Ethernet connection, we currently get a maximum of about 30 Gbps.  That’s not saturating the physical Ethernet connection, so this is an area of ongoing measurement and performance improvement to the Linux Integration Services drivers and the underlying Hyper-V host.

Summary

High network performance and throughput is a key requirement for virtualizing production workloads.  Linux on Hyper-V has achieved its goal of saturating a 10G physical network and is more than capable of running demanding production workloads.  But as technology continues to improve, the bar for success moves as well, so we’re now working on the same goal for 40G physical networks.

Additionally, network features like hot add and remove vNIC provide you with a great deal of flexibility to augment and reconfigure your networks – all while your virtual machines are running so you don’t incur any downtime.

Next week’s topic is FreeBSD guests on Hyper-V.  Jason Anderson will describe why we’re enabling FreeBSD as a guest on Hyper-V, the current status of the FreeBSD Integration Services drivers, and where we stack up on key capabilities such as network performance.