Optimizing Linux Performance on Hyper-V

Hi all, this is Kevin Kelling and I’m a PFE with Microsoft focused on both Hyper-V and Azure.  Today I’d like to talk about the performance of Linux workloads on both Hyper-V and Azure. Every now and then we are asked how well Linux performs on Hyper-V.  We’ve done extensive testing and most of the time it runs quite well, but we’ve found a few cases, where some tuning can make all the difference.

I recently worked on an issue where a Linux application on Hyper-V (Windows Server 2016) was showing benchmarks of between 30-35% slower than other hypervisors. We instinctively felt something was wrong, so we rolled up our sleeves and got to work.

CHALLENGE ACCEPTED!

After some effort from our outstanding product engineering team, we found that this gap was completely eliminated by tuning some settings in the BIOS, tweaking some Linux kernel parameters, and keeping the in-guest tools current.

No changes were made to the Windows Server or Hyper-V configuration – just BIOS settings, and some changes within the Linux guest VMs. In other words, Hyper-V delivers great performance, but sometimes we need to tune the environment to take full advantage of it.

What changes made up for what initially was a significant difference in performance? Let’s go take a look!

TURN THOSE C-STATES OFF!

Many systems have power management profiles in the BIOS, and while many environments correctly have this set to “performance”, many systems will still have additional options for individual C-states.  If you want a primer on what a C-state is, there’s a good article here from Intel, but basically it’s a power management method.

Hyper-V tries to aggressively leverage C-states to reduce power consumption, but if you want the best possible CPU performance on your host servers, turn it all off!

In this case, we observed a 15% performance improvement after disabling all C-states in the host server’s BIOS.

Upgrade Linux Integration Services (LIS)

Linux Integration Tools consists of optimized synthetic device drivers and other “virtualization helpers” that go into the guest OS.  Windows servers have integration services built in and get can updates via Windows Update. Linux is similar in that the various distributions will build LIS into their images, and — depending on what version you are running — updates are distributed through their respective patch channels.

LIS 4.1.3 was released in December of 2016 (download here) and we found that simply updating to this version of LIS there was a 10% performance increase. The performance difference of course will vary based on the application, the version of Linux and the current version of the tools, but in this case the difference was quite significant.

A few notes here — some Linux distributions will not support LIS unless it is released through their patch distribution. Before you manually install LIS, make sure you understand if there are any support implications from your Linux vendor.

On the topic of LIS one final important point to keep in mind is that it makes a big difference what version of Linux you are running, as not all features are supported on all versions of Linux. If you visit the Hyper-V Linux support page, you will see sections for RHEL/CentOS, Debian, Oracle, Ubuntu, SUSE and FreeBSD.  In each of these sections you can find details on which Hyper-V features are supported in which Linux versions.

Bottom line again, is that in this case, simply upgrading LIS to the current version of 4.1.3 delivered a 10% performance improvement.

Set the Linux Clock Source

Let’s say you didn’t want to upgrade to the latest LIS.  You can still capture much of the performance improvement here simply by tuning the Linux clock.

Changing the Linux kernel’s clock source to “tsc” which stands for “Time Stamp Counter” resulted in a performance improvement of 6%.  This particular application had a lot of context switches and made heavy use of clocks and timers, so your mileage may vary here.

A bit of information on this parameter is available here.

How Many CPUs?

Linux kernels have an optional parameter where you limit the number of possible CPUs by setting the “possible_cpus” kernel parameter.  For example, if you have 8 virtual CPUs in the guest, edit the kernel parameters to include “possible_cpus = 8”. By setting this parameter, additional overhead processing was removed from the Linux CPU scheduler.

In this application, a performance improvement of 2% was observed after applying this setting.

CPU Load Balancing

Hyper-V presents a somewhat different view of the hardware to the guest OS than some other hypervisors do, in regard to how the L3 cache is presented. As a result, the Linux scheduler was not being sufficiently aggressive in scheduling the workload and some vCPUs were going into an idle state when they shouldn’t have.

By writing a simple shell script to run at boot, the Linux CPU scheduler used all of the vCPUs more efficiently, resulting in a 6% performance improvement for this application.

The Bottom Line

IT is often complicated.  There’s so many distros, Linux versions and workload patterns generated by applications and other variables to consider — your mileage may vary.

But what at first appeared to be a big difference in performance, was completely eliminated by making absolutely no change to the Windows Server operating system – only changes in the BIOS (hardware) and within the Linux guest were needed to eliminate the performance difference.

In other words, you can rest assured that Hyper-V 2016 is capable of delivering strong performance for Linux workloads. Furthermore, you can expect some improvements in future updates to LIS (Linux Integration Services) to automatically exploit some of these tweaks, with the benefits extending to Linux workloads on not just Hyper-V, but to Linux on Azure as well.

If you are concerned that your Linux workloads may be running slower than they could be on Hyper-V check these three basic things:

  • Disable C-States in BIOS
  • Upgrade to current version of LIS (If not an option, change clock source to “tsc”)
  • Set “maximum_cpus” in the Linux kernel

And if you suspect a problem with idle cores not being utilized, you may be able to further improve performance by tweaking the Linux scheduler (guest OS).

For one application, these above steps resulted in a performance improvement of over 30%.

So if you have any concerns about Linux performance in Hyper-V, you can have the confidence that Hyper-V and Azure are capable of delivering strong performance.  There may be cases where you may want to tweak some guest level settings in the Linux OS in order to achieve maximum performance.