Notes from the Field: Microsoft SDN Software Load Balancers

Kyle Bisnett and Bill Curtis here. We are two Software Defined Network Blackbelts and Premier Field Engineers at Microsoft and specialize in Hybrid Cloud technologies, which includes Cloud Platform System, Azure Stack and WSSD/SDDC. Most importantly, we ensure it’s easy for our customers and partners to deploy and leverage Software Defined Networking (SDN), whether it’s within an enterprise or as part of a Partner Solution (WSSD).

Recently, our customer came to us asking questions about our SDN Load Balancers (SLB) as they were looking into using fewer physical appliances and deployments of the venerable Microsoft Network Load Balancing (NLB) with an SDN solution. In this blog, we will cover some common questions we received from this customer and others in the field about SDN SLB.

Briefly, what is Microsoft Software Defined Networking?

If you have deployed Windows Server 2016 and/or Windows Server 2019, chances are you’ve heard about Software Defined Networking (SDN) that comes at no additional cost in our Datacenter SKU. Also, if you’ve looked at our prior blogs, you have seen mentions about SDN going mainstream.

Microsoft SDN provides software-based network functions such as virtual networking with switching, routing, firewalling with micro-segmentation, third-party appliances, and of course load balancing – the subject of today’s post . These are all virtualized and highly optimized for availability and performance and, like Storage Spaces Direct, is a key component of the Windows Server Software Defines(WSSD)/Software Defined Datacenter (SDDC).

Why should I use Microsoft’s SDN Software Load Balancer?

…There are plenty of other SDN Load Balancer solutions that have been around for longer, right?

Microsoft SDN is an end-to-end solution. All the components work in harmony together, and you can leverage features that are a direct result of this synchronization, such as Direct Server Return (DSR), health-probing on the Hyper-V hosts, and NAT functionality. Keep in mind, the other benefit is from an administrative perspective as you no longer need worry about expensive support contracts, hardware upgrade cadences (these are just Windows VMs), and some of the odd items like Active/Passive. All SLB MUXs are always Active/Active whether it’s two or eight.

SDN in Server 2016\2019 is closely based on the SDN running in Microsoft Azure

Software Defined Networking is being utilized across 32 different global Azure datacenters. When you configure a Standard or Basic Load Balancer, Virtual Network (vNet), Site to Site VPN Connections and more in Microsoft Azure, you are using SDN architecture that has been ported over to SDN in Windows Server 2016\2019, and Azure Stack. Microsoft SDN is well-tested at scale and is very competitive with other SDN products in terms of performance and scalability.

Are the SLB MUXs highly available?

If so, how can I ensure it is checking my Guest VMs to ensure they are ‘up’ or ‘down’?

SLB MUXs are fault tolerant and utilize Border Gateway Protocol (BGP), which is a dynamic routing protocol that advertise all MUXs within the pool in a /32 subnet form to the top-of-rack switch. When a keep-alive metric is missed, BGP automatically removes the individual load balancer from the routing table. This is helpful in host outage or in case of individual MUX monthly patching.

So that’s great! We have fantastic fault tolerance for the MUX infrastructure, but how about our Guest VMs that leverage the SLB MUXs?

Well, we have a feature that is most known in the load balancing community as Health Probing, and our implementation is state-of-the-art. In Windows Server 2016 and above, we support both TCP probe-to- port and HTTP probe-to-port and URL.

Unlike traditional load balancer solutions where the probe originates on the appliance and is sent across the wire to the guest IP, SLB probes will originate on the host where that Guest VM IP is located and is sent directly from the SLB Host Agent running on the Hyper-V Host to the VM IP. This eliminates wire traffic and spreads the overhead of conducting health probes between the Hyper-V hosts within the SDN-enabled cluster.

How much performance can I expect from the load balancers?

Direct Server Return (DSR) is a fantastic feature. In the two scenarios below, you’ll see this in action. For external traffic, DSR can eliminate most of the outbound traffic going through a SLB MUX as it will send directly from the Hyper-V Host to the top-of-rack switch\router. For internal load balancing, it can eliminate most traffic being received at the load balancer infrastructure and will be strictly VM to VM traffic after the initial packet. Let’s look at these scenarios:

External Load Balancing

For a Public Virtual IP (VIP) load balancing scenario, the initial packet will arrive at our public VIP on the Top of Rack Switch/Router, which will then be routed to one of our SLB MUXs, and then onto the host, and to the individual tenant VM. Now, on the outbound path, egress packet avoids the MUX infrastructure all together since the Hyper-V host has performed NAT on the packet and routed directly to the Top of Rack Switch. This increases available bandwidth for tenant and infrastructure workloads by 50% when compared to other appliances and solutions.

  1. Internet traffic routed to a Public VIP comes in through the Top-of-Rack switch\router, and, then using ECMP, a SLB MUX VM is chosen in which to route the traffic.
  2. The SLB MUX VM then finds what Dynamic IPs (DIPs – the actual IPs of the VMs) the Public VIP is associated with. One of the DIPs is chosen, the traffic is encapsulated into VXLAN packets, and is then sent to the Hyper-V Host which owns the VM with the chosen DIP.
  3. The Hyper-V Host receives the packets, removes the VXLAN encapsulation, and routes it to the VM.
  4. When the VM sends a response packet, it is intercepted by the Hyper-V Host’s virtual switch, the response packet is re-written with the Public VIP IP, and routed directly to the Top-of-Rack switch\router bypassing the SLB MUX VMs. This results in massive scalability as DSR eliminates the SLB MUX VM(s) from being a bottleneck for return traffic.

Internal Load Balancing

During the internal load balancing scenario, the initial packet will flow to the internal VIP, the SLB MUX will find the DIPs (guest VMs), encapsulate the packet using VXLAN, and send to the host which removes the encapsulation and forwards to the DIP, i.e. Tenant VM. Now, the best part, all traffic after this initial internal load balancing scenario will avoid the MUX and perform VM to VM traffic until a health event occurs such as a probe failure, etc. This can eliminate a large percentage of internal load balancing traffic.

  1. The first internal VIP request goes through the SLB MUX to pick a DIP.
  2. The SLB MUX detects that the source and destination are on the same VM Network and then the MUX sends a redirect packet to the source host.
  3. The source host then sends subsequent packets for that session direct to the destination. The SLB MUX is bypassed completely!

How do I grant my business units access to a jump box within an isolated vNET?  Could I also grant Internet Access to all of the VMs without using a Gateway Connection?

If you have ever created a virtual machine in Microsoft Azure, you will have a Public IP and a Private IP. The private IP is used for Intra vNet traffic in Azure or can be used for Express Route and/or Site to Site. The public IP, however, is a NAT interface that you can expose RDP 3389 on. SDN has the same functionality to both inbound and outbound NAT. Outbound NAT is especially useful to give all your VMs within a vNet, internet access, but you do not need a Gateway connection for each vNet!

Inbound NAT

Let’s walk through how inbound NAT occurs: NAT will not terminate within the load balancer but on the Hyper-V host itself. When the Public VIP is created and configured, along with an external port, the SLB MUXs will start advertising the VIP by updating the routes using BGP to the Top of Rack switch. When a packet is destined for the Public VIP, it will forward this to an available MUX which will look up the DIPs and encapsulate the packet, using VXLAN to be forwarded to the Hyper-V host. The Hyper-V host will remove the encapsulation and re-write the packet, so the destination is now the DIP and internal port that you wish to use.

A great use of this feature that we see from the field is the “Our infrastructure team wishes to allow a business unit RDP access to multiple VMs inside of the ‘Finance’ vNet.” Within VMM, the infrastructure team can assign separate Public ports, I.e. 3340, 3341, etc. that still have the same back end port of 3389, but to different DIPs. This fulfills the requirement of RDP to a few jump boxes inside the vNet.

Can I use SDN Software Load Balancers on VMs that are not using Hyper-V Network Virtualization?

Yes! In some organizations, the extra configuration required for Hyper-V Network Virtualization (HNV) as well as the need for SDN RAS Gateways for HNV enabled networks to be configured so that VMs can communicate with the physical network can be overkill. Virtual Machines that are not using HNV VM Networks can still take advantage of SDN load balancing.

Microsoft Network Load Balancer can also be used, but it does not come close to providing all the robust features and scalability that SDN SLB provides, as mentioned above.

If the following criteria is met, SDN SLB can be used on non-HNV VMs:

  • Top of Rack Switch is BGP capable
  • Network Controller is deployed
  • Hyper-V Hosts are managed by Network Controller
  • Software Load Balancer MUX VMs have been deployed and onboarded by NC
  • The VM Networks being used by the VMs that require load balancing are on a defined VLAN and are managed by Network Controller

How do I get started evaluating SDN Software Load Balancers?

Deploying SDN has never been easier!  As announced during our Top 10 Network Features series SDN has gone mainstream!

There are two methods for deploying SDN:

SDN Express

SDN Express now includes a GUI (see our SDN Goes Mainstream post)!  You can also deploy via PowerShell for environments not utilizing System Center Virtual Machine Manage (SCVMM). Additional details on how to deploy SDN using SDN Express are located here and scripts and other resources are in the Microsoft SDN repository on GitHub.

System Center Virtual Machine Manager 2016 or higher

SDN can also be deployed and managed by SCVMM 2016 and higher. Instructions for how to deploy SDN in SCVMM are located here and scripts and other resources are in the Microsoft SDN repository on GitHub.

How can Microsoft help my enterprise become part of SDN?

That’s a great question and we are sure glad that our customer asked. There are a few different options listed below:

Premier Advisory Call

Ask your Technical Account Manager (TAM) who is assigned to your account to get you in touch with the Microsoft SDN Blackbelt community. We can hold a remote advisory call to discuss prerequisites and ensure that it will meet the requirements of your business. This is also a great time for a Q & A session!

Premier WorkshopPLUS: Windows Server: Software Defined Networking

This workshop is a full 4-day workshop that walks through planning, architecture, implementation, and operation of an SDN-enabled hybrid cloud. It includes labs that are hosted on our Learn on Demand platform, simply bring-your-own device and you gain access to all the content and labs. Also, coming towards end of this year, our Unified Support customers will have access to all the Blended Learning Unit (BLUs) video recordings we completed. It’s sort of like a Bill and Kyle SDN on-demand channel!

SDN Blackbelt Community

The SDN Blackbelt community is also here to assist remotely. We can certainly have an advisory call as mentioned and that should be your first step. However, if you have a quick question or need assistance, send us a quick note at SDNBlackbelt@microsoft.com and one of us will get back to you.

Summary

We hope you found this blog to be useful and the scenarios beneficial. There are some fantastic features gained from implementing SDN including the battle-tested and performant Software Load Balancing included in your datacenter SKU. Stay tuned for more Notes from the Field and check the tag below for the full series.  We plan to post future blogs that will discuss many other components of SDN!

Stay tuned and see you next time!

Kyle Bisnett and Bill Curtis