Top 10 Networking Features in Windows Server 2019: #9 LEDBAT – Latency Optimized Background Transport

Share On: Twitter      Share on: LinkedIn
 
This blog is part of a series for the Top 10 Networking Features in Windows Server 2019!
-- Click HERE to see the other blogs in this series.

Look for the Try it out sections then give us some feedback in the comments!
Don't forget to tune in next week for the next feature in our Top 10 list!

Keeping a network secure is a never-ending job for IT Pros, and doing so requires regularly updating systems to protect against the latest threat vectors.  This is one of the most common tasks that an IT Pro must perform.  Unfortunately, it can result in dissatisfaction for end-users as the network bandwidth used for the update can compete with interactive tasks that the end-user requires to be productive.

Have you ever had a support call that started like this?

“…I can’t seem to save my presentation to SharePoint”
“…my Skype session sounds like I’ve entered the Matrix!”

With Windows Server 2019, we bring a latency optimized, network congestion control provider called LEDBAT, which stands for Low Extra Delay Background Transfer. LEDBAT is designed to automatically yield bandwidth to users and applications, while consuming the entire bandwidth available when the network is not in use. It’s a scavenger protocol – it scavenges whatever network bandwidth is available on the network, and uses it. In other words, you can transfer SCCM Packages or Microsoft Updates without interfering with your user’s sanity.

Important: LEDBAT can optimize any TCP sender-side workload.  It is not limited to updates!

If you remember our Anniversary edition post: Announcing: New Transport Advancements in the Anniversary Update for Windows 10 and Windows Server 2016, LEDBAT was configured through an undocumented socket option.  As of Windows Server 2019, LEDBAT is now a fully supported feature.

Here’s what our some of our Microsoft MVPs had to say about their experience with LEDBAT:

“LEDBAT will play a key part of how Enterprises deal with infrastructure being more and more component based in the future, being the workhorse that keeps the Enterprise up to date without interrupting or impacting critical business traffic.”

– Andreas Hammarskjöld (Co-founder of 2Pint Software/Übergeek)

“This issue listed in KB4163525 caused extremely high network bandwidth consumption due to clients running full SUP scans. LEDBAT could have minimized this so that network saturation did not occur. We need LEDBAT!  Sign me up as soon as it is ready for WS 2016!”

Mike Terrill  (Enterprise Experiences & Management MVP and OS Engineer at a Global Financial Company)

Challenges with Existing Approaches

Some protocols like BITS (Background Intelligent Transfer Service) use an Adaptive Bit Rate (ABR) to adjust bandwidth of lower priority traffic. ABRs usually require multiple adjustments prior to reaching an optimized level of bandwidth that does not interfere with other current workloads.  However, each adjustment can require up to 2-seconds (which is not insignificant in our instant gratification world!). In addition, these two second increments add-up, negatively affecting the user’s experience over the long run! As a result, BITS has switched to using LEDBAT for upload traffic. 

Another existing approach is to use throttling, or specifying the maximum amount of bandwidth that can be used for a specific purpose. For instance, many of our customers have an SCCM distribution point that throttles the downloads of packages to 50% of the available bandwidth to its clients. In this scenario, you’ll only ever use 50% of the bandwidth even if 100% is available – You’ve set a maximum amount that cannot be exceeded under any circumstance. As a result, your client downloads could take 2x as long! Even worse, user traffic may require more than 50% of the overall bandwidth – in such scenarios, the bandwidth set aside for background transfers would interfere with the user experience.  You can see this effect in picture below:

In contrast, LEDBAT leverages unused network resources, and does not need a bandwidth caps for background transfers typically required by other solutions:

Latency as the Key Metric

My favorite quote from, Primer on Latency and Bandwidth Networking 101, chapter 1 is: “To succeed, network latency has to be carefully managed and be an explicit design criteria at all stages of development.”

One of the things we realized over the years is that latency is the key metric to optimize when it comes to having a great user experience. Whether it is a website that needs to load, a Skype call that needs to connect (and stay connected with high quality), or watching the recent world cup – latency is critical to keep low. An increase in latency generally indicates increased usage of the network, and such increases in latency usually result in a poor user experience with their productive tasks.

Consequently, LEDBAT carefully tracks latency and automatically yields the network to other traffic as the latencies start to go beyond a threshold. It operates on the sending-side of network communication, implementing RFC 6817We open-sourced these modifications and noted this at a recent IETF to ensure the community could benefit from our learnings.

LEDBAT in Action

On the left side of the image below we see a time series graph calibrated over latency without LEDBAT.  Before I started sending data (this is the user/application traffic), the latency was hovering around 10ms, which generally translates to a nice and smooth experience for users.  At about 10 seconds into the experiment, I started a data flow not optimized using LEDBAT (say, someone else initiated a large file download) and BOOM!  The latency goes straight to the moon!  Over three thousand milliseconds!

As noted earlier, this significantly impacts the user experience, and I’m guessing as an IT Pro, you do not want to take that support call. With that said, if your company uses VOIP, that frustrated user might not be able to get through to tech support anyway! 😊

LEDBAT minimizes latency and user frustration

In contrast, we see the same experiment with LEDBAT on the right side of the image above.  Just as before, I started the data flow (this time a LEDBAT optimized flow) at about 10 seconds into the experiment and the latency did indeed go up, but, only a little averaging about 100ms.  That’s less than the time that it takes for you to blink your eye, and is generally not perceptible for many actions!  As a result, user experience typically will not be impacted by these updates, which translates to happier end-users. In other words, as an IT Pro, you will be able to distribute updates to keep your organization secure, and without significantly impacting a user’s experience.

Here’s a video that further illustrates the effect of the latency:

The image below tracks network throughput over time. The height of the bars indicates network utilization, and the color of the bar indicates the type of traffic. LEDBAT is displayed in blue and non-LEDBAT (this is the user productivity traffic) is displayed in orange. Observe that till the 13 second marker, there is no competing user traffic – consequently, LEDBAT utilizes a significant portion of the network. At 13 seconds, I start the Not-LEDBAT data flow. Observe how LEDBAT promptly backs off giving the non-LEDBAT data flows the needed bandwidth. Then at the 25 second marker, I stop the non-LEDBAT data flow (this would be equivalent to the user stops watching a video or otherwise) and LEDBAT comes right back ramping up to good utilization automatically.

Here’s a demo illustrating the user productivity shown above:

Think about what this means to a system update.  The system updates are in blue using LEDBAT.  When a user (in orange) starts using the network, LEDBAT quickly and automatically gets out of the way.  Subsequently, when the user is not using the network, LEDBAT automatically ramps back up to full utilization.  No throttles, no tuning, no scheduling, no hassles for the IT Pro.  Doesn’t that sound nice?


Ready to give it a shot!?   Download the latest Insider build and Try it out!

*** There was a bug in the validation guide.  The guide incorrectly referred to the DatacenterCustom template.  
Please use the InternetCustom template instead.  The guide has been fixed.

LEDBAT with SCCM

LEDBAT can also be enabled on a SCCM distribution point running Windows Server 2019.  Because LEDBAT operates on the sending side, any client regardless of the operating system, will enjoy the benefits that it brings.  To enable this in SCCM, check the following option:

Here’s more information on how you can enable SCCM distribution points to use network congestion control

Well that is the end of this blogpost.  I hope you have enjoyed reading and watching the videos.  Please remember that LEDBAT can be used for any TCP-based workload that sends large amounts of data.  Don’t forget to check out the validation guide. We would love to hear your feedback in the comments section below!

Thanks again for reading,

Daniel “low latency” Havey

Here’s a quick summary of the resources included in this article:

[1] Request for Comments: 6817 — Low Extra Delay Background Transport (LEDBAT)