SDN Troubleshooting: UDP Communication failures and changing the Network Controller Certificate

With this blog post, I wanted to highlight a couple of issues that we have encountered recently with Software Defined Networking (SDN) customer deployments in Windows Server 2016.

Issue #1: UDP communication isn’t working when outbound NAT is configured

Customer had configured outbound NAT access for his virtual network through SCVMM (this internally uses SDN Software Load Balancer), so that machines in the virtual network could access the Internet. The customer noticed that TCP traffic to the Internet was working fine, but all User Datagram Protocol (UDP) traffic was getting dropped. Moreover, this only happened when the Software Load Balancer MUX was on a different HyperV host than the tenant VM.

On deeper analysis, it was revealed that the destination VM was rejecting the packet because the UDP checksum was incorrect. Further investigations revealed a physical NIC issue. The customer was using a physical NIC which was not certified for SDN with Windows Server 2016. The NIC was incorrectly marking the UdpChecksumFailed flag when the inner packet had a valid checksum.

If you are planning to use SDN with Windows Server 2016, please ensure that you use certified NICs. You can verify whether a network adapter is or is not certified by checking the Windows Server Catalog.

Click Software-Defined Data Center (SDDC) Premium to filter the Windows Server Catalog LAN card list.

Issue #2: Changing the Network Controller Server certificate

A customer wanted to change the Network Controller server certificate used for communication with the Northbound clients. He was using self-signed certificates and wanted to move to Certificate Authority based certificates. After installing the new certificate on all the Network Controller nodes, he used the Set-NetworkController Powershell command to point Network Controller to the new certificate.

Although the command succeeded, Network Controller communication with SCVMM stopped working.

This is due to a bug in the product where the certificate binding is only changed on one Network Controller node (where the command was run) and is not updated on the other nodes. We are planning to release a fix soon.

As a workaround, you need to manually change the certificate binding on the other Network Controller nodes. Process is as follows:

  • Install the new certificate in Personal store of LocalMachine account
  • Execute the Powershell command: Set-NetworkController -ServerCertificate <new cert>
  • Retrieve the thumbprint of the new SSL certificate that you want to use with Network Controller
  • Double click the certificate and click on Details, note the value of Thumbprint parameter. Remove any spaces in between

The following illustration depicts the Thumbprint property of the certificate.

On each Network Controller node, check the SSL binding by executing the following command from a command prompt:

netsh http show sslcert

If the result shows the thumbprint of the old certificate, change the binding by executing the following commands:

netsh http delete sslcert ipport=0.0.0.0:443

netsh http add sslcert certhash= <thumbprint of the new certificate> appid=<application ID> ipport=0.0.0.0:443 certstorename=MY

You can retrieve the appid from the output parameter Application ID of the netsh http show sslcert command.

If the binding shows the thumbprint of the new certificate, no further action is needed on that node.

Additional Information

Here are a few links to SDN topics to assist with your planning and deployment:

If you plan to assess your needs and environment for deploying SDN, see the topic Plan a Software Defined Network Infrastructure.

If you want to deploy SDN using System Center Virtual Machine Manager, see the topic Set up a Software Defined Network (SDN) infrastructure in the VMM fabric.

If you have any questions/feedback about SDN, send an email to sdn_feedback@microsoft.com.