VMQ Deep Dive, 3 of 3

Introduction

At this point I’m hoping everyone has had the opportunity to read my two previous blogs on VMQ, VMQ Deep Dive 1 and VMQ Deep Dive 2, and are more knowledgeable about the offload and when to use it. In this last post of this series, I want to go into on how to monitor and troubleshoot problems you may be having related to VMQ.


Monitoring

Windows Performance Monitor (Perfmon) is an inbox tool that you can use to examine how programs you run affect your computer’s performance. If you are not familiar with Perfmon, you can find more information here, Windows Performance Monitor. In (Perfmon), there are 3 counters that are extremely helpful and can help evaluate VMQ. To use these counters, open Perfmon and right click on the graph to select ‘Add Counters…”. Click on the ‘Hyper-V Virtual Switch Processor’ category and the counters are in under:

1. Number of VMQs – The number of VMQ processors affinitized to that processor

2. Packets from External – Packets indicated to a processor from any external NIC

3. Packets from Internal – Packets indicated to a processor from any internal NIC, such as a vmNIC or vNIC.

Quick note for customers using Windows Server 2012 since the counter below may be of interest:

Hyper-v Hypervisor Logical Processor à Hardware Interrupts per sec – Counters 2 and 3 were not implemented until Windows Server 2012R2. If you are running Windows Server 2012 then you can use this counter to see which processors are receiving a large number of interrupts from the NIC to locate the VMQ processors.

For all these counters you are going to want to include all of the processors on your system. You are also going to want to change the graph type to report. The graph view for these counters is not really useful.  An example of the kind of report you’ll see is below.

clip_image002

Once you can see where the interrupts are occurring you can modify your VMQ settings to have the VMQs only interrupt the set of processors you define.  You can see the VMQ processors currently configured by using the following cmdlet:

PS C:\> Get-NetAdapterVmq -Name “<Your NIC here>” | fl

This will return results that look similar to the below output:

Caption : MSFT_NetAdapterVmqSettingData 'Mellanox ConnectX-3 Ethernet Adapter #2'
Description : Mellanox ConnectX-3 Ethernet Adapter #2
ElementName : Mellanox ConnectX-3 Ethernet Adapter #2
InstanceID : {32690636-E5F4-40AD-94F7-59B12657D095}
InterfaceDescription : Mellanox ConnectX-3 Ethernet Adapter #2
Name : SLOT 4 4
Source : 2
SystemName : 27-3145J0513
AnyVlanSupported :
BaseProcessorGroup : 0
BaseProcessorNumber : 0
DynamicProcessorAffinityChangeSupported :
Enabled : True
InterruptVectorCoalescingSupported :
LookaheadSplitSupported : False
MaxLookaheadSplitSize : 0
MaxProcessorNumber : 7
MaxProcessors : 8
MinLookaheadSplitSize : 0
NumaNode : 65535
NumberOfReceiveQueues : 125
NumMacAddressesPerPort : 0
NumVlansPerPort : 0
TotalNumberOfMacAddresses : 0
VlanFilteringSupported : True
PSComputerName :
ifAlias : SLOT 4 4
InterfaceAlias : SLOT 4 4
ifDesc : Mellanox ConnectX-3 Ethernet Adapter #2

The first parameter that you will look at is ‘BaseProcessorNumber.’ This is where processing of your VMQ processing will start. VMQ processing will never occur on a processor lower than the one indicated in this setting. Next, you will look at MaxProcessors and MaxProcessorNumber. MaxProcessors is the number of processors your NICs queues can use for VMQ. MaxProcessorNumber is the highest processor in the system that your NIC will use. I’m going to give a few examples to make the point clear. In the examples below, we will pretend we have a system with 6 CPUs, 0 to 5. The red box will encompass the VMQ capable processors. In the first example, all the processors will be available to VMQ.

BaseProcessorNumber: 0

MaxProcessorNumber: 5

MaxProcessors: 6

clip_image004

In this next example, let’s change the MaxProcessorNumber to 3 and see what happens.

BaseProcessorNumber: 0

MaxProcessorNumber: 3

MaxProcessors: 6

clip_image006

Here you can see that the MaxProcessorNumber keyword takes precedence over the MaxProcessors keyword. In general, the smallest set of processors is going to be chosen to not violate any of the keywords settings. Let’s reverse MaxProcessorNumber and MaxProcessors and see what we get.

BaseProcessorNumber: 0

MaxProcessorNumber: 6

MaxProcessors: 3

clip_image008

In this case, the system will not violate the MaxProcessors keyword and will only use 3 CPUs total.

Troubleshooting

Once you have your Perfmon setup correctly you’ll be able to see what processors currently have VMQ’s assigned to them and the number of packets/interrupts it is handling. When you combine this with the output of Get-NetAdapterVmq –Name <NIC> | fl & Get-NetAdapterVmqQueue you should be able to very easily tell where VMQ traffic is being processed. The processors being used should line up with the processors that you have set when configuring VMQ. Note that not all the processors will be used but the processors being used should be within the configured range. If a processor is not above 90% utilization, we would not expect VMQ to try to expand the processing any further.

Packets processed on the wrong processor

A problem we see often that affects performance is packets being processed on the wrong VMQ processors. Symptoms include a drastic unexplained dropped in throughput or just low throughput. This problem is usually related to a bug in the NIC and although they are obvious they are most times the hardest to troubleshoot. By opening perfmon and using the monitoring techniques from above, you will see that there is traffic being processed on a processor that is not in the configured range.

For these types of bugs we recommend you file a bug with your NIC vendor and make them aware of the situation.  You can always bring them to our attention as well for further investigation.

VMQ and 1G NICs

The second issue that is reported frequently is the implementation of VMQ on 1G NICs. By default, we do not enable VMQ on 1G NICs because a single processor is usually more than sufficient to handle the networking traffic generated. If your workload requires that you use VMQ on a 1G card you will need to enable it by setting a registry key. The registry key is below:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\VMSMP\Parameters\BelowTenGigVmqEnabled

DWORD = 1

clip_image010

After applying this registry key and rebooting the server, VMQ should start to work.

Conclusion

Summarizing the key takeaways from this post:

· There are performance monitor counters available that are extremely helpful in locating where a VMQ is located and the amount of traffic arriving on a processor

· Keep an eye on the processor set you choose for VMQ and make sure that packets are being indicated on the correct processors

· Be cognizant of the link speed of your NIC because 1G NICs do not have VMQ enabled by default

This concludes our series on VMQ. I hope that these posts were helpful in understanding the concepts behind VMQ and how to correctly configure relevant scenarios and troubleshoot issues.

Gabriel Silva, Program Manager, Windows Core Networking