WNV Deep Dive Part 3 – Capturing and Reading Virtualized Network Traffic

By James Kehr, Networking Support Escalation Engineer

There are three primary tools used to capture virtual network traffic in Windows: netsh trace, the PowerShell NetEventPacketCapture module, and Message Analyzer. I won’t focus much on Message Analyzer captures here. Most server admins don’t like installing tools, so I will focus on the built-in tools for capturing. MA will be discussed as an analysis tool, though it is capable of capturing NBLs and ETW events. There are a number of online tutorials for MA should you prefer a graphical capture tool.

netsh trace

“netsh trace” was added to Windows in the 7/2008 R2 generation. This is a single command start, single command stop option. It’s great for simple captures and scenario based captures, but becomes cumbersome with complex captures. It is also being deprecated in favor of the PowerShell NetEventPacketCapture module in new versions of Windows. There is a rare chance that the packets will not be captured with a netsh trace in Windows Server 2012 R2+/8.1+. When this happens, you must you an alternate method.

A netsh trace scenario is a pre-packaged set of ETWs. The most common scenario used by Microsoft support is the NetConnection scenario. It is the jack-of-all-trades scenario using 45 ETW providers and covering everything from the network stack to the various network subsystems; such as, wireless, wired, WWAN, 802.1x authentication, firewall, and much more. A sample NetConnection command looks like this:

netsh trace start capture=yes scenario=NetConnection

There are some problems with netsh trace, outside of those previously discussed. Scenarios do not work on Server Core. Each provider must be manually created and added to the netsh trace command. This makes the command large and unwieldy to adjust or troubleshoot should something go wrong. Nano Server doesn’t have netsh trace at all. And then there is stop time.

Every time netsh trace is stopped with “netsh trace stop” it will generate a report and compress the data using the CAB format. This is time consuming (several minutes) and can hammer system resources.

I’m not saying netsh trace is bad, because it’s not. Netsh trace is the bread and butter of Microsoft networking support. There are just some caveats to be aware of when using it. I could write an entire article on the intricacies of capturing network data on Windows…some other day perhaps.

PowerShell NetEventPacketCapture

PowerShell is a bit more complex to learn, but is more flexible, stops immediately, and can be better integrated into scripts. The PowerShell method can be used on Server Core and is the only packet capture tool supported on Nano Server, one of the primary Windows Container operating systems. This option is only available on Windows Server 2012 R2+/8.1+. Installing PowerShell 4 or 5 on older version of Windows will not add the NetEventPacketCapture module, as that is an OS specific module, not a PowerShell specific module.

The NetEventPacketCapture (NEPC) module is built to do packet captures “The PowerShell Way”. Rather than grouping up everything into one large command, like netsh trace, it’s spread out and can be controlled by variables. There are no scenarios, nor is there an automatically generated report, like netsh trace has. NEPC is built for creating a script file that collects what you need. The script can then be stored in a repository or share for use across your environment, rather than memorizing a command.

Having used both for several years in a support capacity I can honestly say both have their benefits, but I prefer “The PowerShell Way” when I am troubleshooting a version of Windows with NEPC. It is, in my opinion, easier to use a script, six basic steps to execute the script, and know I will get all the data I need every time, rather than send a list of commands and more complex instructions. Even then, I will still use netsh trace frequently because it can be so simple to use.

Capturing data

The steps below cover the very basics of collecting data for Windows Network Virtualization (WNV), using both netsh trace and PowerShell NEPC. Both tools are far more flexible and complex, when needed, than what I will show here.

Basic instructions for capturing virtual network data with netsh trace
  1. Open an elevated Command Prompt (Run as administrator) console.
  2. Run this command to start the trace.
    netsh trace start capture=yes overwrite=yes maxsize=1024 tracefile=c:\%computername%_vNetTrace.etl provider=”Microsoft-Windows-Hyper-V-VmSwitch” keywords=0xffffffffffffffff level=0xff capturetype=both
  3. Reproduce the issue, or perform the operation you wish to investigate.
  4. Stop the trace with this command.
    netsh trace stop
  5. The ETL file and CAB report be stored on the root of C:.
Basic instructions for capturing virtual network data with PowerShell

Please note that this code contains my own special flare. The most basic capture code can be much shorter than this. While the code can be much larger when applying more stringent coding practices.

    1. Open an elevated PowerShell (Run as administrator) console.
    2. Execute these commands to start tracing.
# Basic Windows Virtual Network host capture

# the primary WNV ETW provider.
[array]$providerList = 'Microsoft-Windows-Hyper-V-VmSwitch'

# create the capture session
New-NetEventSession -Name WNV_Trace -LocalFilePath "C:\$env:computername`_vNetTrace.etl" -MaxFileSize 1024
            
# add the packet capture provider
Add-NetEventPacketCaptureProvider -SessionName WNV_Trace -TruncationLength 1500

# add providers to the trace
foreach ($provider in $providerList) {
    Write-Host "Adding provider $provider"
    try {
        Add-NetEventProvider -SessionName WNV_Trace -Name $provider -Level $([byte]0x5) -EA Stop
    } catch {
        Write-Host "Could not add provider $provider"
    }
}

# start the trace
Start-NetEventSession WNV_Trace 
    1. Reproduce the issue, or perform the operation you wish to investigate.
    2. Stop the trace with these commands.
      # stop the trace
      Stop-NetEventSession WNV_Trace
      
      # remove the session
      Remove-NetEventSession WNV_Trace 
      

 

Cautionary Side Note:

ETL files are a bit finicky to work with, compared to traditional packet capture file types. Here are some caveats and information to consider when working with ETL files. Yes, there are a lot.

  • ETLs can only be collected by a member of the Administrators group.
    • Command Prompt, PowerShell, and Message Analyzer must be “Run as administrator” to collect data.
  • There can only be one ETL packet capture at a time, per instance of Windows, regardless of tool used.
    • You can collect an NDIS packet capture on a Hyper-V host and any VM guest at the same time, as those are separate instances of Windows.
  • The capture must be stopped in the same user context that started the capture. If Bob starts the capture, Fred can’t stop it.
  • Writing ETLs requires faster storage as the network speed increases.
    • The buffering levels are not as large as other tools, and thus ETL files are prone to losing data when written to slow storage.
    • This is not an ETL only issue, but it can be more prominent with ETL collections.
    • Writing to an SSD or enterprise HDD [array] is needed for multiple-Gb packet captures.
    • The only sure-fire way to accurately capture on a saturated 10Gb+ connection is by using a RAM disk or NVMe-grade solid state storage. No exaggeration.
  • Never capture to a mapped network drive or other network storage. Though that’s always good practice regardless of packet capture tool.
  • Packets are collected from all network interfaces, by default.
    • This will cause packet duplication on systems using virtual networking, as each packet is collected on each physical and virtual NIC, and sometimes the vmSwitch, as it moves through Windows. Capture on a single interface to see only a single set of packets.
  • ETLs can only be parsed by Microsoft tools: Message Analyzer, Network Monitor (limited support), and parsed to text file by “netsh trace convert”.
    • Wireshark and other third-party tools will not open an ETL (as of this writing).
  • A reasonably fast computer can parse an ETL file up to ~2GB in size.
    • This seems like a lot of data, but it takes a saturated 10Gb network connection about 2-3 seconds to fill a 2GB file. ~20-30 seconds on a single, full 1Gb connection.
    • Files larger than 2GB can be parsed with Message Analyzer and “netsh trace convert”, but parsing and analyzing files this size can be extremely slow. Be patient.
    • Truncate (snaplen in tcpdump terms) packets to fit more packets into a smaller file. This works if you don’t care about the payload, and just need packet headers.
  • ETL tracing has limited filtering capabilities. Use Message Analyzer if more complex filtering is needed.
  • Message Analyzer can usually export/convert an ETL to a CAP file that Wireshark can read.
    1. Open and apply a filter to the trace first.
      • Any non-packet ETLs in the output will prevent Wireshark from opening the file.
      • Use “Ethernet” as the filter if you want all the packets and nothing else.
    2. Save [As]
    3. Select the “Filtered Messages for Analysis Grid view” option
    4. Export
    5. Pick the location and filename
    6. Save

Looking at a Container trace

Here’s an example captured from a Contianer host, with Docker networking set in Transparent mode. This is how the TCP SYN, the first frame in all TCP connections, passes through the Windows host to the Container. Message Analyzer 1.4 was used to process the ETL file.

Notes:

  • The traffic below is slightly modified. The vmNIC and vmSwitch GUIDs are included in the actual output, but these make it difficult to read in an article format.
  • You need Hyper-V installed to see the vmSwitch events. The ETW parsing details are installed with the feature or role. Windows 10 or Server 2016 is highly recommended for the best parsing experience.

Flags: ……S., SrcPort: 52891, DstPort: HTTP(80), Length: 0, Seq Range: 1717747265 – 1717747266, Ack: 0, Win: 8192(negotiating scale factor: 8)
   –>The first line is the TCP SYN as it arrives on the physical network adapter of the Container host. The destination port is 80, which is the HTTP port.

NBL received from Nic /DEVICE/ (Friendly Name: Microsoft Hyper-V Network Adapter) in switch (Friendly Name: Layered Ethernet)
-> The second line shows the Microsoft_Windows_Hyper_V_VmSwitch module. This is how Message Analyzer parses the Microsoft-Windows-Hyper-V-VmSwitch ETW provider. This provider is what captures the NBL reference as it passes through the Hyper-V vmSwitch.
–> The NBL event text shows that the NBL was received by the Microsoft Hyper-V Network Adapter, which is the host’s management network adapter in this case. The network adapter names are sometimes the same in parser output. When this happens use the outputs of Get-VmNetworkAdapter and Get-NetAdapter on the host, in PowerShell, to see which NIC is used by comparing the vmNIC’s GUID of the PowerShell output to the adapter GUID in the trace.

Flags: ……S., SrcPort: 52891, DstPort: HTTP(80), Length: 0, Seq Range: 1717747265 – 1717747266, Ack: 0, Win: 8192(negotiating scale factor: 8)
-> This in the packet on the vmSwitch. The packets will not always show up on the vmSwitch in a trace. It depends on the version of Windows and technology used. The NBLs may be the only thing you see, so don’t worry if your results don’t directly match this.

NBL routed from Nic /DEVICE/ (Friendly Name: Microsoft Hyper-V Network Adapter) to Nic (Friendly Name: Container NIC fb5c285c) on switch (Friendly Name: Layered Ethernet)
–>
The next NBL shows the packet being routed across the vmSwitch to the Container vNIC.

Flags: ……S., SrcPort: 52891, DstPort: HTTP(80), Length: 0, Seq Range: 1717747265 – 1717747266, Ack: 0, Win: 8192(negotiating scale factor: 8)
–>
The packet captured on the vmSwitch again.

NBL delivered to Nic (Friendly Name: Container NIC fb5c285c) in switch (Friendly Name: Layered Ethernet)
-> The NBL is successfully delivered to Container NIC.

Flags: ……S., SrcPort: 52891, DstPort: HTTP(80), Length: 0, Seq Range: 1717747265 – 1717747266, Ack: 0, Win: 8192(negotiating scale factor: 8)
-> This is where the packet finally arrives at the Container.

The interesting bit here is that the TCP connection shows up after the packet arrives on the Container. Which shouldn’t seem odd. Except that the capture was taken on the host. Maybe it’s just me.

clip_image002

Next up is a smashing bit about LBFO and Hyper-V traffic. It’s filled with fabulous pictures, hastily drawn in Paint, to provide a little visual context to the flow of data. This section also covers the basics of troubleshooting WNV traffic.

-James