Detecting ephemeral port exhaustion

Symptoms

When Windows or Windows Server is out of ephemeral/outbound/dynamic network ports, it will not be able to establish any outbound network connections. This results in a lot of connection failures such as database and/or domain controller connections. If the system is not responding, then try increasing the port range (discussed below) - this change is effective immediately. If the system immediately recovers, then you know you are dealing with this issue. You can either increase the port range which may just delay the problem or you can try to identify the cause.

Detecting user mode port leaks

Ephemeral ports are range of ports that Windows and Windows Server use for outbound communications over the TCP/IP network protocol. When an outbound connection is finished, the port associated to the connection is put into a TIMED_WAIT state for two minutes by default. This allows any lingering packets on the network to be ignored. Windows Server 2008 and later use the IANA range which uses the ports between 49152 and 65535 providing 16,383 ports.

Some applications and services such as Microsoft Exchange Server CAS servers can be very “chatty” and might actually use all 16,383 ports within a two minute time period. The result is connection failures similar to “Couldn’t connect to X, due to no ports available from the end point mapper”.

If you suspect ephemeral port exhaustion, then consider running the following Powershell script called “Log-EphemeralPortStats.ps1” at
https://1drv.ms/f/s!AhuJirRUDDbmkotkPocbTrN0wgKB7Q

Warning: This script is provided as sample code only. Please review it and use at your own risk.

Be aware that this only detects user mode port leaks. If a driver (kernel mode) is leaking ports, then a complete memory dump must be sent to Microsoft Support for analysis to know which driver is responsible. The memory dump must be taken when the system has leaked a significant number of ports.

This script is designed to run in an infinite loop of 1 minute sleep intervals and write to a log file called “EphemeralPortStats.log”. Here is an example of the output it produces:

Computer       DateTime             LocalAddress  #OfEPortsInUse Max#OfEPorts %EPortUsage #OfTcpListeningPorts #OfPids
--------       --------             ------------  -------------- ------------ ----------- -------------------- -------
ETCHEDCHAMPION 8/9/2013 12:37:42 PM 127.0.0.1                  6        16384           0                   15      11
ETCHEDCHAMPION 8/9/2013 12:37:42 PM 172.18.96.192              3        16384           0                   15      10
ETCHEDCHAMPION 8/9/2013 12:37:42 PM 192.168.1.2               69        16384         0.4                   15      17

This script is intended to be ran from the console of the computer suspected to be running low on ephemeral ports and to leave it running. Periodically review the log to see if there was any ephemeral port exhaustion detected.

This script gets the port range from:

netsh int ipv4 show dynamicportrange tcp

Then, correlates this information with the output of:

netstat –ano –p tcp

PsExec can be potentially used to get this information from remote computers, but keep in mind that passwords used in PsExec are sent in the clear over the network.

My PFE colleagues and customers have used this script quit a bit and I hope it will help you as well.

Also, as a temporary work-around, the ephemeral/dynamic/outbound port range can be increased using the following command:

netsh int ipv4 set dynamicport tcp start=10000 num=55535

This change takes affect immediately. No need to restart. No need to reboot. This is about three times larger than the default. If this change has a positive affect when the problem occurs, then you know that this is the issue your system is dealing with.

To set the port range back to default, use the following command:

netsh int ipv4 set dynamicport tcp start=49152 num=16384

To check the port range:

netsh int ipv4 show dynamicportrange tcp

Once you have identify the process consuming the ports, contact the developer of the process for a fix.

Also, consider decreasing the TIMED WAIT delay using the TcpTimedWaitDelay registry key.

Detecting kernel mode port leaks

All of the above covers user mode port leaks - meaning leaks originating from processes. If a driver (kernel mode) leaks a port, then netstat will not be able to report it. Kernel mode port leaks can only be detected by inducing a complete memory dump and then analyzing the dump with the MEX debug extension. Load MEX into WinDBG and then run the following commands to help identify ports associated with drivers.

!mex.afd -conn -report -verbose

!afd -endp -report

!tcpip -p

Once you identify the driver consumer the ports, contact the developer of the driver.

If you suspect a Windows Server 2008 R2 system might be running out of ephemeral ports, then apply the following hotfix:

Kernel sockets leak on a multiprocessor computer that is running Windows Server 2008 R2 or Windows 7
https://support.microsoft.com/en-us/kb/2577795

Windows Server 2012 and later are already patched with this fix.

UPDATE

Windows 10 and Windows Server 2012 R2 introduced the "q" parameter in netstat. This parameter displays all connections, listening ports, and bound non-listening TCP ports. Bound non-listening ports may or may not be associated with an active connection.