RPC Endpoint Mapper Returns Dynamic Port Incorrectly When Active Directory is Configured to Use Static Port

Hi Folks,

Gary Green, Lakshman Hariharan and Rick Sasser here with a new post on RPC. The purpose of this post is to draw attention to an issue that our friends in the Directory Services team have uncovered where the RPC Endpoint Mapper (EPM) returns a dynamic port incorrectly instead of the static Active Directory Domain Services (ADDS) port configured. As a result of this, one or more of the following symptoms may be observed

a. Slow console and RDP Logons

b. Failed console and RDP logons

c. Netlogon event ID 5783: The session setup to the Domain Controller <Domain Controller Name> for the domain <Domain Name> is not responsive. The current RPC call from NETLOGON on <Machine name> to <Domain Controller Name> has been cancelled.

d. Netlogon event ID 5719: This computer was not able to set up a secure session with a domain controller in domain <Domain Name> due to the following: The RPC server is unavailable. This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.

e. Active Directory replication fails with one of the following errors

Active Directory Replication Error 1753: there are no more endpoints available from endpoint mapper.
OR

Active Directory Replication Error 1722: The RPC server is unavailable

f. Local Users & Groups: existing users cannot be viewed

g. Local Users and Groups: New users cannot be added

h. Local Users and Groups: Security principals appear as SIDS vs. being displayed as friendly names.

i. Security principals (users from remote domains) cannot be added to groups from the local domain.

j. Kerberos authentication to web servers fails. The reason is that the Kerberos PAC cannot be verified.

 

Some background information

Before we discuss the issue in more detail, we highly recommend reading this post as an excellent source to understand various RPC concepts such as dynamic ports, Universally Unique Identifiers (UUID), Floors, Towers, OpNums and such. The knowledge of many of these concepts is assumed in this post.
To summarize, from the post referenced above:

1. RPC “clients” connect to the remote server over port 135 and reference the UUID for the service they want to access.

2. EPM on the RPC Server responds with the dynamically assigned port number for the requested service that falls within default RPC dynamic port range for Operating System version used on the server.   The default dynamic port ranges for Windows 2000 and Windows Server 2003 are in the "low" range of 1024-5000 while Windows Server 2008 and above Operating System versions use the “high” RPC port range of 49152-65535. Note that in some environments these default port ranges can (and have been) modified for Windows Server 2008 and above to use the "low" port ranges.

This means that intermediate devices and firewalls need to be configured to allow connectivity over the default or modified ranges. It also follows that if you have domain controllers running both low AND high port ranges, connectivity has to be enabled over both ranges.   Given the range of ports encapsulated, it is sometimes desirable to configure certain services to use a specific port number, provided the service supports such a configuration.

Active Directory is an example of such a service as documented Knowledgebase article 224196. The lsass.exe process, which is pretty much responsible for all things ADDS related on a domain controller, hosts among others 4 services: DRSUAPI, LSARPC, NETLOGON and SAMR. These are the different UUIDs associated with the services.  The result of such a configuration is that when a client computer queries the Endpoint mapper (EPM) on a domain controller to connect to the DRSUAPI or NETLOGON interfaces, instead of returning a dynamically assigned port, the EPM response includes the static port(s) configured in the registry.

NETLOGON -- {12345678-1234-ABCD-EF00-01234567CFFB}
LSARPC -- {12345778-1234-ABCD-EF00-0123456789AB}
DRSUAPI -- {E3514235-4B06-11D1-AB04-00C04FC2DCD2} 
SAMR -- {12345778-1234-ABCD-EF00-0123456789AC}

The issue

The endpoint mapper response contains a tower list containing the ports that the RPC caller should use for the requested RPC interface. While the tower list may contain multiple UUID and port mappings, only the first port returned in the tower is used.

Problems arise when a code defect causes the endpoint mapper to randomly return the dynamic port for the DRUSAPI, LSARPC and NETLOGON interfaces in the tower list before the statically assigned port configured in the registry. To be clear, this random behavior occurs over a specific start of the NETLOGON or Directory Service (DRSUAPI), not on each use of the RPC interface.

If connectivity over the dynamically assigned port is blocked, RPC based operations dependent on those 3 interfaces, such as Active Directory replication will fail.

 

How to confirm

In Active Directory environments where KB article 224196 has been used to hard code the port used by NETLOGON and DRSUAPI

AND

You are observing any of those symptoms when connecting to or communicating with a domain controller that has a static DRSUAPI or NETLOGON port configured then you are most likely seeing a variant of this issue.

If one were to capture a network trace one would see the behavior described below. Refer to this post on how to capture a network trace using netsh.exe. The trace has to be captured early enough to catch the RPC bind operation. Once the trace has been captured, using Network Monitor analyze the EPM Response from the domain controller. Match the UUID to the failing operation. For example, if it is AD replication then look for the DRSUAPI UUID, if it is NETLOGON error with session setup, the NETLOGON UUID.

 
Below is one frame of the network capture of an EPM response from a sample network trace taken while forcing AD replication between two domain controllers, which as mentioned earlier uses the DRSUAPI interface. Highlighted in blue is the UUID for DRSUAPI, highlighted in yellow is the fact that there is one tower and lastly, highlighted in green is the fact that tower which is the dynamic port for the DRSUAPI interface is 49155. The port being returned at the top of the tower list, highlighted in green below will be different from the static port configured in the registry of the RPC “server”

 

- Epm: Response: ept_map: NDR, DRSR(DRSR) {E3514235-4B06-11D1-AB04-00C04FC2DCD2} v4.0, RPC v5, 10.0.0.2:49155 (0xC003) [49155]
- EntryHandle:
ContextType: 0 (0x0)
ContextUuid: {00000000-0000-0000-0000-000000000000}
NumTowers: 1 (0x1)
- Towers: 1 Elements
+ ArrayInfo: 1 Elements
- TwrPtr: Pointer To 0x0000000000000003
ReferentID: 0x0000000000000003
- Tower: NDR, DRSR(DRSR) {E3514235-4B06-11D1-AB04-00C04FC2DCD2} v4.0, RPC v5, 10.0.0.2:49155 (0xC003) [49155]
+ Length: 75 Elements
TowerLength: 75 (0x4B)
+ Floors: NDR, DRSR(DRSR) {E3514235-4B06-11D1-AB04-00C04FC2DCD2} v4.0, RPC v5, 10.0.0.2:49155 (0xC003) [49155]
+ Pad: 1 Bytes
  + Status: 0x00000000 - EP_S_SUCCESS

Resolution

If knowledgebase article 224196 is being used, we recommend proactively having these fixes available on the domain controllers in question. Depending on what the symptoms are, one or more hotfixes will have to be applied.  Refer to the following table for guidance on operating system versions and respective hotfixes.

Operating System Versions

DRSUAPI

LSARPC

NETLOGON

Windows 7 / Windows Server 2008 R2

KB 2912805

KB 2987849

KB 2827870

Windows 8 / Windows Server 2012

Contact Support

Contact Support

Contact Support

Windows 8.1 / Windows Server 2012 R2

KB 2912805

KB 2987849

N/A

Below is a short description (from headings of the articles referenced in the table above) of the symptoms and associated hotfixes.

 

Long logon time after you set a specific static port for NTDS and NETLOGON in a Windows Server 2008 R2-based domain environment
https://support.microsoft.com/kb/2827870/en-us

AD replication fails with an RPC issue after you set a static port for NTDS in a Windows-based domain environment
https://support.microsoft.com/kb/2912805/en-us

Logon fails after you restrict client RPC to DC traffic in Windows Server 2012 R2 or Windows Server 2008 R2
https://support.microsoft.com/kb/2987849/en-us

 

Please post in the comments any questions.

 

Gary Green, Lakshman Hariharan and Rick Sasser