Troubleshooting “RPC server is unavailable” error, reported in failing AD replication scenario.

 

In this scenario when are troubleshooting AD replication between 2 DCs separated by a firewall.

 

In order to ensure that the important well-known ports required in a domain environment are open on the firewall between these DCs, use the PortqryUI tool.

PortqryUI

https://www.microsoft.com/downloads/details.aspx?FamilyID=8355e537-1ea6-4569-aabb-f248f4bd91d0&displaylang=en

Run this tool on both these DCs to test the communication on a selected set of ports on the target DC (replication partner).

· Invoke PortqryUI.exe

· Enter replication partner’s IP or FQDN in the “Destination to query” textbox.

· Select Pre-defined query – “Domains and Trusts”.

· Hit the “Query” button and let it finish.

· Save the above output to a text file.

PortqryUI

 

 

Go through the PortqryUI query result by searching for “Return Code” phrase in the output.

è If the return code is 0, it indicates that this DC was able to communicate with its partner DC on that particular port.

è The return code 2 is normally reported for UDP ports as we don’t get an ACK for that communication. This can be ignored if it’s returned for a UDP port.

è The return code of 1 indicates that this DC was unable to talk to the target DC on the respective port. This either indicates that the service related to this port is not running on the target or that port is FILTERED on the firewall.

è Any other return code also needs investigation.

 

Sample output of PortqyrUI

 

 

 

 

In our scenario, we need to ensure that the following ports are open on both these DCs.

· TCP 135 – Endpoint mapper

· TCP,UDP 389 - LDAP

· TCP, UDP 88 - Kerberos

· TCP 445 - SMB

· TCP 139 – SMB, Namepipe

· TCP, UDP 53 – if these servers are DNS servers too.

Out of the above ports, the one that is most IMP to look at in the RPC related errors is TCP 135.

à This is the Endpoint Mapper port. A DC would first communicate with its partner on port 135 to get the details of the TCP ports the NTDS and Netlogon services are listening on. It’s only when it gets this response from the Endpoint mapper that it would communicate with the NTDS (DRS) and Netlogon service on the target DC (Partner DC).

To get the list of the Endpoints on the partner DC and get the list of services and the ports associated with it, we can use another tool called RPCdump. This tool also has the capability of checking if source server can communicate with all endpoint on the destination server.

 

RPCdump is a part of Windows resource kit.

**

https://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&displaylang=en

Command : RPCDUMP /s destination_server /v /i > RPCdump_destination.txt

In our scenario, we need to review the TCP port the DRS Interface is listening on. You can either search using the phrase DRS or using the following UUID e3514235-4b06-11d1-ab04-00c04fc2dcd2.

Other UUIDs:

e3514235_4b06_11d1_ab04_00c04fc2dcd2 DRS

12345778_1234_abcd_ef00_0123456789ab LSA

12345678_1234_abcd_ef00_01234567cffb NETLOGON

12345778_1234_abcd_ef00_0123456789ac SAM

When looking at this log we need to check if for the respective UUID and IsListening result.

è IsListening:YES – this means that the source server was able to communicate with target server on the respective port.

è IsListening:NO – means that this port is filtered on the firewall.

è IsListening:Unknown – this may mean that you need to investigate further as packets targeted to the respective port may not be reaching the server. In this scenario a simultaneous network trace may help.

Sample RPCdump output

 

 

Yon can confirm if the source server can communicate to the destination server on a particular port by using PortqryUI again. This time specify a Port instead of a predefined query.

 

If you find that the DRS and Netlogon service ports cannot be communicated to, from either of the 2 DCs which are suppose to replicate with each other. Then we should have the network team analyse the Firewall/network device - to allow communication on this port.

In some scenarios, you will see that the above 2 test pass and the DCs are able to communicate with each other on the required ports, but then too the AD replication fails with RPC server unavailable message.

In this scenario, we need to install Network Monitor on both the source and target DCs.

· Start network monitor capture on both these DCs simultaneously.

· Force AD replication between these DCs using “AD sites and services” snap-in.

· Leave the network monitor for 2 to 3 mins after initiating the replication and then stop it.

 

**The main thing to analyse in this network trace (for RPC errors) is to check if any packets between these DCs are getting dropped. This can be done by looking at the communication between these 2 DCs, in both the simultaneous network traces. If there are packets visible on 1 trace which is not reflected in the other trace, it would indicate that the packet may have got dropped.

 

I have seen a few firewalls with “Intrusion Prevention System” drop selective packets.

Network Monitor

https://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=983b941d-06cb-4658-b7f6-3088333d062f

The above troubleshooting should be able to indicate the reason behind the RPC server unavailable messages for sure.

More Information - related to configuration in the above scenario:

Talking of the communication issue between the DCs specially when seperated by a firewall, I suggest you review the information below to configure you DCs and firewall. The idea is to avoid opening large number of ports to allow RPC communication, hence making your network more secure and have a better control on the communication behaviour of these DCs.

Dynamic RPC ports are used by Netlogon, NTDS, FRS, DFSR service etc. and these ports are picked from the range 1024-65535/TCP.

If you want to restrict the range of ports, the services would pick from, for RPC communication, then follow the KB article below and define a range of port to be used for RPC dynamic allocation.

How to configure RPC dynamic port allocation to work with firewalls

https://support.microsoft.com/kb/154596/

If you want to specify static ports for known services on DC like Netlogon, NTDS, FRS etc. then follow the articles below.

Restricting Active Directory replication traffic to a specific port

https://support.microsoft.com/?id=224196

How to restrict FRS replication traffic to a specific static port

https://support.microsoft.com/?id=319553

IMP: If you are modify the RPC range or assigning static RPC ports then you just need to open those port in the firewall, instead of the range 1024-65535/tcp. If you plan to place a firewall between the client and DCs in the main site, you need to allow most of the above exceptions on that firewall too.

In case you add Windows 2008 DCs in your domain, you need to know that the default dynamic port range has changed in Windows 2008 starting from 49152/tcp.

The default dynamic port range for TCP/IP has changed in Windows Vista and in Windows Server 2008

https://support.microsoft.com/?id=929851

 

 

 

-Abizer