Something to think about when looking at the Endpoint Mapper in a network trace.
That is some title, it took a bit to come up with. In most cases when I find myself writing a technical document or blog it has to do with a specific fix for a specific problem. This blog will be a bit different; I am writing this more as something to think about when you are troubleshooting. I will reference some things that I have found in traces and I hope this gives you a starting point for troubleshooting. So why am I writing this? Well I was working on an issue and noticed something in a trace, when I tried to find more information there was really nothing that helped, I hope that this information may help you. Because I am on a roll, let me start by telling you that reading a trace is part science and part art, combined with a bit of luck. The thing about reading traces is you can see what happened but sometimes the answer to why it happened is more difficult. To tell you the truth, after fixing the issue that you are going to read about here I still do not know for sure why it happened; I can only guess.
So let me tell you a story.
I was asked to help with an issue where a connection attempt from a SharePoint server to a Domain Controller for the purposes of getting user information would fail. Now, it only failed when it was using Domain Controllers in a specific site so the thought was that there was a network problem. This is where I came in; I was asked to review the network trace to determine where the problem was. Now remember that a network trace only shows me what happened and not why it happened.
Let’s start with how an RPC connection should look.
We start with a three way handshake.
After the three way handshake we initiate an RPC Bind to the Endpoint Mapper.
After receiving a response we make an Endpoint Map request to the application (this is what we will be looking at).
We receive a response indicating the IP address and port that we will need to connect to.
We close this connection and start a new connection with the host and port referenced in the Epm response, starting with a three way handshake.
We initiate an RPC Bind to the application.
We continue working with the application from here.
So now we have made it to the issue that I encountered.
Here is what I saw in the Epm for the failing connection.
Notice that there is an IP address in the Epm Request above.
Here is the same thing from the working trace.
So this started the questions:
- Why is there and IP address listed?
- Where did the IP address come from?
- Why would this cause a problem?
Well I could not find much information about this issue. What I did find was a reference in the MSDN technical reference indicating that the Epm request can contact a host address. The problem with this is it is application specific, so the application would need to be written to allow for it.
For the most part, my job with regard to finding a network problem was over; there was no problem. The communication was sent, received, and answered. Not being an expert in the application being used, I began to look for differences between the configuration when connecting to the working server and the failing server.
Let me make this very clear – the application is not important here and I am not suggesting that what I am about to tell you will fix any particular problem. What I am saying is that it fixed the one I had and it is something that you can look at to help troubleshoot a similar issue.
In reviewing the configuration, I found that the application could be configured to find Domain Controllers automatically or it could be configured to use a specific Domain Controller. In my case, both the working and non-working configuration was set to use a specific Domain Controller. The difference was that the working configuration had the FQDN of the Domain Controller (server.domain.com) and the non-working configuration had the IP address of the Domain Controller.
On the non-working configuration, we added an entry for the Domain Controller to the HOSTS file and changed the IP address to the FQDN and the problem was fixed.
Okay, so here we go again. I am not saying that I know for sure what was happening but the indication is that when you place an IP address in this configuration it sets the Epm request to use that IP. There is no reason that I can think of that this would not work but in my case it did not.
If you are having this type of problem and notice this behavior in the trace it might be a good idea to check if your application allows you to specify and IP address, and if so change it to an FQDN.
– Michael Andreacola