The Good News Is… (a hotfix we have developed for shorter LDAP timeout)

We now have a solution for a problem that was described last month in Exchange Does Not Always Use Local GCs. The gist of the problem is that Exchange makes occasional requests to out-of-site DCs and if a WAN link is flaky or the remote DC is unresponsive, a few of those remote calls can block the majority of LDAP calls that are otherwise happy to use local GCs. This sometimes led to severe server outages especially when the initial LDAP connect succeeded but the bind or request response took too long to complete.

Not anymore. We have introduced a shorter timeout for LDAP calls and now if a query takes too long it will quickly time out allowing other calls to proceed. The new default timeout is 30 seconds but can be further tuned by using the following registry key:

HKEY_LOCAL_MACHINE\ SYSTEM\CurrentControlSet\Services\MSExchangeDSAccess\

Value: LdapBindTimeoutSecs


This key is available with the fix contained in the soon to be published knowledge base article KB 911830 (please check back, the article WILL be there). It is an Exchange 2003 post SP2 fix. Without this fix Exchange uses the default LDAP timeout of 2 minutes and attempts 24 retries. Now we use a 30 second timeout (or whatever LdapBindTimeoutSecs is set to) and still attempt 24 retries. If you experience the problem it's probably best to set the value based on what is normal for a remote LDAP query in your environment. An LDAP query to a local GC generally takes well under 1 second. If a remote query takes 3-5 seconds in your environment maybe set LdapBindTimeoutSecs to double that. This is just a suggestion. You are the best judge of what works for your particular environment.

I should emphasize that the fix does NOT solve LDAP connectivity issues. It only ensures that you don't suffer a complete Exchange outage because some remote DC Exchange tried to contact is unresponsive, yet the local GCs are available to service the majority of Exchange LDAP requests. If the Exchange server is hung waiting on responses from local GCs or all local GCs are unavailable and therefore all requests end up going to out-of-site servers, the fix is not particularly helpful. In both of these situations the first order of business should be figuring out what is wrong with the local GCs and fixing that. For the situation where the fix is useful you should also address whatever is wrong with the underlying substrate (WAN link or whatever else the case might be).

In the blog feedback Nino requested in November, MKohlman said he would like to see:

"...a post or two regarding a recent KB article concerning a recent issue followed with a history of how the problem was discovered and documented. It would be interesting to follow this process from discovery to resolution or work-around (say, was it discovered internally by MS or did it start as an issue in the field that was researched and resolved either by the admins, MS or both?) I know that I've been very curious on occasion when I've run into an issue that I could not find an answer for via KB or newsgroups, then see a KB turn up a few weeks or months later that matches or closely matches the issue that no one else could initially confirm or duplicate."

MKohlman, we hope this is what you had in mind!

- Jasper Kuria

Comments (11)
  1. Michael Kohlman (MKohlman) says:

    You made my days guys!  Not only is this exactly the kind of stuff that I thought would be useful to see (not only for myself but for others) but we are in the in the process of deploying a single Exchange cluster servicing multiple child domains over some pretty questionable (on occasion) WAN links.  This info is very relevant and the history behind is very interesting indeed.

  2. This is why I like this blog. It proves that the exchange team listens to it’s customers. Excelent!

  3. jasperk says:

    And we like our customers too because they read the blog and write comments!

  4. Steven Z says:

    Is there a way to force a local GC?



  5. jasperk says:

    Please read the referenced "Exchange Does Not Always Use Local GCs" Steve. A local GC (local domain) cannot service EVERY query and a few need to go to DCs in remote domains which will likely be in a remote site. Of course you could design your AD topology such that there is a DC for every domain in every site to ensure every query is serviced locally, but that would really wasteful. And ugly!

  6. Steve Z says:

    Is there a way to tell Exhchange not to try and use certain GC’ that may be behind a FW and the Exchange server tries to go to the GC behind the firewall when It has a GC localy.



  7. jasperk says:

    I will say this Steve: There is no way today to tell Exchange NOT to use a particular GC

  8. Joe says:

    1. I am interested in the 24 retries. Why did you not allow that to be tuneable since you were already in the code? How did you happen to pick 24? Any significance or is that what the dart hit?

    2. How does the LDAP query queue for dsaccess work? Is it a prio queue? fifo? Preemptive (either forced or cooperative)? When you say retry is it a requeue or does it just sit there spinning until all 24 retries have occurred?


  9. jasperk says:

    Hi Joe,

    The answer to your first question is I don’t know :) I dont know how 24 was arrived at but I dont think the actual number is that important as long as its not ridiculously high or low. You also can’t make everything tuneable or else you run into the kinds of
    problems described in this post

    For your second question, no, we dont just sit there spinning until all 24 retries have occurred. Its a lot more sophisticated, highly optimized retry mechanism

  10. Anonymous says:

    Research In Motion and NTP Sign Definitive Settlement Agreement to End Litigation

    RIM Announces Availability…

  11. Anonymous says:

    Here is the updated, corrected post. Exchange Does Not Always Use Local GC(s). Thanks to Dmitri for his…

Comments are closed.

Skip to main content