How to detect applications using "hardcoded" DC name or IP?


You look at Windows Server 2012 R2 and you tell yourself: "that would be nice if I could leverage all those new features". Then you remember…

  • Adding new domain controllers is usually not a problem. Besides, if you want to add your new DCs in a smooth way, without impacting the existing environment, you can follow this excellent post which, despite its age, is still valid for Windows Server 2012 R2: Minimizing Risk During AD Upgrades.
  • Removing the old ones is what you are worried about. "What if I have applications using explicitly one specific domain controller's name or IP?" Well, unless you reuse the same name and same IP address for your new domain controller, it might break things. And breaking things isn't fun…

How can we do it without breaking things?

First, it is important that all applications consuming Active Directory data (for authentication as well as for data storage) are configured in a way that they are not bound to a specific DC. Being proactive means two things:

  1. Communicate and educate the applications' owners about the magic the NetLogon service does. If possible, craft the list of all business critical apps, sit down with the team in charge of administering them and try to determine how their apps are discovering domain controllers.
  2. When acquiring new software, ask the vendors if their applications are discovering a domain controller through the Windows API or if they require a hardcoded configuration. And be careful! Specifying the FQDN of the domain name might bring some flexibility but does not necessarily imply that the applications use Windows API to discover domain controllers. We'll discuss about it later on this article.

Second, we can try to detect which applications are using this kind of hardcoded configuration. This is a tough one. You cannot just look at the logs of the domain controllers because the decision of using a specific DC is done on the clients' side. So enabling LDAP logging will just basically list all your active clients without the possibility to distinguish if it comes from a hardcoded app or a regular Windows client. When replacing a DC with a new one with a new name, you might be tempted to create a DNS alias to point to the new DC. It might do the trick for the application but it's in fact just punting. You will have to maintain the DNS record. However some functionalities such as LDAPs or Kerberos could go bad with this DNS spoofing workaround. It looks like a goner…

The ldap://contoso.com illusion

It is actually also true for \\contoso.com but less relevant for the purpose of this post. When we are using the FQDN of the domain name in a connection string for an application, we could assume that we are relying only on the contoso.com DNS resolution and therefore performing a simple A lookup of the domain name (resulting of a round robin of all DC registering their LdapIpAddress record). It is not the case. Well, not always. On a Windows client, when doing a [ADSI]"LDAP://contoso.com/DC=contoso,DC=com" in a PowerShell console for example, the ADSI component, like other Windows LDAP clients are leveraging the DsGetDcName function to get the closest domain controller. It will not use this <same as parent> record that you see in your DNS console.

Give it a try:

  1. Empty your resolver's cache: ipconfig /flushdns
  2. Start a network capture
  3. Run the following command from a newly opened PowerShell console: [ADSI]"LDAP://contoso.com/DC=contoso,DC=com"

What Are you seeing? That you are using the classic DCLocator discovery mechanism described here: http://technet.microsoft.com/en-us/library/cc759550(v=ws.10).aspx section Domain Controller Locator. So it will leverage SRV records and not the (same as parent folder) thingy you would expect.

"And what? Even if I was using the (same as parent folder), at the end I find and use a domain controller". Sometimes the netmask ordering was generous with you and gave you a pretty close target (see here for more info: http://support.microsoft.com/kb/842197 note that this behavior is pretty much the same for more recent versions of the OS). The problem is that you might think that because AD is highly resilient, if an application is using Active Directory pointing to its FQDN, your app inherits of that resiliency property. This is not always true. If the application is levering the Windows API to find a domain controller it is fine, if one domain controller goes down, it will find another one (there might be some timeouts depending on the app but they should be manageable). However if the application is relying on the DNS round robin of the FQDN of the domain, and the DC the app is currently pointing at goes down, because of the DNS cache, the app is likely broken. I will write another post about it. For now, I just want to bring awareness on that problem in order for you to make the verification on your apps. 

Hope

Well, it's not really hope, it's more about the method. You know that Windows clients are leveraging the DCLocator process that we just talked about. It means they are using specific DNS records to localize domain controllers. Those SRV records are registered by the NetLogon service of each domain controller (there are actually also a few A records recorded by the NetLogon service such as the (same as parent folder) or some GC related ones). Without those records the DCs cannot be localized therefore cannot be used. And THAT is the trick. Here is one method:

  1. Add new domain controllers in your environment (same OS version or new OS version if you are confident about application compatibility).
  2. Mask the old domain controllers in the DNS, it means remove everything registered by the NetLogon service (well not everything, the GUID records are used for the replication, so we must keep this one).
  3. Wait until the clients' caches expire and TADA! Every LDAP query you see reaching the masked DC, every authentication request is from applications and servers not leveraging the DCLocator and eventually having a hardcoded configuration. Because the hidden domain controllers are still running and replicating, it does not affect the hardcoded applications in using them.

Step by step, ooh baby

Of course, as usual, make sure you understand everything in this article and that you have a valid backup and test that in your lab first! If you are not feeling it, ask for assistance or even better: ask for a PFE!

  1. Adding new domain controllers

    I don't think I need to describe that one. You just launch a bunch of dcpromo, ideally you add a new domain controller for each domain controller you are planning to hide (dcpromo is a new nostalgic way, this time from the 2000s to say deploy and configure the Active Directory Domain Services from your Server Manager console). If you do need assistance to add new domain controllers, well you better stop here and ask for external assistance.

  2. Mask the old domain controllers

    Be careful! If by mistake you're hiding almost all or all your domain controllers, you might cause a serious outage! One way to do it is to create a group policy and link it to the domain controller OU.

    1. Create an empty group policy and disable the user configuration settings.

    2. Remove the Authenticated Users from the security filtering. Instead we will manually add the computer account of the domain controllers we need to hide. "Why not create a group and add those DCs accounts in that group?". Group membership change will require to restart the domain controller we want to hide. Because every domain controller is potentially suspected to be hardcoded somewhere, you want to avoid any sort of service disruption. You don't have to go all in. It is even better to go slowly, starting with one or few domain controllers. It will take time to parse the logs anyway. Why don't you start with those domain controllers that you have been keeping the name or IP (or both) of for a few years?

    3. Edit the group policy and find the following parameter: Computer Configuration > Policies > Administrative Templates > System > Net logon > DC Locator DNS record > Specify DC Locator DNS records not registered by the DCs. Enable this parameter and in the field you have to type all the records that you don't want to see in the DNS (those keywords are explained here: http://support.microsoft.com/kb/306602). So type the following (the separator is a space character): LdapIpAddress Ldap LdapAtSite Pdc Gc GcAtSite GcIpAddress Kdc KdcAtSite Dc DcAtSite Rfc1510Kdc Rfc1510KdcAtSite GenericGc GenericGcAtSite Rfc1510UdpKdc Rfc1510Kpwd Rfc1510UdpKpwd. Do not delete the DsaCname, this is used for the replication.

  3. Wait

    You have several things to wait for.

    1. If you have a multi site environment, you have to wait for a replication convergence. The group policy needs to be replicated on the affected DC to be effective.

    2. The group policy refresh interval is every 5 minutes on the domain controllers (unless it has been changed in your domain). So you have to up to 5 minutes to get the setting applied.

    3. Then the NetLogon service refreshes its records every 30 minutes. It means you have to add 30 minutes, you can also restart this service but since you will have to wait anyway for the clients, let's not be too intrusive.

    4. Then for the clients using the FQDN of the domain and not leveraging the DCLocator, you have to wait until the TTL of the records expired. By default it is 10 minutes.

    5. Then you have to wait until all Windows clients pick other domain controllers. By default, since Windows Vista, the clients will rediscover a domain controller every 12 hours. So you have to wait 12 hours. You still have Windows XP or Windows Server 2003? This is a tricky one, if you have deployed the KB 939252 you wait 12 hours. If you haven't deployed it… Well the XP/2003 client will not refresh its domain controller selection unless the currently selected domain controller isn't reachable, or you restart the machine (actually just restarting the NetLogon service will be enough). Your machines will restart at one point because of updates and software management, so you can also wait until the next cycle.

  4. Enable and collect the logs

    Let's focus on the LDAP logging. You need the time and the source IP of each call. Set the following registry value to 5: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics 16 LDAP Interface Events. Note that you could also choose to do that in the group policy that we have created and using the Group Policy Preferences feature to modify this registry value. From now on, every LDAP call is logged in the Directory Service event log. We will have to look at the eventids 1138 and 1139.

    Note that you also have the security context, here it is the builtin Administrator doing the call (this is the SID finishing with -500). You will see the type of LDAP operation such as ldap_search, ldap_bind… Do not rely on the parsing of the GUI, it sometimes gets things wrong as you can see:

    This can generate a lot of logs. So you might consider changing the default size of the Directory Service eventlogs to something way larger (as long as you stay under the recommended limits which is really a problem only for Windows Server 2003 domain controllers: http://support.microsoft.com/kb/957662). Or you can also remotely collect the logs by script at short intervals. I can share some code for that if you think it might be useful.

    You can also open perfmon and look at how many authentications per second and LDAP queries per second you still have on the domain controller to give you an idea of the number of requests still arriving on these domain controllers. Once you think you addressed all the servers and apps you were seeing in the logs, you can maybe let this perfmon run for a little while. Other domain controllers are still using hidden domain controllers therefore you will still see some authentication stuff happening (replication, monitoring apps and other internal calls). 

Here is a simple PowerShell script you can use to list all the IPs:

Get-WinEvent -ComputerName dc01.contoso.com -MaxEvents 1000 -FilterHashtable @{LogName="Directory Service" ; ID=1139 } | ForEach-Object `
{
       $_info = @{
           "Operation" = [string] $_.Properties.Value[0]
           "User" =  [string] $_.Properties.Value[2]
           "IP:Port" = [string] $_.Properties.Value[3]
       }
       New-Object psobject -Property $_info    

What about NetBIOS name resolution?

Really? You also have WINS in your environment? And you are scared that your app is not only hardcoded to use the NetBIOS name of the domain but also relies on NetBIOS name resolution to discover a DC? I am sure you only wish to get rid of WINS! I will discuss this in another article. In the meantime, just make sure that the 1C record are not listing the domain controllers you want to hide. Ping me if you want more details.

Comments (15)

  1. Anonymous says:

    Great! thank you

  2. nick says:

    Excellent article! Thank you for the very informative post!

  3. Byron says:

    Very good post, very informative. Wish I could see this earlier. Thanks

  4. JPK91120 says:

    Thanks for sharing !

  5. Karl says:

    Very thorough and a great read. Thanks!

  6. Dave says:

    Good article, thank you!
    A much better solution than "shut it down and see who shouts"…

  7. John says:

    This is great, however, none of my 1139 events are capturing the IP address or port. That data is just not there.

    Any ideas why that would be?

    Thank you.

  8. Unfortunately the event contains those data only starting Windows Server 2012.

    Alternatively, you can use a network capture to detect the IP address sending a SYN-ACK to the port TCP389.
    You can use netsh (ie: netsh trace start capture=yes PacketTruncateBytes=100 provider=Microsoft-Windows-TCPIP tracefile=c:tcpiptrace.etl) and parse the file with Netmon or using the technic I describe here:

    http://blogs.technet.com/b/pie/archive/2014/03/09/track-ldaps-clients-on-a-domain-controller.aspx.
    I know that it can seem a bit overwhelming but because the DC is isolated, it will not be that much data to go through.

    Have a look and tell me if you need assistance to set that up.

  9. Craig says:

    This is a great post. I’ve used it to eliminate almost all the hits on our one last remaining W2K3 Domain Controller, but ‘almost’ translates to a hundred or so machines (both servers and workstations) that may be using WINS to find the DC.
    Did you already post on how to hide a DC from WINS? I’m searching diligently but haven’t found anything yet.

  10. Narayanan says:

    Any idea how to log and see LDAPS events…i know LDAP sources are easy to get…but wondering how to track Secure LDAP sources to the domain controller

  11. Jamil says:

    This article’s awesome. One thing I’m not clear on though:
    “Mask the old domain controllers in the DNS, it means remove everything registered by the NetLogon service”

    How do the existing SRV records previously registered in DNS by the Netlogon service on the DC get removed? Does the group policy setting tell the Netlogon service to now remove (or unregister) those SRV records, or are we assuming DNS scavenging is enabled in the environment?

    Thanks.

    1. As long as dynamic update is supported by the DNS server and configured on the zone, the SRV records will be unregistered by the Netlogon service of the hidden DCs. So no scavenging involved.

  12. Phil says:

    Hi, great article. I have implemented on a 2003 DC. When I run the powershell script to list IP’s it returns an error.

    New-Object : Cannot validate argument on parameter ‘Property’. The argument is null or empty. Supply an argument that is not null or empty and then try the command again.
    At C:\scripts\HideDC.ps1:8 char:37
    + New-Object psobject -Property <<<< $_info
    + CategoryInfo : InvalidData: (:) [New-Object], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.PowerShell.Commands.NewObjectCommand

    1. The IP address will be there only if the server you are hiding is a Windows Server 2012. I know it is not ideal, but OS before 2012 do not show the IP address in the log. So you’ll have to go the harder way using network traces. I wrote an article to capture traffic on DC and minimizing the performance impacts. You could write a similar filter that will just capture TCP SYN for LDAP connections: https://blogs.technet.microsoft.com/pie/2014/03/09/track-down-ldaps-clients-on-a-domain-controller/ (in this one it is looking for LDAPs, but you can adapt it to catch LDAP).