How DNS Scavenging and the DHCP Lease Duration Relate

Hello everyone, Sean Ivey here from the US PFE – Carolinas team. I’m what we refer to as a platforms-AD PFE. Basically I focus on Active Directory and related networking technologies. Recently, and on three separate occasions, I worked with SCCM administrators having issues deploying the SCCM client. Specifically, they were seeing the error “Failed to get token for current process (5)” in ccm.log. We discovered the problem was related to DNS and DHCP rather than SCCM. As a matter of fact, other services were suffering from the same issue but either didn’t experience symptoms or showed slightly different symptoms. Let’s talk about the problem and discuss how it can be PREVENTED!

The Scenario

Consider the following simplified scenario.

DHCP

  • A DHCP scope has its lease duration set to the default 8 days.
  • The DHCP scope is low on available IP addresses.
  • Client-A has NOT renewed its IP address lease in 8 days, so it has expired.
  • Client-B is requesting a new IP address.
  • The DHCP server assigns Client-B the address that was leased to Client-A.

 So far so good. This is a very typical scenario and everything works as we would expect. Now let’s add DNS into this story.

DNS

  • An Active Directory integrated DNS zone is set to scavenge stale resource records.
  • The defaults are used; “No Refresh = 7 days” and “Refresh = 7 days”.
    • The server defaults are used as well; “Scavenging Period = 7 days”.
  • Client-A renewed its DNS record 8 days ago (the last time its DHCP lease was updated as well).
  • Client-A is the owner of its DNS record so it cannot be deleted by the DHCP server (by default).
  • Client-B registers its DNS record with the new IP address it received from the DHCP server.
    • This IP address is the same one that is registered to Client-A!
  • The DNS server will not be able to scavenge Client-A’s DNS record for another 6 days!

(NOTE: if you’re unsure what all of this “scavenging”, “refresh/no refresh” stuff is check out Josh Jones’ blog, it’s great!)

Uh-oh, not so good. This happens more than you’d think. Now Client-A and Client-B have the same IP address registered in DNS!

 

 

Figure 1

Ugh, now we’ve got two different names associated with the same IP address in DNS. And it will likely stay this way for at least 6 days using the defaults for DNS scavenging outlined above. What kind of problems can we expect to see? Let’s take a look.

The Problem

I mentioned this issue manifesting itself as a problem installing the SCCM client, but in reality we can demonstrate this with a much simpler example; accessing a shared folder. Ultimately the problem is the same.

So now let’s say I need access to a share on Client-A. Let’s use the administrative share as an example. Maybe a deployment requires this share to be accessible.

 

Figure 2

Well that’s interesting. First, Client-A isn’t even turned on…but we get a response. And of all things it’s a logon failure! Some of you may already realize this is what happens when we send a Kerberos ticket intended for one computer to another computer, but let’s quickly walk through this process.

We see that our computer (Infra-App1) does a DNS query for client-a.corp.contoso.com. It then gets a response back saying the IP address is 10.0.0.100.

 

Figure 3

As far as DNS is concerned, this is true. Client-A does have 10.0.0.100 listed as its IP address, but so does Client-B.

Great, so now let’s go get a Kerberos ticket. Our DNS query was for Client-A, so our TGS request will be for Client-A as well. 

 

Figure 4

 

Figure 5

We can see the request in Figure 4, and the domain controller’s response in Figure 5.

Now that we have our ticket, we try to connect to what we think is Client-A. 

 

Figure 6

We see the Kerberos ticket is being presented in this frame.

And finally, we get an error returned from Client-A. Why? Because Client-A isn’t Client-A, it’s Client-B!

 

Figure 7

All of this just reiterates what you might have guessed. For Kerberos to work you have to present the right ticket to the right account.
(NOTE: for more information on Kerberos, go read Rob Greene’s blog, “Kerberos for the Busy Admin”)

Realize this works just fine if the IP address is used instead of the FQDN. Why? Because NTLM authentication will be used instead of Kerberos authentication. With the IP address, we make no assumption about which client we’re connecting to (which is why we have to negotiate NTLM in the first place). We’re simply connecting to an IP address. Some of you might be thinking that it should work when using the FQDN as well. After all, if Kerberos fails we try NTLM right? Not quite, I won’t go into the details here, but it’s only if we fail to negotiate Kerberos that we will fall back to NTLM. You can read more about it here. Either way, in our scenario Kerberos didn’t fail. It returned a valid response.

Prevention

So now that we understand that this issue is related to stale DNS records, let’s discuss how we can prevent the problem from happening in the first place. There are a few different approaches, so let’s talk about each.

NOTE: For each of these I recommend lowering the scavenging interval to 1-3 days. The 7 day default will prolong the period invalid records remain in DNS.

  1. Increase the DHCP lease duration to match the “no-refresh + refresh” interval. In our example we would increase the DHCP lease to 14 days.
    1. Pros:
      1.  DHCP leases will remain until the DNS record is scavenged which means no other client will receive the address and register it in DNS
      2.  It’s easy.
    2. Cons:
      1.  If the DHCP scope is already low on addresses, you’ll likely run out
      2.  A small percentage of records may not be scavenged before the lease expires because of small time differences. Setting the scavenging interval to 1 day will ensure the defunct records are removed the next day.
  2. Decrease the “no-refresh + refresh” interval to match the DHCP lease. In our example we would decrease both “no-refresh” and “refresh” to 4 days.
    1. Pros:
      1.  The existing DNS record will be scavenged sooner affectively achieving the same results as in the first solution
      2.  It’s easy.
    2. Cons:
      1.  Active Directory replication will increase (if these are AD integrated DNS zones). This is because the DNS records will be refreshed by the clients more often (every 4 days instead of every 7)
      2.  A small percentage of records may not be scavenged before the lease expires because of small time differences. Setting the scavenging interval to 1 day will ensure the defunct records are removed the next day
  3.  Allow the server DHCP to register the addresses on behalf of the clients.
    1. Pros:
      1.  The DHCP server will be able to remove the DNS record as soon as the lease expires
      2.  If setup correctly no duplicate records should exist.
    2.  Cons:
      1.  The setup is more involved.
      2.  A service account will need to be setup to run the DHCP service, or all the DHCP servers will need to be joined to the DNSUpdateProxy group (less secure) adding complexity.

For steps on doing this, read this KB article (around the Use the DnsUpdateProxy security group section).

Experiment with the DHCP lease duration, and “no-refresh/refresh” intervals. You may find a need to depart completely from the defaults. Low DHCP lease durations (in the hours) are sometimes used for wireless subnets. Be mindful of the performance of your servers though, especially if you have a DNS server set to scavenge every few hours on very large DNS zones.

Identifying Records with Duplicate IPs

Almost there! Now, we understand the problem, when the problem happens, and how to prevent it. But how can we easily identify these duplicate records? You could search through DNS easily enough, but why not use PowerShell? 


#Import the Active Directory Module
import-module activedirectory

#Define an empty array to store computers with duplicate IP address registrations in DNS
$duplicate_comp = @()

#Get all computers in the current Active Directory domain along with the IPv4 address
#The IPv4 address is not a property on the computer account so a DNS lookup is performed
#The list of computers is sorted based on IPv4 address and assigned to the variable $comp
$comp = get-adcomputer -filter * -properties ipv4address | sort-object -property ipv4address

#For each computer object returned, assign just a sorted list of all
#of the IPv4 addresses for each computer to $sorted_ipv4
$sorted_ipv4 = $comp | foreach {$_.ipv4address} | sort-object

#For each computer object returned, assign just a sorted, unique list
#of all of the IPv4 addresses for each computer to $unique_ipv4
$unique_ipv4 = $comp | foreach {$_.ipv4address} | sort-object | get-unique

#compare $unique_ipv4 to $sorted_ipv4 and assign just the additional
#IPv4 addresses in $sorted_ipv4 to $duplicate_ipv4
$duplicate_ipv4 = Compare-object -referenceobject $unique_ipv4 -differenceobject $sorted_ipv4 | foreach {$_.inputobject}

#For each instance in $duplicate_ipv4 and for each instance
#in $comp, compare $duplicate_ipv4 to $comp If they are equal, assign
#the computer object to array $duplicate_comp
foreach ($duplicate_inst in $duplicate_ipv4)
{
    foreach ($comp_inst in $comp)
    {
        if (!($duplicate_inst.compareto($comp_inst.ipv4address)))
        {
            $duplicate_comp = $duplicate_comp + $comp_inst
        }
    }
}

#Pipe all of the duplicate computers to a formatted table
$duplicate_comp | ft name,ipv4address -a


Here’s a sample of the output:

 

Figure 8

This is a pretty straightforward PowerShell script. Consider it a sample. This will only return duplicate IP addresses registered to actual computer accounts in Active Directory. Keep in mind it will query every computer in an Active Directory domain and then it will do a DNS query to get the IP address. If you have many computers, use the -searchbase switch with get-adcomputer to limit the number of computers returned each time. If the computer is not joined to AD it will never be returned from get-adcomputer. This is really aimed at finding records in DNS that contain duplicate IP addresses because of the scenario listed above.

Summary

There are a number of articles and blogs that discuss this issue in some shape or form. My goal was to tie all of these separate pieces together to make the big picture a little clearer. To recap:

  • The Scenario
    • Default “no-refresh/refresh” interval coupled with the default DHCP lease = stale DNS records.
  • The Symptoms
    • SCCM “Failed to get token for current process (5)”
    • File Shares “Logon Failure: The target account name is incorrect”
    • Many, many others (potentially anything using Kerberos)
  • The Problem
    • Kerberos authentication requires the ticket be specific to a computer. Stale DNS records mean we could be sending the Kerberos ticket to the wrong computer.
  • The Fix
    • A combination of changing the “no-refresh/refresh” intervals and the DHCP lease period.
    • Configuring DHCP to register records for the clients.
    • Identifying and removing duplicate records (either waiting for scavenging or using the provided script).

I hope you find this information useful! Tuning these DHCP and DNS settings will leave your environment in a much healthier state!

-Sean Ivey