Dynamic DNS registration process can cause queue build up and failures

Hey Folks! Ajay Sarkaria here with new information on the changes in dynamic DNS registrations. In the past, we have had many customers who call Microsoft support with scenarios where dynamic DNS registration from a Microsoft DHCP server either fail or is delayed. This causes printers or other critical devices to not get registered or get registered with delays.

This issue has been difficult to diagnose as the current logs did not provide adequate visibility into the reasons for the failures.

Scenario:

Dynamic DNS updates are attempts to update the PTR record, which is often done in conjunction with HOST A record update. In order to send a PTR update an SOA query is made first for the reverse record for the host to see who is authoritative to accept the update. In case the DNS Server does not have a reverse lookup zone, an error sent back to the DHCP server is interpreted as a general update failure, and this would cause re-queuing of the update request at the DHCP server, eventually causing a queue build-up.

Symptoms:
  • Devices after a renewal from Microsoft DHCP Server, fail to update their records in DNS

For context, the following is how DHCP server-DNS client interaction works:

Whenever a device obtains a new IP address lease from the DHCP server, DHCP server sends a dynamic DNS update request to the DNS server. The DHCP Server does so by calling a DNS client API which puts the request in the dynamic DNS update processing queue of the DNS client. At start of the DHCP server service, DHCP server sets the length of the dynamic DNS update queue in the DNS client. By default, it is set to 2048. It can be overridden by a DHCP server registry key. DHCP server registers a callback function with the DNS client to be called once the dynamic DNS update is completed either successfully or otherwise. DNS clients make 3 attempts to register each DNS record in the queue with a time interval of 5 minutes between retries. When the number of attempts are unsuccessful or the registration is successful, DNS client de-queues the request and calls DHCP server via a callback function indicating success or failure. If the callback function indicates a failure of the registration, DHCP server maintains in it’s lease database that this DNS update status for the lease is “Pending”.

If the dynamic DNS registration is successful, it’s marked as “Complete”. A background process (scavenger) in DHCP server wakes up every hour, goes through the lease database and calls the DNS client API to register leases where the DNS update status is “Pending”. These attempts also go to the same DNS client queue mentioned above. If the queue becomes full, scavenger does not call the DNS registration API and moves to process the next unregistered IP address in the lease database.

What has been observed is there is often a missing configuration which causes failure of dynamic DNS registration. The more common one being reverse lookup zone not being present. Given the above implementation, in a situation where the reverse lookup zone is not present, the queue inside the DNS client builds up and causes long delays in the registration of other DHCP clients which should have gone through.

To alleviate this problem, the below was done:

On Windows Server 2016:

  1. Currently the DHCP server logs do not give information to an administrator on why the DNS registrations are failing. New events are added in DHCP server on Windows Server 2016 which will help to easily identify that the DNS registration is failing because of a missing reverse lookup zone. You can then resolve the issue by adding that zone on the DNS server.

    New Events in Windows Server 2016:

Event Category Event Text
DHCPv4.ForwardRecordDNSFailure Forward record registration for IPv4 address %1 and FQDN %2 failed with error %3. This is likely to be because the forward lookup zone for this record does not exist on the DNS server.
DHCPv4.ForwardRecordDNSTimeout Forward record registration for IPv4 address %1 and FQDN %2 failed with error %3.
DHCPv4.PTRRecordDNSFailure PTR record registration for IPv4 address %1 and FQDN %2 failed with error %3. This is likely to be because the reverse lookup zone for this record does not exist on the DNS server.
DHCPv4.PTRRecordDNSTimeout PTR record registration for IPv4 address %1 and FQDN %2 failed with error %3
DHCPv6.ForwardRecordDNSFailure Forward record registration for IPv6 address %1 and FQDN %2 failed with error %3. This is likely to be because the forward lookup zone for this record does not exist on the DNS server.
DHCPv6.ForwardRecordDNSTimeout Forward record registration for IPv6 address %1 and FQDN %2 failed with error %3.
DHCPv6.PTRRecordDNSFailure PTR record registration for IPv6 address %1 and FQDN %2 failed with error %3. This is likely to be because the reverse lookup zone for this record does not exist on the DNS server.
DHCPv6.PTRRecordDNSTimeout PTR record registration for IPv6 address %1 and FQDN %2 failed with error %3.
DHCPv4.ForwardRecordDNSError Forward record registration for IPv4 address %1 and FQDN %2 failed with error %3 (%4).
DHCPv4.PTRRecordDNSError PTR record registration for IPv4 address %1 and FQDN %2 failed with error %3 (%4).
DHCPv6.ForwardRecordDNSError Forward record registration for IPv6 address %1 and FQDN %2 failed with error %3 (%4).
DHCPv6.PTRRecordDNSError PTR record registration for IPv6 address %1 and FQDN %2 failed with error %3 (%4).

2. The second change is related to retries in the DNS client. The current implementation made 3 attempts at failed registrations with a time interval of 5 minutes between them. In the scenario of a missing DNS zone, there will be several registrations which will fail and will be present in the DNS client queue for as long as 10 minutes. This causes the queue length to be hit and other valid registrations do not get done or get delayed a lot. In other words, the impact of a missing configuration like reverse lookup zone not being present becomes quite severe.

So, a change was made in DNS client to not make any retries for failed registrations. Failed registrations will anyway be retried by the scavenger in the DHCP server as mentioned above. This will ensure that in cases when a zone is not present, the failed registration does not stay in the queue for a long time and the queue build will not be seen. This feature is enabled by default in Windows Server 2016.

On Windows Server 2012 R2:

  • The change in Windows Server 2012 R2 is related to only the retries in the DNS client as mentioned in # 2 above for Windows Server 2016 & no additional events have been added.

How to get this change on Windows Server 2012 R2?

The change (not enabled by default) is part of the following Monthly Quality Rollup for Windows Server 2012 R2:

November 2016 Preview of Monthly Quality Rollup for Windows 8.1 and Windows Server 2012 R2

Important: The behavior does not change by simply installing the Monthly Quality Rollup for Windows Server 2012 R2 (KB3197875). It is controlled by a registry setting which needs to be implemented:

Disclaimer: It is always a good idea to take a backup of the registry key before making any changes so please do that!

  1. Create a new key called “DnsRegistrationMaxRetries” of type DWORD under
    HKLM\System\CurrentControlSet\Services\DhcpServer\Parameters
  2. Set the value of “DnsRegistrationMaxRetries” to 0
  3. Reboot the DHCP Server

Note: If you already have KB3197875 installed, the “DnsRegistrationMaxRetries” may exist with value 3. You need to change the value to 0 and reboot the DHCP Server.

Until next time!

Ajay Sarkaria
Microsoft