DHCP Failover, Dynamic DNS updates and a Perfect Storm!

This blog article is authored by Joel Christiansen, Senior Support Escalation Engineer, Microsoft

In a recent support case, a perfect storm of circumstances came together and resulted in some unexpected behavior. These kind of perfect storms make for good blogging material.

The unexpected behavior was that a DHCP failover node, that did not own a lease for a client, deleted the DNS record for that client. Here is how it all happened.

Firstly, the DHCP scope was configured in this manner:

The DHCP scope is configured to register records on behalf of clients: 

  • Always dynamically update DNS A and PTR records
  • Dynamically update DNS A and PTR records for DHCP clients that do not request updates (for example, clients running Windows NT 4.0)

The DHCP scope is configured to delete records of clients:

  • Discard A and PTR records when lease is delete

Here is a screenshot of a scope with this configuration:

 

Secondly, the DHCP scope had to be enabled for DHCP failover.  In this case, the DHCP scope was enabled for DHCP failover in load balance mode.  The servers where Windows Server 2012 R2, but the same behavior would likely be seen on Windows Server 2012 RTM. 

Thirdly, BOOTP has to be disabled on the DHCP failover scope.

Note: DHCP failover does not support for BOOTP in Windows Server 2012 RTM, but DHCP failover does support BOOTP in Windows 2012 R2.

Here is a screenshot of BOOTP being disabled on the scope.

Thirdly, and this is the secret sauce to the perfect storm, the client has to behave in an anomalous manner as described here:

  • The DHCP client sends two BOOTP REQUEST message before sending a DHCP DISCOVER message.
  • The DHCP client sends the initial DHCP DISCOVER message with the SECONDS field set to 6 or greater.
  • The DHCP client had to change the DHCP Transaction ID between the DISCOVER message and the REQUEST message.

In this specific case, the client was a Lexmark T650 series network printer. 

When the device was power cycled, it would send two BOOTP Request messages. The DHCP Failover nodes would ignore these messages since BOOTP was not enabled.

The timeout for the BOOTP messages took 6 seconds total:

Time Delta in seconds

Message

0

First BOOTP REQUEST sent

2

Second BOOTP REQUEST sent

6

DHCP DISCOVER sent

Therefore, when the DHCP DISCOVER was sent, the device populated the DHCP SECONDS field with the value of 6.  Note: The DHCP DISCOVER should have had a SECONDS Value of 0 since this was the first DHCP packet.

When the DHCP DISCOVER message was received, both DHCP Failover nodes responded to the client because the SECONDS field is set to a value of 6 or greater. This behavior is by design. When a DISCOVER message is received with a SECONDS value set to 6 or higher, the DHCP Failover component uses the SECONDS field as an indicator that the client has been unable to receive a lease from the server that should have provided the lease, resulting in both servers responding to the DISCOVER message.

Here is an example of the SECONDS field being set to 6 in a DHCP network packet.

- Dhcp: Request, MsgType = DISCOVER, TransactionID = 0xBCBCFAE3
    OpCode: Request, 1(0x01)
    Hardwaretype: Ethernet
    HardwareAddressLength: 6 (0x6)
    HopCount: 0 (0x0)
    TransactionID: 3166501603 (0xBCBCFAE3)
    Seconds: 6 (0x6)
  + Flags: 32768 (0x8000)
    ClientIP: 0.0.0.0
    YourIP: 0.0.0.0
    ServerIP: 0.0.0.0
    RelayAgentIP: 0.0.0.0
  + ClientHardwareAddress: 00-12-3F-17-E0-CF
    ServerHostName:
    BootFileName:
    MagicCookie: 99.130.83.99
  + MessageType: DISCOVER - Type 53
  + AutoConfigure: Auto Configure  (1) - Type 116
  + clientID: (Type 1) - Type 61
  + RequestedIPAddress: 10.0.0.3 - Type 50
  + DHCPEOptionsHostName:
  + DHCPEOptionsVendorClassIdentifier:
  + ParameterRequestList:  - Type 55
  + End:
    Padding: Binary Large Object (5 Bytes)

 

The last interesting behavior we saw on this case was that the client device was switching DHCP Transaction IDs between the DISCOVER and the REQUEST.  The network trace looked something like this:

DHCP:Request, MsgType = DISCOVER, TransactionID = 0xBCBCFAE3
DHCP:Reply, MsgType = OFFER, TransactionID = 0xBCBCFAE3
DHCP:Request, MsgType = REQUEST, TransactionID = 0xBCBCFAE4
DHCP:Reply, MsgType = ACK, TransactionID = 0xBCBCFAE4

If you see behavior like this with a DHCP client, contact the DHCP client vendor and see if they have an update to resolve the behavior as this is non-RFC compliant.

To resolve this issue, we had two options really.

  1. Disable BOOTP on the client device so that it only sent a DISCOVER on reboot and the SECONDS value would be set to 0x0.

  2. Contact the device vendor to have them fix:

    1. The SECONDS field being set to 6 in the initial DISCOVER packet.

    2. The Transaction IDs being mismatched between the DISCOVER and REQUEST packet.

In this case, the customer simply disabled BOOTP on the client device, since it wasn’t enabled on the DHCP server anyway.  After this change, only the DHCP server that owned the client lease would respond and we no longer had issues with DNS records being removed.