DHCP Failover Load Balance Mode


As described in the blog on DHCP failover, there are two types of DHCP Failover relations – Load Balance which provides Active-Active configuration and Hot Standby which provides Active-Passive configuration. This blog article elaborates on the Load Balance failover relationship.

As is evident from the name, in load-balance mode of operation both the servers respond to client requests. Here’s how the servers ensure the distribution of client requests between themselves:

Each DHCP server on receiving the client request calculates hash of the MAC address in the client request as per hashing algorithm specified in RFC 3074.  Each server hashes any MAC address to a value between 1 and 256. If the load distribution ratio between the 2 servers is left at the default of 50:50; and if the hash of the MAC address falls between 1 and 128 then the first server will respond to the client request else if the hash is any value between 129 and 256, the other server responds to the client. This ensures that only one server responds for a specific client. If the load distribution ratio has been changed by the admin to a different value, the distribution of hash buckets would be in that proportion. The admin does not need to configure the MAC addresses on any server configuration a-priori.

Figure 1: Load Balance Ratio in a Failover Relationship

Handling of IP address pool in load balance failover

The free IP addresses of each failover scope are also distributed in the same proportion as the load balancing ratio. So for example, let’s say the failover scope – 10.10.10.0/24 – with an IP address range of 10.10.10.1 through 10.10.10.250. Suppose all IP addresses from 10.10.10.1 to 10.10.10.50 in this scope are leased out and all IPs starting from 10.10.10.51 are free. In this situation, the IP addresses from 10.10.10.51 through 10.10.10.150 would be apportioned to the first server and IP addresses 10.10.10.151 through 10.10.10.250 to the second server assuming the load balance ratio is 50:50. So, a client requesting a new lease and whose MAC address hash falls within the hash buckets of the first server would get an IP address 10.10.10.51 and so on. If the client’s MAC address hash falls within the hash buckets of the second server, the client will get the IP address 10.10.10.151.

As you can see from this example, when a scope is configured for failover, the 2 failover servers would be granting new IP address leases from two different portions of the IP address range of the scope. This is in contrast to the case of a standalone server where the server proceeds  sequentially through the free IP address pool of a scope, to give out new leases, starting with the first free IP address.

Figure 2: IP Address Lease view of a Failover Scope 

As clients request new leases, based on the MAC addresses of the clients, the free IP address pool of one server may get depleted faster than the other. To ensure that free IP address pool is at all times apportioned as per the load balancing ratio, every 5 minutes, the primary server checks the distribution of free IP pool distribution and transfers ownership of the IP address from itself to the partner server or vice versa using server to server failover protocol messages (binding update). This is referred as periodic rebalancing of the free IP address pool.

You can get the number of free IP addresses (and percentage of free IP pool) on each server for a failover scope, by viewing the scope statistics. The fields Addresses Available (this Server’s Pool) and Addresses Available (Partner Pool) indicate the number of free IP addresses owned by each server for the specific scope. You can view the scope statistics in DHCP MMC by right clicking on the failover scope and click on Display Statistics. You can also use the PowerShell cmdlet Get-DhcpServerv4ScopeStatistics with the –failover switch to get the same information in PowerShell. The two additional fields shown in display statistics – Addresses granted (this Server’s Pool) and Addresses granted (Partner Pool) – show the number of IP addresses leased out by the servers.

Figure 3: Statistics for a scope in Failover Relationship

Load balancing operation in various failover states

When the failover relationship is in Normal state, hash bucket algorithm is applied for serving every DHCP client request. In communication-interrupted and partner-down states (i.e. when the partner server is unreachable or has gone down) hash bucket algorithm is not employed for servicing
client requests and server responds to all the clients to ensure service continuity.

Even while in Normal state, the server responds to the client if the client has been retransmitting the same request for a while. The server determines that a client has been retransmitting based on the secs field in DHCP client request. As per RFC 2131 the secs field is defined as “seconds elapsed since client began address acquisition or renewal process”. If secs field in client request is greater than 6 seconds, DHCP server will respond to the client even if the hash of the client MAC address does not fall within the hash buckets of the server. The idea behind this approach is to cater to a scenario where the server which actually owns the hash bucket for that client is down, but relation state is still Normal (there is a lag of 30 seconds between network connection (or the server) going down and this being detected by the partner server).

Most of the details shared in this article are not something that a DHCP administrator has to worry about. However, if you ever wondered how failover works under the hood (and most people do!), now you know!

Other Links

Team DHCP

Comments (26)

  1. teamdhcp says:

    Jobish, please look at the description of MCLT as well as the DHCP examples section in "Understand and deployment guide’ for DHCP failover –
    http://technet.microsoft.com/en-us/library/dn338985.aspx

  2. Anonymous says:

    When a client sends a request for a new lease, it will get lease for MCLT duration. When the client attempts to renew the lease at half the lease period i.e. MCLT/2, it will be given the scope lease duration if DHCP failover is in NORMAL state.

    This is as per the DHCP failover protocol. This does increase the traffic from new clients but given the scalability of Windows DHCP server, this should not pose any deployment problem.

  3. Anonymous says:

    Joe, yes, client 1 will get the same IP address again. There are two cases possible –

    1. DHCP 1 synced the IP address 10.2.3.4 to DHCP2 before it went down.

    In this case, the client will given the full lease duration which has been configured for the scope.

    2. DHCP 1 went down before syncing the IP address 10.2.3.4 to DHCP2

    In this case also, the client will be able to renew but the lease duration will be shorter – same as the value configured for MCLT.

    There is another client behavior to be aware of here – at half the lease period, the client will attempt to renew the lease. This is just normal DHCP client behavior as per the DHCP protocol. The renew message is unicast. So in the scenario above, it will be directed to DHCP 1 which is down and so there will be no response. At 7/8th of the lease period, the client will broadcast the renew request message. This message will be seen by DHCP2 which will respond to the renew request.

  4. teamdhcp says:

    John, when one server goes down the second server (in communication interrupted state) will renew all existing clients including clients which were earlier responded to by the server which went down. I think this addresses your concern ?
    There is a different aspect where the second server will be serving "new leases" from 50% of "free" IP addresses in the scope. However, after the second server moves to "Partner down" state, it will have take over 100% of the "free" IP addresses in the scope.

  5. Anonymous says:

    Thomas, please see the blog article "DHCP Failover using PowerShell" at – blogs.technet.com/…/dhcp-failover-using-powershell.aspx

  6. Felicio Santos says:

    Hello DHCP Team, I got issues with DHCP Snooping as recorded on
    http://support.microsoft.com/kb/2978225, but instead of reduce the number of servers on switches I changed the mode from Load Balance to StandBy and so far I don’t have detected packages being dropped by the DHCP Snooping, so if this is actually a valid
    configuration to keep both features fully functional, you could evaluate to include it on article as additional solution.

  7. Anonymous says:

    yongfoo, you can configure the address rebalancing interval using the registry – DhcpFailoverAddrRebalacingTimeInt. You can create this under HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesDHCP ServerConfiguration. You need to specify the value
    in seconds.

  8. Anonymous says:

    Andrew, its not clear what exact behavior you are referring to. DHCP failover scope can be updated on either of the servers. However, on any update, you need to invoke "Replicate scope" from the server on which an update was performed.

    If lets say you want to remove server 2, perform "deconfigure failover" from server 1 MMC. Now, the failover scopes will be removed from server 2 and retained only on server 1.

  9. Anonymous says:

    Hi Andrew, once the DHCP server service on the second server starts, it will automatically sync up with the first server and make its lease database up to date. After that, it will enter NORMAL failover state and start servicing clients. At that point,
    they will start sharing the load 50:50. There is no admin intervention required. It is expected to just work!

  10. Web blog of Janine Patterson says:

    I was looking through some of your blog posts on this site and I believe this web site is really informative! Keep on putting up.This site is really helpful for us.

    http://janinepatterson.com/

  11. Joe says:

    Question:

    If client1 gets an IP address of 10.2.3.4 from DHCP1 and then DHCP1 goes down but is in a load balance failover pair with DHCP2 and client1 does a renew – will client1 get the same IP address again?

    In other words – Does DHCP2 have IP to MAC address mapping for both pairs in the load balance?

  12. Hi, teamdhcp says:

    I got a question here.

    If the failover pair stay in Normal status, which lease time will the client obtain? MCLT or the time defined in the Scope configuration.

    My test result is MCLT, I think it is unreasonable as it double the request traffic of DHCP.

    According most of document of failover, MCLT should and ONLY be actived when failover enter COMMUITCATE-INTERRPUT or PARTER-DOWN status.

    I'm confused.

  13. Thomas Lee says:

    It would be useful to show the PowerShell Cmdlets used to set this up.

  14. Jobish George says:

    Could you please provide more explanation on MCLT? Does it has got any relation with DHCP lease period? I am confused.

  15. John Vollmer says:

    If one server goes down for an extended period, is there a setting to make 100% of the scope available to the good server instead of 50% so that all 200 of our hosts get an ip address instead of only 100? Thanks.

  16. Andrew Read says:

    Assuming one DHCP server in a load-balanced pair has crashed and the second server is supporting 100% of the clients, what is the process to recover the second server and re-establish the 50%/50% split? Is this process documented somewhere on TechNet?

  17. Andrew Read says:

    Thanks for the previous answer – that helps, but I do have a follow-up question.

    My customer has noticed that there seems to be a master/slave (or Primary / Secondary) relationship, whereby a scope created on DHCP-Server-1 can only be updated (i.e. reconfigured) on that server (and not on DHCP-Server-2). is that expected behaviour? And
    if that is the case what happens if one server from a load-balancing pair has to be permanently removed? Is it possible to force the 'secondary' DHCP server to become the master for all defined scopes?

  18. Andrew Read says:

    Thanks for the quick reply – I will double check how the customer has configured the load-balancing relationship and make sure they invoke the 'replicate scope' function after any changes.

  19. yongfoo says:

    Can the "5 minutes" interval for the periodic rebalancing of the free IP address pool can be configure to different value?

  20. Jeff says:

    Hi – I was wondering. If you run a 50 – 50 % failover – with only 1 available address which server would get it? and if the in my case both of the DHCP servers states the other one has it, what to do?

  21. ZA_Lad_84 says:

    Hey,

    I’ve got a failover configuration between 2 Server 2012 R2 DC’s setup exactly like this.
    Firstly, am I correct in saying that the scope options, lease information, and everything else is automatically replicated between the 2 servers, EXCEPT for manual reservations? Secondly, is it only possible to automate reservation replication by using a script
    (as outlined in this article –
    http://blogs.technet.com/b/teamdhcp/archive/2012/11/27/automatic-syncing-of-scope-configuration-changes-between-2-dhcp-failover-servers.aspx), or is there another option?

  22. teamdhcp says:

    ZA_Lad_84,
    Lease information is replication using the DHCP failover synchronization protocol between the 2 DHCP failover servers. Scope options, reservations and other "configuration" is replicated using one of the following:
    – using IPAM 2012R2 to manage DHCP failover which performs the option, reservation update on both DHCP servers
    – using the auto sync script in the blog mentioned in your comment above
    – "Replicate" option in DHCP MMC/PowerShell (Manual action by admin)

  23. ZA_Lad_84 says:

    teamdhcp
    Thanks for the quick response. I hadn’t heard of the first option you listed – IPAM. I’ll look into that and see how that works for us, thanks!

  24. Galenklein says:

    Hello,

    I recently experienced an issue with DHCP LB environment responsible for approximately 27 scopes in 50/50 mode. All clients started receiving lease times equivalent to MCLT, things did not return to normal until we disabled failover on all scopes. I suspect
    there was a communication error between DHCP1 and DHCP2 that invoked "communication interrupted" state but both were online continuing to service client computers. Could it be that while in this state, all clients receive MCLT lease times when they renew from
    their respective servers causing this situation to occur? Should I focus on ensuring good communication between these servers over port 647?

  25. Sundar says:

    I have a quick question here.. In communications interrupted state, where one peer has gone down and the client comes with an elapsed time greater than 6, will the working peer provide lease from it's own pool?

    As per RFC, it says that in communications interrupted state, each servers will provide new leases from it's own share of IP addresses.

  26. teamdhcp says:

    Sundar, when you say "the client comes with an elapsed time …", that seems to imply a scenario of a client which already has a lease and is trying to renew. Its not a case of a new lease. In this case, the DHCP server which is running in communication
    interrupted state will renew the lease for the client for lease duration of MCLT.