DHCP Failover Load Balance Mode

As described in the blog on DHCP failover, there are two types of DHCP Failover relations – Load Balance which provides Active-Active configuration and Hot Standby which provides Active-Passive configuration. This blog article elaborates on the Load Balance failover relationship.

As is evident from the name, in load-balance mode of operation both the servers respond to client requests. Here’s how the servers ensure the distribution of client requests between themselves:

Each DHCP server on receiving the client request calculates hash of the MAC address in the client request as per hashing algorithm specified in RFC 3074.  Each server hashes any MAC address to a value between 1 and 256. If the load distribution ratio between the 2 servers is left at the default of 50:50; and if the hash of the MAC address falls between 1 and 128 then the first server will respond to the client request else if the hash is any value between 129 and 256, the other server responds to the client. This ensures that only one server responds for a specific client. If the load distribution ratio has been changed by the admin to a different value, the distribution of hash buckets would be in that proportion. The admin does not need to configure the MAC addresses on any server configuration a-priori.

Figure 1: Load Balance Ratio in a Failover Relationship

Handling of IP address pool in load balance failover

The free IP addresses of each failover scope are also distributed in the same proportion as the load balancing ratio. So for example, let’s say the failover scope - 10.10.10.0/24 - with an IP address range of 10.10.10.1 through 10.10.10.250. Suppose all IP addresses from 10.10.10.1 to 10.10.10.50 in this scope are leased out and all IPs starting from 10.10.10.51 are free. In this situation, the IP addresses from 10.10.10.51 through 10.10.10.150 would be apportioned to the first server and IP addresses 10.10.10.151 through 10.10.10.250 to the second server assuming the load balance ratio is 50:50. So, a client requesting a new lease and whose MAC address hash falls within the hash buckets of the first server would get an IP address 10.10.10.51 and so on. If the client’s MAC address hash falls within the hash buckets of the second server, the client will get the IP address 10.10.10.151.

As you can see from this example, when a scope is configured for failover, the 2 failover servers would be granting new IP address leases from two different portions of the IP address range of the scope. This is in contrast to the case of a standalone server where the server proceeds  sequentially through the free IP address pool of a scope, to give out new leases, starting with the first free IP address.

Figure 2: IP Address Lease view of a Failover Scope 

As clients request new leases, based on the MAC addresses of the clients, the free IP address pool of one server may get depleted faster than the other. To ensure that free IP address pool is at all times apportioned as per the load balancing ratio, every 5 minutes, the primary server checks the distribution of free IP pool distribution and transfers ownership of the IP address from itself to the partner server or vice versa using server to server failover protocol messages (binding update). This is referred as periodic rebalancing of the free IP address pool.

You can get the number of free IP addresses (and percentage of free IP pool) on each server for a failover scope, by viewing the scope statistics. The fields Addresses Available (this Server's Pool)  and Addresses Available (Partner Pool) indicate the number of free IP addresses owned by each server for the specific scope. You can view the scope statistics in DHCP MMC by right clicking on the failover scope and click on Display Statistics. You can also use the PowerShell cmdlet Get-DhcpServerv4ScopeStatistics with the –failover switch to get the same information in PowerShell. The two additional fields shown in display statistics – Addresses granted (this Server's Pool) and Addresses granted (Partner Pool) – show the number of IP addresses leased out by the servers.

Figure 3: Statistics for a scope in Failover Relationship

Load balancing operation in various failover states

When the failover relationship is in Normal state, hash bucket algorithm is applied for serving every DHCP client request. In communication-interrupted and partner-down states (i.e. when the partner server is unreachable or has gone down) hash bucket algorithm is not employed for servicing
client requests and server responds to all the clients to ensure service continuity.

Even while in Normal state, the server responds to the client if the client has been retransmitting the same request for a while. The server determines that a client has been retransmitting based on the secs field in DHCP client request. As per RFC 2131 the secs field is defined as “seconds elapsed since client began address acquisition or renewal process”. If secs field in client request is greater than 6 seconds, DHCP server will respond to the client even if the hash of the client MAC address does not fall within the hash buckets of the server. The idea behind this approach is to cater to a scenario where the server which actually owns the hash bucket for that client is down, but relation state is still Normal (there is a lag of 30 seconds between network connection (or the server) going down and this being detected by the partner server).

Most of the details shared in this article are not something that a DHCP administrator has to worry about. However, if you ever wondered how failover works under the hood (and most people do!), now you know!

Team DHCP