[This post comes to us courtesy of Vithalprasad Gaitonde from Product Group and Gopalakrishnan Krishnan from Global Windows Networking Support]
The DHCP service has evolved with its own in-built failover capability starting Windows Server 2012. With this coming into the picture, how can one go about backing up the DHCP server service with the failover configuration parameters as well? This would be the essence of discussion in this blog post.
Why do we need to backup in the first place?
We all know how crucial DHCP is as a service to any environment. This one service goes down– be prepared for all hell to break lose.
Though the service is very much improved in terms of features and capabilities, it still operates out of a jet database which essentially makes the service efficient and fast. Though there is built-in backup/restore capability in DHCP, it is always a good idea to have backups taken and stored. While the reliability of the service has been vastly improved over successive Windows server releases, there are rare but critical issues caused by corruption of the DHCP database. If and when such corruption happens, the admin should be able to restore the service using the "Last known best backup" of the DHCP database.
How backups worked with DHCP in Windows Server 2008 R2 and earlier?
The below articles talk about how backup and restore functioned in the earlier releases of Windows Server (2008 R2 and prior):
In addition to the above, we have also seen enterprise customers leverage the "netsh dhcp server export" context to export the DHCP database for Disaster Recovery purposes. The exported file– is then stored in a separate location and can be used to restore DHCP server service via the "netsh dhcp server import" context at times of dire need.
Below are a couple of articles that already explained how this could be accomplished so we won't be visiting them here again:
- How to move a DHCP database from a computer that is running Windows Server 2003 to Windows Server 2008
Note: The advantage of using "netsh dhcp server export" versus "netsh dhcp server dump" context was that the export command would also extract and store the active lease information from the DHCP server. This way when we restore, the active leases are also restored from that point in time. (Note that: The dump context on the contrary would only be able to restore the scopes, options and reservation information but NOT the active leases).
What’s different with DHCP and backup in Windows Server 2012 and Windows Server 2012 R2?
With Windows server 2012, failover was introduced along with other useful features like Policy based assignment. This essentially means that there has been change to the functionality and the database structure in the background.
Why we suggest NOT to use “netsh dhcp server export” in Server 2012 and Windows Server 2012 R2:
The "netsh dhcp server export" context works and works well in standalone DHCP server environments. However what happens when we have Windows Server 2012 DHCP role configured in a failover and DHCP policies in the picture as well? Would this still work?
The answer is – “YES, BUT WITH LIMITATIONS”. The key point to note here is that the Failover relationships and the Policies are NOT backed up by the "netsh dhcp server export" command. Any attempt to export and import the configuration using this method on Windows Server 2012 would bring up the DHCP server service and also the active leases like in the earlier days. However, it would be sans the failover configuration. In this case, the failover relationships would need to be reconfigured again. And doing this would need us to remove the second server from the failover configuration and reconfigure the failover configuration between the servers from scratch again.
Now, doing this can get cumbersome if there are multiple failover relationships configured on the server for the various scopes there. Also as we can see, the export context does not cover the new improvements that were brought in with DHCP in Windows Server 2012. So what could be the best way to go about to backup in such a scenario?
Note: Also, the netsh context is being deprecated and this is one of the important reasons why we suggest not to use the netsh context with DHCP going forward.
The backup/restore options discussed earlier via the DHCP GUI would also backup the failover configuration information. So we can manually backup via the GUI to a different location/drive altogether.
This can be achieved also via the Backup-DhcpServer and Restore-DhcpServer powershell commands.
The commands are discussed here:
Let us say that we have 2 DHCP servers:
And there is a failover relationship configured between these servers as seen below:
There are hundreds of scopes hosted between these servers. To back up the database with the failover configuration now from campusDHCP1 we can execute:
Backup–DhcpServer –ComputerName campusDHCP1.campus.com –Path D:\DHCP\Backup
The command should complete successfully as shown below:
Note that the backup is being stored in a different drive altogether here. This PowerShell command works very similarly to the backup option provided in the DHCP management console. Taking a look at the back up location we can see files such as the following created:
And inside the new folder:
This is exactly what happens when the DHCP backup operation happens by default every 60 minutes. The only difference in doing the above steps here is that administrators now, can go ahead and run this command at regular intervals and backup the configuration also to a different drive/server location for disaster recovery purposes.
Let us consider a scenario now, where the DHCP service on CampusDHCP1.campus.com crashed due to a database corruption. Any attempts to bring the service up again fail with an error saying " Error 4312: The object identifier does not represent a valid object". (Say). Attempts to restore with the file in the backup folder also fail with the same error, this essentially tells us that the contents of the backup folder were also corrupted in time.
Jetpack or Esentutl operations to verify consistency of the database have failed to help us to recover from the corruption. Now in this scenario the only option would be to try and restore from a last known good backup that we might have. Luckily here, we do have one taken via the Backup-DhcpServer powershell command.
In order to restore the database, we can do the following:
First we need to get the DHCP server service up and running.
To do so, remove the contents of the C:\Windows\System32\DHCP folder (or the directory that the DHCP service was primarily configured to load the database from).
We can either cut these files out to a different location (just in case) or delete the contents from this folder. The above operations can be performed as the service is down and currently not starting up. The main idea behind doing this, is to remove the corrupted database (dhcp.mdb) file from the location that the DHCP server service is trying to read from. Once the folder contents are removed, start the DHCP server service again. This time the service should start up fine and create fresh instances of the files that we just removed. This is like starting from scratch and the DHCP service now has a fresh, clean database to start work with.
Note: Do NOT delete the contents of the folder unless you are absolutely sure that you have a good backup.
Once the service is back up and running, then run the following PowerShell command to restore the database from the backup that was created previously:
Restore–DhcpServer –ComputerName CampusDHCP1.campus.com –Path D:\Dhcp\Backup
Say Y (Yes) when prompted asking if we really want to go ahead with this operation or not:
When prompted for a service restart, restart the DHCP server service:
Once this is done, you should see the DHCP server service up and running on the server and trying to sync with the partner servers in the failover relationships:
Checking the Failover tab from the IPv4 properties:
You would see that the service has come up and continues to run in the Recover Wait state. The server typically stays in this state for the MCLT duration before resuming Normal operations with its partner. More information on the transition states can be found here:
Give the server some time now so that it can sync up with its partner servers before resuming normal operations on the network again.
The advantage as we can clearly see are that the backup, restore commands help us to also restore the failover relationships that existed on the servers as well.