Disaster Recovery – not a nightmare with virtualization

Hello all, my name is Chris Steffen and I am the Principal Technical Architect at Kroll Factual Data. 


Kroll Factual Data is a company that provides business information to mortgage lenders, consumer lenders, property management firms and other businesses.  We have been a part of the Microsoft Technology Adoption Program for virtualization products for the past several years, and we have about 1,500 Virtual Machines (VM) across our computing environment.


As companies experiment with server virtualization, one of the benefits that is seemingly overlooked (or not emphasized as often as it should be) by those discussing virtualization technologies is the impact that virtualization can have when creating & maintaining a disaster recovery environment.


Put simply, the more that your production environment is virtualized, the easier it is to build a disaster recovery environment using the golden images created for your production systems.


For years, I had worried about my company’s ability to recover from a catastrophic disaster.  But over the last 18 months, we have utilized Microsoft Virtual Server to create the situation that we are in today:


We have a production data center environment that is 85% virtualized.  All of the golden images are replicated to our Disaster Recovery (DR) site (the golden image store is constantly updated to the DR site, and the updated / patched / modified golden images are automatically copied to DR as soon as they are changed).  We have standing virtual machine hosts awaiting a VM image (these machines are running and used for testing).  Our Internet circuits and vendor connections are “hot” and tested constantly.  Databases and all other non-VM machines are already duplicated and maintained at DR (we’re talking a couple dozen machines, not the 600 that are located in our production environment today).


Were we to experience a major disaster, the IT staff at our DR site would immediately start the process of copying VM images to selected hosts.  We would change our circuit routing and externally hosted Domain Name Service to the DR IP range.  After the VMs have been copied to the host machines, we would start to bring up the “new” data center.  Our testing has shown that this can realistically be completed in less than a day, but we have a 24 hour Recovery Time Objective (RTO).


Our DR site is populated by our hand-me-down equipment.  But with server virtualization, this is not a problem.  Sure, you may not be able to run 20 VMs on a single host machine due to the older hardware configs, but even if you can only run a quarter of that number, you are gaining additional life out of “end of life” equipment, and getting your production systems up and running in pretty much the same condition that they were running in your real production environment.


I still worry about business continuity situations (who doesn’t?).  But the DR plan that we have established for our core technology finally lets me rest a little bit better.  And using the Microsoft virtualization software and their System Center management tools has made the entire disaster recovery project more cost effective, not only from a licensing and equipment side, but from a manpower / administrative perspective. As you embrace a virtualization technology, keep disaster recovery in the back of your mind as another solution to your DR maintenance nightmare.


I welcome your questions or comments.