Overview: Disaster recovery scenarios, simple site-to-site replication, or the Prod-to-Dev refresh scenario are generally what drive IT administrators to look into virtual machine replication.
We want to build our environments so that in the event something happens in our primary data center, our critical machines and data will be up and running somewhere else. Our developers may reside in a different location but want to work with the most recent datasets available. There are a slew of questions asked about delivering on results for these different types of requirements. Replication over wide area networks takes careful planning and consideration for any solution, in this article I focus on achieving results with Windows Server 2012 Hyper-V, however the methodology applies to almost any replication environment.
Important Questions: I was talking with a fellow IT Pro at one of our recent camps, he asked me, “How do I know what kind of bandwidth I need to perform replication from my main data center to my secondary site?” Great question, and one of many that I have received in my past 7 years of virtualization consulting. Many people go out and build an infrastructure to support replication functionality, identify the virtual machines they want to replicate and then just give it a whirl. Most often times, they face long replication times, time outs, and other logistical issues if not immediately then a few weeks down the road. A discouraging process at times I know, however I believe that with proper planning these scenarios are quite doable, and may not require near as much budget as one would think. Even if we have identified the virtual machines that would be necessary for replication, the very next thing we should accomplish is understanding how much time can be lost in the event of an outage, and also how quickly can we recover at the alternate location. For those of you who have already defined your requirements and just want to get down to the more advanced configurations fast forward to the Bandwidth Restrictions in Part II of this series. If you are still needing a copy of Windows Server 2012 to try out for 180 days click here.
So let’s take a peek at the entire process.
1) Identify the critical workloads and any dependencies these may have (i.e. Active Directory would be required before a File Server)
2) Identify the current and requested recovery point objective (RPO) for each workload. (i.e. How much time can I afford to lose this computing?)
3) Identify the current and requested return-to-operations objective (RTO) for each workload.
a) How fast can I recover to my RPO for this VM?
b) This value may be more about your infrastructure’s abilities than the request of the application owner.
4) Determine the size of the actual footprint of the workload
5) Determine the amount of change occurring inside the given workloads.
6) Review the requirements with the application owners
a) Hint, the application owner will always say they need 100% uptime, so we need to ask the proper questions.
b) More on this topic later.
7) Determine the amount of open bandwidth available, as well as the times of day/week that the maximum available bandwidth is available.
8) Test replication and bandwidth between site A and site B for performance and reliability.
9) Document the steps necessary to fail over to the alternate site, then fail back to the production site per application.
One of the most overlooked tasks in a project like this is how quickly can I fail back to my primary site when all is said and done! Windows Server 2012 takes this into consideration and allows for Reverse Replication automatically when a failback event occurs.
Now that we have a process to work from, and believe me, the process shown above can take many different turns and angles, we need to work with a set of tools. Since I work at Microsoft, the first tool that comes to mind is a spreadsheet! I just so happen to have said spreadsheet handy, I will share it with you here.
Please continue reading at: Replication with Hyper-V Replica – Part II