In August, Brad Anderson, as part of his Transform the Datacenter blog series, covered Microsoft’s Cloud Integrated Disaster Recovery Solution: Hyper-V Recovery Manager. Last week we announced GA for Cloud OS. Today, we are excited to announce the Paid Preview of the Windows Azure Hyper-V Recovery Manager (HRM). This release of the service has been updated to support GA releases of Windows Server 2012 R2 and System Center Virtual Machine Manager 2012 R2. We are also happy to announce that the service now supports production deployments, so IT administrators can now start planning production roll out of DR plans based on HRM.
The following blog provides a refresher on Hyper-V Recovery Manager service. In the coming weeks, we will post additional content on this blog, that will cover new features, deployment guidance and workload specific DR implementation guidance.
Windows Azure Hyper-V Recovery Manager (HRM) helps protect your business critical services by coordinating the replication and recovery of System Center Virtual Machine Manager 2012 SP1 and System Center Virtual Machine Manager 2012 R2 private clouds at a secondary location. With automated protection, asynchronous ongoing replication, and orderly recovery, the HRM service can help restore important services accurately, consistently, and with minimal downtime.
Leveraging existing investments that fabric administrators have made in System Center Virtual Machine Manager (VMM), and by extending key enhancements made in Windows Server 2012 R2 Hyper-V Replica (HVR), HRM provides a cloud-integrated solution for Disaster Recovery, which minimizes the cost and complexity associated with current solutions.
Our singular focus with HRM is democratizing Disaster Recovery by making it available to everybody, everywhere. HRM builds on the world-class assets of Windows Server, System Center, and Windows Azure and is delivered via the Windows Azure Management Portal. Coupled with Windows Azure Backup (WAB), which offers data protection or backup, HRM completes the “Recovery Services” offering in Windows Azure. The following diagram provides an overview of the HRM Architecture:
To build a simple, easily deployable, and readily operable Disaster Recovery (DR) solution, we focused on five key tenets:
- Highly Secure
- Enlightened with VMM
- Works at cloud-scale for all clouds
- Application-level failover
- Service-oriented extensible approach
Application data always travels on your on-premise replication channel. Only metadata (such as names of logical clouds, virtual machines, networks etc.) that is needed for orchestration is sent to Azure. All traffic sent to/from Azure is encrypted as shown in the following schematic.
Installation and Registration
Once you have planned your deployment from a capacity, topology, and security perspective such that the secondary site provides the resources needed for business continuity, you are ready to protect your Virtual Machines and create orchestration units for failover.
Using the Windows Azure Management Portal, browse to Data Services > Recovery Services and click New to create a New Hyper-V Recovery Manager Vault. You can name the vault and specify a region where you would like the vault to be created. For Example: We created our ContosoDR vault in East Asia using Quick Create.
A toast at the bottom of the portal will indicate progress and once the creation process completes, a message will tell you that the vault has been successfully created. The newly created vault it will be listed in the resources for Recovery Services as Active. You can now upload a self-signed or a certificated issues by a Microsoft trusted CA to the vault.
To complete the installation and registration process, the Hyper-V Recovery Manager provider needs to be installed only on the VMM servers that manage your Hyper-V hosts. Provider is accessed via the Windows Azure Download Center, and follows a simple setup process. The provider securely communicates with the HRM service and no additional agent or software is needed to complete the DR setup. Since HRM manages all your sites, as well as complex inter-site relationships from one portal in the cloud, it reduces complexity and risk associated with on-premise DR solutions. Windows Azure ensures that your DR solution is protected against disasters.
Configure Clouds for Protection and Mapping Networks
Once the VMM servers have been registered, all clouds configured on the servers are displayed in the vault. HRM works at the logical abstractions of VMM, making it a truly cloud-scale solution. To map the resources for compute and memory, you can now configure Protected Items that represent logical clouds. For Example: We chose to protect our NewYork_Primary cloud by the Chicago_Recovery cloud in the ContosoDR vault.
Once, you have specified the configuration settings and submitted the operation, the Jobs tab on the portal can provide details of the various stages in the cloud configuration process. This view present rich information for all user-driven gestures, indicating their timely progress, and explicitly indicating events that need user actions. For Example: The progress of the protection configuration for NewYork_Primary is shown below
If you have ensured that the capacity of the recovery cloud will meet the DR requirements of virtual machines protected in the primary cloud, the system transparently configures the hosts with the required certificates, firewall rules, and HVR settings. This works equally well for stand-alone hosts or clusters. The diagram below shows the same process with hosts configured.
Just as HRM allows you to configure protection of your existing SCVMM clouds, networks created by Fabric Administrators as part of their SCVMM deployment can be mapped effortlessly as well. The mapping ensures intelligent placement of virtual machines, and business continuity and availability post failover. Furthermore, If you are using static IPs (and this is most likely the case), the service will reserve an IP address for the virtual machine and inject the same into the virtual machine on failover.
Network mapping works for an entire array of networks – VLANs or Hyper-V Network Virtualization. It even works for heterogeneous deployments, by supporting different types of networks on primary and recovery sites.
Refer our detailed instruction guide for advanced information on how to configure cloud protection and map networks. The schematic below shows the tenant networks of the multi-tenanted Gold cloud as mapped to the tenant networks of the multi-tenanted Gold-Recovery cloud – the replica virtual machines are attached to the corresponding networks due to this mapping. For example, the replica virtual machine of Marketing is attached to Network Marketing Recovery since (a) the primary virtual machine is connected to Network Marketing and (b) Network Marketing in turn is mapped to Network Marketing Recovery.
Virtual Machine Protection and Recovery Plans
With your clouds protected and networks mapped, you can Enable Protection for Virtual Machines by updating the Manage Protection property of the VMs in SCVMM. The Jobs tab on the HRM Dashboard indicates progress. For Example: We enabled Disaster Protection for NYVM003 using SCVMM; corresponding in-progress job information starts updating on the HRM portal
To orchestrate failovers with in-built support for dependency groups and custom actions, we developed a robust Recovery Plan (or RPs as we like to call them) framework. RPs allow administrators to digitize the intelligence and details of their disaster recovery plans. Apart from helping our customers meet their compliance and audit requirements, RPs also help us in delivering on our simplified orchestration promise. Once a RP has been created, it can be used to orchestrate failover and recovery, be it for a DR Drill, or for an actual failover. For Example: We created a NewYork Recovery Plan composed of two groups for 3 VMs. We added custom actions to invoke scripts that needed to be executed as part of the failover. All VMs in a group fail over together thereby improving the Recovery Time Objective (RTO). Across groups, the failover is in a sequence thereby preserving dependencies – Group1 followed by Group2, and so on
At this point, you have successfully protected your workloads, and have ensured that in the event of a disaster recovery, they would seamlessly failover and resume work at the recovery site. During an actual disaster, recovery plans are used to perform a planned or an unplanned failover. HRM also provides support for Test Failovers or Disaster Recovery Drills for various purposes, such as compliance, training staff around roles via simulated runs, system patching, etc. For Example: The NewYork Recovery Plan that we created can be used to perform a Test Failover, or an actual Planned (or Unplanned) Failover
- Test Failover or DR Drills: Enable support for application testing by creating test virtual machines and networks as specified by the user. Without impacting production workloads or their protection, HRM can quickly enable periodic workload testing
- Planned Failovers (PFO): For compliance, or in the event of a planned outage, customers can use planned failovers; Virtual Machines are shut-down, final changes are replicated to ensure zero data loss, and then virtual machines are brought up in order on the recovery site as specified by the RP. More importantly, failback is a single-click gesture that executes a planned failover in the reverse direction
- Unplanned Failovers (UFO): In the event of unplanned outage or a natural disaster, HRM opportunistically attempts to shut down the primary machines in case some of the virtual machines are still running when the disaster strikes. It then automates their recovery on the secondary site as specified by the RP
Windows Azure Hyper-V Recovery Manager builds on top of world-class assets of Windows Server, System Center, and Windows Azure with the goal of reducing the cost and complexity associated with current Disaster Recovery solutions. With dedicated focus on simplified deployment and operation, commitment to security, ability to work at cloud-scale, and support for application-level failover, Hyper-V Recovery Manager ensures that no workload is left behind when it comes to disaster recovery.
You can read more about Windows Azure Hyper-V Recovery Manager in Brad Anderson’s 9-part series, Transform the datacenter. To learn more about setting up Hyper-V Recovery Manager in your deployment follow our detailed step-by-step guide. You can also visit the HRM forum on MSDN for additional information and engage with other customers. Check out additional product and pricing information and sign up to try out the Windows Azure Hyper-V Recovery Manager service.
We look forward to helping you protect your business critical workloads and enabling a seamless disaster recovery solution for your organizations.
Abhishek Agrawal, Senior Program Manager Lead, Windows Server & System Center