Automated Disaster Recovery Testing and Failover with Hyper-V Replica and PowerShell 3.0 for FREE!


Hyper-V Replica is a new built-in feature of both Windows Server 2012 and our FREE Hyper-V Server 2012 products.  Hyper-V Replica enables Hyper-V hosts or clusters to enable distance replication of running VMs to remote Hyper-V hosts over a standard IP WAN connection.  It provides a very cost-effective disaster recovery solution in the event of a primary data center outage. To get this feature in the VMware world, we'd pay gobs of extra money to license "Site Recovery Manager", but in Hyper-V it's just included as part of the core feature set – no extra cost.

To learn more about Hyper-V Replica, Kevin Remde, my friend and colleague, has a great new post over on his "Full of I.T." blog.  Be sure to check it out!

How Can I Automate Disaster Recovery and DR Testing with Hyper-V Replica?

There's two ways to manage the Hyper-V Replica feature:

  • GUI – We can use the built-in Hyper-V Manager tool to manage Hyper-V Replica settings, enable VMs for replication, and execute planned failovers, unplanned failovers and test failovers … but, being a GUI tool, it's intended for interacting with Hyper-V Replica on an individual VM-by-VM basis.
     
  • Script – PowerShell 3.0 (of course – what else?!) includes 22 Cmdlets for configuring, enabling, monitoring, and managing Hyper-V Replica on an automated basis. 

Since most organizations have many VM's and often want an automated solution for managing Disaster Recovery, the basic components of a scripted solution is what we'll focus on this article.

Enabling Hyper-V Replica on each Hyper-V Host

The first step to leveraging Hyper-V Replica is enabling it on each Hyper-V Host.  With PowerShell 3.0, this can easily be done using the "Set-VMReplicationServer" Cmdlet as follows:

Set-VMReplicationServer -ComputerName HYPERV_HOST -ReplicationEnabled $true -AllowedAuthenticationType Kerberos -ReplicationAllowedFromAnyServer $true -DefaultStorageLocation DRIVE:\FOLDER

If you've got Windows Firewall enabled on each of your Hyper-V hosts, you'll also need to make sure that the Hyper-V Replica inbound port is open on each Hyper-V host. You can do this using the "Enable-NetFirewallRule" PowerShell 3.0 Cmdlet, which replaces the now deprecated "netsh advfirewall" command line tool.

Enable-NetFirewallRule -displayname "Hyper-V Replica HTTP Listener (TCP-In)” 

Enabling Replication for each Virtual Machine

After Hyper-V Replica is enabled on each Hyper-V host, the next step is to enable replication for each Virtual Machine currently running on your primary Hyper-V hosts.  To do this, you can use the "Enable-VMReplication" PowerShell 3.0 Cmdlet as follows:

Enable-VMReplication -VMName VM_NAME -ReplicaServerName DEST_HYPERV_HOST -ReplicaServerPort 80 -AuthenticationType Kerberos -CompressionEnabled $true -RecoveryHistory 0 

Note that once VM replication is enabled, you can then use the "Set-VMReplication" PowerShell Cmdlet should you need to modify any of the initial replication settings.

Checking in on Replication

Once a VM is enabled for replication, you can check on the replication status by using the "Measure-VMReplication" PowerShell cmdlet as follows:

Measure-VMReplication -VMName VM_NAME

Name      State       Health LReplTime            PReplSize(M) AvgLatency AvgReplSize(M) SuccReplCount
—-      —–       —— ———            ———— ———- ————– ————-
Win8Ent01 Replicating Normal 10/5/2012 9:57:31 AM 0.0039       00:08:41   3,691.15       3 of 3 

In the Cmdlet output above, we see the state (Replicating), the overall Heatlh and Last Replication Time, as well as some useful insights for WAN capacity planning to support replication: AvgLatency and AvgReplSize (in MB). To use these statistics for planning your Recovery Point Objective ( RPO ) and bandwidth requirements between hosts, check out this article posted by one of my peer IT Pro Technical Evangelists, Tommy Patterson.

Testing VM Failover

Testing Disaster Recovery Failover with traditional solutions is usually a painful process – to test failover procedures, normally you have to completely failover your production workloads after-hours to your DR site and then, often times, spend the entire remainder of your maintenance window trying to fail everything back to production before users come back to work … Not so with Hyper-V Replica! Hyper-V provides the ability to test failover at any time by using the "Start-VMFailover" cmdlet with the -AsTest parameter.  Here's an example:

Start-VMFailover -ComputerName DEST_HYPERV_HOST -VMName VM_NAME -AsTest 

When testing failover using the Cmdlet above, Hyper-V will create a copy of the replicated VM that is disconnected from all virtual networks.  This allows you to start that VM and verify that all services are running properly, without risk of exposing the replicated VM to your production network and causing potential network conflicts.  Cool stuff!

Checking for Data Center Connectivity

Hyper-V Replica is intended as a data center, or site-wide, recovery solution. As such, we'd normally want to execute a failover in situations where the primary data center or Hyper-V replica server is no longer reachable.  To test for Hyper-V Replica connectivity between the primary and secondary sites, we can create a PowerShell function that leverages the "Test-VMReplicationConnection" Cmdlet as follows:

Function PrimarySiteAvailable {

Param ([string]$HyperVHost)

$Test = Test-VMReplicationConnection -AuthenticationType Kerberos -ReplicaServerName $HyperVHost -ReplicaServerPort 80 -ErrorAction SilentlyContinue

If ( $Test -match "was successful") {

Return $True

}

Else {

Return $False 

}

}

Now, we have a function we can call from the remote Hyper-V Replica host periodically to test connectivity to the primary Hyper-V Replica host and site as follows:

$IsPrimarySiteUp = PrimarySiteAvailable -HyperVHost PRIMARY_HYPERV_HOST

If ($IsPrimarySiteUp -eq $False) { … Insert Failover or Notification Commands Here … }

Of course, we could get a lot more sophisticated with our code to test connectivity by including additional support for multiple retries, timing logic, running as a scheduled task, etc … but this example is sufficient to demonstrate the basic capabilities.

CAUTION! True automated datacenter failover can be tricky to implement due to potential "Split Brain" scenarios with some application configurations.  Be sure that you test your logic to account for these conditions with the applications that you use in your environment.  Many organizations stop just short of fully automating the failover process and instead have a notification sent to the Admin team when a primary site outage is detected.  The Admin team can then triage the situation and determine if a failover is warranted.  To speed the failover process, the Admin team can then choose to use a script containing the Cmdlets below for each VM requiring failover.

At last! Performing Scripted Failover …

We almost there …. we just need to add the actual Cmdlets that perform VM failover into a PowerShell script to automate the failover process when we invoke it.  We've already seen the needed Cmdlet above when we were testing failover – our old friend "Start-VMFailover".  We'll just remove the -AsTest parameter to perform a real unplanned failover.  Here's an example:

$VM = Start-VMFailover -ComputerName DEST_HYPERV_HOST -VMName VM_NAME -PassThru -Confirm:$false

Start-VM -VM $VM

What's Next?

In this article, we walked through examples of the PowerShell 3.0 Cmdlets that you can leverage to easily implement an enterprise disaster recovery solution using Windows Server 2012 and Hyper-V Replica.  To continue your learning, check out these resources next …

  • Hyper-V Replica: Learn more about Hyper-V Replica in this post on Kevin's blog!
  • PowerShell: Get up to speed on PowerShell 3.0 in this post on Matt's blog!
  • Server Management: Interested in learning more about server management in Windows Server 2012?  Check out this post on Brian's blog!
  • Want to Get Certified on Windows Server 2012? Become an "Early Expert" for FREE at http://EarlyExperts.net!

How Are You Using Hyper-V Replica?

Have you found a unique and interesting use case for Hyper-V Replica and/or PowerShell 3.0?  Be sure to share your story in the comments below!

HTH,

Keith

Comments (21)
  1. KeithMayer says:

    Hi Michal,

    We definitely have this feature! 🙂  Automated failover of a set of VMs from one host to another is provided by our integrated Failover Clustering feature.  Windows Server 2012 and Hyper-V include a number of high availability features: Live Migration for moving running VMs host-to-host during planned downtime of a host, Failover Clustering for automated failover during unplanned downtime of a host, and Hyper-V Replica as a solution for recovery of site-level disasters to a remote datacenter or DR location.  All of these features are provided in Windows Server 2012 Standard edition, Datacenter edition and our completely free Hyper-V Server 2012 product.

    In your scenario above, I'd recommend investigating Failover Clustering.  With Windows Server 2012 and Hyper-V Server 2012, you can create clusters of up to 64 host servers that provide automated failover for up to 8,000 VMs across that cluster.  

    You can check-out the steps involved in building a Failover Cluster at the following article location:

    blogs.technet.com/…/step-by-step-building-a-free-hyper-v-server-2012-cluster-part-1-of-2.aspx

    Hope this helps!

    Keith

  2. KeithMayer says:

    Hi Josh,

    Thanks for your comment!  You are correct that SRM does help to automate the failover and failback process in VMware hypervisor environments.  However, a couple points to keep in mind – first SRM is certainly not free ( starting at $5K+ USD for smaller environments ).  Second, SRM does not automatically initiate failover and failback, but rather orchestrates the failover and failback process of multiple VM's via a defined recovery plan when an administrator initiates the recovery plan manually.

    If you're instead looking for easy, orchestrated failover and failback, you may be interested in checking out our new Windows Azure Hyper-V Recovery Manager – which provides site-to-site protection of entire Private Clouds.  By leveraging Windows Azure, the entire recovery plan is safely located off-site in the cloud, so that you also don't have to be concerned with a physical site disaster preventing the initiation of your disaster plan.

    You can check out the details on Windows Azure Hyper-V Recovery Manager, which is really the appropriate solution to contrast with SRM, at blogs.technet.com/…/step-by-step-disaster-recovery-for-private-clouds-with-windows-azure-hyper-v-recovery-manager-build-your-private-cloud-in-a-month.aspx.

    Hope this helps!

    Keith

  3. Great article. It helps me in improving Disaster Recovery for our SharePoint enterprise infrastructure. Many thanks for sharing.

    -T.s

  4. KeithMayer says:

    Thanks, PowerComp, for your detailed script example!  Fantastic work! 🙂

  5. DZoquier says:

    Thanks for the article.  This is just what I was looking for.  I will start to implement it tomorrow.  Whish me luck and I may have more questions.

  6. michal says:

    i've been playing around with live migration, replication and other features of hyperv in server 2012.  i cannot find a way to setup automated failover.  planned failover and failover testing works great… but if the source vm/host goes offline in the middle of the night i want the replica to start up on it's own.  unless i'm missing something MS forgot to include this feature.   Cant really compete with vmware without it.  

  7. PowerComp says:

    Failover Clustering is great, but not if shared storage isn't an option. I took this post and developed a very robust AUTOMATIC FAILOVER script. It first checks if the VM is online; if not it checks the host's hyper-v service status (for or needs, this works better because the failover isn't triggered by normal VM reboots). If the host hyper-v service is down a countdown is triggered which rechecks the VM and HyperV service at specific intervals. (NOTE: the host Hyper-V service can be down while the VM is still running!!!- this is checked for). While the failover isn't instant, the check and wait periods can be adjusted to other needs. Thanks for the original post, hope this helps!

  8. PowerComp says:

    Here's the script:

    Function PrimarySiteAvailable {

    $Test = Test-VMReplicationConnection -AuthenticationType Kerberos -ReplicaServerName HOST1 -ReplicaServerPort 80 -ErrorAction SilentlyContinue

    If ($Test -match "was successful") {

    Return $True

    }

    Else {

    Return $False

    }

    }

    Start-Sleep -s 5

    Do{

    $status = (get-service -Name lanmanserver -ComputerName Webctrl).Status

    $date = Get-Date

    " Monitoring VM "

    " "

    Write-Host "VM is online "  $date -ForegroundColor "Green"

    Start-Sleep -s 10

    cls

    }    

    While ($status -eq "Running")

    cls

    Write-Host "VM is offline " $date  -Foregroundcolor "Yellow"

    " "

    "checking HOST1 HYPER-V service…"

    " "

    Do {

    $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST

    $date = Get-Date

    " Monitoring HOST1 "

    " "

    Write-Host "HOST1 is online " $date -ForegroundColor "Green"

    Start-Sleep -s 30

    $status = (get-service -Name lanmanserver -ComputerName Webctrl).Status

    If ($status -eq "Running") {c:scriptscontrol.bat} ELSE {}

    }

    While ($IsPrimarySiteUp -eq $True)

    " "

    ######## start failover countdown

    cls

    Write-Host " Primary VM server is not responding, starting AutoFailover in 10 minutes " -Foregroundcolor "Red"

    " "

    "press CTRL-C to abort"

    Start-sleep -s 240

    " "

    "checking HOST1 HYPER-V service…"

    " "

    $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST

    If ($IsPrimarySiteUp -eq $True) {c:scriptscontrol.bat} ELSE {}

    ########## 6 minute check

    cls

    Write-Host " Primary VM server is not responding, starting AutoFailover in 6 minutes " -Foregroundcolor "Red"

    " "

    "press CTRL-C to abort"

    Start-sleep -s 180

    " "

    "checking HOST1 HYPER-V service…"

    " "

    $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST

    If ($IsPrimarySiteUp -eq $True) {c:scriptscontrol.bat} ELSE {}

    ########## 3 minute check

    cls

    Write-Host " Primary VM server is not responding, starting AutoFailover in 3 minutes " -Foregroundcolor "Red"

    " "

    "press CTRL-C to abort"

    Start-sleep -s 180

    " "

    "checking HOST1 HYPER-V service…"

    " "

    $IsPrimarySiteUp = PrimarySiteAvailable -HOST1 PRIMARY_HYPERV_HOST

    If ($IsPrimarySiteUp -eq $True) {c:scriptscontrol.bat} ELSE {}

    cls

    " starting failover…"

    $VM = Start-VMFailover -ComputerName HOST2 -VMName WebCTRL -PassThru -Confirm:$false

    Start-VM -VM $VM

  9. PowerComp says:

    just a followup… the C:scriptscontrol.bat is simply a batch file that calls this ps script. Calling it from within the script basically restarts it from scratch.

  10. Josh Holt says:

    Or.. you use Vmware and install SRM… which does this automatically.. both for failover AND failback..

  11. Anonymous says:

    December 23rd, 2013: Updated to include additional resources …

    Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack
    Module 1: Added links to Datacenter TCO

  12. Anonymous says:

    December 23rd, 2013: Updated to include additional resources …

    Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack
    Module 1: Added links to Datacenter TCO

  13. Anonymous says:

    December 23rd, 2013: Updated to include additional resources …

    Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack
    Module 1: Added links to Datacenter TCO

  14. Anonymous says:

    December 23rd, 2013: Updated to include additional resources …

    Module 0: Added links for New FREE EBOOKS and Documentation for Windows Server 2012 R2, System Center 2012 R2 VMM and Windows Azure Pack
    Module 1: Added links to Datacenter TCO

  15. Anonymous says:

    In lots of customer discussions, the one thing that comes out often – How does Microsoft Virtualization

  16. Anonymous says:

    Update: The Hyper-V Replica Capacity Planner tool was updated this month with a new version that adds several new features, including supporting for Windows Server 2012 R2, Hyper-V Replica Extended Replication and new storage options. This article has

  17. Anonymous says:

    Las week, Brad Anderson , CVP for Windows Server and System Center, announced Windows Azure Hyper-V Recovery Manager (HRM) as a Cloud-integrated Disaster Recovery solution on his In the Cloud blog. Be sure to get the full details in Brad’s article

  18. Anonymous says:

    Implementing an effective Disaster Recovery and Business Continuance plan can be a really complicated and expensive ordeal for some Private Clouds. In many cases, competing solutions either don't meet the Recovery Point Objectives (RPO) that an organization

  19. Anonymous says:

    Throughout our Disaster Recovery Planning Series of IT Pros this month, we've been discussing the steps involved in planning an effective Disaster Recovery strategy … And planning for the additional capacity needed to implement a successful DR plan

  20. Anonymous says:

    April 11, 2014: Updated to include additional resources …

    Take this Build Your Cloud series with you "on-the-go" … Download our FREE Windows Phone app! Built for Windows Phone using App Studio

    My fellow Technical Evangelists

  21. Anonymous says:

    Pingback from VMware and Hyper-V comparison | CKINET

Comments are closed.