SQL Server 2008 Hyper-V and high availability

Imagine you have a windows 2003/SQL Server 2005 cluster (active-active) and your mission is to both virtualise and maintain high availability using Hyper-V in Windows Server 2008/SQL Server 2008.  This may sound odd but I was asked the question at TechEd and I wanted to do some research on it. Fortunately I on Windows 7 training this weekend and have access to 2 of the key players Jeff Woolsey who ‘owns’ Hyper-V development and clustering product manager Dave Dion. 

So are your options? and more importantly how do they help?

I would start this by addressing what are you most worried about…

  • Server outage
  • Site Outage
  • Network issues
  • SAN issues and media loss

Also what are your goals e.g.

  • Database availability
  • User response time
  • Supported by your solution providers (e.g. hardware vendors,  Microsoft etc.)
  • Ease of Maintenance, automation diagnostics
  • Predictability that when something goes wrong you know what will happen to mitigate it.

Finally other workloads are also on this setup which will also be virtualised.

The customer who asked me this wanted to cluster the cluster by..

  • Creating a hyper-V cluster ClusterH on Node1 and Node2
  • Creating  virtual machines (VM) vmA and vmB to run on that Cluster with a resource group for each VM such that vmA runs on Node1 and vmB runs on Node2 (i.e. active active). Both of these are using pass though disks i.e. they are talking directly to a LUN on a SAN and so will perform  very well.
  • Clustering vmA and vmB to form ClusterV with SQL Server 2008 installed.

So how does this stack up against our criteria above?

Basically not very good..

Initially this solution looks quite good - if node 1 or node 2 dies then the surviving vm on the surviving node will take  ownership of the storage. But this will take time. Currently hyper-V VM’s only support iSCSI (fibre channel is on the way). The two VMs will have different routes to the same shared storage so if that is not accessible then the system will fail. However the killer is that Microsoft do not support SQL Server failover clustering of VMs as per this KB. BTW this limitation applies to all virtualisation technologies, not just Hyper-V.

Is there a better way? For example if you mirrored the critical databases between vmA and vmB then you now have a copy of the database on either shared or separate storage.  You could then script a manual failover to the mirror if the node running on the principal moved over to the other node as its storage would now be remote to the VM. This is supported, and mitigates a number of failures including:

  • Hardware failure of a node.
  • Media failure on one of the nodes or part of the SAN.
  • Site failure if the nodes are remote to each other and have shared SAN
  • An instance of SQL Server crashing

Of course this is at the database level, but the principal could be extended for more than one database and other workloads as well. 

However in my opinion the best way to consolidate here is to create a physical cluster and consolidate SQL Server workloads on into it.  Clustering is a lot easier in Windows Server 2008 and you n o longer need carefully matched hardware. though the surviving nodes in a cluster need to be powerful enough to handle all of the work of the cluster.

Let me know what you think and what your plans are in this space.

Technorati Tags: sql server,high availability,virtualisation