Windows Server Failover Cluster on Azure IAAS VM – Part 1 (Storage)

Hello, cluster fans. This is Mario Liu and I am a Support Escalation Engineer on the Windows High Availability team in Microsoft CSS Americas. I have a good news for you that starting in April 2015, Microsoft will support Windows Server Failover Cluster (WSFC) on Azure IAAS Virtual Machines. Here is the supportability announcement for Windows Server on Azure VMs:

Microsoft server software support for Microsoft Azure virtual machines
https://support.microsoft.com/en-us/kb/2721672

The Failover Cluster feature is part of that announcement. The above knowledge base is subject to change once more improvements for WSFC on Azure IAAS VMs are made. Please check the above link for the latest updates.

Today, I’d like to share the main differences when you deploy WSFC on-premises as compared to within Azure. First, the Azure VM operating system must be Windows Server 2008 R2, Windows Server 2012, or Windows Server 2012 R2.  Please note that both Windows Server 2008 R2 and 2012 both require this hotfix to be installed.

At a higher level, the Failover Cluster feature does not change inside the VM and is still a standard Server OS feature. The challenges are outside and relate to Storage and Network. In this blog, I will be discussing Storage.

The biggest challenge to implementing Failover Clustering in Azure is that Azure does not provide native shared block storage to VMs, which is different than on-premises – Fiber Channel SAN, SAS, or iSCSI. That limits SQL Server AlwaysOn Availability Groups (AG) as the primary use case scenario in Azure as SQL AG does not utilize shared storage. Instead, it leverages its own replication at the application layer to replicate the SQL data across the Azure IaaS VMs.

image

Until now, we have a few more options to work around the shared storage limitation; and that is how we can expand the scenarios beyond SQL AlwaysOn.

Option 1: Application-level replication for non-shared storage

Some applications leverage replication through their own means at the application layer.  SQL Server AlwaysOn Availability Groups uses this method.

Option 2: Volume-level replication for non-shared storage

In other words, 3rdparty storage replication.

image

A common 3rdparty solution is SIOS DataKeeper Cluster Edition. There are other solutions on the market, but this is just one example.  For more details, please check SIOS’s website:

DataKeeper Cluster Edition: Real-Time Replication of Windows Server Environments
https://us.sios.com/products/datakeeper-cluster/

Option 3: Leverage ExpressRoute for remote iSCSI Target shared block storage for file based storage from an Azure IaaS VMs

ExpressRoute is an Azure exclusive feature. It enables you to create dedicated private connections between Azure datacenters and infrastructure that’s on your premises. It has high throughput network connectivity to guarantee that the disk performance won’t be degraded.

One of the existing examples is NetApp Private Storage (NPS).  NPS exposes an iSCSI Target via ExpressRoute with Equinix to Azure IaaS VMs.

Availability on Demand - ASR with NetApp Private Storage
https://channel9.msdn.com/Blogs/Windows-Azure/Availability-on-Demand-ASR-with-NetApp-Private-Storage

image

For more details about ExpressRoute, please see

ExpressRoute
https://azure.microsoft.com/en-us/services/expressroute/

There will be more options to present “shared storage” to Failover Clusters as new scenarios present in the future. We’ll update this blog along with the KB once new announcements become available. As long as you fix the storage, you’ve built the foundation of the Cluster.

In my next blog, Part 2, I’ll go through the network part and the creation of a Cluster.

Stay tuned and enjoy the Clustering in Azure!

Mario Liu
Support Escalation Engineer
CSS Americas | WINDOWS | HIGH AVAILABILITY