Windows Server 2012 Continuous Availability File Server feature

General availability of Windows Server 2012 and Windows 8 is almost here!  As a result, I have been slowly absorbing all of the new features so I can better support my customers.   Continuous Availability File Server is my favorite feature, so I wanted to take a few minutes to make the case to highligh its importance and potential value.  When I think back and consider some of the more difficult cases I have worked on over the years many them are intermittent in nature.  One such scenario often was related to highly available file shares on Windows Failover Clusters.  The idea is if your business has a network file server the business has defined as mission critical, deploying a solution to optimize uptime and reliability certainly makes sense.  Deployment of a stand alone file server does not meet those requirements as a simple hardware failure causes those resources to become unavailable.  Back in Windows 2000 (yes, believe it or not I still get calls on it) the product team made Highly Available File Shares to address this issue.  The basic idea is cluster creates a Virtual Network Name and IP address and uses a shared disk presented to both nodes so resources can be hosted on either node providing fault tolerance.  If Node 1 owns the file share resource group and reboots for patching, the resource fails over to Node 2 within seconds providing highly available solution.

Notice my statement "highly available", I didn't say fault tolerant.  You are still getting 99% uptime, but the reality is...your client had an outage.  When the group moves the file share, depending on the application, customer may experience an outage.  The cluster service, in order to provide high availability, routinely checks for the health of the file share to ensure online status.  How does it do that?  It routinely performs a directory listing of the file share itself to ensure it's alive and well.  There are many reasons why this process may fail, not limited to memory pressure, server service load, file or system filter drivers such as A/V, list goes on and on.  Typical errors may look like the following:

Error Messages: ClusSvc Event ID:1055
Cluster File Share resource ‘demo’ has failed a status check. The error code is 64.

00000888.00001108::2009/08/04-02:16:09.021 INFO File Share <DEMO>: Retrying FindFirstFile on error 64 for share \\HAFileShare\demo\*.* !

ERR 64 = The specified network name is no longer available.

The question in my mind is “Does Windows Server 2012 offer a solution to this problem?”  The answer is obvious; otherwise I wouldn’t have written this blog ;)

One of the primary design goals the product team had for Windows Server 2012 was to enable workloads such as SQL Server or Hyper-V to use SMB File Server back end storage as primary storage for datacenter scenarios.  So what does SQL Server and Hyper-V have to do with the scenario above?  Well, part of the framework required for their objective included a major overhaul of SMB to version 3 and to include “transparent failover”.  Windows Server 2012 offers two file server types, the one we all know and love now called “Classic” file server and "scale-out" file server to support workloads above.  The “classic” file server role provides the ability to failover a file share resource with zero downtime and is the scenario we have been discussing. Scale-out file server is a different discussion for another blog.

I know what you’re thinking…this is awesome, all I have to do is migrate by backend file server to Windows Server 2012 and I’m golden.  After all, you’re more than half way complete with Windows XP refresh project to Windows 7. Well… No.  This technology, at its core, depends on SMB3 which is included only in Windows Server 2012 and Windows 8.  Therefore, you must be running those SKU’s to take advantage of this feature. 

For me, seeing is believing, so let’s move from the academic discussion into real world application and testing.  My lab consists of five machines:

Windows Domain controller (iSCSI target for lab scenario)
Two Node Windows Server 2012 Cluster with shared storage using iSCSI and “Classic File Share” configured with continuous availability.
One XP Client
One Windows 8 Client

Test Scenario:
Create continuous available file share and initiate large file transfer from client.  During copying process, move resource group containing file share to other node and note any disruptions.  We know on XP client, any disruption in availability of file share will immediately return error to user.  How will Windows 8 fare?

 

[Add file share role to each cluster node]

 

[Select file server for general use, scale-out is for Hyper-V or SQL only]

 

 

 

[Create client access point, the virtual name your clients will use to connect]

 

[Select a disk resource to host file share]

 

[Complete wizard]

 

[HAFileShare is my CAP and it’s online owned by Node 1]

 

[Create file share for continuous availability]

 

[Make a selection, notice the other settings item pending]


 
[Define share name]


 
[Default option foe continuous availability is selected]

 

[Demo Share is online]


 

[Initiate large file transfer from client to file server on XP, move cluster group containing file share during copy process]

[Network trace snippet] 
1832 1:37:42 PM 10/21/2012 17.3213007 System HAFILESHARE  SMB SMB:C; Nt Create Andx, FileName = \largefile.iso {SMB:639, SMBOverTCP:19, TCP:18, IPv4:17}
245 1:37:30 PM 10/21/2012 5.6650507 System HAFILESHARE   SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOUND {SMB:30, SMBOverTCP:19, TCP:18, IPv4:17}
287 1:37:30 PM 10/21/2012 5.6650507 System HAFILESHARE SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (172) STATUS_PIPE_NOT_AVAILABLE {SMB:39, SMBOverTCP:19, TCP:18, IPv4:17}

[Initiate large file transfer from client to file server on Windows 8, move cluster group containing file share during copy process.  Notice the temporary dip in transfer rate but no exception, copy completes]

 

[Event logs on the Windows 8 machine] 

Log Name:      Microsoft-Windows-SMBClient/Operational
Source:        Microsoft-Windows-SMBClient
Event ID:      30624
Level:         Information
User:          SYSTEM
Description:
Connection to share \hafileshare\demo was re-established.

 

Log Name:      Microsoft-Windows-SMBClient/Operational
Source:        Microsoft-Windows-SMBClient
Event ID:      30623
Level:         Warning
Description:
Connection to share \hafileshare\demo was lost. Status 0xC000020C

Summary:
Continuous available file share is a great new feature and I hope you will evaluate Windows Server 2012 and Windows 8 with the common goal to provide your business partner with highly available and fault tolerant services.