Enhanced Always On Availability Groups in SQL Server 2016

Always On Availability Groups

Always On Availability Groups are a fundamental component of the availability story for SQL Server, and provide a robust disaster recovery solution as well. Availability Groups were first introduced in SQL Server 2012, replacing the prior Database Mirroring technologies. They enable a DBA to configure two SQL instances to host replicas of a set of databases, which can be kept exactly in sync, giving zero data loss guarantees, or near exactly in sync (async replication, which is optimal for geographically remote replicas). This technology has become the standard which the majority of critical production SQL Server instances leverage.

Enhancements in SQL Server 2016 Availability Groups

SQL Server 2016 is making some significant improvements to the Always On Availability Groups set of features. There are a number of features, such as:

  • Round-robin load balancing in readable secondaries
  • Increased number of auto-failover targets
  • Enhanced log replication throughput and redo speed
  • Support for group-managed service accounts
  • Support for Distributed Transactions (DTC)
  • Basic HA in Standard edition
  • Direct seeding of new database replicas

These each deserve their own blog post, and I hope to get to them all. However, there is one feature area that I’d like to focus on today because it opens up some interesting scenarios, and has been requested for a very long time.

Domain Independent Availability Groups

When Always On Availability Groups was introduced in SQL Server 2012, it represented a substantial advancement in the HA story over the previous technology, Database Mirroring (DBM). By adopting certain layers of Windows Server Failover Cluster (WSFC) functionality to use as infrastructure, the new solution is able to be much more performant, and to scale much better than the original DBM solution. Where DBM was limited to only two replicas, the Principal and the Mirror, Availability Groups could initially support five replicas (up to eight as of SQL Server 2014). Perhaps more significantly, where DBM established a relationship between one database and a copy of that database, Availability Groups, as the name implies, establishes a relationship between a set or group of databases, and replicas of that group of databases on one or more replicas. This means that for applications which access more than one database such as a SharePoint farm, we can now have all of the databases in the group move as a unit. We no longer need to worry about implementing complex scripting solutions to cause multiple databases to move as a unit.

One of the additional restrictions that came with Availability Groups was the restriction that all nodes in the Availability Group must reside in the same Active Directory Domain.

While this works well for a majority of our customers, there are some deployment patterns which cannot be met with current Always On Availability Groups/Windows Server Failover Cluster solutions:

  • Some corporate environments have multiple domains that cannot be merged, but would benefit from spanning an Availability Group across them. These multiple domains may or may not have trust relationships between them, but still there is a need to have servers hosting DR replicas.
  • Some customers or specific installations are not in the context of Active Directory domains at all.

SQL Server and Windows Server Collaboration

In order to relieve this situation, SQL Server had two options: We could (once again) re-implement the functionality that we are currently getting from WSFC, or we could collaborate with the Windows Server Failover Cluster team to eliminate the single-domain restriction. The two teams met and worked for over a year to understand what the technical restrictions were that led to this limitation, and to envision what a solution that did not rely on AD for authentication and security would look like, and how it could be made usable.

The result is that with the release of Windows Server 2016, Windows Server Failover Clusters will no longer require that all nodes in a cluster reside in the same domain. In fact, they will no longer require the nodes to live in any domain at all. You can now form a WSFC cluster from machines which are in workgroups. Because of this change, SQL Server 2016 is now able to deploy Always On Availability Groups in environments with:

  • All nodes in a single domain
  • Nodes in multiple domains with full trust
  • Nodes in multiple domains with no trust
  • Nodes in no domain at all

This opens up a great number of new scenarios to customers, and removes previous blocks preventing migration from the deprecated Database Mirroring technology to AlwaysOn Availability Groups.

Installing and Configuring an Always On Availability Group without Domains

Although many customers will see this change as a great step forward in openness and flexibility, the AD infrastructure does make many operations simpler, so in order to operate without it we must compensate for these changes.

Setting up Windows Server Failover Clustering

In order to set up Windows Server Failover Clustering in an environment without AD security, we must provide a means to securely configure the cluster and provide for secure communications within the cluster. WSFC as of Windows Server 2016, creates self-signed certificates and uses these to secure intra-cluster authentication. In environments with AD security, we leverage that security infrastructure in order to establish the secure communication links. Without AD, we must take an extra step. In order to establish the cluster in environments without AD, WSFC requires you to set up synchronized admin accounts on all proposed cluster nodes. That means that there must be an account with the same name and the same password, that is in the Administrators group, on each node in the cluster. You must also be able to resolve the hostnames of all nodes in the cluster so that they can find each other.

As of now, WSFC cannot be managed using the UI in environments without AD security, so you will need to form and manipulate the cluster using PowerShell cmdlets. Fortunately, the minimum of commands for our purposes is very simple.

PS C:\> New-Cluster -Name “MyCluster” -Nodes Node1,Node2,Node3,Node4 -AdministrativeAccessPoint DNS

That’s it! It really is that simple once the basics are set up.

You can find more information on Domainless clusters from the Windows Cluster team on the Failover Clustering and Network Load Balancing team blog.

Configuring an Always On Availability Group

In environments without AD security, we must secure the “Mirroring Endpoints” in the same way they were when used in Database Mirroring without domains. This involves a number of steps, and can be cumbersome, especially when there are a number of nodes in the AG. In order to facilitate this, I’ve created a pair of scripts to help out. The first one, CreateEndpointCert, creates a certificate based on a password, and backs it up to a network share that all machines in the AG will have access to. In order to facilitate this, I recommend that you configure the SQL Server service to run as a user account which is synchronized in the same way that the Admin account used to set up the cluster was. This account need not be a box administrator, but this will enable a network share to be set up with write access to this account. The script then sets up the Mirroring Endpoint to be authenticated using this certificate. This means that when a connection attempt is made via the endpoint to another machine, that machine will be presented with the certificate as credentials for the login attempt.

The second script, InstallEndpointCert, takes as parameters the remote computer whose certificate is to be installed on this instance, the share on which the certificates were backed up, and a strong password to be used for a local login.

The script will restore the remote certificate, and create a login based on that certificate. It will then grant that user and login access to the instance. Now, when the endpoint from the other system attempts to connect, and presents the certificate, the local node will recognize that certificate as being linked to the local user, and grant the connection based upon that.

So, the total sequence looks like:

Node A

Node B

Node C

EXEC CreateEndpointCert ‘\\NodeB\MyShare’ '1R3@llyStr0ngP@ssw0rd!'

EXEC CreateEndpointCert ‘\\NodeB\MyShare’ '1R3@llyStr0ngP@ssw0rd!'

EXEC CreateEndpointCert ‘\\NodeB\MyShare’ '1R3@llyStr0ngP@ssw0rd!'

EXEC InstallEndpointCert ‘NodeB’ 'D1ff3rentStr0ngP@ssw0rd!'

EXEC InstallEndpointCert ‘NodeA’ 'D1ff3rentStr0ngP@ssw0rd!'

EXEC InstallEndpointCert ‘NodeA’ 'D1ff3rentStr0ngP@ssw0rd!'

EXEC InstallEndpointCert ‘NodeC’ 'D1ff3rentStr0ngP@ssw0rd!'

EXEC InstallEndpointCert ‘NodeC’ 'D1ff3rentStr0ngP@ssw0rd!'

EXEC InstallEndpointCert ‘NodeB’ 'D1ff3rentStr0ngP@ssw0rd!'

Finishing the Setup

If you have set up the instances with synchronized service accounts, from this point on, you can create and manipulate the Availability Group much like any other. You can script the creation, or use the New AG wizard to create the AG. You manipulate it like any other, and monitor it using the Dashboard, DMVs or other SQL tools.

Scripts

CreateEndpointCert

————————————————————————————————–
— This procedure automates the creation of a local certificate and the endpoints required for a domainless AG.
— Parameters are the strong password for the cert, and the location of a share which receives the backup of the cert.
— The share should be accessible to all nodes in the AG, as they will need to read the certs for each other.
— The procedure also creates the endpoint based upon the newly created cert.

— EXEC CreateEndpointCert '\\Myserver\Myshare' '1R3@llyStr0ngP@ssw0rd!'
—————————————————————————————————

CREATE PROCEDURE CreateEndpointCert
 @ShareName SYSNAME ,
 @StrongPassword SYSNAME
AS BEGIN
 
 –This must be executed in the context of Master
 IF (DB_NAME() <> 'master')
 BEGIN
  PRINT N'This SP must be executed in master.  USE master and then retry.'
  RETURN (-1)
 END

    DECLARE @DynamicSQL varchar(1000);
    DECLARE @CompName varchar(250);
 DECLARE @HasMasterKey INT;
    SELECT @CompName = CONVERT(SysName, SERVERPROPERTY('MachineName'));

 — Only create a master key if it doesn't already exist

 SELECT @HasMasterKey = is_master_key_encrypted_by_server from sys.databases where name = 'master'
 IF (@HasMasterKey = 0)
 BEGIN
 –Create a MASTER KEY to encrypt the certificate.
  SET @DynamicSQL = CONCAT('CREATE MASTER KEY ENCRYPTION BY PASSWORD = ' ,  QUOTENAME(@StrongPassword, ''''));
  EXEC (@DynamicSQL)
 END

 –Create the certificate to authenticate the endpoint
    SET @DynamicSQL = CONCAT('CREATE CERTIFICATE ', QUOTENAME(@CompName + '-Cert'), ' WITH SUBJECT = ', QUOTENAME(@CompName, '''')) ;
    EXEC (@DynamicSQL);

    –Create the database mirroring endpoint authenticated by the certificate.
    SET @DynamicSQL =
        CONCAT('CREATE ENDPOINT Endpoint_Mirroring
            STATE = STARTED
            AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
            FOR DATABASE_MIRRORING (AUTHENTICATION = CERTIFICATE ',QUOTENAME(@CompName + '-Cert'), ' , ENCRYPTION = REQUIRED ALGORITHM AES, ROLE = ALL)');
    EXEC (@DynamicSQL);

    –Back up the certificate to a common network share for import into other nodes in the cluster
    SET @DynamicSQL = CONCAT('BACKUP CERTIFICATE ',QUOTENAME(@CompName + '-Cert'),' To FILE = ', QUOTENAME( @ShareName + '\SQL-' + @CompName + '.cer', ''''));
    EXEC (@DynamicSQL);
END
GO

InstallEndpointCert

————————————————————————————-
—  This procedure assumes that a certificate has been created on another node in the AG, and backed up to a common network share.
—  Parameters:
—    @CompName – The name of the computer whose certificate needs to be installed here.  i.e. the other replica that this node needs to communicate with.
—    @ShareName – A common network share to which certificates were backed up from each machine in the cluster/AG.
—    @StrongPassword – A strong password to be used for the login created to log in on behalf of the endpoint on the other node.

—  This procedure assumes that each node has run CreateEndpointCert and that all cert backup files reside on the share pointed to by the second parameter.
—  The procedure creates a login and a user for the remote machine, and then created a certificate to authorize the user when the certificate is used as authentication from the remote endpoint.
—————————————————————————————

CREATE PROCEDURE InstallEndpointCert
    @CompName SYSNAME,
 @ShareName SYSNAME,
 @StrongPassword SYSNAME
AS BEGIN
    DECLARE @DynamicSQL varchar(1000);
    DECLARE @MyCompName varchar(250);
    SELECT @MyCompName = CONVERT(SysName, SERVERPROPERTY('MachineName'));
  –Don't need to create LOGINs for the local system
    IF (@MyCompName <> @CompName)
    BEGIN
        SET @DynamicSQL = CONCAT('CREATE LOGIN ', QUOTENAME (@CompName + '-Login'), ' WITH PASSWORD= ', QUOTENAME( @StrongPassword, ''''));
        EXEC (@DynamicSQL);

        SET @DynamicSQL = CONCAT('CREATE USER ', QUOTENAME( @CompName + '-User'), ' FOR LOGIN ', QUOTENAME(@CompName + '-Login'));
        EXEC (@DynamicSQL);

        SET @DynamicSQL = CONCAT('CREATE CERTIFICATE ', QUOTENAME(@CompName +'-Cert'), ' AUTHORIZATION ', QUOTENAME(@CompName +'-User'), ' FROM FILE = ', QUOTENAME(@ShareName + '\SQL-' + @CompName + '.cer' , ''''));
        EXEC (@DynamicSQL);

        SET @DynamicSQL = CONCAT('GRANT CONNECT ON ENDPOINT::Endpoint_Mirroring TO ', QUOTENAME(@CompName +'-Login'));
        EXEC (@DynamicSQL);
 END
END
GO

See the other posts in the SQL Server 2016 blogging series.

Try SQL Server 2016 RC