Issue with Accessing CMS on Paired Standard Edition Server

Update 5/31/13 - Added information about using "Prepare first Standard Edition server" to add firewall rules.

I came across this issue while testing paired pool fail-over.  The issue is dependent on a couple of things, namely how you configure your base OS server build and when you paired the Standard Edition Servers.  For me this became an issue when I performed the following steps in this order:

  1. Created/Installed two Lync Server 2013 Standard Edition Servers that aren't paired together.
  2. Ran the Install-CsDatabase -CentralManagementStore cmdlet on the second Standard Edition Server.
  3. Setup pool pairing in Topology Builder and published the new topology from the second Standard Edition Server.
  4. Ran Step 2 from the Deployment Wizard on both Standard Edition Servers to install the Lync Server Backup Service.
  5. Followed the steps in the Managing Lync Server 2013 Disaster Recovery, High Availability, and Backup Service TechNet article to fail-over to the second Standard Edition Server.
  6. Tied to access the CMS from another Lync Server or bring the first Standard Edition Server back online and try to access the CMS.

With the CMS failed over to the second Standard Edition Server, I was unable to access the CMS from any server except the second Standard Edition Server.  Trying to download the topology in Topology Builder on the first Standard Edition Server returned the following:

Running Get-CsBackupServiceStatus on the first Standard Edition Server to check the state of the Lync Server Backup Service returned the following:

However, running the same cmdlets on the second Standard Edition Server returned different results:

Also, the second Standard Edition Server was logging the following in the Lync Server Event Log every 2 minutes:

 

Log Name:      Lync Server
Source:        LS Backup Service
Date:          4/28/2013 11:21:01 AM
Event ID:      4090
Task Category: (4000)
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      TEST-LS15-SE2.test.deitterick.com
Description:
Microsoft Lync Server 2013, Backup Service central management backup module performs a full sync export.

 

At the same time the first Standard Edition Server was logging the following errors in the Lync Server Eventlog:

Log Name:      Lync Server
Source:        LS Backup Service
Date:          4/28/2013 11:27:15 AM
Event ID:      4095
Task Category: (4000)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TEST-LS15-SE1.test.deitterick.com
Description:
Failed to read topology from Master Central Management database Microsoft Lync Server 2013, Backup Service will continuously attempt to retrieve the topology.

While this condition persists, the module will not be able to perform backup.
Exception:
Could not connect to SQL server : [Exception=System.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.Connect(ServerInfo serverInfo, SqlInternalConnectionTds connHandler, Boolean ignoreSniOpenTimeout, Int64 timerExpire, Boolean encrypt, Boolean trustServerCert, Boolean integratedSecurity, Boolean withFailover)
   at System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean ignoreSniOpenTimeout, TimeoutTimer timeout, Boolean withFailover)
   at System.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout)
   at System.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance)
   at System.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions)
   at System.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
   at System.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnectionPool pool, DbConnectionOptions options, DbConnectionPoolKey poolKey, DbConnectionOptions userOptions)
   at System.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnectionOptions userOptions)
   at System.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnectionOptions userOptions)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
   at System.Data.SqlClient.SqlConnection.Open()
   at Microsoft.Rtc.Common.Data.DBCore.PerformSprocContextExecution(SprocContext sprocContext)
ClientConnectionId:00000000-0000-0000-0000-000000000000]
Cause: Possible issues with Master Backend database
Resolution:
Ensure that the SQL Server hosting the Master Central Management is running.

Log Name:      Lync Server
Source:        LS Backup Service
Date:          4/28/2013 11:27:15 AM
Event ID:      4082
Task Category: (4000)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TEST-LS15-SE1.test.deitterick.com
Description:
Microsoft Lync Server 2013, Backup Service central management backup module failed to complete import operation.

Configurations:
Backup Module Identity:CentralMgmt.CMSMaster
Working Directory path:\\TEST-LS15-SE1.test.deitterick.com\share\1-BackupService-4\BackupStore\Temp
Local File Store Unc path:\\TEST-LS15-SE1.test.deitterick.com\share\1-BackupService-4\BackupStore
Remote File Store Unc path:\\TEST-LS15-SE2.test.deitterick.com\share\1-BackupService-3\BackupStore

 Additional Message:
 Exception: Microsoft.Rtc.BackupService.ModuleUnavailableException: Backup module is temporarily unavailable at this point. Reason: CMS backup module is not initialized yet.
   at Microsoft.Rtc.BackupService.BackupModules.CentralMgmtBackupModule.CheckModuleAvailability(Nullable`1 primaryPool)
   at Microsoft.Rtc.BackupService.BackupModules.CentralMgmtBackupModule.ApplyChanges(Unzipper unzipper, String& newCookie, Boolean& forceSetErrorState)
   at Microsoft.Rtc.BackupService.BackupModuleHandler.ReceiveBackupDataTask.ApplyChanges(Boolean& forceSetErrorState)
   at Microsoft.Rtc.BackupService.BackupModuleHandler.ReceiveBackupDataTask.InternalExecute()
   at Microsoft.Rtc.Common.TaskManager`1.ExecuteTask(Object state)

Cause: Either network or permission issues. Please look through the exception details for more information.
Resolution:
Resolution

Log Name:      Lync Server
Source:        LS Backup Service
Date:          4/28/2013 11:30:45 AM
Event ID:      4080
Task Category: (4000)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TEST-LS15-SE1.test.deitterick.com
Description:
Microsoft Lync Server 2013, Backup Service central management backup module failed to complete export operation.

Configurations:
Backup Module Identity:CentralMgmt.CMSMaster
Working Directory path:\\TEST-LS15-SE1.test.deitterick.com\share\1-BackupService-4\BackupStore\Temp
Local File Store Unc path:\\TEST-LS15-SE1.test.deitterick.com\share\1-BackupService-4\BackupStore
Remote File Store Unc path:\\TEST-LS15-SE2.test.deitterick.com\share\1-BackupService-3\BackupStore

 Additional Message:
 Exception: Microsoft.Rtc.BackupService.ModuleUnavailableException: Backup module is temporarily unavailable at this point. Reason: CMS backup module is not initialized yet.
   at Microsoft.Rtc.BackupService.BackupModules.CentralMgmtBackupModule.CheckModuleAvailability(Nullable`1 primaryPool)
   at Microsoft.Rtc.BackupService.BackupModules.CentralMgmtBackupModule.ConfirmChanges(String cookie)
   at Microsoft.Rtc.BackupService.BackupModuleHandler.SendBackupDataTask.ConfirmChanges(String cookie)
   at Microsoft.Rtc.BackupService.BackupModuleHandler.SendBackupDataTask.ConfirmChangesAndPrepareCookieToSync(Boolean primaryDataExists, Boolean secondDataExists, CookieContainer& oldCookie, Boolean& forPrimaryBatch)
   at Microsoft.Rtc.BackupService.BackupModuleHandler.SendBackupDataTask.InternalExecute()
   at Microsoft.Rtc.Common.TaskManager`1.ExecuteTask(Object state)

Cause: Either network or permission issues. Please look through the exception details for more information.
Resolution:
Resolution

 

The issue is that the Windows Firewall is blocking access to SQL.  If you compare the Windows Firewall rules on the first Standard Edition Server to the rules on the second Standard Edition Server, you'll notice that 2 rules are missing on the second Standard Edition Server:

This is why access to the CMS on the second Standard Edition Server is failing except from the second Standard Edition Server.  These two rules were added to the first Standard Edition Server when I ran the "Prepare first Standard Edition server" step in the Deployment Wizard.  This is why they're on the first Standard Edition Server, but not the second.  The fix is just to manually add these rules to the second Standard Edition Server.  Once the rules are added, you'll be able to access the CMS from other Lync Servers.  This will also resolve the ErrorState that Get-CsBackupServiceStatus is displaying for the first Standard Edition Server.

If you had decided to pair the Standard Edition Servers when you first authored the topology, you could have just ran the "Prepare first Standard Edition server" step in the Deployment Wizard on both Standard Edition Servers:

This would have created the required SQL instance and placed the correct rules in the Windows Firewall:

> Creating firewall exception for SQL instance

netsh advfirewall firewall add rule name="SQL RTC Access" dir=in action=allow program="c:\Program Files\Microsoft SQL Server\MSSQL11.RTC\MSSQL\Binn\sqlservr.exe" enable=yes profile=any
Ok.

> Creating firewall exception for SQL Browser

netsh advfirewall firewall add rule name="SQL Browser" dir=in action=allow protocol=UDP localport=1434
Ok.

Of course, if you have the Windows Firewall disabled, you wouldn't run into this issue, but it's not recommended to disable the Windows Firewall!  The important thing to remember is to make sure that the Windows Firewall rules on your paired Standard Edition Server are setup correctly.  Otherwise you might be in for a surprise when practicing your disaster recovery plan!