Exchange 2007 – Upgrading a service pack on a single copy cluster instance when SAN based replication is utilized.

In Exchange 2007 there are two clustered installation models. Some customers elect to utilize a clustered installation model based on shared storage – this is a single copy cluster installation. In order to achieve site resiliency or provide for disaster recovery, some customers will implement a SAN based data replication solution. 

Recently I encountered a customer that was utilizing SAN based data replication and the single copy cluster installation model to provide their site resilient solution. The installation encompassed a source cluster with single copy configuration and a target cluster with single copy cluster configuration. Each clustered mailbox server was established utilizing a different name – for example Exchange-Main and Exchange-DR. The physical disk resources that were assigned to each CMS instance represented the LUNs that were replicated between SANs. When it was necessary to activate the solution databases would be marked as “Allow this database to be overwritten by a restore” and then mounted. Mailboxes would be moved utilizing the move-mailbox –configurationOnly to restore client access to the replicated databases

This presented an interesting challenge for this customer when it came to deploying service packs. When the same physical disk resources are utilized between clusters, only one set of the physical disk resources can be brought online. This is because one SAN has a Read / Write setting and the other SAN has a Read Only setting. Essentially an online attempt of the database instances of the CMS Exchange-DR would fail because their dependant physical disks could not be brought online (because they were read only).

When an /upgradeCMS is performed after upgrading the binaries on a clustered node, the resources are initially in an offline state. As a completion of the upgradeCMS the setup process initiates an online to the cluster mailbox server group. Should any resources fail to come online this is considered a failure of the upgrade. The administrator performing the upgrade is notified that a failure occurred and the upgrade setup watermark persists in the registry. Therefore it is necessary that the /upgradeCMS be allowed to complete. In this case database instances could not be brought online because their associated storage could not be brought online due to the storage being Read Only.

In order to complete the upgrade process the following steps were utilized (utilizing my sample clustered mailbox server names).

  • Following SAN vendor recommendations replication was suspended between the Exchange-Main and Exchange-DR. 
  • Mark LUNs on the remote SAN as Read / Write (allowing Exchange-DR full access to storage).
  • Databases on the secondary CMS were set to “Allow this database to be overwritten by a restore”
    • Get-MailboxDatabase –server Exchange-DR | Set-MailboxDatabase –allowfilerestore:$TRUE
  • Complete the upgrade process on Exchange-Main which will fully bring resources online.
  • Complete the upgrade process on Exchange-DR which will fully bring resources online.

At this point both Exchange-Main and Exchange-DR are online. This means that the databases that were previously replicated to Exchange-DR are no longer equal to the databases that exist on Exchange-Main. As a post upgrade step we need to do the following:

  • Stop the resources on Exchange-DR.
  • Mark replicated LUNS on the remote SAN as Read Only (preventing Exchange-DR access to the storage).
  • Following SAN vendor recommendations re-establish replication between the source and remote SANs ensuring that the SOURCE SAN is utilized for data synchronization.

In this installation it was necessary to temporarily break and re-establish replication in order to complete the /upgradeCMS process.