Exchange 2003/2007 clustering & high availability

Article
03/02/2007

The Exchange development team have done a nice job of expanding the high availability options with the 2007 release. With Exchange 2003, the only real HA design was to use what is now known as a Single Copy Cluster (SCC) - ie. there's one copy of the databases and log files, held on a Storage Area Network, and multiple physical cluster nodes connect to that SAN. Exchange 2007 introduced Local Continuous Replication and Cluster Continuous Replication, and is due to add Standby Continuous Replication later this year.

In the 2003 model, the "Exchange Virtual Server" (EVS) was a collection of resources that run on one of the physical cluster nodes at any given time, with each EVS having its own name and IP address which the clients connected to.

This model works well in providing a high level of service for clients - Microsoft's own IT department ran an SLA of 99.99%, a maximum of 53 minutes of downtime a year. Routine maintenance tasks (like patching the OS, upgrading the firmware etc) could be performed one node at a time, by having the workload fail over to the passive node during the maintenance period. The downside with this single-copy approach is that there's a single point of failure: the disks. Even though the SAN technology is highly fault tolerant, it's still possible to knock out the entire SAN, or to have some catastrophe make the SAN corrupt the data on the disks.

Exchange 2007 added a couple of additions to the high availability arena - Local Continuous Replication (LCR), which doesn't use clustering at all, and Cluster Continuous Replication (CCR) which does. The name "Exchange Virtual Server" used in clustering has also changed to "Clustered Mailbox Server" to prevent confusion with the Virtual Server software virtualisation technology.

Local Continuous Replication

In an LCR environment, the server keeps a 2nd copy of its databases and log files on a separate physical set of disks, which could be in the same place (maybe even a USB disk hanging off the back of the server, if it was a branch office or small business one). Basic Architecture of Local Continuous Replication

LCR could also replicate data to another datacenter using iSCSI storage, accessed over the WAN (assuming the bandwidth and network latency are OK). Downsides to LCR are that the server in question is doing more work (by keeping two separate sets of disks updated) and that there's no automatic failover - an administrator would need to manually bring the LCR copy of data back online, in the event of a hardware failure of the server.

Cluster Continuous Replication

CCR provides a more complex but more robust (in terms of recovery) solution. There are two nodes in a cluster (and there can only be two, unlike the SCC approach which could have up to 8 nodes), with each node containing a copy of the databases and the log files being used by the active node. When a log file is closed on the active node, the passive one will copy it over the LAN/WAN and will apply the changes to its own copy of the database. The plus side of CCR is that there's little overhead on the active node (since it's not taking care of the 2nd copy) and because we're using clustering, the nodes can fail over between each other automatically - they maintain a networked heartbeat between the nodes, so the passive node can tell if it needs to come fully online.

In the case where either planned or unplanned failover occurs, the passive node will take over the role of servicing users, meaning the clients continue connecting to the same name and IP address they were using to previously, and the formerly active node will now take up the passive role, and will start pulling any changes back from the newly activated one.

In order to prevent the situation of both nodes coming online at the same time (something that's referred to as a "split brain" cluster), there's also a new "witness" role which is used to prevent the scenario where the passive node thinks the sky has fallen in and everything's gone dead, when in fact, it's the passive node that's fallen off the network. The witness is just a file share, which uses locking semantics to illustrate if the active node is still alive (since both nodes connect to the file share witness) - so if the passive node can read the witness and deduce that the active node is still running, it won't bring itself online, even if it can't currently see the heartbeat from the active node.

CCR provides a solution to the single point of failure in the SCC model, but there are some limitations - namely, there can only be two cluster nodes, and they need to be on the same IP subnet. This means it can be tricky to have a node in a Disaster Recovery datacenter, what with needing to span a subnet and an AD site across the WAN. What many people feel would be the ideal scenario would be to have the both CCR nodes & copies sited in one datacenter, but then have a 3rd node in the DR datacenter, on a different subnet.

Standby Continuous Replication

Service Pack 1 for Exchange 2007 (due in the second half of 2007) plans to introduce a new replication paradigm called Standby Continuous Replication (SCR). This could be used in conjunction with a CCR model, where the active/passive nodes are in one place and will automatically fail over between each other, but a third (standby) node is in a different place. Activation of the 3rd node will only take place when both of the primary nodes are offline, such as if the primary datacenter failed completely. In that environment, a manual process will be followed to mount the databases on the standby node, similar to how an administrator would bring a backup copy from an LCR server online. The third node is not a member of the cluster, and will not need to be on the same IP subnet.

SCR will also offer the option of having a standalone Exchange server sending a copy of its data to another standalone server, meaning that cross-datacenter fault tolerance could be achieved without clustering at all, albeit at the expense of a manual failover regime.

More information on High Availability in Exchange 2003 can be found online here, and for Exchange 2007, here. Further details of what's going to be in SP1 will be posted in the coming weeks to the Exchange team blog.

Exchange 2003/2007 clustering & high availability

Additional resources