Exchange 2007 SP1 Continuous Replication Disaster Recovery Decisions flowcharts


The ability to continue to provide a full service to your user community in the unlikely event of the loss of a datacentre is an increasingly common requirement. Continuous Replication (CCR and SCR) with Exchange 2007 Service Pack 1 can be used to provide both data availability and site resilience.

An example Exchange 2007 design using continuous replication is as follows (click to see bigger):

With any design it is important to understand the processes and decision making that might be involved when certain scenarios present themselves. If we are designing for high availability administrators need to understand what decisions might need to be made and the processes that would be required should a particular set of circumstances occur. For example, what should the recovery strategy be in the event of the loss of a single mailbox database? Should the Exchange cluster group be moved to the passive node at this stage? If so this would mean the temporary loss of service to all users on this server for the sake of those on one mailbox store.

The following flowcharts show the likely processes and decision making flow that might be involved in certain disaster recovery situations based on the above Exchange 2007 design.

Total Site Failure - Likely steps & decision making process in recovering from total physical site failure:

Single Server Failure - Likely steps & decision making process in recovering from single server failure:

Single Database Failure - Likely steps & decision making process in recovering from single active database failure:

These decision matrices do not provide the definitive answer and there are often numerous possible recovery paths in any given Disaster Recovery scenario. However they do highlight the decisions that are likely to be made and the importance of understanding what the processes an administrator might have to follow to recover service and data to their user community.

Few additional notes:

To see a different version of those charts: "CCR, Site Resilience and sample decision making processes" - please go here.

As it was asked last time, we have made those available as a download for you to print in full resolution if you wish to do so. To get the ZIP file with XPS and PDF version of those charts, please go here.

- Doug Gowans


Comments (7)
  1. Simon says:

    Hey Doug

    Good info. Thanks. Nice to see you spelling centre the correct way ;o)

    Any chance of posting the visio files for download?

    Simon

  2. Well, this is a very nice post, but i noticed something in the AD sites…

    which is considered best practice?

    Configuring the two sites as one AD sites hence the replication will be RPC between the DCs in the local and remote sites, and no need to reconfigure subnet association later?

    or making Two active directory sites hence using IP site link with network association for each, and when activating, move the subnet for the subnets to be associated with the DR site?

  3. sixth says:

    Love these charts!!! :-)

  4. Matt says:

    This is a great post – thank you for spending the time to make it!  It fits perfectly with our data center model.

  5. Nuno Mota says:

    Like said before, excellent job with these charts! Everyday I love E2K7 even more!   :)

    Thanks for all the work Doug!

  6. Jesse Harris says:

    Your note (Single Server Failure chart, rebuilding failed mailbox cluster node) that you should remember to "also ensure that server is removed from redundant server list" was just what I needed. Haven’t been able to find this anywhere else. For anyone in the same boat, the necessary command is detailed at http://technet.microsoft.com/en-us/library/cc535022(EXCHG.80).aspx.

  7. AdamX says:

    First, good job on the DR process flow charts!

    Second, I have a similar question as Mohammed asked. In the example setup, two physical sites are configured as one AD site. It may work OK if you have only two physical sites. What if you have multiple physical sites (more than two) deployed with Exchange, and use an additional physical site as a dedicated DR site? It is not going to be practical to configure all these physical sites as one single AD site. Apparently, Exchange server won’t like it that way. And we won’t deploy them that way. The better approach should be, I think, to configure an AD site for each physical site. And the DR processes reflect a "true" site wide failover.

Comments are closed.

Skip to main content