Exchange 2010 High Availability Misconceptions Addressed


I’ve recently returned from TechEd North America 2011 in Atlanta, Georgia, where I had a wonderful time seeing old friends and new, and talking with customers and partners about my favorite subject: high availability in Exchange Server 2010. In case you missed TechEd, or were there and missed some sessions, you can download slide decks and watch presentations on Channel 9.

While at TechEd, I noticed several misconceptions that were being repeated around certain aspects of Exchange HA. I thought a blog post might help clear up these misconceptions.

Exchange 2010 High Availability and Site Resilience Terminology

But first let’s start with some terminology to make sure everyone understands the components, settings and concepts I’ll be discussing.

Term Description How Configured
Activation The process of making a passive database copy into an active database copy. Move-ActiveMailboxDatabase cmdlet
Activation Suspended or Blocked The process of suspending or blocking a single database or an entire server from automatic activation. Suspend-MailboxDatabaseCopy or Set-MailboxServer cmdlets

Active Manager

An internal Exchange component which runs inside the Microsoft Exchange Replication service that is responsible for failure monitoring and corrective action within a DAG. N/A
Alternate Witness Directory An optional property of a DAG that specifies the name of the directory that is shared on the Alternate Witness Server for quorum purposes. Set-DatabaseAvailabilityGroup cmdlet or Exchange Management Console (EMC)
Alternate Witness Server An optional property of a DAG that specifies the witness server that will be used by DAG after you have performed a datacenter switchover. Set-DatabaseAvailabilityGroup cmdlet or EMC
Attempt Copy Last Logs (ACLL) A process performed during Best Copy Selection in which the system attempts to copy from the original active database copy any log files missing by the passive copy selected for activation. N/A
AutoDatabaseMountDial Property setting of a Mailbox server that determines whether a passive database copy will automatically mount as the new active copy, based on the number of log files missing by the copy being mounted. Set-MailboxServer cmdlet
Best Copy Selection (BCS) A process performed by Active Manager to determine the best passive mailbox database copy to activate either in response to a failure affecting the active mailbox database copy, or as a result of an administrator performing a targetless switchover. N/A
Continuous Replication Block Mode A form of continuous replication that replicates blocks of ESE transaction data between the ESE log buffers on the active mailbox database copy to replication log buffers on one or more passive mailbox databases copies. N/A – not configurable
The system automatically switches between file mode and block mode based on the passive copy’s ability to keep up with continuous replication.
Continuous Replication File Mode A form of continuous replication that replicates closed transaction log files from the active mailbox database copy to one or more passive mailbox database copies. N/A – not configurable
Datacenter In Exchange, typically this refers to an Active Directory site; however, it can also refer to a physical site.  In the context of this blog post, datacenter equals Active Directory site. N/A
Datacenter Activation Coordination (DAC) Mode A property of the DAG setting that, when enabled, forces starting DAG members to acquire permission in order to mount databases. Set-DatabaseAvailabilityGroup cmdlet
Datacenter Activation Coordination Protocol (DACP) A bit in memory (0 or 1) that is used by DAG members in DAC mode. A value of 0 indicates the DAG member cannot mount databases; a value of 1 indicates the DAG member can mount active databases that it currently hosts. N/A – Automatically used when the DAG is configured for DAC mode.
Datacenter Switchover The manual process performed to activate the Exchange components of a standby datacenter and restore Exchange service and data after failure of a primary datacenter. Process documented in Datacenter Switchovers.
Failover The automatic process performed by the system in response to a failure affecting one or more active mailbox database copies. N/A – Happens automatically by the system; although you can activation-block or suspend individual mailbox database copies or an entire DAG member.
File Share Witness A cluster-managed resource that is created and use when the cluster uses the node and file share majority quorum model. N/A – Automatically created and destroyed by Exchange, as needed, based on the number of DAG members.
Incremental Resync A built-in process that corrects certain forms of divergence that occur between mailbox database copies in a DAG (typically between a new active copy and the previous active copy). N/A – Happens automatically by the system
Lossy Failover A database failover condition in which the passive copy being activated is mounted with one or more log files missing. Partially controlled via AutoDatabaseMountDial
Primary Active Manager (PAM) An Active Manager role held by a single DAG member at any given time that is responsible for responding to failures an initiating corrective action in the form of a database or server failover. N/A – The PAM role is automatically held by the DAG member that owns the cluster’s core resource group.
Quorum Has a dual meaning:

  • Refers to the consensus model used by members of a cluster to ensure consistency within the cluster. Exchange uses two of the four cluster quorum models:
    • Node Majority (for DAGs with an odd number of members)
    • Node and File Share Majority (for DAGs with even number of members)
  • Refers to the data (e.g., “quorum data”) shared by the cluster members that is used to maintain quorum
N/A – Automatically configured by Exchange, as needed, based on the number of DAG members.
RPCClientAccessServer A property of a Mailbox database that specifies the Client Access server or Client Access server array through which MAPI/RPC clients (Outlook 2003, Outlook 2007, Outlook 2010) access their mailboxes on that database. Set-MailboxDatabase cmdlet
Site Active Directory Site, which is defined as one or more fast, reliable and well-connected (< 10 ms latency) subnets. N/A – Configured outside of Exchange by using Active Directory Sites and Services
Split Brain A condition in which multiple cluster members believe they are authoritative for one or more common resources. N/A
Standby Active Manager (SAM) An Active Manager role held by all DAG members that do not hold the PAM role that is responsible for monitoring for local failures and responding to queries from CAS and Hub as to the location of the user’s mailbox database. N/A – The SAM role is automatically held by the DAG members that do not own the cluster’s core resource group.
StartedMailboxServers A list of DAG members that are currently operational and functional. Automatically added to list during normal operations
Manually added to list as part of re-activation of primary datacenter using Start-DatabaseAvailabilityGroup cmdlet
StoppedMailboxServers A list of DAG members that have been marked as non-operational or non-functional. Automatically added to list during normal operations
Manually added to list as part of activation of second datacenter using Stop-DatabaseAvailabilityGroup cmdlet
Switchover In the context of a single database, it is the manual process used to move the active mailbox database copy to another server. In the context of an entire server, it is the manual process used to move all active mailbox database copies from one server to one or more other servers. Move-ActiveMailboxDatabase cmdlet
Targetless Switchover A switchover process in which the administrator does not specify which passive mailbox database copy should become active, but instead allows Active Manager’s BCS process to choose the best copy to activate. Move-ActiveMailboxDatabase cmdlet (specifically without using the TargetServer parameter)
Transport Dumpster A feature of the Hub Transport role that retains copies of messages sent to users on replicated mailbox databases in a DAG so that messages can be recovered by the system in the event of a lossy failover of the user’s database. Set-TransportConfig cmdlet or EMC
Witness Directory A required property of every DAG that specifies the name of the directory that is shared on the Witness Server for quorum purposes. Set-DatabaseAvailabilityGroup cmdlet or EMC

Witness Server A required property of every DAG that specifies the server external to the DAG that should be used as a participant in quorum for the DAG’s underlying cluster when the DAG contains even number of members. Set-DatabaseAvailabilityGroup cmdlet or EMC

OK, that’s enough terms for now.  Smile

Exchange 2010 High Availability and Site Resilience Misconceptions

Now let’s discuss (and dispel) these misconceptions, in no particular order.

Misconception Number 1: The Alternate Witness Server (AWS) provides redundancy for the Witness Server (WS)

The actual name – Alternate Witness Server – originates from the fact that its intended purpose is to provide a replacement witness server for a DAG to use after a datacenter switchover. When you are performing a datacenter switchover, you’re restoring service and data to an alternate or standby datacenter after you’ve deemed your primary datacenter un-usable from a messaging service perspective.

Although you can configure an Alternate Witness Server (and corresponding Alternate Witness Directory) for a DAG at any time, the Alternate Witness Server will not be used by the DAG until part-way through a datacenter switchover; specifically, when the Restore-DatabaseAvailabilityGroup cmdlet is used.

The Alternate Witness Server itself does not provide any redundancy for the Witness Server, and DAGs do not dynamically switch witness servers, nor do they automatically start using the Alternate Witness Server in the event of a problem with the Witness Server.

The reality is that the Witness Server does not need to be made redundant. In the event the server acting as the Witness Server is lost, it is a quick and easy operation to configure a replacement Witness Server from either the Exchange Management Console or the Exchange Management Shell.

Misconception Number 2: Microsoft recommends that you deploy the Witness Server in a third datacenter when extending a two-member DAG across two datacenters

In this scenario, you have one DAG member in the primary datacenter (Portland) and one DAG member in a secondary datacenter (Redmond). Because this is a two-member DAG, it will use a Witness Server. Our recommendation is (and has always been) to locate the Witness Server in the primary datacenter, as shown below.

Figure 1
Figure 1: When extending a two-member DAG across two datacenters, locate the Witness Server in the primary datacenter

In this example, Portland is the primary datacenter because it contains the majority of the user population. As illustrated below, in the event of a WAN outage (which will always result in the loss of communication between some DAG members when a DAG is extended across a WAN), the DAG member in the Portland datacenter will maintain quorum and continue servicing the local user population, and the DAG member in the Redmond datacenter will lose quorum and will require manual intervention to restore to service after WAN connectivity is restored.

Figure 2
Figure 2: In the event of a WAN outage, the DAG member in the primary datacenter will maintain quorum and continue servicing local users

The reason for this behavior has to do with the core rules around quorum and DAGs, specifically:

  • All DAGs and DAG members require quorum to operate. If you don’t have quorum, you don’t have an operational DAG. When quorum is lost, databases are dismounted, connectivity is unavailable and replication is stopped.
  • Quorum requires a majority of voters to achieve a consensus.Thus, when you have an even number of members in a DAG, you need an external component to provide a weighted vote for one of the actual quorum voters to prevent ties from occurring.
    • In a Windows Failover Cluster, only members of the cluster are quorum voters. When the cluster is one vote away from losing quorum and the Witness Server is needed to maintain quorum, one of the DAG members that can communicate with the Witness Server places a Server Message Block (SMB) lock on a file called witness.log that is located in the Witness Directory. The DAG member that places the SMB lock on this file is referred to as the locking node. Once an SMB lock is placed on the file, no other DAG member can lock the file.
    • The locking node then acquires a weighted vote; that is, instead of its vote counting for 1, it counts for 2 (itself and the Witness Server).
    • If the number of members that can communicate with the locking node constitutes a majority, then the members in communication with the locking node will maintain quorum and continuing servicing clients. DAG members that cannot communicate with the locking node are in the minority, and they lose quorum and terminate cluster and DAG operations.
  • The majority formula for maintaining quorum is V/2 + 1, the result of which is always a whole number. The formula is the number of voters (V) divided by 2, plus 1 (for tie-breaking purposes).

Going back to our example, consider the placement of the Witness Server in a third datacenter, which would look like the following: 

IMG3
Figure 3: Locating the Witness Server in a third datacenter does not provide you with any different behavior

The above configuration does not provide you with any different behavior. In the event WAN connectivity is lost between Portland and Redmond, one DAG member will retain quorum and one DAG member will lose quorum, as illustrated below: 

Figure 4
Figure 4: In the event of a WAN outage between the two datacenters, one DAG member will retain quorum

Here we have two DAG members; thus two voters. Using the formula V/2 + 1, we need at least 2 votes to maintain quorum. When the WAN connection between Portland and Redmond is lost, it causes the DAG’s underlying cluster to verify that it still has quorum.

In this example, the DAG member in Portland is able to place an SMB lock on the witness.log file on the Witness Server in Olympia. Because the DAG member in Portland is the locking node, it gets the weighted vote, and now therefore holds the two votes necessary to retain quorum and keep its cluster and DAG functions operating.

Although the DAG member in Redmond can communicate with the Witness Server in Olympia, it cannot place an SMB lock on the witness.log file because one already exists. And because it cannot communicate with the locking node, the Redmond DAG member is in the minority, it loses quorum, and it terminates its cluster and DAG functions. Remember, it doesn’t matter if the other DAG members can communicate with the Witness Server; they need to be able to communicate with the locking node in order to participate in quorum and remain functional.

As documented in Managing Database Availability Groups on TechNet, if you have a DAG extended across two sites, we recommend that you place the Witness Server in the datacenter that you consider to be your primary datacenter based on the location of your user population. If you have multiple datacenters with active user populations, we recommend using two DAGs (also as documented in Database Availability Group Design Examples on TechNet).

Misconception Number 2a: When I have a DAG with an even number of members that is extended to two datacenters, placing the witness server in a third datacenter enhances resilience

In addition to Misconception Number 2, there is a related misconception that extending an even member DAG to two datacenters and using a witness server in a third enables greater resilience because it allows you to configure the system to perform a “datacenter failover.” You may have noticed that the term “datacenter failover” is not defined above in the Terminology section. From an Exchange perspective, there’s no such thing. As a result, no configuration can enable a true datacenter failover for Exchange.

Remember, failover is corrective action performed automatically by the system. There is no mechanism to achieve this for datacenter-level failures in Exchange 2010. While the above configuration may enable server failovers and database failovers, it cannot enable datacenter failovers. Instead, the process for recovering from a datacenter-level failure or disaster is a manual process called a datacenter switchover, and that process always begins with humans making the decision to activate a second or standby datacenter.

Activating a second datacenter is not a trivial task, and it involves much more than the inner workings of a DAG. It also involves moving messaging namespaces from the primary datacenter to the second datacenter. Moreover, it assumes that the primary datacenter is no longer able to provide a sufficient level of service to meet the needs of the organization. This is a condition that the system simply cannot detect on its own. It has no awareness of the nature or duration of the outage. Thus, a datacenter switchover is always a manual process that begins with the decision-making process itself.

Once the decision to perform a datacenter switchover has been made, performing one is a straightforward process that is well-documented in Datacenter Switchovers.

Misconception Number 3: Enabling DAC mode prevents automatic failover between datacenters; therefore, if I want to create a datacenter failover configuration, I shouldn’t enable DAC mode for my DAG

Datacenter Activation Coordination (DAC) mode has nothing whatsoever to do with failover. DAC mode is a property of the DAG that, when enabled, forces starting DAG members to acquire permission from other DAG members in order to mount mailbox databases. DAC mode was created to handle the following basic scenario:

  • You have a DAG extended to two datacenters.
  • You lose the power to your primary datacenter, which also takes out WAN connectivity between your primary and secondary datacenters.
  • Because primary datacenter power will be down for a while, you decide to activate your secondary datacenter and you perform a datacenter switchover.
  • Eventually, power is restored to your primary datacenter, but WAN connectivity between the two datacenters is not yet functional.
  • The DAG members starting up in the primary datacenter cannot communicate with any of the running DAG members in the secondary datacenter.

In this scenario, the starting DAG members in the primary datacenter have no idea that a datacenter switchover has occurred. They still believe they are responsible for hosting active copies of databases, and without DAC mode, if they have a sufficient number of votes to establish quorum, they would try to mount their active databases. This would result in a bad condition called split brain, which would occur at the database level. In this condition, multiple DAG members that cannot communicate with each other both host an active copy of the same mailbox database. This would be a very unfortunate condition that increases the chances of data loss, and make data recovery challenging and lengthy (albeit possible, but definitely not a situation we would want any customer to be in).

The way databases are mounted in Exchange 2010 has changed. Yes, the Information Store still performs the mount, but it will only do so if Active Manager asks it to. Even when an administrator right-clicks a mailbox database in the EMC and selects Mount Database, it is Active Manager that provides the administrative interface for that task, and performs the RPC request into the Information Store to perform the mount operation (even on Mailbox servers that are not members of a DAG).

Thus, when every DAG member starts, it is Active Manager that decides whether or not to send a mount request for a mailbox database to the Information Store. When a DAG is enabled for DAC mode, this startup and decision-making process by Active Manager is altered. Specifically, in DAC mode, a starting DAG member must ask for permission from other DAG members before it can mount any databases.

DAC mode works by using a bit stored in memory by Active Manager called the Datacenter Activation Coordination Protocol (DACP). That’s a very fancy name for something that is simply a bit in memory set to either a 1 or a 0. A value of 1 means Active Manager can issue mount requests, and a value of 0 means it cannot.

The starting bit is always 0, and because the bit is held in memory, any time the Microsoft Exchange Replication service (MSExchangeRepl.exe) is stopped and restarted, the bit reverts to 0. In order to change its DACP bit to 1 and be able to mount databases, a starting DAG member needs to either:

  • Be able to communicate with any other DAG member that has a DACP bit set to 1; or
  • Be able to communicate with all DAG members that are listed on the StartedMailboxServers list.

If either condition is true, Active Manager on a starting DAG member will issue mount requests for the active databases copies it hosts. If neither condition is true, Active Manager will not issue any mount requests.

Reverting back to the intended DAC mode scenario, when power is restored to the primary datacenter without WAN connectivity, the DAG members starting up in that datacenter can communicate only with each other. And because they are starting up from a power loss, their DACP bit will be set to 0. As a result, none of the starting DAG members in the primary datacenter are able meet either of the conditions above and are therefore unable to change their DACP bit to 1 and issue mount requests.

So that’s how DAC mode prevents split brain at the database level. It has nothing whatsoever to do with failovers, and therefore leaving DAC mode disabled will not enable automatic datacenter failovers.

By the way, as documented in Understanding Datacenter Activation Coordination Mode on TechNet, a nice side benefit of DAC mode is that it also provides you with the ability to use the built-in Exchange site resilience tasks.

Misconception Number 4: The AutoDatabaseMountDial setting controls how many log files are thrown away by the system in order to mount a database

This is a case where two separate functions are being combined to form this misperception: the AutoDatabaseMountDial setting and a feature known as Incremental Resync (aka Incremental Reseed v2). These features are actually not related, but they appear to be because they deal with roughly the same number of log files on different copies of the same database.

When a failure occurs in a DAG that affects the active copy of a replicated mailbox database, a passive copy of that database is activated one of two ways: either automatically by the system, or manually by an administrator. The automatic recovery action is based on the value of the AutoDatabaseMountDial setting.

As documented in Understanding Datacenter Activation Coordination Mode, this dial setting is the administrator’s way of telling a DAG member the maximum number of log files that can be missing while still allowing its database copies to be mounted. The default setting is GoodAvailability, which translates to 6 or fewer logs missing. This means if 6 or fewer log files never made it from the active copy to this passive copy, it is still OK for the server to mount this database copy as the new active copy. This scenario is referred to as a lossy failover, and it is Exchange doing what it was designed to do. Other settings include BestAvailability (12 or fewer logs missing) and Lossless (0 logs missing).

After a passive copy has been activated in a lossy failover, it will create log files continuing the log generation sequence based on the last log file it received from the active copy (either through normal replication, or as a result of successful copying during the ACLL process). To illustrate this, let’s look at the scenario in detail, starting before a failure occurs.

We have two copies of DB1; the active copy is hosted on EX1 and the passive copy is hosted on EX2. The current settings and mailbox database copy status at the time of failure are as follows:

  • AutoDatabaseMountDial: BestAvailability
  • Copy Queue Length: 4
  • Replay Queue Length: 0
  • Last log generated by DB1\EX1: E0000000010
  • Last Log Received by DB1\EX2: E0000000006

At this point, someone accidentally powers off EX1, and we have a lossy failover in which DB1\EX2 is mounted as the new active copy of the database. Because E0000000006 is the last log file DB1\EX2 has, it continues the generation stream, creating log files E0000000007, E0000000008, E0000000009, E0000000010, and so forth.

An administrator notices that EX1 is turned off and they restart EX1. EX1 boots up and among other things, the Microsoft Exchange Replication service starts. The Active Manager component, which runs inside this service, detects that:

  • DB1\EX2 was activated as part of a lossy failover
  • DB1\EX2 is now the active copy
  • DB1\EX1 is now a diverged passive copy

Any time a lossy failover occurs where there original active copy may be viable for use, there is always divergence in the log stream that the system must deal with. This state causes DB1\EX1 to automatically invoke a process called Incremental Resync, which is designed to deal with divergence in the log stream after a lossy failover has occurred. Its purpose is to resynchronize database copies so that when certain failure conditions occur, you don’t have to perform a full reseed of a database copy.

In this example, divergence occurred with log generation E0000000007, as illustrated below:

IMG5
Figure 5: Divergence in the log stream occurred with log E0000000007

 DB1\EX2 received generations 1 through 6 from DB1\EX1 when DB1\EX1 was the active copy. But a failover occurred, and logs 7 through 10 were never copied from EX1 to EX2. Thus, when DB1\EX2 became the active copy, it continued the log generation sequence from the last log that it had, log 6. As a result, DB1\EX2 generated its own logs 7-10 that now contain data that is different from the data contained in logs 7-10 that were generated by DB1\EX1.

To detect (and resolve) this divergence, the Incremental Resync feature starts with the latest log generation on each database copy (in this example, log file 10), and it compares the two different log files, working back in the sequence until it finds a matching pair. In this example, log generation 6 is the last log file that is the same on both systems. Because DB1\EX1 is now a passive copy, and because its logs 7 through 10 are diverged from logs 7 through 10 on DB1\EX2, which is now the active copy, these log files will be thrown away by the system. Of course, this does not represent lost messages because the messages themselves are recoverable through the Transport Dumpster mechanism.

Then, logs 7 through 10 on DB1\EX2 will be replicated to DB1\EX1, and DB1\EX1 will be a healthy up-to-date copy of DB1\EX2, as illustrated below: 

Figure 6
Figure 6: Incremental Resync corrects divergence in the log stream

I should point out that I am oversimplifying the complete Incremental Resync process, and that it is more complicated than what I have described here; however, for purposes of this discussion only a basic understanding is needed.

As we saw in this example, even though DB1\EX2 lost four log files, it will still able to mount as the new active database copy because the number of missing log files was within EX2’s configured value for AutoDatabaseMountDial. And we also saw that, in order to correct divergence in the log stream after a lossy failover, the Incremental Resync function threw away four logs files.

But the fact that both operations dealt with four log files does not make them related, nor does it mean that the system is throwing away log files based on the AutoDatabaseMountDial setting.

To help understand why these are really not related functions, and why AutoDatabaseMountDial does not throw away log files, consider the failure scenario itself. AutoDatabaseMountDial simply determines whether a database copy will mount during activation based on the number of missing log files. The key here is the word missing. We’re talking about log files that have not been replicated to this activated copy. If they have not been replicated, they don’t exist on this copy, and therefore, they cannot be thrown away. You can’t throw away something you don’t have.

It is also important to understand that the Incremental Resync process can only work if the previous active copy is still viable. In our example, someone accidentally shut down the server, and typically, that act should not adversely affect the mailbox database or its log stream. Thus, it left the original active copy intact and viable, making it a great candidate for Incremental Resync.

But let’s say instead that the failure was actually a storage failure, and that we’ve lost DB1\EX1 altogether. Without a viable database, Incremental Resync can’t help here, and all you can do to recover is to perform a reseed operation.

So, as you can see:

  • AutoDatabaseMountDial does not control how many log files the system throws away
  • AutoDatabaseMountDial is a completely separate process that does not require Incremental Resync to be available or successful
  • Incremental Resync throws away log files as part of its divergence correction mechanism, but does not lose messages as a result of doing so

Misconception Number 5: Hub Transport and Client Access servers should not have more than 8 GB of memory because they run slower if you install more than that

This has been followed by statements like:

a Hub Transport server with 16 GB of memory runs twice as slow as a Hub Transport server with 8 GB of memory, and the Exchange 2010 server roles were optimized to run with only 4 to 8 GB of memory.

This misconception isn’t directly related to high availability, per se, but because scalability and cost all factor into any Exchange high availability solution, it’s important to discuss this, as well, so that you can be confident that your servers are sized appropriately and that you have the proper server role ratio.

It is also important to address this misconception because it’s blatantly wrong. You can read our recommendations for memory and processors for all server roles and multi-role servers in TechNet. At no time have we ever said to limit memory to 8 GB or less on a Hub Transport or Client Access server. In fact, examining our published guidance will show you that the exact opposite is true.

Consider the recommended maximum number of processor cores we state that you should have for a Client Access or Hub Transport server. It’s 12. Now consider that our memory guidance for Client Access servers is 2 GB per core and for Hub Transport it is 1 GB per core. Thus, if you have a 12-core Client Access server, you’d install 24 GB of memory, and if you had a 12-core Hub Transport server, you would install 12 GB of memory.

Exchange 2010 is a high-performance, highly-scalable, resource-efficient, enterprise-class application. In this 64-bit world of ever-increasing socket and core count and memory slots, of course Exchange 2010 is designed to handle much more than 4-8 GB of memory.

Microsoft’s internal IT department, MSIT knows first-hand how well Exchange 2010 scales beyond 8 GB. As detailed in the white paper, Exchange Server 2010 Design and Architecture at Microsoft: How Microsoft IT Deployed Exchange Server 2010, MSIT deployed single role Hub Transport and Client Access servers with 16 GB of memory.

It has been suggested that a possible basis for this misconception is a statement we have in Understanding Memory Configurations and Exchange Performance on TechNet that reads as follows:

Be aware that some servers experience a performance improvement when more memory slots are filled, while others experience a reduction in performance. Check with your hardware vendor to understand this effect on your server architecture.

The reality is that, the statement is there because if you fail to follow your hardware vendor’s recommendation for memory layout, you can adversely affect performance of the server. This statement, while important for Exchange environments, has nothing whatsoever to do with Exchange, or any other specific application. It’s there because server vendors have specific configurations for memory based on a variety of elements, such as chipset, type of memory, socket configuration, processor configuration, and more. By no means does it mean that if you add more than 8 GB, Exchange performance will suffer. It just means you should make sure your hardware is configured correctly.

As stated in the article, and as mentioned above:

  • Our memory sizing guidance for a dedicated Client Access server role is 2GB/core.
  • Our memory sizing guidance for a dedicated Hub Transport server role is 1GB/core.

Misconception Number 6: A Two-Member DAG is designed for a small office with 250 mailboxes or less

This misconception is really related more to Misconception Number 5 than to high availability, because again it’s addressing the scalability of the solution itself. Like Misconception Number 5, this one is also blatantly wrong.

The fact is, a properly sized two-member DAG can host thousands of mailboxes, scaling far beyond 250 users. For example, consider the HP E5000 Messaging System for Exchange 2010, which is a pre-configured solution that uses a two-member DAG to provide high availability solutions for customers with a mailbox count ranging from 250 up to 15,000.

Ultimately, the true size and design of your DAG will depend on a variety of factors, such as your high availability requirements, your service level agreements, and other business requirements. When sizing your servers, be sure to use the guidance and information documented in Understanding Exchange Performance, as it will help ensure your servers are sized appropriately to handle your organization’s messaging workload.

What Have You Heard?

Have you heard any Exchange high availability misconceptions? Feel free to share the details with me in email. Who knows, it might just spawn another blog post!

For More Information

For more information on the high availability and site resilience features of Exchange Server 2010, check out these resources:

Blogcasts

TechNet Documentation

TechEd Presentations

  • EXL312 Designing Microsoft Exchange 2010 Mailbox High Availability for Failure Domains – TechEd North America 2011
  • EXL327 Real-World Site Resilience Design in Microsoft Exchange Server 2010?– TechEd North America 2011
  • EXL401 Exchange Server 2010 High Availability Management and Operations – TechEd North America 2011
  • UNC401 Microsoft Exchange Server 2010: High Availability Deep Dive (including changes introduced by SP1) – TechEd Europe 2010
  • UNC302 Exchange Server 2010 SP1 High Availability Design Considerations – TechEd New Zealand 2010

Scott Schnoll

Comments (26)
  1. Randall Vogsland says:

    Great article Scott, good to have this type of thing easily accessible for productive discourse with clients!

  2. Andrew says:

    I still say that AutoDatabaseMountDial "indirectly" sets how many log files can be discarded.  The ADMD setting tells the passive copy how many logs it can be behind before automatically mounting.  Once the passive automatically mounts, the former active has up to that many log files that are "diverged" and will be discarded once it comes back online.

    You are correct that ADMD doesn't itself throw away log files, but those log files are discarded once the former active starts replication again.  Thus ADMD indirectly sets the amount that can be discarded :)

  3. Andrew says:

    Was trying to type "Great Post" – this is the level of detail that people are looking for.  Missed that in my first comment.

  4. Scott Schnoll [MSFT] says:

    Andrew, thanks for the kind words.  Please do re-read the part where I state:

    – AutoDatabaseMountDial is a completely separate process that does not require Incremental Resync to be available or successful

    – Incremental Resync throws away log files as part of its divergence correction mechanism, but does not lose messages as a result of doing so

    Remember, Active Manager consults the value for ADMD before issuing a mount request during Best Copy Selection.  Once that operation is complete, the value for ADMD is not used by the copy being activated, nor it used by the Incremental Resync process.  Again, the Incremental Resync process will only occur if the previous active copy is still viable and can be resynchronized. And it is only in that case where logs will be discarded.

    You may be confusing the value of the ADMD setting with the actual number of logs lost.  Consider the above example where ADMD is set to BestAvailability, which translates to 12 or fewer log files.  When Incremental Resync runs, that does not mean it can discard up to 12 log files.  It will only discard diverged log files, and in our example, there were only 4 that were diverged.  Thus, only 4 are discarded, even though ADMD is configured with a value of up to 12.

    To illustrate further, consider the same example scenario I used in the blog post, but instead of losing 4 log files, the copy queue length was actually 20 log files.  Since ADMD is set to BestAvailability, the passive copy will not automatically mount.  So, the administrator forces a mount of the database to restore client access.  When the server hosting the previous active copy is restarted, Incremental Resync will perform its divergence check and discard 20 log files, several more than what ADMD was set to.

  5. Bruce says:

    i will add we have a 2 member DAG in UAT and a 4 Member in PRD. We tested in both scenarios and the only way to allow for automatic service restore was to allow for a witness server in a 3rd DC. We are going from a mail system that did allow for automatic failover in all situations and the new Exch environment needs to meet that expectation, to account for a 3am type of outage of network services

  6. Scott Schnoll [MSFT] says:

    Bruce, I would be very interested to hear all of the details of your configuration, the failure scenario and your test cases.  If you would like, please email the information directly to me.  My email address is in the above blog post.

  7. Scott Schnoll [MSFT] says:

    Randall, thanks very much for the kind words!

  8. Bruce says:

    Np i'll send you some info and basic diagram

  9. Tim Huang says:

    @Scott: Thanks for an excellent, excellent post – watching your High Availability for Failure Domains session from TechEd.

  10. Paul Cunningham says:

    Highly useful information Scott, thanks a lot!

  11. Rajiv Arora says:

    @Scott and Exchange team: Blogs like these from experts like you earn a lot of respect for Exchange team and Microsoft.

  12. Peter says:

    Scott, this is absolutely fantastic post!!! Please, produce more of stuff like this.

  13. Michel de Rooij says:

    Great post Scott!

  14. Bill T says:

    My thought is that #2a actually works in the following scenario:

    -Both data centers are super well-connected. So much so that they are combined to the same AD site, and we don't care about different CAS namespaces or about network traffic between CAS/Hubs in one data center going to a mailbox in the other.

    -The third 'data center' is also very well connected, but may just be a critical office building. Network topology allows for it to connect to either data center directly.

    -Both have no local user population (i.e. pure data centers.) Everyone is remote.

    With the caveat that even though it will be able to automatically maintain or restore service if any of the three buildings blows up, you'll still want your administrators to be checking things out ASAP.

    For #5, I have seen this "8GB recommended max" for CAS in writing on printed MS Exchange 2010 training material delivered by a PFE. And even that was for "large-scale" deployments.

  15. Scott Schnoll [MSFT] says:

    Paul, Rajiv, Peter, Tim and Michel, thanks for the kind words.  BTW, Tim, I did not deliver the HA failure domain session.  That was delivered by Ross Smith IV, and its an excellent session!

    Bill, thanks for the comment.  As Bruce implied in his note, and as I later confirmed when he sent me more information offline, if you have multiple datacenters, but you have sufficient connectivity where they are treated as a single datacenter from an AD perspective and from a namespace perspective, then that scenario is analagous to a DAG that exists in a single datacenter.  In that case, the datacenter switchover scenario does not apply, regardless of the number of well-connected datacenters.  In the context of this blog, datacenter equals AD site.

    As for the PFE content, feel free to send me a copy of it via email and I can have it corrected.  I'm not familiar with the specific training you mention, but I would like to determine the source of the content and have it corrected.

  16. justin says:

    While a solid article, misconception #2 makes a large assumption in that between the two sites, all users exist solely in site "a" and site "b" is largely just a DR.  

    The popularity of a 3rd witness site stems from both sites being "active" and the desire to automatically fail-over between the two sites and differentiate which sit had the WAN failure.  Basically you left out what would happen if "site A" failed:  we'd have to manually /forcequorum to get site b up and running whereas with a 3rd site being used as FSW we could have either site mount and operate databases for the other in the event of a  failure … we simply have to tweak the cluster sensitivity (and have GSLB in place, yadda yadda).

    So to say it doesn't provide any different behavior is false:  it gives us the ability to automatically fail over in either direction, rather than forcing a manual fail over.

  17. justin says:

    I see a "2a" dropped in that states a data center switchover is always manual … Ill simply agree to disagree there.  Again, with proper CAS arrays setup and a simple GSLB with 5m TTL you can very easily switch over from site a to site b "auto-magically".  It _can_ be very trivial as long as the exchange admin takes the time to occasionally ensure his DAG replication is up-to-snuff (something he should be doing anyway, no matter your geographic DAG model of choice).

    In fact, Barracuda added GSLB to one of their firmware updates so even the relatively inexpensive 340 can now do this … making it a very real possibility "on the cheap" for lots of clients.  The hardest part is identifying that latency so you dont get any "false switch overs".

  18. arjuncs@hotmail.com says:

    thanks for the post Scott….

  19. Anthony T says:

    The old adage, if you can't explain it so others can understand it, you don't understand it yourself.

    Glad you guys finally understand it!

  20. prettygoodid@hotmail.com says:

    A super duper post for Exchange enthusiasts..

    Thanks a lot Scott for this :-)

  21. Søren says:

    Excellent article!

    After reading #5 I realise that you attended Rand Morimoto's EXL305 session too, as this is exactly Morimoto's claim on slide 20/time index 31.40 in the video ;-)

  22. morrongiello@hotmail.com says:

    Scott, thanks for the excellent article. I also enjoyed your talks at TechEd 2011 in Atlanta.

    Regarding Number 5, I did find this statement in the "Infrastructure Planning and Design Guide for Exchange 2010 SP1", in Section "Step 4: Design the Client Access Server Infrastructure":

    Memory Subsystem – The Exchange Server product group also recommends 2 GB of memory per processor core, with a maximum of 8 GB per server for optimal performance.

    The Guide can be found at http://www.microsoft.com/ipd

  23. Scott Schnoll [MSFT] says:

    ajm88, thanks for the kind words.  There are a couple of bugs in Exchange 2010 SP1 IPD that we are working to get corrected.  I don't have an ETA for the updated version, but we will announce it on this blog when it's published.  Sorry for any confusion the IPD doc bugs may have caused.

  24. cparker@aardvarktactical.com says:

    The High Availability for Exchange links for parts 1 – 3 are broken. Or rather, the pages themselves may be at fault.

  25. Scott Schnoll [MSFT] says:

    The links to the HA blogcasts should be working now.  Sorry for the inconvenience.

  26. 1 DAG - active/active says:

    Great artikle thanks Scott.

    We are in the process to planning a two Datacenter solution with one DAG and 2 MBX Servers. The 2000 users are equaly shared about the two Datacenters and we have one CAS Array in each Datacenter and for the internet we use one entry point

    Should we work with two namespaces, one for each Datacenter and should we place the witness server in Datacenter one or use a third location for the FSW.

Comments are closed.