We recently updated our maximum item count guidance for Exchange 2007 when IMAP clients are being used. As a result, I’ve gotten a number of questions about the reasons for the change. Hopefully this post will help to answer those questions.
In Exchange 2007, the product went through some major architectural changes that resulted in various components that had previously been closely tied to the store.exe process being decoupled and moved to the middle-tier Client Access Server role. This included the IMAP service. As part of this move, the IMAP service was completely rearchitected and rewritten which has resulted in a different performance and scalability profile from the behavior exhibited by Exchange 2000 and 2003.
Along with differences in the performance and scalability profile from prior versions, the growth of user mailbox size that has occurred alongside reductions in storage cost has resulted in observations of excessive CAS CPU utilization by the Exchange 2007 IMAP service when operating on folders containing a large number of items. In Exchange 2007 SP2 RU2, the product team released some new functionality that can be used to log user activity through the IMAP service operating on folders with item count over a configured threshold. See KB 969948 for more details.
Exchange has provided item count guidance for various product releases, and this guidance has usually been associated with MAPI clients such as Microsoft Outlook accessing “critical path” folders like the Inbox and the resulting effect on disk I/O and CPU utilization of the store.exe process. Given the level of interest in access to high item count folders with IMAP clients, which primarily affects CAS CPU on Exchange 2007, I performed some testing to characterize CAS CPU utilization and provide a data-based recommendation on the maximum number of items to store in folders used by IMAP clients.
Topology: The test was run in a simple Exchange 2007 topology. The topology consisted of a domain controller with co-located Hub Transport server, Client Access server, and Mailbox server roles.
Software: Windows Server 2008 R2 with Exchange 2007 SP3 RU2 (current patch level)
Hardware: The hardware was configured to eliminate I/O bottlenecks and just focus on CPU consumption, so the Mailbox server was extremely overprovisioned on RAM and storage for the required workload. Specifically, the mailbox server was configured with 48GB RAM and enough transactional I/O capacity to support approximately 4000 IOPS. The CAS server was configured with 16GB RAM. The CAS and Mailbox servers both had 2 4-core processors (Intel Xeon L5520, running at 2.27GHz) installed and in order to minimize the workload required to obtain meaningful results, the number of active cores on these machines was reduced to 4 via a Windows boot configuration parameter.
To ensure meaningful CPU measurements, simultaneous multithreading (on this platform that would be Intel Hyper-Threading Technology) was disabled, and the “High Performance” power profile was activated to ensure that all cores would run at 100% of their frequency for the duration of the test.
Workload: In this test, the workload was purposefully designed to be very simple. SSL encryption was disabled to allow for packet captures (which were used to validate the workload generated against the server). Each IMAP session simply logged in, selected the Inbox as the folder to act on, and fetched the UID & FLAGS properties of each message in the folder. This is common behavior for IMAP clients when opening a folder where the content in the folder is already cached locally on the user’s PC. Think of this as the best case scenario where no message content (other than flags associated with the state of a message) are actually being downloaded to the client. The workload was generated by the Exchange Load Generator tool. The script used by LoadGen to generate this workload was:
Content: Groups of mailboxes were configured on Exchange, with each group having a different number of items in the Inbox folder: 100, 500, 1000, 5000, 10000, 50000, 100000, and 500000. Each message was generated by LoadGen’s built-in content generation engine, configured for HTML content type and 75k average message size.
For each iteration of the test, the LoadGen workload was tuned to generate enough load that the CPU utilization on the CAS server was > 60%. Low CPU utilization can result in misleading results when attempting to normalize CPU utilization to specific operations or overall sessions. The workload was adjusted by tuning the number of users simultaneously accessing the server running the provided script.
Additionally, the initial delay in starting each simulated user was tuned in order to generate the cleanest “steady state” possible. The number of simulated users was slowly increased by adding additional client machines to the mix until the target CPU utilization level was achieved in steady state. The test was run for at least 1 hour in steady state to collect performance measurements for analysis and then the next iteration was run. Performance data representing both the workload as well as the effect of the workload on machine resources was collected from both the CAS and Mailbox servers.
Before looking at the actual performance results, the number of user sessions that were run for each of the iterations clearly demonstrated the dramatically increasing impact of high item count folders on the system. While determining the primary cause of the impact, I ruled out mailbox server bottlenecks due to I/O, CPU, or network. Additionally, I/O and network bottlenecks were ruled out on the CAS server as well. The CAS server was not under memory pressure for the duration of the tests, and .NET garbage collection activity was minimal (i.e. not a significant contributor to IMAP process CPU). The CPU data from the CAS server clearly demonstrates the effect of high item count.
In order to make each iteration comparable, I normalized the data first to megacycles per IMAP session. A megacycle is a unit of work equaling one million CPU cycles. A 1 MHz processor would process a megacycle every second. Due to various optimizations in modern processors, clock speed is not the only indication of a processor’s ability to process a megacycle in a given period of time when comparing to other processors. In this case, all of our measurements were obtained from the same type of processor (actually the same physical processor), so the results are comparable between test iterations.
Here’s what the megacycles per IMAP session look like when plotted on a log-log graph:
Note that the y-axis (the vertical axis) of the graph is on a logarithmic scale (as is the x-axis given the item count values that were selected for the tests) – the curve of the graph is much more noticeable on a linear scale, but it makes the lower end of the scale harder to interpret. As one might expect, a session where we process higher item counts costs significantly more than a session where we process lower item counts. Not terribly surprising.
It’s far more interesting to normalize the data to a smaller unit of work – specifically megacycles per message in the folder being processed. Since we are primarily concerned with the impact of each item that contributes to the high item count, this would be much more meaningful than megacycles per session. Obviously there are some fixed costs associated with maintenance of an IMAP session which will occur regardless of item count, but given that the costs rise so dramatically as item count increases (as seen in the megacycles per session graph), it’s clear that the per message costs will dominate.
Here’s what the megacycles per message look like when plotted on a graph:
In this graph, the y-axis is linear rather than logarithmic. As the graph demonstrates, the CPU cost per message in the folder being operated on is basically linear until 10,000 messages. In fact, the cost is slightly higher in the extreme case on the left where the fixed per session costs for the IMAP service likely had a bigger impact. After 10,000 messages, the per message cost begins to rise rather dramatically, suggesting that the overall scalability of the service is impacted.
Given that the IMAP service in Exchange 2007 appears to scale well on a per message basis up to 10,000 messages per folder, we recommend a maximum of 10,000 items per folder for users who are utilizing the IMAP protocol to connect. Users who are not using IMAP should follow the other published guidance on item count for Exchange 2007. The test data does suggest that there is only a moderate increase in per message cost between 10,000 and 50,000 messages, but it is also important to remember that the per session cost increases dramatically between 10,000 and 50,000 messages. This means that the duration of this activity for a given IMAP user on a CAS server will be significantly longer, and when multiple users are accessing the system this can easily lead to CAS CPU capacity issues.