For Exchange 2010, 2013, and 2016 do this before calling Microsoft

Hi Everybody! I wanted to post some steps that might save you some time and money. Although all of the below settings are published on Microsoft somewhere, they aren't all together or always specifically called out to assist in creating a more efficient and stable Exchange environment. I have noted where applicable if it is an official recommendation from the Exchange Product Group ( PG ). I, as a Microsoft Exchange Support Escalation Engineer have enjoyed quite a bit of success with these settings in relation to the symptoms I list below. If you are experiencing any of the following, do this before opening a case with Microsoft. Many cases can be resolved right here. The folks that will need a case with Microsoft will short cut the troubleshooting time and reduce the items in any given action plan which, will further shorten the time to resolution. Even if you are not having problems, this is a good tune up.

1. Intermittent Outlook connectivity issues.
2. Intermittent Activesync connectivity issues.
3. Slow mail delivery to Outlook or Activesync devices.
4. High LDAP search times.
5. High CPU utilization.

My team mate David Paulson maintains a script that will test several of the things I mention below and give you remediation advice. This needs to be freshly downloaded every time you want to use it as it is constantly updated with new versions of Exchange and .Net hotfixes if any. It can be downloaded from here:

Exchange Server Performance Health Checker Script https://github.com/dpaulson45/HealthChecker/releases .

Run the above script, follow it's guidance , then come back here and see if I covered anything the script did not.

For All 2010,2013, and 2016 Exchange Servers:

1.Set the following in the registry. *Please Note, the keys do not exist and will need to be created:

Minimum Connection Timeout. Configure the RPC timeout on Exchange servers to make sure that components which use RPC will trigger a keep alive signal within the time frame you specify here. This will help keep network devices in front of Exchange from closing sessions prematurely:
HKLM\Software\Policies\Microsoft\Windows NT\RPC\MinimumConnectionTimeout
DWORD  0x00000078 (120 decimal)

Set Keep alive timeout. Determines how often TCP sends keep-alive transmissions. TCP sends keep-alive transmissions to verify that an idle connection is still active. Many network devices such as load balancers and firewalls use an aggressive 5 minute idle session time out. This will help keep those devices from closing a session prematurely:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveTime
DWORD value = 1200000 (12000000 decimal)

There is debate about the above value. I have it set for 20 minutes, but anywhere between 15 and 30 minutes would be viable.

My team mate Josh Jerdon has written an excellent script that will tell you your current  KeepAliveTime for all servers in your Exchange org and set it for you if you like. It can be found here: https://gallery.technet.microsoft.com/office/TCP-Keep-Alive-Time-Report-c9a240d0 .

 

*Note: Item 1 is expected to be done in conjunction with item 9. Please be sure to follow both.

2. Install the latest tcpip.sys for your server OS. This is a recommendation from several different networking engineers that I have worked with on Exchange cases involving a suspected network issue. There is no official recommendation from PG to do this, it is just another one of those items I have enjoyed some success with. If you are doing your Windows Updates, you can ignore this as you are getting the needed updates.

3. If your servers are hosted on VMWare, follow this: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2039495. Exchange is a very bursty server and there is a potential issue in ESXi 4.x and 5.x where packet loss can occur during periods of very high traffic bursts. Increasing the Rx buffer as described in this article can prevent that.

4. Disable Hyper-threading on physical servers and at the VM level for virtualized servers. This is an official recommendation from PG and is discussed in https://technet.microsoft.com/en-us/library/dn879075(v=exchg.150).aspx under the processing section.

5. DO NOT use Dynamic Memory Allocation, it is not supported as discussed here: https://technet.microsoft.com/en-us/library/jj619301(v=exchg.160).aspx#BKMK_ExchangeMemory . Please use fixed or reserved memory.

6. Set your Power management. This is an official recommendation from PG and is discussed in https://technet.microsoft.com/en-us/library/dn879075(v=exchg.150).aspx in the power management section.:
Set BIOS to allow the operating system (OS) to manage power.
In the OS, turn on the High Performance power plan.

7. Use the default SNP offload settings where available, and make sure that RSS is enabled (the default setting in Windows Server 2012 and later). RSS will help scale CPU utilization, especially on 10GbE. RSS and TCP Chimney Offload has to be set both in the OS and in the NIC device configuration to work. This is an official recommendation from PG and is discussed in https://technet.microsoft.com/en-us/library/dn879075(v=exchg.150).aspx . To set in the OS run the following in an elevated cmd prompt:
netsh int tcp set global rss=enabled
netsh int tcp set global chimney=Automatic

8. For Exchange 2010, the page file size minimum and maximum must be set to physical RAM plus 10MB As discussed in Exchange 2010 System Requirements (See Below for Exchange 2013 and 2016) https://technet.microsoft.com/library/aa996719%28EXCHG.140%29.aspx.

9. Make sure your TCP Idle session time out on the load balancer is set to at least 30 minutes. As you traverse network devices out ( firewalls, routers, ect. ), the idle session time out should get successively higher. So, NLB is set at 30 minutes, router set to 35, firewall set to 40, all the way out to either the client or the network border. Higher times are acceptable ( within reason ) as long as you follow the formula. DON'T SKIP THIS STEP!! IT IS VERY IMPORTANT! This is discussed in https://msdn.microsoft.com/en-us/library/dn643702.aspx

Specific to Exchange 2013/2016:

1. Install at least N-1 Cumulative Update. The current list with release date can be found here: Exchange Server Updates: build numbers and release dates https://technet.microsoft.com/en-us/library/hh135098(v=exchg.150).aspx. If you are not on at least N-1, you are dealing with issues you don't need to. PG's official recommendation is to keep no more than an N-1 where N = the current CU. Also, the .NET recommendations I am going to make need CU7 or higher to be most effective.

2. Install the latest .NET for your CU of Exchange. The .NET Support Matrix can be found here: https://technet.microsoft.com/en-us/library/ff728623(v=exchg.150).aspx

3. Set the page file to installed memory + 10MB or set the page file to 32GB + 10MB (32,778MB) if more than 32GB of memory is installed. Make sure your page file is not set to be on your mailbox database or log file drives. This is an official recommendation from PG and is discussed in Exchange 2013 Sizing and Configuration Recommendations https://technet.microsoft.com/en-us/library/dn879075(v=exchg.150).aspx .

4. Do not - Do not - Do not scale up your hardware past 24 cores and 192GB of memory. The guidance in Exchange 2013 Sizing and Configuration Recommendations https://technet.microsoft.com/en-us/library/dn879075(v=exchg.150).aspx needs to be followed. Exchange 2013 was designed with O365 in mind. A large number of commodity servers in a data warehouse versus a few monster servers. If you ignore this guidance, you may have problems. The more cores above 24 you go, the more potential for trouble you will have.

Revised May 30, 2018