Tales of the weird: High CPU on Exchange when using CBA authentication

Here's another one from the 'Tales of the Weird and Unusual' bin here at Microsoft Exchange Support.  We recently had a case where a customer was experiencing high CPU on their Exchange 2016 servers.  While experiencing high CPU on an Exchange environment can happen from time to time, in our case, the customer reported it started within 2 hours of them deploying iOS 11 to their organization.  What we were able to determine very quickly is that this was NOT the same as identified in Apple's support article HT208136.

What made this case more from the weird is that the customer was also using Certificate based authentication, like below:  (screen shot from my CBA lab) 

If we removed the CBA, then the issue would not duplicate.

When we've seen issues like this in the past, often times it can be caused due to poorly performing Filter Drivers, IPS, or even a WAN accelerators in one instance.  We were able to eliminate these potential causes fairly quickly as the high CPU was becoming more of a concern.

So, this made us go and get a few procdumps and we started our debugging.

I know some people out there like to try debugging this on their own, and if you do venture into that realm, you may see it getting stuck in this manner:

Child-SP RetAddr Call Site
00 00000075` 00007ff9`ebb113e8 ntdll!ZwDeviceIoControlFile+0x14
01 00000075` 00007ff9`ebb1434d httpapi!HttpApiOverlappedDeviceControl+0x68
02 00000075` 00007ff9`9a8b288a httpapi!HttpReceiveClientCertificate+0x6d
03 00000075` 00007ff9`9a8eed27 w3dt!UL_NATIVE_REQUEST::ReceiveClientCertificate+0x152
04 00000075` 00007ff9`9a8de0c3 iiscore!W3_REQUEST::NegotiateClientCertificate+0x13f
w3dt!UL_NATIVE_REQUEST::ReceiveClientCertificate

In the procdump, we were finding we were getting stuck in http.sys, and SSL decryption, where the device was repeatedly trying to perform an SSL handshake and failing.  Now, we have seen this happen from time to time with customers who are on slow connections, for example, if they are doing CBA from somewhere with a very poor cellular connection (think rural countries, satellite Internet connections, etc.) however that was not true in this case.

After discussing with our counterparts within Windows, there were some performance tweaks made with how SSL is handled in the October 2017 update for Windows Server 2016 ( KB 4041688).  We were able to successfully apply KB 4041688, and the issue was resolved upon reboot.

So, if you are working on doing an iOS 11 deployment, using CBA, and starting to experience high CPU in the MSExchangeSyncApp pool, then apply KB 4041688 and your issue may very well be resolved.

Special thanks to Radomir Zaric, Nick Tilton, and Jason Slaughter for their help on this very weird case.