Unable to send messages from Outlook behind Forefront TMG after migrating to Cloud Services

Introduction

Consider a scenario where a client migrated from on-premise Exchange to Exchange Online and after this migration the users are experiencing issues while sending e-mail. During high peak times Outlook clients can’t send e-mails. Messages are getting stuck in the Outbox image. When this issue was happening the event 31212 also was showing up on TMG:

image

One important point here to add is that when this issue was happening users were able to browse HTTP sites, but not HTTPS.

Data Collection

For this scenario we most likely will need:

  • Client: Network Monitor trace on the client
  • Server:
    • TMG Data Packager
    • Perfmon
    • User mode dump

 

Data Analysis

When analyzing data of this nature you need to add to perfmon the core OS subsystems (memory, network, processor and disk), as well as the core Forefront TMG components. The diagram below shows an interesting trend where the Memory Pool for SSL Requests (black line in the diagram below) starts to decrease, it increases again to 100% and suddenly drops to zero.

image

This is exactly the time that users start to experience issues with Outlook getting messages stuck in the Outbox.

Solution

This problem happens because TMG was running out of memory pool for SSL requests. In order to fix that you need to change the registry key ProxyVmvmAlloc1pSize to a higher value (default is 1024). You can follow the guidelines from KB842438 (also applies to TMG) in order to adjust this value or you can install Forefront TMG 2010 SP2 (just released) that changes this value to 4096. For this particular case we noticed that after changing this value to 4096 the users didn’t experience this problem anymore and the server’s perfmon start looking way better even under heavy load, as shown below:

image

Takeaway

There are a couple of key takeaways regarding this scenario that I want to call out:

  • Don’t go directly to the cloud without proper planning, you might experience issues like the one described in this article and you could potentially think that the cloud services is the one causing problem.
  • Remember that when you start moving your main applications (Exchange, CRM, Sharepoint, etc) to the cloud the traffic from inside to outside will increase and you need to have your edge device (regardless of which one you use) ready for that.

Planning is definitely the key for a success migration, but in order to have a good planning you really need to know your own environment, your traffic profile and your plan to grow. In order to reduce the impact during the cloud migration you should be able to determine that and perform a migration in different waves (not all users nor all applications at the same time).