Outlook in Online Mode is not working as I am expecting

This article is meant to threat the performance from the network perspective. Please keep in mind that the overall performance analysis means also Outlook Add Ins, Antivirus, RAM, CPU, NIC and HDD.

This article represents my experience during my years as a Support Escalation Engineer in Microsoft, handling performance degradation cases while Outlook is connected to Exchange Online (EXO) in Online Mode. When I'm referring to Online Mode scenario, I also consider accessing a Shared Mailbox or a Public Folder without having enabled the options "Download shared folders" or "Download Public Folder Favorites".

The reasons for someone to use Outlook in this mode are multiple, but the majority are doing this because they are using Outlook in a Terminal Services environment.

What Microsoft is saying about this?

Microsoft recommends the use of Outlook in Cached Mode to have best performance when using Office 365 (O365) over a slow network.

https://support.office.com/en-us/article/Best-practices-for-using-Office-365-on-a-slow-network-fd16c8d2-4799-4c39-8fd7-045f06640166

But what is a slow network?

It depends on what you’re doing, and it is up to each one to make their own assessment. You could have a fast network for the use of Internet but not for using other services over the Internet. You could have a fast network for use of Outlook Web App but not for using Outlook in Online Mode.

A network quality is assessed by analyzing the throughput based on the bandwidth and latency.

https://support.office.com/en-us/article/Office-365-performance-tuning-using-baselines-and-performance-history-1492cb94-bd62-43e6-b8d0-2a61ed88ebae

Also, it is imperious to know the Internet traffic made by the internal users each day and to try maintaining a 25-30% network headroom to accommodate eventual spikes.

Below links explains, how you can asses your Internet line for Office 365.

https://blogs.msdn.microsoft.com/vilath/2015/08/06/office-365-the-internet-bandwidth-planning/ https://support.office.com/en-us/article/Network-connectivity-to-Office-365-64b420ef-0218-48f6-8a34-74bb27633b10

So, the fact that you have a 100 Mbps Internet line, for which you are paying hundreds of dollars, it doesn’t mean that you have a network, that can accommodate Outlook in Online Mode.

Why do I have a good experience in OWA but not in Outlook?

It’s not the same type of connection. OWA is using HTTPS and Outlook is using MAPI over HTTP, so different type of traffic patterns.

https://blogs.technet.microsoft.com/exchange/2014/05/09/outlook-connectivity-with-mapi-over-http/

Outlook works by using ROPs.

What is a ROP?

A ROP is basically a request made by Outlook to the Exchange Server for a certain action. Like open a folder, log on to the mailbox, list the mailbox content, etc.

ROPs are document in more technical depth in the below articles:

https://msdn.microsoft.com/en-us/library/hh354786(v=exchg.80).aspx https://blogs.technet.microsoft.com/mahuynh/2014/09/25/rop-breakdown-by-user/

What interest us is that each event is stamped with a time when it leaves the Outlook client.

So, what is happening and why this ROPs helps us knowing where do we have the performance degradation?

Let’s take for example a ropOpenFolder sent by Outlook to Exchange Online.

  1. Package is stamped with a time when it leaves the Outlook client in format HH:MM:SS:0000000
  2. It traverses your Internal Network --> ISP Network --> O365 Network and reaches the Exchange server
  3. When it hits Exchange, a counter starts and stops when the package is processed by the server.
  4. The Process time is stamped on the package before it is sent back to the Outlook client.
  5. Outlook receives the package, subtract the Process time and calculates the time it took to Transmit the package over the wire

The same you can seen in Outlook Connection Status by looking to Avg Resp and Avg Proc. In terms of Networking, Avg Resp – Avg Proc = RTT. But this is more like an overall situation.

In my experience as a Support Escalation Engineer for Exchange Online you should have approximately the below values on Avg Rsp for:

Best Performance:  40 – 70 ms

Good Performance: 70 – 100 ms

Poor Performance:  100 – X ms

To have a depth view of the situation we analyze the Advanced Outlook Logging file, and check the ROP behavior. The type of the ROP, the millisecond that took place and if it matches the action perform by the user at that time.

What this means?

It means that if you’re having an Outlook client connected to Exchange in Online Mode, and you expect for an e-mail to be opened in less than 3 seconds, then you should have a constant Avg Rsp between 40 – 70 ms.

So, Transmit means Internal Network --> ISP Network --> O365 Network, but where is the issue?

You should take in consideration that the traffic is MAPI over HTTP, and it is over SSL / TLS. So, Ping, PSPing, Traceroute and SpeedTest are giving you just the surface of the network related to ICMP requests. Not the actual traffic filtered by Firewalls, Routers, Load Balancers etc.

Honestly it is almost impossible to find how long an ROP package took on each network along the way to the Exchange Server and back. This would imply the package to be stamped on each network with a time and date. I haven’t found yet a method to do that. Not to mention that the traffic should be decrypted.

Microsoft I have an Avg Rsp between 100 - 300 ms but the issue is on your network.

Office 365 is a shared platform, meaning that all the users are connected to the same endpoint using the same routes inside Office 365 network, before ending up on a server shared by others as well. What am I saying is that if this is indeed an issue inside the MS network, this would be a Service Incident in seconds.

From the network design perspective Microsoft is in the top leading networks in the world. The investments that are done each time are to bring Office 365 network closer to each ISP and to maintain a 50% network headroom on all the internal circuits. For this Microsoft has peering with many ISPs and the internal network is designed to accommodate eventual spikes. So network congestion is not an option.

Also, Office 365 internal network is predictable, meaning that the traffic will have the same behavior, latency, routing path inside.

What is Peering?

If you suspect that the issue is in Office 365 this is easy to exclude.

Outlook we’ll always make connections to the following endpoints:

  • autodiscover.outlook.com
  • autodiscover-s.outlook.com
  • outlook.office365.com

Depending on your DNS location, the names we’ll be resolved for example, if it is in EMEA, to outlook-emeaXXXX.office365.com for outlook.office365.com.

This will be resolved then to IPs located in Office 365 Datacenters. For EMEA, this will be in Finland, Ireland, Austria, Netherlands or United Kingdom.

Test to exclude the network

  1. From CMD type nslookup “Office 365 Endpoint”

          Example: nslookup outlook.office365.com

  1. Select the IPv4 IPs and check them on the internet, to see in which country or city they are. By the way if you some of them in US, this in not true. It is just how those sites report them based on the owning company.
  2. Add static DNS entries in the Windows host files for autodiscover.outlook.comautodiscover-s.outlook.com and outlook.office365.outlook.com for each IP, to exclude them one by one and to see on which one you have a better performance. Is it an IP from Finland, is it one from Austria ?
  3. From CMD you will need to type ipconfig /flushdns after each test to empty the DNS cache.

Test to exclude the servers

  1. From PowerShell check where your mailbox is currently located.
  • Get-Mailbox “Mailbox” | fl ServerName
  1. Perform a mailbox move
  1. Your mailbox will be moved to a different server or location.

If you see better results constantly on different endpoints you will need to discuss this with your ISP related to the routing path to those IPs.

Returning to what is good and what is bad, if your network is designed for Outlook in Online Mode or not, I will let you to judge on your own based on the below data taken from my environment located on Azure,  having a basic subscription with limited resources.

The environment is composed from and Exchange 2016 server and a Windows machine domain joined.

What we are interested in is the Avg Rsp on Exchange Mail connection type. The results are just "screenshots" of what happened in a period of 5 minutes monitoring.

How did I ended up with those results? If it is mandatory to have that network connectivity or bandwidth? I don’t know, and it is hard to tell, but those are the values?

Your challenge is to have the same :)

Scenario 1-  Mailbox located on local Exchange 2016 server + Outlook 2016 + 10 GB  link speed inside my LAN outlok-server-1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Scenario 2 - Mailbox A located in Office 365 + Outlook 2016 + 1 Gbps Internet link speed

Get-Mailbox shows that is located on amspr06mb501. So in Amsterdam.

nslookup outlook.office365.com shows that I got resolved to outlook-emeacenter2.office365.com nslookup

 

 

 

 

 

 

 

 

 

Netmon shows that I got connected to 40.101.62.34 which is in Amsterdam

netmon1

 

SpeedTest to an ISP from Amsterdam shows that I have a 2 ms latency and 680 Mbps Download speed speedazure

 

 

 

 

 

Outlook Connection Status, shows an Avg Rsp on Exchange Mail of60 ms outlookconenctinams

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Scenario 3 - Mailbox B located in Office 365 + Outlook 2016 + 1 Gbps Internet link speed

Get-Mailbox shows that it is located currently on vi1pr06mb1680.  So in Vienna.

nslookup outlook.office365.com shows that I got resolved to outlook-emeacenter3.office365.com nslookup1

 

 

 

 

 

 

 

 

 

Netmon shows that I got connected to 40.101.61.130 which is in Amsterdam netmon2

 

SpeedTest to an ISP from Vienna shows that I have a 24 ms latency and 636 Mbps Download speed

speedazurevienna

 

 

 

 

 

Outlook Connection Status shows an Avg Rsp on Exchange Mail of 84 ms connectionstatusvienna

 

 

 

 

 

 

 

 

 

 

 

 

One thing that I want also to mention is that latency can be improved in the limits given by your network, ISP network and of course Office 365 network. You can optimize the path but not break the laws of physics.

Also the fact that "I can reproduce the same on my home network" does't stands a proof as your home network guarantees that you can access the Internet, not that it is predictable. Home network means that you have excluded the Firewall, Proxy or any other device that filters the traffic. Not to mention that usually this is done connecting over Wi-Fi, so the signal may varies.

Predictable means that I will have the same performance each time regardless of the time or user activity, such as Express Route for Office 365.