So why on Earth did you do that stupid ‘push notification’ thingy in Exchange, and why is it so NAT unfriendly?


JC Hillerman wrote a comment in on my bio:


I heard that Exchange was purchased by Microsoft, but it seems clear that you were working on new development. Maybe it was just the MTA that was purchased? I also heard that Exchange was initally written in Pascal and was ported to C using a translator, and that's why it took so long to come out with the 5.0 version. Why was Exchange started with version 4.0? And who DID write that NAT unfriendly push notification (New Message Notification)? Why didn't they just utilize one of the two open TCP connections from the client to the Exchange server to send a push notification?

Thanks!


 

I can’t speak to most of his issues (actually I can, but I’ll leave that to someone else J), but I CAN speak to the questions about push notification.  The answer’s easy: "Me".  I realize now that my biography’s ambiguous, I didn’t indicate that I did push notification for Exchange 4.0, I should have made that clearer J.


So what’s the deal with this whole push notification thingy anyway?  Yeah, I know Joel says that nobody outside Microsoft starts sentences with "so", but I do.  Basically push notification is the word we used for the feature in Exchange that allowed the server to tell the client that there was new mail available for the user (to be more accurate, it was a mechanism for the server to tell the client that "something happened on the server that you might be interested in").


Up until that point, all the existing email clients polled their server.  Every minute, they’d send a request to the server saying "Hey, has anything happened?"  There are two big problems with this.  First, it means that the system doesn’t respond to realtime events - for example, if you started a search, the user wouldn’t see search results for tens of seconds, even though the server had search results within milliseconds.  The second is that it’s bandwidth intensive.


When we do bandwidth analysis of this kind of problem, we need to make some assumptions about the number of email messages a person receives.  For the sake of argument, let’s assume that each user receives 50 email messages a day, and sends 10 of them.  So there will be 50 "interesting" events that occur during the course of an 8 hour business day.  There are 480 minutes in 8 hours, so that’s one message every 10 minutes (since email traffic tends to be bursty, this isn’t 100% accurate, but it works).  What this means is that 90% of the "Hey, has anything happened?" requests comes back "No."


Let’s assume the "Hey, has anything happened?" request takes up 100 bytes and one TSDU (a "transport service data unit", roughly a packet on the wire).  The response "Nope, nothing happened" takes another 100 bytes and again one TSDU.  This means that each client, sends 200 unnecessary bytes of traffic, and 2 TSDU’s every minute.  Every hour, 10,800 unnecessary bytes and 108 TSDU’s are sent.  That’s for each client.  If you have 10,000 users in your organization (a medium size organization), this means that you’re sending 108 Megabytes of unnecessary data and 1.8 million extra TSDU’s each hour. 


This is a big deal, especially back in 1994, when most networks were either 10MBps Ethernet or token ring.  Also, many of our customers were running unroutable protocols like NetBEUI, which meant that they didn’t even have the option of segmenting their network to reduce traffic.  Adding hundreds of megabytes to the bandwidth of a network could cripple the network.  So we needed a mechanism to remove this load from the network.


The solution we came up with was "Push Notification".  The idea was that the client would register that it was interested in notification with the server, then start listening for a response.  When the server had something "interesting" to tell the client, it would send a ping to the client that said "Hey, something happened, you should look".  When the client received this ping from the server, it would then issue the same "Hey, has anything happened?" message, and it’d get a list of the things that had changed.


There were a bunch of unexpected benefits that came from implementing push notification. For example, the user perception of the performance of searches became much faster.  This is (roughly) because an Outlook/Capone search is done by creating a search folder on the server.  The store on the server scans the database and adds the messages that match the search criteria to that search folder.  And as each message is added to the folder, it generates a push notification to the client, which retrieves the current state of the folder.  Without push notification, the client polled the server every 5 or 10 seconds, and the search results appeared in bursts, with push notification, the messages appeared in the search folder smoothly.  There were several reviews of Exchange that gave Exchange’s search functionality high marks just because of this feature!

 

Ok, now that I’ve sold you on how cool Push Notification is, onto the design (which is the meat of what JC’s asking about).

 

To make push notification as lightweight as possible, I designed the solution to use a datagram to send the "ping" message.  The idea was that since the message was advisory, if the routers dropped the datagram it wouldn’t be a big deal.  In addition, we’d be able to get this working with our 16 bit clients (remember, this is 1994 - Win95 hadn’t shipped yet).  For our DOS and Win16 clients, they were sufficiently resource constrained that we couldn’t add anything as heavy-weight as a connection - the Trumpet Winsock client (a very popular shareware TCP/IP stack for Win16) had a limit of 3 or 4 TCP connections, for example.  Since DOS and Win16 clients were single-threaded, whatever mechanism we used for notification needed to work asynchronously, since we didn’t have the luxury of creating threads on the client to process the requests.  Again, using a datagram worked, since all of our target client protocols supported asynchronous datagram reception.

When the client wanted to register for push notifications, the client configured itself to receive the datagram (by issuing a NCBDGRECV, or whatever).  It then sent enough information to the server to identify the client to the server (as a sockaddr structure), and the server remembered this address.  When the server had something "interesting" to send to the client, it issued a sendto() socket command to send the "hey, something happened" message to the client.

 

This, by the way is why this didn’t work over NAT connections -if you’re behind a firewall that supports NAT, the IP address of the client isn’t the public IP address of the machine, because the firewall hides the local addresses.  So the address that the client sent to the server had no relationship to the address that the server could use to respond.  Back when we designed this, the first NAT RFC (RFC1631) had only been published for 6 months as an informational RFC (which means it wasn’t a standard, and thus was only available on a couple of networking products).  The standards track RFC for NAT didn’t get published until 2000.  NAT wasn’t even a consideration when we designed it, the idea of using Exchange over a protocol like HTTP hadn’t even occurred to us (heck, the HTTP 1.1 protocol standard didn’t exist until 1997).


The reason we didn’t use the existing TCP connections was two-fold.  First, and most important, not all of our clients ran TCP/IP.  In fact, the majority of our clients for Exchange 4.0 ran either NetBEUI or IPX/SPX.  On IPX, you didn’t even HAVE a connection, it was a datagram based RPC protocol.  Second, the two connections were used for the address book provider (emsabpXX.dll) and the store provider (emsmdbXX.dll) for their RPC traffic to the server.


At the time, RPC barely supported asynchronous operations (it was possible but not on all platforms), and in order to use the existing connection, we would have had to totally rearchitect the communication protocol used between the store and the client.  It was unclear if it would even be possible to support this on clients like MS-DOS or Windows.  So using the existing store TCP connection wasn’t really an option for us at the time.

 

If we were to re-do push notification, I’m sure we’d come up with something that was more NAT friendly, probably using async RPC.  The good news is that the 16bit issues that guided almost all of the design criteria associated with this feature are now long dead.

- Larry Osterman

Comments (5)
  1. jesse says:

    How does this relate to Exchange 2k3 and Outlook 2k3 in cached mode? Is the same push notification still in use today?

  2. Larry Osterman says:

    I believe that it is still in use, but I’m not 100% sure.

  3. E2k3 <-> O2k3, cached mode or not, still uses the same UDP notifications, though we’ve made a couple tweaks in the implementation to make it a tad more robust.

    In the original implementation, any random packet sent to the termination port would cause the client to stop listening for notifications. We fixed this by requiring a specially formatted packet for shutdown:

    http://support.microsoft.com/?kbid=329415

    http://support.microsoft.com/?kbid=329024

    And on the Outlook side, we’ve added a ForcePolling key to bypass the whole shebang if you’re behind a NAT:

    http://support.microsoft.com/?kbid=305572

    I’m not sure what we do in the RPC over HTTP case though.

  4. Andy says:

    1. It would sure be nice if the Outlook folks had a way to set the port used to a static high port (that could be set via GPO) so that ipsec/port filtering and other things would deal with it without having to resort to polling.

    2. Awesome post, Larry.

  5. KC Lemson says:

    Push notification is not used in rpc/http, the client polls periodically.

Comments are closed.

Skip to main content