Understanding NAT When Setting Up Lync, Part 2 - STUN and TURN Explained

Written by Joe Lefort, Senior Microsoft Premier Field Engineer. Better late than never. I finally got a bit of free time to try to explain part two of the ICE NAT and media stuff I began a while back – that being STUN and TURN.

First off, not all NATs are the same, and that is the key to whether STUN or TURN gets used when setting up media traversal with Lync. I am going to attempt to define the relevant NAT types here so you can understand why STUN or TURN gets chosen when ICE occurs. (For the network nerds out there: I know these are somewhat simplistic definitions and don’t cover every real scenario. This is meant to educate, not intimidate, so I took this route).

NAT Types

Full cone NAT: my internal socket is NAT mapped to an external socket that is ‘open’ to all, which is unchanging. For example, if my client on 10.1.2.3 sends a request out on port 53098 (the socket is 10.1.2.3:53098), the NAT converts that to an external socket of 204.45.15.12:1802. Any external device can communicate to my client via my external socket (204.45.15.12:1802), which gets mapped back to my internal socket (10.1.2.3:53098) by the NAT.

Full Cone NAT

 

Address restricted cone NAT: my internal socket is NAT mapped to an external socket that is unchanging and open to only the IP address of the destination in my original request. For example, if my client on 10.1.2.3 sends a request out on port 53098 (the socket is therefore 10.1.2.3:53098) for a web page on 56.56.55.54 (socket being 56.56.55.54:80), the NAT converts the internal socket to an external socket of 204.45.15.12:1802. Only the IP that I was originally making a request of (56.56.55.54) can communicate to my client via my external socket (204.45.15.12:1802), which gets mapped back to my internal socket (10.1.2.3:53098) by the NAT.

Address restricted cone NAT:

 

Port restricted cone NAT: my internal socket is NAT mapped to an external socket that is unchanging and open to only the socket of the destination in my original request. For example, if my client on 10.1.2.3 sends a request out on port 53098 (the socket is therefore 10.1.2.3:53098) for a web page on 56.56.55.54 (socket being 56.56.55.54:80), the NAT converts the internal socket to an external socket of 204.45.15.12:1802. Only the socket that I was originally making a request of (56.56.55.54:80) can communicate to my client via my external socket (204.45.15.12:1802), which gets mapped back to my internal socket (10.1.2.3:53098) by the NAT.

Port restricted cone NAT

Symmetric NAT: my internal socket is NAT mapped to an external socket that changes with each request I make and open to only the socket of the destination in that request.  For example, if my client on 10.1.2.3 sends a request out on port 53098 (the socket is therefore 10.1.2.3:53098) for a web page on 56.56.55.54 (socket being 56.56.55.54:80), the NAT converts the internal socket to an external socket of 204.45.15.12:1802 for this request. Only the socket that I was originally making a request of (56.56.55.54:80) can communicate to my client via my external socket (204.45.15.12:1802). I then make a second request from the same internal socket to the same external host, but the NAT creates a new external socket (204.45.15.12:1803), which is only open to the target of this latest request.

From a security standpoint, symmetric NAT is the tightest type of NAT as a socket is only made available to a specific host for a specific request. On the other end, full cone is the most open as anyone can get to a socket once exposed.

Now that you (hopefully) understand a bit about the types of NAT, let’s launch into STUN and TURN.

STUN and TURN

There are RFCs that define STUN and TURN and these can be found here:

STUN (Session Traversal Utilities for NAT):

  • RFC 3489 (this is an obsolete RFC, but I include it as a reference)
  • RFC 5389  (this RFC also explains why 3489 was better in theory than in practice)

From a Communicator client perspective, the STUN setup is as follows: The client has built-in logic that says effectively “I am not sure that I am visible from the outside, but I know there is a STUN server (the Lync Edge server) around that I can talk to, in order to increase my chances of being visible. I am going to try to BIND to it over UDP 3478 / TCP 443.” After receiving the bind request, the STUN server replies back to the client with a payload including the socket it saw as the sender of the request. The reply is then examined by the client. If the sender socket in the server reply is the same as the internal client socket, there is no NAT in play between the client and the STUN server. If the reply’s sender socket is different than the internal socket, there is a NAT in play (technically the sender socket in the reply is known as a reflexive sender address). In either case, the Communicator client remembers the sender socket in the reply and will provide that to other parties when media needs to be sent (in the form of an ICE candidate in the SDP payload).

In summary: During the BIND, the STUN server acts like an external client, in that it attempts to respond on whatever socket it sees the requestor asks from. In the BIND response, there is payload data identifying the source socket seen. The client saves the payload as an ICE candidate.

Effectively, the BIND action checks to see if the NAT punches a consistent hole through for the original client.

TURN (Traversal Using Relays around NAT):

TURN is an extension to STUN, where the Communicator client uses the TURN server (the Lync edge) as a RELAY (proxy) to allow media traversal over a NAT that does not do the “consistent hole punch” required by STUN traffic. First, the client establishes a single continuous session to the TURN server (which creates a “consistent hole punch” for that session). The TURN server then sets up a direct connection from itself to the desired endpoint, relaying media between the client and the endpoint as needed.

It is important to note that the TURN server needs to have a public IP address, or it might end up behind a NAT that does not provide a “consistent hole punch” from the TURN server to the endpoint. You know where that will end up...

Since the TURN server is able to establish a direct connection to the endpoint and the client has a 'permanent' connection to the TURN server, TURN simply works. This is because the TURN server is visible publicly, and will always provide a specific socket to the endpoint. There are downsides to this though:

  • The TURN server is another (potentially unnecessary) hop in the media chain. Hops add complexity and latency. Not ideal for real-time communications.
  • The TURN server needs to be able to handle all the media traffic flowing through it, which can be a lot more traffic than a single client would see/generate. That netbook just won't cut it as a TURN server.

For these reasons TURN is less preferred than STUN.

...in the pudding

If you have done any snooper traces of ICE traffic and seen the candidate lists, the preference values are those 'random' numbers listed along with the candidate socket information. And you thought that was just random numbers. Nope.

Pulling It All Together

From a Lync media traversal standpoint, any of the cone NATs allow media traversal via STUN. Because there is a “consistent hole punch” (the external socket on the STUN server that reflects my NATed internal socket) for the other party to communicate to, media can pass the NAT, meaning media content gets to where it needs to be. Symmetric NAT is not good for STUN based media traversal though, because the external socket will change with every request from the client. This means TURN becomes the only viable option. Remember though that use of TURN means another hope and a lot of processing on the TURN server.

Hopefully this clears up the mysteries of STUN and TURN for those of you who are interested. For those of you who are not, how did you get this far into the posting? Was it a dare? Are you bored and randomly Binging results?

One final note: you may have noted that I did not speak to whether my client was inside my corporate LAN or outside the corporate LAN. Why not? Because it does not matter! Wherever my client is, in order for it to be able to communicate to other clients over Lync it has to be authenticated and authorized by the internal Front End pool. Additionally, the client will always contact the edge server (STUN/TURN server) as part of the follow on from in-band provisioning. The edge doesn't care where the request came from, it just replies to the client request with the payload (requesting socket seen) as noted above. Once the client get that response, it has another candidate to use in the ICE negotiations.

In any case, comments are welcome and appreciated.

See you next time…