Issues with MRAS and Limited External Calling

This week I ran into two interesting issues, both around MRAS and the clients displaying Limited External Calling.  So what is MRAS anyway?  MRAS (Media Relay Authentication Service) is a service on the Edge Server that is responsible for providing credentials to clients in order for them to be able to request ports and establish media sessions through the Edge Server.  Without these credentials, clients will not be able to include Edge Server candidates in their candidate list when trying to establish a media session.

Clients will send a request for MRAS credentials every time the user signs in.  When the user signs in, the client will send a SIP SERVICE message to the Front End Server requesting MRAS credentials.  This request includes the user's SIP address and whether or not they are on the intranet or internet, among other things.  You can see a normal request below:

Start-Line: SERVICE sip:TEST-LS14-EDGE1.dmz.test.deitterick.com:5062;grid SIP/2.0
From: "William Cooper"<sip:wcooper@test.deitterick.com>;tag=6392822386;epid=e114d373a8
To: <sip:TEST-LS14-EDGE1.dmz.test.deitterick.com@test.deitterick.com;gruu;opaque=srvr:MRAS:tmsOHGdb9lGxCPJSi2p6eAAA>
CSeq: 1 SERVICE
Call-ID: 3ca1e573619f4be4975e6cd6ec1f44bd
Record-Route: <sip:TEST-LS14-SE1.test.deitterick.com:5061;transport=tls;opaque=state:T;lr>;tag=F0801F2A7EAD2D0381EEE38345CE6E66
Via: SIP/2.0/TLS 172.16.7.6:60353;branch=z9hG4bK58E6CA87.914C06A09A4174A8;branched=FALSE
Max-Forwards: 69
ms-application-via: SIP;ms-urc-rs-from;ms-server=TEST-LS14-SE1.test.deitterick.com;ms-pool=TEST-LS14-SE1.test.deitterick.com;ms-application=ad894dc3-55e0-44bf-a07e-3c073aaa4a57
Via: SIP/2.0/TLS 172.16.7.8:58674;ms-received-port=58674;ms-received-cid=39700
Contact: <sip:wcooper@test.deitterick.com;opaque=user:epid:2L_Ol0dqJVO_R1E7cD7V_gAA;gruu>
User-Agent: UCCAPI/4.0.7577.0 OC/4.0.7577.0 (Microsoft Lync 2010)
Content-Type: application/msrtc-media-relay-auth+xml
Content-Length: 496
ms-user-data: ms-publiccloud=FALSE;ms-federation=FALSE
Message-Body: <request requestID="3757984" version="2.0" to="sip:TEST-LS14-EDGE1.dmz.test.deitterick.com@test.deitterick.com;gruu;opaque=srvr:MRAS:tmsOHGdb9lGxCPJSi2p6eAAA" from="sip:wcooper@test.deitterick.com" xmlns="https://schemas.microsoft.com/2006/09/sip/mrasp" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"><credentialsRequest credentialsRequestID="3757984"><identity>sip:wcooper@test.deitterick.com</identity><location>intranet</location><duration>480</duration></credentialsRequest></request>
$$end_record

 

The response that is returned contains the FQDN of the Edge Server or the Edge pool, the ports that the client should send requests to, and the credentials for that user.  These credentials are valid for 8 hours.  You can see a normal response below:

Start-Line: SIP/2.0 200 OK
From: "William Cooper"<sip:wcooper@test.deitterick.com>;tag=6392822386;epid=e114d373a8
To: <sip:TEST-LS14-EDGE1.dmz.test.deitterick.com@test.deitterick.com;gruu;opaque=srvr:MRAS:tmsOHGdb9lGxCPJSi2p6eAAA>;tag=6b04cac19
CSeq: 1 SERVICE
Call-ID: 3ca1e573619f4be4975e6cd6ec1f44bd
VIA: SIP/2.0/TLS 172.16.7.6:60353;branch=z9hG4bK58E6CA87.914C06A09A4174A8;branched=FALSE,SIP/2.0/TLS 172.16.7.8:58674;ms-received-port=58674;ms-received-cid=39700
RECORD-ROUTE: <sip:TEST-LS14-SE1.test.deitterick.com:5061;transport=tls;opaque=state:T;lr>;tag=F0801F2A7EAD2D0381EEE38345CE6E66
CONTENT-LENGTH: 1002
CONTENT-TYPE: application/msrtc-media-relay-auth+xml
SERVER: RTCC/4.0.0.0 MRAS/2.0
ms-edge-proxy-message-trust: ms-source-type=EdgeProxyGenerated;ms-ep-fqdn=TEST-LS14-EDGE1.dmz.test.deitterick.com;ms-source-verified-user=verified
Message-Body: <?xml version="1.0"?>
<response xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="https://www.w3.org/2001/XMLSchema" requestID="3757984" version="2.0" serverVersion="2.0" to="sip:TEST-LS14-EDGE1.dmz.test.deitterick.com@test.deitterick.com;gruu;opaque=srvr:MRAS:tmsOHGdb9lGxCPJSi2p6eAAA" from="sip:wcooper@test.deitterick.com" reasonPhrase="OK" xmlns="https://schemas.microsoft.com/2006/09/sip/mrasp">
  <credentialsResponse credentialsRequestID="3757984">
    <credentials>
      <username>AgAAJCFf4/sBzVDF7Q9GNmDEp0oLdfto4acI50YDy3IAAAAAOoZBWBV8je4fgGTvyKbxh7sioMY=</username>
      <password>k6GJX093FIYdfVlmHkGSw0Yi+lM=</password>
      <duration>480</duration>
    </credentials>
    <mediaRelayList>
      <mediaRelay>
        <location>intranet</location>
        <hostName>TEST-LS14-EDGE1.dmz.test.deitterick.com</hostName>
        <udpPort>3478</udpPort>
        <tcpPort>443</tcpPort>
      </mediaRelay>
    </mediaRelayList>
  </credentialsResponse>
</response>
$$end_record

 

There are a few things that can cause the client to fail to get MRAS credentials.  Some of those a detailed below:

Issue #1

In this issue clients were seeing the Limited External Calling icon in the Lync 2010 client:

We ran through the normal troubleshooting steps of making sure that the proper ports were open, DNS resolution was working, and certificates were trusted by both servers.  Since all of that checked out, the next step was to take SIPStack tracing on the Front End Server and the Edge Server during the client login process.  Reviewing the SIPStack trace from the Front End Server showed that a SIP/2.0 504 Server time-out was being returned to the client.  Specifically the response contained the following:

ms-diagnostics: 1038;reason="Failed to connect to a peer server";WinsockFailureCode="10061(WSAECONNREFUSED)";WinsockFailureDescription="The peer actively refused the connection attempt";Peer="TEST-LS14-EDGE1.dmz.test.deitterick.com";Port="5062";source="TEST-LS14-SE1.test.deitterick.com"

The SIPStack trace from the Edge Server didn't show any MRAS requests.  This means that whatever the issue is, it is happening before the Edge Server can receive the SIP SERVICE message.  This usually points to a certificate/MTLS issue.  Depending on the amount of traffic on the Front End Server(s), you can sometimes see certificate/MTLS issues during the tracing, as the log will grow extremely large relative to the amount of traffic on the server.

Looking at the error from the SIPStack trace above, we can see that the Edge Server is refusing the connection from the Front End Server.  Since this is generally a certificate issue we looked at the certificates on the Front End Server again.  Everything about the certificate was correct, although the one thing we did find was that there were numerous Intermediate CA certificates listed for their Issuing CA.  They only had one Issuing CA in the environment, and looking at each of the certificates listed, we found that had different thumbprints.  It appears that the Issuing CA was deployed multiple times with the same name, but the certificates weren't cleaned up in Active Directory.  We deleted the invalid Intermediate CA certificates for their Issuing CA and tested again.  This time the client was able to request MRAS credentials from the Edge Server.

 

Issue #2

This issue occurred on Communicator 2007 R2 clients in an OCS 2007 R2 environment.  Clients were seeing the Limited External Calling icon.  We confirmed that the proper ports were open, DNS resolution was working, and that certificates were trusted by both servers.  We took SIPStack tracing on the Front End Server and the Edge Server.  Similar to Issue #1 above, we were getting a SIP/2.0 504 Server time-out response returned to the client.  It included the following:

ms-diagnostics: 2;reason="See response code and reason phrase";source="TEST-OCSR2-SE1.test.deitterick.com";HRESULT="0xC3E93C69(SIPPROXY_E_CONNECTION_FAILED)"

This time we're getting a SIPPROXY_E_CONNECTION_FAILED.  The certificate chains checked out, and everything appeared to be configured correctly, yet the Front End Server would return this error message every time.  I remembered a couple environments where weird things would happen if you didn't have your Edge Server(s) patched, but you had patched your Front End Server(s).  I checked the patch levels on both the Front End Server and the Edge Server, and sure enough, the Front End Server was patched and the Edge Server had no patches applied.  We applied the latest OCS 2007 R2 patches to the Edge Server and tested again, and this time the client was able to request MRAS credentials from the Edge Server.

 

These are just a couple of issues I've run across with MRAS.  The important thing to remember is to check the basics and then take logging and dig into what is really going on.