Author: Steve Schiemann
Product Version: Skype for Business Server 2015
During the first few months after the release of Skype for Business Server 2015, Unified Communications Customer Service and Support (CSS) was hit with a minor “surge” of cases with a puzzling issue. The Skype for Business Server 2015 Front End (FE) service would crash every 24 hours, almost to the minute after the last reboot, or service restart. Here is a typical event ID 1000 associated with such a crash:
Log Name: Applicationw
Source: Application Error
Event ID: 1000
Task Category: (100)
Faulting application name: RTCSrv.exe, version: XXXX, time stamp: XXXX
Faulting module name: SIPStack.dll, version: XXXX, time stamp: XXXX
Exception code: 0xc0000005
Fault offset: 0x0000000000442aa1
Faulting process id: 0x1ab4
Faulting application start time: 0x01d0bbeed3ffd292
Faulting application path: C:\Program Files\Skype for Business Server 2015\Server\Core\RTCSrv.exe
Faulting module path: C:\PROGRA~1\SKYPEF~1\Server\Core\SIPStack.dll
Report Id: 874748dc-28ab-11e5-8115-005056b52889
Shortly after each case was opened, relevant data was gathered, which included event logs, a topology export, and crash dumps gathered from an elevated command prompt with “procdump -ma -e <PID_of_Process> <output path>". The data showed that the FE service (rtcsrv.exe) was using incorrect logic to validate the address of the trusted TCP server. Part of this logic assumed that Transport Layer Security (TLS) was in use. The certificates in use on the FE server all checked out fine. So the certificate information must be coming from a remote server, right? Not so in this case...
During the course of investigation, CSS asked the affected customers about their environment. We also looked at the topology files. A pattern soon emerged, in that these customers were all running trusted applications from third-parties. Furthermore, the applications were configured to use TCP connections, instead of TLS, which requires a certificate. With help from the Skype for Business product group, the issue was at last fully understood: There was no problem with a certificate; the problem was that there was NO certificate. When a TCP connection came in from a trusted app, an access violation was generated because of the “missing” certificate. This problem has now been corrected in Skype for Business 2015 Server Cumulative Update 2. See this link for the latest Front-End/Edge update for Skype for Business server 2015, and related KB 3141114 for more information on this particular issue.
But why not use TCP for trusted application connections? I’ve found nothing in Microsoft documentation says use TLS only. As a matter of fact, you have to run New-CsTrustedApplication when installing/configuring a new trusted app, and the -EnableTCP parameter is valid. The documentation for New-CsTrustedApplication says, “Use this parameter only if the trusted application is not a Microsoft Unified Communications Managed API (UCMA) application”. This doesn’t mean that you SHOULD use -EnableTCP if the trusted app is not UCMA, but it is an option.
The use of plain TCP connections with trusted applications is not nearly as common as TLS connections. It turns out that TCP connectivity from trusted applications gets limited test coverage. Most trusted apps nowadays can and should be configured to use TLS. With TCP connectivity, there is no support for IPv6, and it lacks the added security and privacy of TLS.
Back to the crashing problem, why did the crash occur almost exactly 24 hours after the last FE service restart? It seems that TCP connections are refreshed at that set time interval from the last service start. While this issue is resolved in Skype for Business Server 2015 CU2, we recommend using TLS over TCP for your trusted applications.