Troubleshooting Windows HPC WCF/SOA Issues

HPC uses HPC sessions to support the service-oriented architecture (SOA) programming model based on Windows Communication Foundation (WCF). Sometimes troubleshooting errors from this SOA based applications could be challenging. However this tip I'm about to share should be helpful to figure out exactly where the issue is coming from. Looking through the trace of the communication between the service hosts (running on the compute nodes) and the broker is often the key to identifying where the problem lies.

You can use the Windows Communication Foundation (WCF) Service Trace Viewer Tool to analyze messages logged by WCF. Service Trace Viewer is included in the Microsoft Windows Software Development Kit (SDK) for Windows Vista and .NET Framework Runtime Components. You can download the Windows SDK from the Microsoft Download Center at https://go.microsoft.com/fwlink/?LinkID=75636. For more information about using this tool, see "Service Trace Viewer Tool (SvcTraceViewer.exe)"at https://go.microsoft.com/fwlink/?LinkId=88991.

The following are the instructions to enable tracing.

1. Modify the system.diagnostic section HpcServiceHost.exe.config in %CCP_HOME%\bin\ (*for each compute nodes*) as follow:

<system.diagnostics>

    <sources>

      <source name="Microsoft.Hpc.HpcServiceHosting" switchValue="All">

        <listeners>

          <add name="Console" />

          <add name="ServiceHostTraceListener" />

        </listeners>

      </source>

    </sources>

    <sharedListeners>

      <add initializeData="\\<HEADNODE>\CcpSpoolDir\host.svclog" type="System.Diagnostics.XmlWriterTraceListener"

        name="ServiceHostTraceListener">

        <filter type="" />

      </add>

      <add type="System.Diagnostics.ConsoleTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"

        name="Console" traceOutputOptions="DateTime, ThreadId">

        <filter type="" />

      </add>

    </sharedListeners>

    <trace autoflush="true" />

</system.diagnostics>

Modify the modify the system.diagnostic section of the HpcWcfBroker.exe.config in %CCP_HOME%\bin ( *on all broker nodes* ) as follows:

  <system.diagnostics>

    <sources>

      <source name="Microsoft.Hpc.ServiceBroker" switchValue="All">

        <listeners>

          <add name="Console">

            <filter type="" />

          </add>

          <add name="WSLBTraceListener">

            <filter type="" />

          </add>

          <remove name ="Default" />

        </listeners>

      </source>

    </sources>

    <sharedListeners>

      <add type="System.Diagnostics.ConsoleTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"

        name="Console" traceOutputOptions="DateTime, ThreadId">

        <filter type="" />

      </add>

      <add initializeData="\\<HEADNODE>\CcpSpoolDir\broker.svclog"

        type="System.Diagnostics.XmlWriterTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"

        name="WSLBTraceListener" traceOutputOptions="Timestamp">

        <filter type="" />

      </add>

    </sharedListeners>

    <trace autoflush="true" />

  </system.diagnostics>

2. replace the <HEADNODE> in both files with your headnode name.

3. Run your application until you see the errors

4. All svclog files will be under \\<HEADNODE>\CcpSpoolDir\.