Insertion of an error report was throttled to prevent flooding of the Call Detail Recording (CDR) Database


Often a Lync Administrator will see EVENT ID 56208 – Insertion of an error report was throttled to prevent flooding of the Call Detail Recording (CDR) Database on their Front-End Servers, potentially on other server roles like a Mediation Server too.

While most administrators chose to ignore this alert, this blog-post sheds light on what this alert represents and when is it safe to ignore the same,

event-id-56208-002

A CDR report is generated by each party that is in a Peer-to-Peer IM Session, Multi-Party IM Session, and/or Conferencing Session. In case of PSTN calls, the mediation server will generate a CDR report.

Since CDR reports are generated by by P2P conversations and by conferences, lets first focus on conferences.

P2P and Multi-Party IM Conversations – Each party will generate a CDR Report, when the session is ended (I.e user closes their active tab or the session is timed-out, due to inactivity).

Conferences typically start at the top of the hour or at the 30 minute mark and then conclude at the end/top of the hour or at the 30 minute mark.  This means that at every hour and 30 minute interval after the same, during core-business hours, CDR reports are expected to be generated at higher volumes.

Conferences / Meetings also present a unique scenario where people may either be walking to a conference room, switching from one Wireless Access Point to another and/or not responding to IM sessions that are already in progress, as they are just joining/exiting a meeting causing a session timed-out due to inactivity.

In case of a session involving audio and/video, a Quality of Experience (QoE) report is also generated, but the volume of QOE reports is often much lesser than the volume of CDR reports.

Both in Lync Server 2013 and Skype for Business Server 2015, we have a throttle set to on the monitoring server to only allow for only 10 reports with a particular MS-Diagnostic ID to be inserted into the LcsCDR database per second. This throttle was deliberately introduced to distribute the load on the server over a longer period of time, so that report creation isn’t impacted, if the monitoring and reporting servers are collocated.

The above mentioned throttle can be seen in the dbo.MSDiagMetaData table in the LCSCDR database. One common example would be

DiagnosticId ReasonString Description
52094 Conversation suspended due to a loss of network The signaling session was terminated due to a loss of network.  This typically occurs when a wireless connection drops or a machine enters hibernation mode.  Audio calls will typically continue if audio is still flowing.  If the network dropped, the client will attempt to re-establish and send any queued instant messages once connectivity is restored.

The way our product is designed, these reports are held in SQL LyncLocal instance in the LYSS database., and are retried again and again until the messages are eventually committed to the Monitoring servers.

In rare instances, mostly either in very large environments that use a centralized Monitoring Server or in environments, where the Monitoring Server has been offline for a considerable period, we have see issues where the threshold hasn’t necessarily worked.

Generally speaking it is safe to ignore EVENT 56208. However, if EVENT ID 56208 is fired during periods of low business activity (weekends or holidays), it is certainly concerning.  In such a case focusing on EVENT ID 56206 and EVENT ID 56207 can help understand, if the aggregate volumes are increasing or decreasing. If the volumes are decreasing, we recommend to wait till the queues drain out. It is suggested to prevent additional load on the monitoring server, not to generate reports on your reporting server.

If there are indeed issues in the environment, based on the type of issue, you may see other events in the LYNC Server event log from source LS Storage Service. Also, it could be possible that your reporting server reports will not have data for the current or prior day. In such cases, we would recommend that you engage Microsoft Support.

Comments (5)

  1. soder says:

    Sri Todi: Is there a way to change the default behavior (10 entries/sec limit), or it is hard coded / cannot be changed by us?

    1. Sri Todi says:

      @Soder The value isn’t hard-coded, but defined in a database, and hence can certainly be tweaked. However, we recommend that you investigate into the root cause of the issue, and then resolve the root cause. Identifying the MS-DiagnosticID’s frequency in Monitoring Server, can certainly reveal about potential issues in the environment.

      1. soder says:

        If I have a Large meeting Pool, and all the 1000 participants leave the conference at the same (when its finished for example), it will cause CDR-report-tsunami, and as a result, will trigger this throttling, am I right?

        1. Sri Todi says:

          Soder – There may be say ~ 1,000 reports that get generated. Since the throttle is based on the MS-DiagnosticId, even if presuming that all users have the same DiagnosticID, @ 10 reports/sec, you will need just about 100 seconds, nothing that’s very concerning. If you are concerned, please work with Premier Support to investigate into the root cause in your environment.

  2. Sri Todi says:

    @Soder There may be say ~ 1,000 reports that get generated. Since the throttle is based on the MS-DiagnosticId, even if presuming that all users have the same DiagnosticID, @ 10 reports/sec, you will need just about 100 seconds, nothing that’s very concerning. If you are concerned, please work with Premier Support to investigate into the root cause in your environment.

Skip to main content