Microsoft Exchange Diagnostics service crashing with the event 4999, 1007 and 7031 post CU upgrade

Post upgrading to Exchange 2013 to CU6 from any CU level you may notice the below events are generating in every 30-45 minutes, it is the exceptional rare scenario and may encounter similar symptom post upgrade. While upgrade due to some modification in the registry hives the ExchangeDiagnosticsDailyPerformanceLog got missing from Task Scheduler, performance monitor and was also missing from the location C:\Windows\System32\Tasks\Microsoft\Windows\PLA because of which the MSExchangeDiagnostics service started crashing very frequently and it failed to create daily performance logs, since PID is not changing its first chance exception.

 

Log Name:      Application

Source:        MSExchangeDiagnostics

Date:          9/29/2014 2:47:37 PM

Event ID:      1007

Task Category: General

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      SRV1.contoso.local

Description:

Failed to create or start performance logs with error: System.ArgumentException: Value

does not fall within the expected range.

   at PlaLibrary.DataCollectorSetClass.start(Boolean Synchronous)

   at Microsoft.Exchange.Diagnostics.PerformanceLogger.PerformanceLogSet.StartLog

(Boolean synchronous)

   at

Microsoft.Exchange.Diagnostics.PerformanceLogger.PerformanceLogMonitor.CheckPerflogSta

tus(). Performance log: ExchangeDiagnosticsPerformanceLog.

 

 

Log Name:      System

Source:        Service Control Manager

Date:          9/29/2014 2:20:27 PM

Event ID:      7031

Task Category: None

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      SRV1.contoso.local

Description:

The Microsoft Exchange Diagnostics service terminated unexpectedly.  It has done this

577 time(s).  The following corrective action will be taken in 60000 milliseconds:

Restart the service.

 

 

Log Name:      Application

Source:        MSExchange Common

Date:          9/29/2014 3:42:18 PM

Event ID:      4999

Task Category: General

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      SRV1.contoso.local

Description:

Watson report about to be sent for process id: 16436, with parameters: E12IIS, c-RTL-

AMD64, 15.00.0995.029, M.E.Diagnostics.Service, M.E.Diagnostics.PerformanceLogger,

M.E.D.P.PerformanceLogSet.StartLog, System.ArgumentException, 95c6, 15.00.0995.012.

ErrorReportingEnabled: True

 

Before starting the troubleshooting we can first verify if Windows Scheduler Service is disabled by specific GPO. Since ExchangeDiagnosticsDailyPerformanceLog was missing in the performance monitor and task scheduler, I just thought of importing the ExchangeDiagnosticsDailyPerformanceLog template as a workaround from performance monitor of one of the working Exchange 2013 CU6 server. Though it didn’t fix the problem completely, but it’s worth reading the steps performed for understanding the cause of the problem.

 

I first exported ExchangeDiagnosticsDailyPerformanceLog template with the name ExchangeDiagnosticsDailyPerformanceLog.xml from the working Exchange 2013 CU6 server. Then while importing the ExchangeDiagnosticsDailyPerformanceLog.xml it was not picking the XML file and was showing the blank console post browsing the file from the saved location. It seems that ExchangeDiagnosticsDailyPerformanceLog.xml was already in used by other data collector. Modified the name from ExchangeDiagnosticsDailyPerformanceLog.xml to ExchangeDiagnosticsDailyPerf.xml and this time I was able to pick the template from the saved location, but if I am using the data collector name as ExchangeDiagnosticsDailyPerformanceLog then it throws an error as mentioned in the below screenshot.

 

 

 

Changed the data collector name from ExchangeDiagnosticsDailyPerformanceLog to ExchangeDiagnosticsDailyPerf and I was able to import the ExchangeDiagnosticsDailyPerf.xml template. Since ExchangeDiagnosticsDailyPerf template was imported properly, it even populated the ExchangeDiagnosticsDailyPerf task in the Task scheduler. Now the ExchangeDiagnosticsDailyPerformanceLog started getting generated at the location C:\Program Files\Microsoft\Exchange Server\V15\Logging\Diagnostics\DailyPerformanceLogs. But the events 1007 and 4999 were still generating in every 30-45 minutes.

 

 

 

As per the above testing it is clearly reflected that cached entry of the default daily performance log name “ExchangeDiagnosticsDailyPerformanceLog” is present somewhere and since it is not able to find the ExchangeDiagnosticsDailyPerformanceLog in the perfmon and task scheduler it’s generating the error 1007 and 4999.  

  

Resolution:

We have to ensure that the templates key are present under PLA

HKLM\Software\Microsoft\PLA\Templates

 

 

Found cached entry of the default daily performance log name “ExchangeDiagnosticsDailyPerformanceLog” at the registry location

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Schedule\TaskCache\Tree\Microsoft\Windows\PLA\ExchangeDiagnosticsDailyPerformanceLog

 

 

Important This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, click the following article number to view the article in the Microsoft Knowledge Base: 322756 (https://support.microsoft.com/kb/322756/ )

How to back up and restore the registry in Windows

 

Deleted the ExchangeDiagnosticsDailyPerformanceLog entry from the above location and deleted the imported 'ExchangeDiagnosticsDailyPerf' from perfmon

Ensured that we only have ExchangeDiagnosticsPerformanceLog and Server Manager Performance monitor under user data collector in perfmon. Then rebooted the server, post which found that new default ExchangeDiagnosticsDailyPerformanceLog was recreated automatically in the task scheduler and performance monitor and now events were not generating any more.

Note: If we are missing both ExchangeDiagnosticsDailyPerformanceLog and ExchangeDiagnosticsPerformanceLog then we have to delete both the entries from the above registry location and then reboot

 

Cheers!

Anil