Changes in the TimerJobHistory table maintenance introduced in November 2017 CU for SharePoint 2013


[The blog post below is based on the information collected by Derek Smay and Alexander Peter, two Microsoft Premier Field Engineers located in the US]

 

Overview

One of the changes in the November 2017 CU for SharePoint 2013 affects the maintenance of the [dbo].[TimerJobHistory] table in the SharePoint configuration database, which can experience significant growth if it is not trimmed on a regular schedule.

Historically SharePoint 2013 preserved the timerjob history for all timerjobs which ran in the last 7 days. Starting with November 2017 CU for SharePoint 2013 (build 15.0.4981.1002), the maintenance of the [dbo].[TimerJobHistory] table has been updated in a way that it will never keep more than 10,000 job records.

To maintain the [dbo].[TimerJobHistory] table SharePoint only relied on the Delete Job History timer job, which is responsible to purge records older than 7 days. The default schedule is to run once a day but it is possible to change the schedule by updating the frequency in the Central Administration Site. The Delete Job History timer job, invokes the [dbo].[proc_DeleteOldTimerJobHistory] stored procedure which does the actual work.  If this timer job is not enabled or does not complete successfully, the [dbo].[TimerJobHistory] table can grow to hundreds of millions of records and hundreds of Gigabytes, which can impact performance and availability.

With November 2017 CU, an additional stored procedure, [dbo].[proc_DeleteOldTimerJobHistoryByRows], was added, which limits the number of rows in the [dbo].[TimerJobHistory] table table to 10,000 records. Be aware that both of these stored procedures actually maintain three different tables: [dbo].[TimerJobHistory], [dbo].[TimerScheduledJobs] and [dbo].[TimerRunningJobs]. Of these three, the largest table is [dbo].[TimerJobHistory].

The new stored procedure is not called from any specific timerjob - it is executed as the final step of each single timer job running on your system. That means the maintenance of the [dbo].[TimerJobHistory] does no longer rely the successful execution and on proper schedule of a single timerjob.

This is the logic built into the execution of SharePoint timer jobs after November 2017 CU:

  1. Execute the logic of the timer job including the execution of one or more stored procs to complete intended task of the timer job
  2. Execute the [dbo].[proc_AddTimerJobHistory] proc to document the execution of the completed timer job to the [dbo].[TimerJobHistory] table
  3. Execute the [dbo].[proc_DeleteOldTimerJobHistory] proc to limit the number of records in [dbo].[TimerJobHistory] table to a maximum of 10,000 by removing the oldest records.

This new enhancement eliminates the possibility of the SharePoint configuration database becoming bloated.  By imposing a ceiling on the [dbo].[TimerJobHistory] record count, a number of benefits are recognized such as disk space consumption for the database and its backups, I/O activity, backup impact on network throughput as well as maintenance activities (i.e. CHECKDB, Index, etc).

After running November 2017 CU you should never have to worry about [dbo].[TimerJobHistory] to exceed 10,000 rows. And indeed, this will clearly keep this table in check BUT be careful if there was a problem with the maintenance of your [dbo].[TimerJobHistory] table before installing of November 2017 (or later) CU:

 

Potential Problem

After installing November 2017 CU each and every timerjob will be responsible to maintain the size of the [dbo].[TimerJobHistory] table. In case that the [dbo].[TimerJobHistory] had several million rows due to the fact that the Delete Job History timer job was disabled or did not work correctly the new maintenance stored procedure will take a very, very long time to complete. As several SharePoint timerjobs will start and end each second that will cause this stored procedure to run in parallel on your SQL server in multiple different SQL sessions.

In this case you can see your SQL server to run on high CPU on each of the available cores till the [dbo].[TimerJobHistory] has been trimmed to the new maximum size.

To prevent this from happening you should ensure that the the Delete Job History job has been running successfully. If you have a SharePoint Config database that is large in size, check the row count and size of [dbo].[TimerJobHistory] table.

 

SharePoint 2016?

The code change introduced in November 2017 CU for SharePoint 2013 is not included in SharePoint 2016 and is currently not planned to be ported to SharePoint 2016. The reason is that the maintenance of the [dbo].[TimerJobHistory] table in SharePoint 2016 is done differently than in SharePoint 2013. E.g. the Delete Job History job does bulk deletes of 1000 records rows rather than on row after the other and it only keeps records for 3 days compared to 7 days in SharePoint 2013.

But be aware that the proper maintenance of the the [dbo].[TimerJobHistory] table in SharePoint 2016 requires that the Delete Job History job runs successful on a regular basis.

Comments (6)

Cancel reply

  1. Alan Hu says:

    Hooray! Another thing should also be addressed is the number of rows in table TimerScheduledJobs and TimerRunningJobs, in some extreme cases they can be huge too.

    For instance, in one SP2013 farm we have 3000+ site collections and RecordPoint creates 3 timer jobs for each site collection, two of these timer jobs are scheduled to run every minute. It has a huge impact on the farm performance and also cause large SQL transaction logs piling up.

    1. Hi Alan,
      feel free to open a support case to request a Design Change Request for this or use UserVoice to request such a change:
      https://sharepoint.uservoice.com/
      Cheers,
      Stefan

  2. PoWaShell says:

    Hi Stefan,

    I have deploy CU February 2018, which includes the TimerJobHistory table maintenance changes and I have noticed the 10000 record limit doesn’t keep much in terms of timeframe (just a few hours worth of jobs in my case as we have 5 web application running in the farm). If timer jobs fail, it becomes difficult to troubleshoot. Any recommendations for flagging issues with jobs, other than monitoring ULS?

    Thanks in advance,

    Paul

    1. Hi Paul,
      the things you noticed have been reported by several customers and that was the reason that the design was changed again in April CU:
      https://blogs.technet.microsoft.com/stefan_gossner/2018/04/12/changes-in-the-timerjobhistory-table-maintenance-introduced-in-april-2018-cu-for-sharepoint-2013/
      The recommendation is to upgrade to April 2018 CU to get the updated functionality.
      Cheers,
      Stefan

      1. PoWaShell says:

        OK! Thanks for the heads-up!

        Pau;

      2. AndySteiner says:

        Hi Stefan
        We’re currently facing a huge issue on our productive environment (12 SP servers, 30’000 users) after patching to CU March 18 yesterday evening. PSConfig ran on 4 of 12 servers yet and took hours each time. The configDB timerjobhistory table still contains more than 70mio items, decreasing verly slowely. SQL server is under heavy load and we fear that the system goes down completely.
        Is there any “quick fix” avilable on this issue? Can we temporarely stop timer services or other jobs on the sharepoint servers to overcome the issue?
        Advice is much apprechiated.
        Best regards
        Andy

Skip to main content