ConfigMgr: Troubleshooting High CPU utilization on the SQL server and Parallelism


 

Hello Folks, This post is about troubleshooting CPU utilization of SQL Server hosting the ConfigMgr database. It talks about the general queries to find out the causes and talks specifically about parallelism.

 

To check which query is taking the Max CPU time run the following query

 
 
SELECT TOP 50
 
ObjectName = 
OBJECT_SCHEMA_NAME(qt.objectid,dbid) + '.' + OBJECT_NAME(qt.objectid, 
qt.dbid)
 
,TextData = 
qt.text
 
,DiskReads = 
qs.total_physical_reads -- The worst reads, disk reads
 
,MemoryReads = 
qs.total_logical_reads --Logical Reads are memory 
reads
 
,Executions = 
qs.execution_count
 
,TotalCPUTime = 
qs.total_worker_time
 ,AverageCPUTime = 
qs.total_worker_time/qs.execution_count
 ,DiskWaitAndCPUTime = 
qs.total_elapsed_time
 
,MemoryWrites = 
qs.max_logical_writes
 
,DateCached = 
qs.creation_time
 
,DatabaseName = DB_Name(qt.dbid)
 ,LastExecutionTime = 
qs.last_execution_time
 FROM sys.dm_exec_query_stats AS qs
 CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
 ORDER BY qs.total_worker_time DESC

It’s also good to look at the tasks which are having the maximum wait time. You can do this by running the following:

 

select top 10 * from sys.dm_os_wait_stats order by wait_time_ms desc

 

In my case this returned a very high value for the wait_type CXPACKET.

 

CXPACKET/Parallelism Explanation:

When a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. There is an organizer/coordinator thread (thread 0), which takes waits for all the threads to complete and gathers result together to present on the client’s side. The organizer thread has to wait for the all the threads to finish before it can move ahead. The Wait by this organizer thread for slow threads to complete is called CXPACKET wait.

Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query.

Parallelism: SQL Server provides parallel queries to optimize query execution and index operations for computers that have more than one microprocessor (CPU). Because SQL Server can perform a query or index operation in parallel by using several operating system threads, the operation can be completed quickly and efficiently.

For more info on parallelism: http://msdn.microsoft.com/en-us/library/ms178065(v=sql.105).aspx

When is Parallelism required?

We cannot discuss reducing the CXPACKET wait without talking about the server workload type.

OLTP: On Pure OLTP system, where the transactions are smaller and queries are not long but very quick usually, set the “Maximum Degree of Parallelism” to 1 (one). This way it makes sure that the query never goes for parallelism and does not incur more engine overhead.

 
EXEC sys.sp_configure N'max degree of parallelism', 
N'1'
GO
RECONFIGURE WITH OVERRIDE
GO

Data-warehousing / Reporting server: As queries will be running for long time, it is advised to set the “Maximum Degree of Parallelism” to 0 (zero). This way most of the queries will utilize the parallel processor, and long running queries get a boost in their performance due to multiple processors.

 
EXEC sys.sp_configure N'max degree of parallelism', 
N'0'
GO
RECONFIGURE WITH OVERRIDE
GO

ConfigMgr DB operations are normal OLTP queries so it’s good to go with less parallelism. In my case it had a value of zero which means the queries will utilize the available processors (in my case 16 core).

The moment we reduced the value to 1 we saw the drop in the performance and the CXPACKET was no longer the wait_type with the maximum wait.

 

 

Hope this helps !!

Umair Khan

Support Escalation Engineer | Microsoft System Center ConfigMgr 

Disclaimer:
This posting is provided "AS IS" with no warranties and confers no rights.

Comments (7)

  1. Scott, you are correct here. For CONFIGMGR from experience I think we should not be greatly affected by this but I am aware of the specifics on a few places where we do need Parallelism and that improves perf. Yes this would require some monitoring and
    caution.

  2. Bharat Chand says:

    Nice explanation Umair

  3. Michael Ray says:

    When you say "The moment we reduced the value to 1 we saw the drop in the performance and the CXPACKET was no longer the wait_type with the maximum wait." do you mean that performance improved after setting the value to 1?

  4. Scott says:

    I am sorry but recommending setting an OLTP MAX DOP setting to 1 is very dangerous. Both Brent Ozar and Paul Randall both recommend you NOT to do this, (or take extreme caution) You are better off altering the Cost Threshold For Parallelism in the first
    instance.

  5. Marcos Liebich says:

    Is there an guidance for MDOP? I have ConfigMgr DB having a lot of deadlocks now, it was set to 0. I used a MDOP calculator that I found that recomended to change it to 8 but I still see deadlocks.

  6. Supriyo says:

    The first and the most common step if you suspect high CPU utilization (or are alerted for it) is to login to the physical server and check the Windows Task Manager. The Performance tab will show the high utilization as shown below:

    Next, we need to determine which process is responsible for the high CPU consumption. The Processes tab of the Task Manager will show this information:

    For this types of situation we have to remember that CPU consume time in two modes as

    1) Kernal Mode
    2) User Mode

    These two mode can be seen by "Performance Monitor" by monitoring "%Privilege Time" and "%User Time" counter.

    Remember that "%Privileged time" is not based on 100%.It is based on number of processors.If you see 200 for sqlserver.exe and the system has 8 CPU then CPU consumed by sqlserver.exe is 200 out of 800 (only 25%).If "% Privileged Time" value is more than 30%
    then it's generally caused by faulty drivers or anti-virus software. In such situations make sure the BIOS and filter drives are up to date and then try disabling the anti-virus software temporarily to see the change.

    If "% User Time" is high then there is something consuming of SQL Server.
    There are several known patterns which can be caused high CPU for processes running in SQL Server including

    Some of the most common causes for High CPU in SQL Server are..........

    For more details click here...

    http://sqlserverstudymaterials.blogspot.in/2016/03/troubleshooting-high-cpu-utilization.html

  7. Nicole Garris says:

    Thanks for this blog, it helped solve our problem. We were having a problem with high CPU on our SQL Server 2014 for SCCM 2012. For about 25 minutes every hour, CPU usage would go up to around 80%, with extremely high CXPacket waits. We resolved the issue
    by resetting SQL Server's Cost Threshold for Optimization from 5 to 200. Now CPU goes up to less than 30% every hour, for about 5 minutes. I'm still tweaking the cost threshold to determine the optimum setting for our current environment.

    Our SQL Server is assigned only 4 CPUs (its a virtual machine).

    Perhaps this should be a recommendation for SQL Server configuration for SCCM. It is well known that the default SQL Server setting of 5 is too low, but it hasn't caused a problem for any of our other databases/applications.

    Here is the SQL I ran to change the cost threshold:

    EXEC sp_configure 'show advanced options', 1 ;
    GO
    RECONFIGURE
    GO
    EXEC sp_configure 'cost threshold for parallelism', 200 ;
    GO
    RECONFIGURE
    GO

Skip to main content