Part 2: Resource Governor CPU Demo on multiple CPUs


In some cases CPU demo described in the previous post may behave not as expected.


One such case is doing the demo on multiple CPUs. Even though it appears as Resource Governor is “not working”, it is “by design”. Let’s take a look at what happens in this case.


What is our promise about CPU bandwidth distribution?


When you specify percentage on the pool we will ensure average CPU bandwidth distribution on per scheduler basis for CPU bound workloads. Note, emphasis: average, per scheduler, CPU bound. What does this mean?


For example, you have a 2 CPU machine, and SQL Server runs on both of the CPUs. If you haven’t set affinity mask as was indicated in the Setup step of the previous post,



select scheduler_id, cpu_id, status, is_online from sys.dm_os_schedulers


will get you something like:



scheduler_id cpu_id      status                                                       is_online


———— ———– ———————————————————— ———


0            0           VISIBLE ONLINE                                               1


1            1           VISIBLE ONLINE                                               1


257          0           HIDDEN ONLINE                                                1


255          0           VISIBLE ONLINE (DAC)                                         1


Important item to note here is that we have 2 online schedulers to service normal user requests (scheduler_id < 255). You can compare it with the case when affinity mask is set to 1. You will still have 2 schedulers (with scheduler_id < 255), but one of them will be offline.


Now, for simplicity, let’s configure 2 groups and 2 pools as follows:



CREATE RESOURCE POOL pSlow


WITH (MAX_CPU_PERCENT = 30)


 


CREATE RESOURCE POOL pFast


WITH (MAX_CPU_PERCENT = 70)


 


CREATE WORKLOAD GROUP gSlow


USING pSlow


 


CREATE WORKLOAD GROUP gFast


USING pFast


GO


And corresponding classifier function as



CREATE FUNCTION f1()


RETURNS SYSNAME WITH SCHEMABINDING


BEGIN


      DECLARE @val sysname


      if ‘UserSlow’ = SUSER_SNAME()


            SET @val = ‘gSlow’;


      else if ‘UserFast’ = SUSER_SNAME()


            SET @val = ‘gFast’;


      return @val;


END


GO


Create logins, assign function and make changes effective 



CREATE LOGIN UserFast WITH PASSWORD = ‘UserFastPwd’, CHECK_POLICY = OFF


CREATE LOGIN UserSlow WITH PASSWORD = ‘UserSlowPwd’, CHECK_POLICY = OFF


GO


 


ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.f1)


GO


 


ALTER RESOURCE GOVERNOR RECONFIGURE


Add group counters for CPU Usage % and run 1 CPU intensive query UserSlow, then 1 with UserFast. You will observe a picture similar to this:



But how can explain that the resource governor does not obey 70:30 distribution we specified for the resource pools? The answer is quite simple: the queries landed on different schedulers (= different CPUs) and thus, there is nothing to share between them and each of them is taking a full CPU. How can you verify that?


First of all, verify the classification 



— verify the classification


select


      sess.session_id, sess.group_id, CONVERT(NCHAR(20), grps.name) as group_name


from


      sys.dm_exec_sessions as sess join


      sys.dm_resource_governor_workload_groups as grps on sess.group_id = grps.group_id


where


      session_id > 50


GO


Where you will see something like



session_id group_id    group_name


———- ———– ——————–


52         2           default            


51         265         gSlow              


54         266         gFast              


Everything is normal here. Now, let’s take a look at how currently executing requests landed on schedulers:



select


      r.session_id,


      CONVERT(NCHAR(20), wg.name) as group_name,


      t.scheduler_id,


      r.status


from sys.dm_exec_requests r


      join sys.dm_os_tasks t on r.task_address = t.task_address


      join sys.dm_resource_governor_workload_groups wg on r.group_id = wg.group_id


where


      r.session_id > 50


GO


Which will get



session_id group_name           scheduler_id status


———- ——————– ———— ——————————


54         gFast                0            runnable


52         default              0            running


51         gSlow                1            running


What you can see here is that gFast and gSlow is working on different schedulers, thus confirming our earlier statement. Thus Red and Blue line corresponds to utilization of 1 full CPU or 50%, since it is normalized to number of CPUs available on the box.


Is this a problem with Resource Governor in general? Certainly not if you have more executing requests than you have CPUs or if queries are running in parallel. But you should keep this in mind if you are trying to give 90% of CPU bandwidth to a single query running serially compared to a parallel query or number of serial queries in another group/pool.


A simple way of thinking about this is: is my workload capable of taking as much CPU as I told it to? Remember, when you set MAX_CPU_PERCENT on the pool, you allowing workloads belonging to the pool to take this much bandwidth on each scheduler. In case of the demo, you are working with a single query workload which produces this initially unexpected result.  

Comments (2)

  1. Anonymous says:

    If you are looking for some demos to play and learn about Resource Governor, you should have a look to

  2. Dinesh Vishe says:

    * Manage a single database spread over 200+ cpu cores?….