In some cases CPU demo described in the previous post may behave not as expected.
One such case is doing the demo on multiple CPUs. Even though it appears as Resource Governor is “not working”, it is “by design”. Let’s take a look at what happens in this case.
What is our promise about CPU bandwidth distribution?
When you specify percentage on the pool we will ensure average CPU bandwidth distribution on per scheduler basis for CPU bound workloads. Note, emphasis: average, per scheduler, CPU bound. What does this mean?
For example, you have a 2 CPU machine, and SQL Server runs on both of the CPUs. If you haven’t set affinity mask as was indicated in the Setup step of the previous post,
will get you something like:
Important item to note here is that we have 2 online schedulers to service normal user requests (scheduler_id < 255). You can compare it with the case when affinity mask is set to 1. You will still have 2 schedulers (with scheduler_id < 255), but one of them will be offline.
Now, for simplicity, let’s configure 2 groups and 2 pools as follows:
And corresponding classifier function as
Create logins, assign function and make changes effective
Add group counters for CPU Usage % and run 1 CPU intensive query UserSlow, then 1 with UserFast. You will observe a picture similar to this:
But how can explain that the resource governor does not obey 70:30 distribution we specified for the resource pools? The answer is quite simple: the queries landed on different schedulers (= different CPUs) and thus, there is nothing to share between them and each of them is taking a full CPU. How can you verify that?
First of all, verify the classification
Where you will see something like
Everything is normal here. Now, let’s take a look at how currently executing requests landed on schedulers:
Which will get
What you can see here is that gFast and gSlow is working on different schedulers, thus confirming our earlier statement. Thus Red and Blue line corresponds to utilization of 1 full CPU or 50%, since it is normalized to number of CPUs available on the box.
Is this a problem with Resource Governor in general? Certainly not if you have more executing requests than you have CPUs or if queries are running in parallel. But you should keep this in mind if you are trying to give 90% of CPU bandwidth to a single query running serially compared to a parallel query or number of serial queries in another group/pool.
A simple way of thinking about this is: is my workload capable of taking as much CPU as I told it to? Remember, when you set MAX_CPU_PERCENT on the pool, you allowing workloads belonging to the pool to take this much bandwidth on each scheduler. In case of the demo, you are working with a single query workload which produces this initially unexpected result.