Troubleshooting High CPU utilization can be addressed from a couple of different standpoints. In the first instance, you might have a single process consuming CPU – essentially a runaway process. However, what happens if there is no runaway process, and yet the CPU utilization still continues to run high, sometimes at 100% for an extended period of time? You may be experiencing a processor bottleneck instead. Processor bottlenecks occur when the processor is so busy that it cannot respond to requests for a significant period of time.
The major indicators of a processor bottleneck are identifiable using Performance Monitor. There are two main counters to be aware of:
|Counter Name||Indicator||Values to Consider|
|Processor (_Total) \ % Processor Time||Processor Utilization||Sustained values > 90% on a single processor machine, or > 80% on a multiprocessor machine should be investigated|
|System \ Processor Queue Length||Current Depth of the thread Scheduler Ready Queue||If the Ready threads per processor value is > 2 with some frequency this may indicate a processor bottleneck.|
Interpretation of these two counters should be performed with the following considerations:
- Processor state is sampled hundreds of times per second. These results are accumulated over the interval for which the data is gathered. Thus, if the interval used for data gathering is a small one, you may have skewed data
- Applications capable of absorbing any free processor cycles (for example screen savers or web page animation controls) should be excluded from the analysis as they can cause skewed data
- Processor Queue Length is a counter that reflects a point in time data point based on the last sample of the processor state
- Most threads are in a voluntary Wait state for much of the time. The remaining threads actively trying to run (a subset of the total number of existing threads) form the upper limit on the number of threads you can see in the processor queue
A combination of high processor utilization and a lengthy processor queue may be an indication that the processor is overloaded. Remember however, that a misbehaving application can cause the same symptoms. For example, if an application thread is caught in an infinite loop, that can drive processor utilization to 100% (although you would most likely catch this in Task Manager). Additionally, remember that sudden spikes may create a scenario where it appears that you have a processor bottleneck. Having a performance baseline of the server is key to understanding and detecting abnormal conditions and take corrective actions in a timely fashion. Remember that any normal workload that drives the CPU to 100% utilization on a consistent basis for an extended period of time should be investigated. Even if the queue of Ready threads is low, there is obviously some pressure on the CPU. A faster processor would provide relief. If the workload is multi-threaded, moving from uniprocessor to multiprocessor may offer additional relief. Dividing the workload among a cluster of machines may be another option. Lastly, consider this scenario: If the workload is something that can be shifted to a different time when there is less work being done on the machine, that may help to ease the pressure on the CPU – a common example would be running backups during business hours. These operations can cause severe I/O and Network pressure as well as CPU utilization – moving them to the evenings / weekends helps to reduce the load during business hours.
That brings us to the end of this post. In our next post, we’ll cover the different ways in which a processor executes instructions. Until next time …
- Technet Article on Processor Bottlenecks (mainly geared towards Exchange but nevertheless a good read!)