Summary: Guest Blogger Kevin Kline talks about using Windows PowerShell to monitor SQL Server performance.
Hey, Scripting Guy! I am wondering about using Windows PowerShell to work with SQL performance counters. Is this something that can be done?
Microsoft Scripting Guy Ed Wilson here. We have a new guest blogger today. Kevin Kline is the technical strategy manager for SQL Server Solutions at Quest Software, a leading provider of award-winning tools for database management and application monitoring.
Kevin is a founding board member and former president of the international Professional Association for SQL Server (PASS) and frequently contributes to database technology magazines, websites, and discussion forums. Kevin was the recipient of the PASS 2009 Lifetime Achievement Award.
Tuning SQL Server: Windows PowerShell + PerfMon <> 2P. It’s = P2!
Performance tuning is one of the most rewarding and interesting parts of the IT pro’s job. Don’t like performance tuning? Then you’re missing out on an activity that provides endless variety, opportunity for innovation, and a chance to show the boss that you rock.
So what’s keeping you from jumping into performance tuning? Whenever I talk to customers—in every environment from small four-person IT shops to huge Fortune 50 enterprises—about why they don’t spend more time doing performance tuning, it usually comes down to two obstacles:
- The tyranny of the urgent: Boiled down to its essence, IT pros have too many fires to put out to actually get on top of performance. The analogy I like to use is, “When you barely have enough time to bail the water out of your sail boat to keep it from sinking, you certainly won’t have time to adjust the sails for better speed.”
- Skills: Or as my buddy Jermaine says “mad skilz.” Simply put, IT pros have so much to do and so little time to do it that they don’t have time to learn how to do one more thing. Our brains are full and learning something new not only requires a big investment in brain power, it will very likely lead to more responsibility without removing any of our other responsibilities. So not only do we not have the time to learn performance tuning, we don’t have the time for all of the follow-up responsibilities we will have to commit to in the future. The analogy I like to use is, “Ignorance is bliss, or at least less painful than not being ignorant.”
But there’s a tradeoff that PowerShell plus Performance Monitor (PerfMon, also known as System Monitor and, most recently, Windows Reliability and Performance Monitor) offers you to mitigate the obstacles mentioned above.
First, and in answer to the tyranny of the urgent, when you get a handle on performance tuning, urgent issues crop up less often—a lot less often. When you don’t know how to collect and interpret performance issues, every snag seems like an emergency. But many times, those red flags turn out to be red herrings instead. Instead of going into emergency firefighter mode every now and then, you’re forced to put on your fire hat every time the phone rings. Too much stress! Second, and in response to acquiring mad skilz, Scripting Guy Ed Wilson and I have teamed up to make performance tuning a “no-skilz required” activity. Why learn a whole new set of skills when you can learn just 20 percent of the content but get 80 percent of the productivity boost of being an expert? Invoke the power of the 80/20 rule by reusing these lessons freely.
What’s our objective?
We’re going to teach you how to:
- Perform a quick health check of SQL Server.
- Find the best performance counters for SQL Server performance tuning.
- Develop robust monitoring solutions.
Once you’ve followed our guidance here, performance tuning on SQL Server is no longer a black art. That’s because you’ll be able to shine the bright light of good instrumentation on your Windows and SQL Server instances.
Step 1: Perform a quick health check of SQL Server
Knowing which PerfMon counters to keep an eye on is half the battle. Maybe even more than half the battle. I’m going to start off by teaching you the handful of PerfMon counters, as well as their usage and values, that enable you to perform a quick health check of Windows and SQL Server.
A little later in the article, I’ll also show you where you can get a huge best practices collection for SQL Server PerfMon counters if you’re interested in looking at more than a handful of counters or are ready to go to the next step of hardcore performance tuning.
As a practical recommendation from a long-time SQL Server expert, SQL Server usually encounters problems in IO, memory, and then CPU (in order of commonality). The reason that SQL Server consumes an inordinate amount of IO, memory, and CPU is usually because of poor-quality SQL code, poor database design, or inadequate hardware (in order of commonality). Notice that poor code and poor design are usually the problem. So that means that if you usually fix performance issues by throwing more hardware at the problem, you’ll eventually end where you started—with an underperforming application. This is because poor code and poor design will always use more resources than they have available.
So here’s your quick list of PerfMon counters to determine SQL Server health relating to IO, memory, and CPU. These three sections are analogous to every visit to the medical clinic that starts with pulse, blood pressure, and temperature checks. If one of these is out of whack, something is definitely wrong. We haven’t necessarily diagnosed an illness, but we know there’s definitely a problem.
When assessing the top level of IO health, latency is your best quick health check. Latency means the amount of time measured between the initiation of an IO operation and its completion. Though these numbers can vary widely due to all sorts of variations in the underlying storage systems, here’s a good place to start:
Physical Disk: Average Disk/sec Reads
Physical Disk: Average Disk/sec Write
The lower these values, the better. Microsoft recommends that a well-tuned IO subsystem should deliver IOs at 5 milliseconds or below on the disks holding the transaction log files and at 20 milliseconds or less on the disks holding the data files. (The transaction logs and database files are on separate disks, right?) In real world systems, I’ve seen applications perform well when reads and writes average a bit higher than their white paper recommended values, but these are great rules of thumb. Numbers a lot higher than the recommendations means that the IO subsystem is under stress.
A lot of IT pros new to SQL Server get quite alarmed when they see that SQL Server is gobbling up all of a server’s available memory. Not to worry—that’s by design. By default, SQL Server is configured to grab all of a server’s available physical memory (even if it doesn’t immediately use it all), and then give it back to other Windows processes whenever they ask for it. That way, SQL Server can optimize large blocks of memory for queries and major processes, such as big reporting jobs and backup processes.
When doing a quick check of SQL Server memory, I like to corroborate the findings by checking more than one indicator. In this case, these two counters provide an excellent quick indicator of memory pressure inside SQL Server:
SQL Server: Memory Manager >> Free List Stalls/sec
This counter monitors the number of requests per second where data requests stall because no buffers are available. Any value greater than 2 indicates that SQL Server needs more memory:
SQL Server: Memory Manager >> Memory Grants Pending
This counter shows the total number of processes per second waiting for a workspace memory grant. Numbers greater than 0 indicate a lack of memory.
It’s not very common for SQL Server, when running OLTP workloads, to use up major amounts of CPU. Even a busy SQL Server will usually use between 25-45 percent of CPU when responding to a transaction heavy workload. SQL Server will use more CPU for BI applications, but even then it usually consumes an added 20-30 percent of CPU. To find out the total amount of CPU being used and the proportion of that being used by SQL Server, check these counters:
Processor(_total): % Processor Time
This is the percentage of elapsed time the processor spends executing work (in other words, nonidle threads). On a box dedicated to SQL Server, I raise a red flag if this is frequently above 80 percent:
Process (sqlservr): % Processor Time
This shows the percentage of processor time spent exclusively on SQL Server process threads. Combining this value with the Processor:% Processor Time value will show you conclusively the overall CPU utilization of the server overall compared to how much SQL Server is using.
As I pointed out earlier in the memory section, it’s sometimes very useful to corroborate a finding with overmeasures to ensure you have a clear picture of the issue. Any quick and easy check of CPU pressure to add to the mix follows:
System: Processor Queue Length
It’s sometimes a bit of hassle trying to figure out CPU utilization based on percentages when you factor in multiple cores, hyperthreading, and virtualization. So it’s often valuable not to check a metric based on percentages, but on the raw number of threads waiting for access to that resource. In this case, the processor queue length is a great resource because it represents the number of threads waiting for CPU: anything about 12 is a red flag; values between 9 to 12 per CPU are good or fair; 5–8 is better; and 4 or less is best.
Step 2: Find the best performance counters for SQL Server performance tuning
In the previous section, I’ve pointed out a couple PerfMon counters that quickly assess the health of your SQL Server’s IO, memory, and CPU. These quick checks basically show whether your SQL Server is overextended or not. But they don’t reveal the root cause that is contributing to overconsumption of IO, memory, or CPU. For example, you might have memory issues on a SQL Server because stored procedures running on the server are constantly recompiling and are never able to stay in the cache for long. Similarly, a SQL Server might by showing high IO consumption when, at the root of the problem, the transactions running on the server are constantly blocking each other and preventing each other from completing quickly. There are PerfMon counters for that, too.
I’ve spent a lot of time building the ultimate list of SQL Server PerfMon counters and the troubleshooting scenarios when they’re most useful. Rather than run through all of them here, please take a look at my website to take a deeper dive.
Step 3: Develop robust monitoring solutions
Now that you know what to look for, it’s time to use Windows PowerShell to regularly poll your SQL Servers for these PerfMon values. I strongly encourage you to use Windows PowerShell not just for an occasional spot check of these values but to build a regular monitoring solution that runs at a rather frequent polling interval—say, 5 or 15 minutes, saving the data as you go along. That way you can save long-term performance information and look for trends and problem areas in your system.
To get PerfMon information about using Windows PowerShell, we’ll use the Get-Counter cmdlet. Without parameters, this cmdlet retrieves a handy set of summary information about the server. However, we want to retrieve performance information from specific computers. We do that by providing the name of the server along with the path of the counter, as I’ve shown earlier, using the –Computer parameter:
PS > $computer = $ENV:Computername
PS > Get-Counter "\$computerprocessor(_total)% processor time"
6/11/2011 11:16:44 AM \...processor(_total)% processor time :
Don’t know the path to the performance counter you want? Then use the -ListSet parameter to search for just the right counter or set of counters. To see all counter sets, use an asterisk as the parameter value.
When building your database of monitored values, export the retrieved counter information using the Export-Counter cmdlet to save your data in a format that other tools can read, such as the .blg format used by Windows Performance Monitor. That way you can use the nice graphic tools in Windows Performance Monitor for easy graphic analysis of performance and problems.
Thanks, Kevin, that was exactly what I was looking for.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at firstname.lastname@example.org, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy