The Case of the Mysterious Black Box

Article
11/18/2009

I haven’t had any performance analysis challenges lately, but there is a lot of confusion as to how to measure SAN performance. To many, a SAN is a proverbial “mysterious black box” that seems to perplex all who try to measure it’s performance with any measure of certainty. This blog entry covers how I measure the performance of SANs and tries to unlock the mysteries of the black box.

SAN hardware and configurations vary greatly between our customers and SAN vendors. Some carve out disk arrays that are stripped across only a few spindles while others stripe the entire disk array across all of the spindles.

Before I go too deep, let’s get some terminology out of the way. This is *my* interpretation of these terms and may differ from the industry.

Disk Terminology

The following terms are used frequently in the disk industry.

Spindle: A physical hard disk. The *real* disk installed in the hardware.

LUN (Line Unit Number): This is the physical disk representation to the Windows operating system – meaning Windows thinks it is a single spindle, but could be many spindles masked by the hardware.

Physical Disk: The same as LUN in regards to Windows.

Logical Disk: A drive letter or mount point mapped to a physical disk.

SAN’s Consolidate and Maximize Your Disk Investment, but…

SAN’s are popular because they consolidate spindles allowing you to maximize your investment. Imagine you have a large file server, a large database, and a few other high end servers all with high end disk arrays dedicated to them. In many cases, one or more of the large, dedicated disk arrays is under utilized and/or one of them is over utilized. The SAN consolidates all of the spindles into a single location and spreads the load across all of them to maximize utilization. While this is a great way to spread the load, it’s a gamble with performance.

SANs Gamble with Disk Performance

Since SAN’s spread the load across all of spindles in many cases, it is taking a gamble that no single spindle will be contested for any length of time. If that event occurs, then it tries to compensate with cache and other technologies. It does a great job of handling this. Unfortunately, the SANs allow the SAN administrator to hand out *many* LUNs from the same disk array. This ends up becoming more of an optimization of disk capacity versus disk performance. In many cases that I have witnessed myself, I have found so much contention that the LUN performance is reduced to the speed of a floppy drive which is about 900ms!

Disk Analysis Basics

When it comes to analyzing the performance of any disk in Windows, our best approach is the simplest one… monitor the disks for response times. Specifically, monitor “\LogicalDisk(*)\Avg Disk sec/Read” and “\LogicalDisk(*)\Avg Disk sec/Write” for values greater than 15ms (0.015) on average. 5400rpm drives are able to respond within 15ms with no cache in front of it. The spindles typically come with a specification showing their seek times. Effectively, if the response times are greater than the seek times, then the spindle is likely falling behind. With that said, there might be a more advanced reason for the high response times. For example, what if the bytes per I/O is 2MBs? That would explain the long response times.

My PAL tool checks and throws warning alerts for response times greater than 15ms and critical alerts for response time greater than 25ms.

Response times measured by “\LogicalDisk(*)\Avg Disk sec/Read” and “\LogicalDisk(*)\Avg Disk sec/Write” is our most authoritative and primary indicator of disk performance. With that said, it can’t tell us the entire story. When you see high disk response times, don’t go running to the SAN administrators(s) with guns a blazing. Work with them on their terms by providing details of your research. This shows that you are trying to help and that you understand their needs.

The Needs of the SAN Administrator

SAN administrators like to have the following details because these are metrics that they can typically do something about. Try to gather this information over a long period of time and have a chart of graph prepared when presenting your findings.

· Disk Read and Write response times which you can get from “\LogicalDisk(*)\Avg Disk sec/Read” and “\LogicalDisk(*)\Avg Disk sec/Write”. For example, Microsoft BizTalk Server database LUNs should get less than 15ms sustained response times for both read and write I/O.

· IOPS (I/O’s per second) which you can get from “\LogicalDisk(*)\Disk Transfers/sec”. This helps with determining how many dedicated spindles are needed to supply the demand for IOPS for the LUN. For example, BizTalk database LUNs should get a least 200 IOPS or more sustained – 500 IOPS or more preferred. Why? Because my 5400rpm USB drive can do 200 IOPS of random write I/O with 5ms response times on average. Why have a SAN if a USB drive can out perform it?

· Bytes per I/O which you can get from “\LogicalDisk(*)\Avg Disk Bytes/Read” and “\LogicalDisk(*)\Avg Disk Bytes/Write”. The size of the I/O’s on average will tell the SAN administrators how to format the blocksize of disk partitions. For example, BizTalk database LUNs typically do about 16K per read or write operation on average with frequent spikes of 64K. This is why we typically recommend formatting the disks to a 64K blocksize. For more information on blocksizes, see “Disk Partition Alignment Best Practices for SQL Server”

· Ratio of Reads to Write which you can get from “\LogicalDisk(*)\Disk Reads/sec” and “\LogicalDisk(*)\Disk Writes/sec”. SAN cache can typically be adjusted for a read/write ratio and this information helps to adjust that setting. For example, BizTalk database LUNs for the MsgBoxDb is about 50% reads and 50% writes while the DTADb is about 99% writes and 1% reads.

· What is using the disks which you can get using Process Monitor (a Mark Russinovich SysInternals tool owned by Microsoft), Microsoft xPerf (part of the Windows Server 2008 Performance Toolkit), or from Resource Monitor which is part of Windows 7. This helps determine if the server is really using the disks or if anti-virus or backup software is really killing them unnecessarily. In any case, it’s important to know if the I/O is mission critical versus routine or unnecessary I/O. See my earlier blog post, “The Case of the Relentless Cookie Monster”, for more information on how to use the Process Monitor tool for disk analysis.

Once you have the information above and present it to the SAN administrators, they should be able to configure the SAN to meet these needs.

Conclusion

SANs are critical assets to disk management, but it seems that many disks presented to mission critical systems just can’t perform when the SAN is over saturated or not optimized for the kind of I/O the server needs. SAN administrators don’t instinctively know what the needs of your system are, so provide them with the information they need using the counter metrics mentioned above. I hope this helps to remove the veil of the mysterious black box.

Thanks again to Shane Creamer who originally taught me the disk analysis basics.

Windows 7 and Disk Analysis

Oh, and by the way, Windows 7 and Windows Server 2008 R2 have an incredible performance tool called the Resource Monitor. Just go to Task Manager and click the Resource Monitor button. Click the Disk tab and you will see the processes most active with disk I/O and their average response times to each file they are using. Imagine a SQL Server showing average response times to specific MDF and LDF files. This was a feature I asked for and Microsoft listened.