Using Microsoft SQLIO to Baseline LOB Application Disk Performance

When implementing, replatforming, upgrading, or simply making technology stack changes to an LOB application like SAP, PeopleSoft, Siebel, EBS, or Dynamics, special attention must be paid to configuring and optimizing the disk subsystem hosting the system’s data. After all, experience tells us that this is a common culprit in end-user performance problem scenarios. To forgo tuning and validating the disk subsystem post-change is to risk serious performance and availability issues. Yet many of the methods and tools commonly used for baselining disk performance are often tied to the application package being deployed or to the underlying RDBMS. Because of their complexity and the investment needed in terms of time, these methods often takes a back seat to other matters. What we REALLY need is a simple way to validate that a SAN or other disk subsystem can push a certain I/O load or support a particular throughput level. For example, if I know my newly deployed or updated SAN can move data at 1.2 GB/second, or process 10,000 64kb reads every second, I’d be really comfortable using that SAN to support an SAP ERP system that’s only expected to process 4,000 reads a second. So…..let’s talk about a easy-to-use tool and method of testing and validating disk subsystem performance that comes into play before the database and application are ever even introduced into the equation. The tool – Microsoft’s SQLIO.

   

What is SQLIO?

SQLIO is a tool provided for many years from Microsoft, developed long ago to help the SQL Server technical team characterize different computing platforms. As such, it’s described as “a disk workload generator that is designed to simulate some aspects of the I/O workload of Microsoft SQL Server.” The key here is “some,” of course, as simulating a true production database is much more complex than executing a command-line utility with a couple of switches enabled. But, as many of my very own enterprise customers know, Microsoft’s SQLIO utility is a worthwhile tool for a number of reasons:

   

  • It’s simple to set up and execute, small in terms of footprint, and eminently portable
  • It’s highly configurable, easily allowing for testing the impact that different sized reads or writes, sequential and direct, have on a particular disk configuration
  • It provides the core information a solutions or technical architect requires when it comes to disk subsystem testing – MB per second throughput, and the number of I/Os processed per second
  • SQLIO also can report CPU utilization, making it easy to correlate the relationship between disk loads and processor utilization

The “older” standalone version of SQLIO was not generally available to the public; Microsoft’s Hardware and Services partners were authorized to use the tool provided it was not left in the hands of the customer. Another tool – SQLIOStress – was generally available to anyone through public download – but this tool was a bit more complex and focused more specifically on emulating MS SQL Server. HOWEVER, in 2005 a publicly available version of SQLIO (offered as an msi file) was introduced. You may download it from https://www.microsoft.com/downloads/en/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442d9e19&displaylang=en. Keep in mind that while the remainder of this blog posting focuses on the older standalone version of SQLIO, the methods of using the tools are similar.

 

How to Baseline a Disk Subsystem Using SQLIO

Once the disk subsystem has been setup, changed, or provisioned online (in the case of Amazon S3, Azure, etc), the next step is to begin setting up the specific disk groups and LUNs as directed by the LOB or vendor Storage Architect. After the LUNs are carved up, they are typically presented (made accessible) to the LOB database server and other servers that will ultimately use the disk subsystem; afterwards, the database and application installations can begin. However, before the RDBMS and applications are installed, I recommend using SQLIO to baseline your disk subsystem:

  • Connect a Windows client to the disk subsystem; a server-based OS will be required if more than a handful of LUNs will be tested. If the system is eventually going to run a Windows OS, you may as well use the Database Server as your SQLIO client.
  • Format the LUNS that will eventually house data (i.e. your SQL data files or Oracle-on-Windows tablespaces). If your OS will eventually be configured for a particular blocksize, be sure to format the data partitions for the samesize, too (keeping in mind that different OSs inherently exhibit different performance characteristics from one another, but that the overall the test will be truer apples-to-apples if this important configuration parameter is consistent).
  • Copy the SQLIO executable and its configuration files to the client machine, i.e. to a directory named C:\SQLIO.
  • Edit the run.cmd configuration file to reflect the blocksize to be tested, whether the test will be read or write-oriented, sequential or direct, and so on. Edit the second configuration file (pfile.txt) to reflect the disk drive letter(s) and size of the data files against which the testing will be performed. I typically configure multiple multi-GB data files spread out in a way that’s similar if not identical to what will eventually be in production.

   

The fact that SQLIO allows you to define the time your particular test run will execute makes it a much better alternative to its hard-coded counterparts that require an hour to run. SQLIO also allows you to control queue depth per thread; thus, you can easily execute a “smoke test” where a disk subsystem is truly saturated, or emulate a more realistic “average load test” where the disk queue might be 1, 2, 4, 8, etc (whatever is deemed appropriate) in front of each LUN.

   

For simple baseline testing or even “delta” testing between two disk subsystem configurations, I like to execute each of eight discrete SQLIO tests for 60 seconds at a time. The SQLIO configuration file is edited to run through sequential read, random read, sequential write, and random write 8kb blocksize operations (4 different tests). I then repeat the same set of tests for 64kb block sizes, for 8 tests total. The fact that only eight minutes of wall-clock time is consumed for these eight minutes of testing makes SQLIO a compelling tool set indeed (it’s important to remember that the first time the tests are executed, though, the “data” files against which the tests will run will actually be created, making the first test longer in duration than subsequent tests).

   

Analyzing SQLIO’s output is easy. Since the SQLIO test-run output is space-delimited, dump one or multiple test runs into an Excel spreadsheet for subsequent analysis. I typically set up different columns for specific disk configurations (i.e. iterations in tweaking the configuration’s disk controller setup, group configuration, number of spindles per group or LUN, and so on), OS configuration changes (i.e. changes to block sizes or the number of logical drives). I then dump the test run output underneath each vertical column, so as I scan horizontally across my spreadsheet I can observe changes in MB/second throughput and total I/Os processed per second. It’s a simple way to help reinforce that my changes to a particular disk subsystem or OS configuration are indeed resulting in adequate (hopefully improved!) disk throughput and performance.

   

At the end of the day, with my SQLIO output in hand it’s easy for me to prove to a customer that their particular disk subsystem is ready for prime-time. Sure, the actual database configuration or application itself might be horribly tuned (another discussion altogether!). When performance problems crop up and the customer is tempted to start finger-pointing, I’ll be able to keep them focused away from the disk and OS….and in cases where an LOB application is already live, SQLIO gives me the ability to determine if the disk system is configured well relative to the load it is supporting. Either way, the comparative value – through iterative tuning and testing, or by comparing my results to those of previous SQLIO baseline engagements – saves all of us a lot of time, and helps us focus our collective tuning efforts on the technology layers where that incremental effort may actually provide real value.

   

I've got some "Using SQLIO tips and tricks" that might be useful to some of our readers - let me know if that's interesting to you, and I'll get those posted as well.

   

Enjoy the long weekend!

george