My name is Flavio Muratore and I am a Senior Support Escalation Engineer with the Windows Core Team. One subject we haven’t written much about in the Core team blog is “disk performance”.
Today I would like to talk a little bit about measuring Physical Disk IO Latency with Windows Performance Monitor (perfmon). Most likely you have some experience with Perfmon, it’s been around since the NT days.
You have probably heard general statements about what are acceptable disk latency measurements: “Less than 10 milliseconds is good and more than 20 milliseconds is bad”. Although these rules of thumb are used to simplify analysis, they do not apply in all cases and may lead to incorrect conclusions. Let’s check how this really works so we can understand these numbers.
Summary: The IO latency measured in perfmon includes all the time spent in the hardware layers as well as the time spent in the Microsoft Port Driver queue (Storport.sys for SCSI). If the running processes generate a large storport queue, the measured latency increases, as IO has to wait before getting dispatched to the hardware layers.
What is disk IO latency?
We can define disk IO latency as: A measure of the time delay from the time a disk IO request is created, until the time the disk IO request is completed.
What counters in Windows Performance Monitor show the physical disk latency?
“Physical disk performance object -> Avg. Disk sec/Read counter” – Shows the average read latency.
“Physical disk performance object -> Avg. Disk sec/Write counter” – Shows the average write latency.
“Physical disk performance object -> Avg. Disk sec/Transfer counter” – Shows the combined averages for both read and writes.
The “_Total” instance is an average of the latencies for all physical disks in the computer.
Each other instance represents an individual Physical Disk.
Note: Do no confuse with Avg. Disk Transfers/sec, which is a completely different counter.
Where does the performance data comes from?
For the “physical disk performance object”, the data is captured at the “Partition Manager” level in the storage stack.
Keep in mind Perfmon does not create any performance data per se; it only consumes data provided by other subsystems within Windows.
Where is the partition Manager in the Storage Stack?
A simplified explanation on the Windows Storage Stack follows.
When an application creates an IO request, it sends it to the Windows IO Subsystem (at the top of the stack). The IO will then make its way all the way down the stack (to the Hardware Disk Subsystem) and then come all the way back up. During this process, each layer will perform its function and then hand over the IO to the next layer.
So what are we really measuring with the Physical disk performance object -> Avg. Disk sec/Transfer (or /Read, or /Write) counter?
We are measuring all the time spent below the partition manager level.
When the IO request is sent by the Partition Manager down the stack we time stamp it, when it arrives back we time stamp it again and calculate the time difference. The time difference is the latency.
This means we are accounting for the time spent in the following components:
Class Driver – manages the device type, such as disks, tapes, etc.
Port Driver – manages the transport protocol, such as SCSI, FC, SATA, etc.
Device Miniport Driver – This is the device driver for the Storage Adapter. It is supplied by the vendor of the device (Raid Controller, and FC HBA).
Disk Subsystem – This includes everything below the Device Miniport Driver – This could be as simple as a cable connected to a single physical hard disk, or as complex as a Storage Area Network.
How disk queuing affects the measured latency in Perfmon?
There is only a limited number of IO a disk subsystem can accept at a given time. The excess IO gets queued until the disk can accept IO again. The time IO spends in the queues below the Partition Manager is accounted in the Perfmon physical disk latency measurements. As queues grow larger and IO has to wait longer, the measured latency also grows.
There are a multiple queues below the Partition Manager level:
Microsoft Port Driver Queue -SCSIport or Storport queue.
Vendor Supplied Device Driver Queue- OEM Device driver.
Hardware Queues – such as disk controller queue, SAN switches queue, array controller queue, hard disk queue.
Although not a queue, we also account for the time the hard disk spends actively servicing the IO and the travel time all the way back to the partition manager level to be marked as completed.
Finally, special attention to the Port Driver Queue (for SCSI Storport.sys).
The Port Driver is the last Microsoft component to touch an IO before we hand it off to the vendor supplied Device Miniport Driver.
If the Device Miniport Driver can’t accept any more IO because its queue and/or the hardware queues below are saturated, we will start accumulating IO on the Port Driver Queue. The size of the Microsoft Port Driver queue is limited only by the available system memory (RAM) and can grow very large, causing large measured latency.
The time the IO spent in queue is added to the disk latency in perfmon.
To keep the queue under control you have to tune your applications to limit the maximum number of outstanding I/O operations they generate. That’s a subject for another blog post.
For SCSI Disks (FC/RAID) you can enable Storport tracing to measure the latency below the Port Driver level. This does not account for the time spent in the storport queue or anything above. Essentially this is the lowest level we can possibly monitor the latency inside Windows before the IO is handed over to third party components. Check this excellent blog from NTdebug team for details.
“Storport ETW Logging to Measure Requests Made to a Disk Unit”
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support