Hello, Jeff “The Dude” Stokes here for another post.
The other day I was implementing in my lab a pure Microsoft VDI solution (blog coming soon) leveraging the latest and greatest Hyper-V on Server 2012 R2. While doing so, I noticed I was getting abysmal performance coming from a four disk enclosure of RAID class prosumer hard drives. The enclosure was configured to use hardware RAID 5 and connected over USB 3.
My disk performance was looking something like this:
(Note on Windows Server 2012 R2, to enable disk you must (from an elevated command prompt) type diskperf –Y.)
For the unseasoned performance analyst out there, I’ll point out that my USB 3 enclosure was going slow, 2.396 seconds to do a transaction. Holey Moley! We’re looking for enterprise class hardware to generally perform transactions at 15ms or less, consumer grade maybe 20ms max? Also we’re getting about 1.7 MB/sec of throughput while doing this. Why is it so slow?!
At the time I was running 3 VMs at the same time, each with static 40 GB VHDs, deploying Windows 7 and some stuff from a MDT 2013 share on another LUN. This is not a light load, but it should go a bit quicker than that. Funny enough, when I stopped Hyper-V VMs residing on the enclosure, I got something like this when running a speed test:
As you can see, this looks much better, more um, sane? Now we are pushing 154 MB/sec writes at an average of 566ms, as opposed to 1.7 MB/sec total with 2.4 seconds (2400ms) of latency.
So, I did some troubleshooting:
o Swapped USB 3 cable
o Tried the on-board USB 3 ports, and two different brands/chipsets of Add-on cards
o Updated BIOS on MB
o Updated Firmware on the enclosure
o Updated USB 3 drivers
o Searched the web for super-secret reg keys to make USB 3 ‘go faster’ (doesn’t exist so save yourself the time)
o Talked to developers of the USB 3 stack for Microsoft who traced it and saw no issues
o Tore my hair out and complained to the wife about it
By the way for those interested, USB 3 tracing is wicked easy, by going to this blog:
[So after verifying it wasn’t USB 3, I was pretty sure about that, I thought to myself, what the heck is going on here?… I then started tinkering in the lab some more, had an epiphany on VHD vs VDHX and searched and found this document:
It says, AND I QUOTE, Page 157:
“The VHDX format also provides the following performance benefits (each of these is detailed later in this guide):
· Improved alignment of the virtual hard disk format to work well on large sector disks.
· Larger block sizes for dynamic and differential disks, which allows these disks to attune to the needs of the workload.
· 4 KB logical sector virtual disk that allows for increased performance when used by applications and workloads that are designed for 4 KB sectors.
· Efficiency in representing data, which results in smaller file size and allows the underlying physical storage device to reclaim unused space. (Trim requires pass-through or SCSI disks and trim-compatible hardware.)
When you upgrade to Windows Server 2012, we recommend that you convert all VHD files to the VHDX format due to these benefits. The only scenario where it would make sense to keep the files in the VHD format is when a virtual machine has the potential to be moved to a previous release of the Windows Server operating system that supports Hyper-V.”
These two points stuck in my head, “Improved alignment of the virtual hard disk format to work well on large sector disks.” And “4 KB logical sector virtual disk that allows for increased performance when used by applications and workloads that are designed for 4 KB sectors.”
Huh. How ‘bout that.
So I fired up Fsutil and verified my disk format type (4k, 512e or 512 bytes per sector, more on this here: http://en.wikipedia.org/wiki/Advanced_Format) by doing the following:
fsutil fsinfo ntfsinfo F:
And hey, it’s NOT Advanced Format according to fsutil. But the spec sheet for the drive says otherwise (512e to be exact). Who is correct here? Turns out drivers can give back data that appears to be garbage to fsutil, and then it defaults to 512 when this happens . So I know the drives SHOULD show 4096 for Physical Sector and 512 for Sector, but don’t. So I should still continue as if they do show 4096 for Physical Sector and 512 for Sector, which is 512e standard, which means I need to upgrade the VHDs to VHDX!
Easy peasy in Powershell 3.0, Convert-VHD on Server 2012/2012 R2 will do it, or you can use the GUI too.
Parameter Set: Default
Convert-VHD [-Path] <String> [-DestinationPath] <String> [-AsJob] [-BlockSizeBytes <UInt32> ] [-ComputerName <String> ] [-DeleteSource] [-ParentPath <String> ] [-Passthru] [-VHDType <VhdType> ] [-Confirm] [-WhatIf] [ <CommonParameters>]
Then just reconfigure the VM to talk to the new file instead of the old one and blamo, now you’re cookin with gas here!
After converting from VHD to VHDx, my performance (again, with RAID 5 so there is a serious write penalty) it looked like this:
Above you can see that we are running at a total of about 42.4 MB/second, it’s still a little slow with the response time, but the throughput has gone from 1.7 MB/second to 42.4 MB/second. Much better. Latency has halved just about.
If you were wondering if that’s all that VHDX is good for, search no further!
VHDX format supports virtual disk storage capacity of up to 64 TB
Protects against data corruption during power failures by logging updates to the VHDX metadata structures
Improves alignment of the VHD file to work well on large sector disks.
Support for TRIM on direct attached/SCSI hardware that supports TRIM, which results in smaller file size and allows the underlying physical storage device reclaim unused space.
Dr. Jeff’s Deep in the Weeds Section
Note that if you are in this boat and want to REALLY figure out what’s going on under the hood, Neal Chistiansen at Microsoft was kind enough to give this advice on attaching WINDBG or LiveKD to the Windows Kernel and figuring it out:
“You want to break in: nt!FsRtlGetSectorSizeInformation and follow what happens when the PhysicalGeometry is queried.
Note that there is a lot of validation of the returned information because we have seen garbage come back from this call. If incorrect information is returned then we fall back to a 512b sector device.”
There you have it. Our performance greatly increased just by using the latest hyper-v disk type. We didn’t even have to do anything else. We also discovered this performance problem by knowing what our performance baselines should be. Without this we’d have to rely on the users to complain about how slow the new VDI environment is which would probably get some folks in some hot water on this brand new solution…. if this was production. So the next question is… do you know your baselines 🙂 ? That’s it for now.
Jeff ‘the Dr. is in’ Stokes