Summary: Gregg O’Brien, a Microsoft Premier Field Engineer from Canada, walks us through some lessons he learned when testing Storage Spaces, a new feature in Windows Server 2012 that’s meant to provide enterprise-level storage performance, flexibility, and scale. The moral of his story: for heavy workloads requiring high speed, random I/O and the best resiliency possible, use mirrored space instead of parity spaces. Enjoy!
I was recently building a lab based around Storage Spaces, a new feature in Windows Server 2012. My plan was to host a Hyper-V over SMB configuration for some virtual machines. I decided to build the configuration with a Parity Stripe so that I could maximize the use of the storage I had, while still maintaining some form of resiliency. I installed 3 x 1TB SATA drives and configured the storage space as required. I created the share for the VMs and started using the share to store VMs from a Hyper-V cluster I had built previously.
Through the process of using the VMs, I noticed that these were very slow. I started investigating and saw that the VMs all had performance issues writing to disk. With even further investigation, I found that it had to do with slow disk performance on the storage space. I tested this by copying a 1.5GB file unbuffered (xcopy /J) to the parity space. Speed topped out at 25MB/s. I decided that I would test each disk individually, so I broke the stripe and tested each disk with the same 1.5GB file. Speeds reached as high as 140MB/s. Quite the difference. I tried a mirrored space and that also performed at speeds of roughly 140MB/s.
It seemed to be that everything I could think of worked flawlessly, except for the parity space. But why?
Well, I asked around within Microsoft and was pointed to this document: Storage Spaces - Designing for Performance. Within this document I found the following key piece of info:
“The caveat of a parity space is low write performance compared to that of a simple or mirrored storage space, since existing data and parity information must be read and processed before a new write can occur. Parity spaces are an excellent choice for workloads that are almost exclusively read-based, highly sequential, and require resiliency, or workloads that write data in large sequential append blocks (such as bulk backups).”
So really, the product is working as intended, I was just attempting to use it for the wrong thing! The article does go on to say that a parity space of 3 drives could be improved by using two dedicated SSD drives for the journal, but at this point the space would more than likely exceed the cost of the mirrored space and still may not approach the same speed that the mirrored space would provide.
Moral of the story: For heavy workloads requiring high speed, random I/O and the best resiliency possible, use the mirrored space as opposed to the parity space.