Hello, Claus here again. This time we are going to take a look at a couple of the key enhancements to Storage Spaces Direct that are coming alive in Windows Server Technical Preview 4 namely Multi-Resilient Virtual Disks and ReFS Real-Time Tiering. These two combined solves two key issues that we have in Windows Server 2012/R2 Storage Spaces. The first issue is that parity spaces only works well for archival/backup workloads – it does not perform well enough for workloads such as virtual machines. The second issue is that the tiering mechanism is an ‘after the fact tiering’ in that the system collects information about hot and cold user data, but only moves this user data in and out of the faster tier as a scheduled task using this historical information.
I suggest reading my blog post on Software Storage Bus and the Storage Bus Cache, which contains important information about how the Software Storage Bus and Storage Bus Cache works, both of which sits underneath the virtual disks and file systems.
Multi-Resilient Virtual Disks
A multi-resilient virtual disk is a virtual disk, which has one part that is a mirror and another part that is erasure coded (parity).
Figure 1 Virtual disk with both mirror and parity tier.
To arrive at this configuration, the administrator defines two tiers, just like in Windows Server 2012 R2, however this time the tiers are defined by their resiliency setting rather than the media type. Let’s take a look at a PowerShell example for a system with SATA SSD and SATA HDD (the Technical Preview 4 deployment guide also includes an example for an all-flash system with NVMe + SATA SSD):
# 1: Enable Storage Spaces Direct
# 2: Create storage pool
New-StoragePool -StorageSubSystemFriendlyName *cluster* -FriendlyName S2D -ProvisioningTypeDefault Fixed -PhysicalDisk (Get-PhysicalDisk | ? CanPool -eq $true)
# The below step is not needed in a flat (single tier) storage configuration
Get-StoragePool S2D | Get-PhysicalDisk |? MediaType -eq SSD | Set-PhysicalDisk -Usage Journal
# 3: Define Storage Tiers
$MT = New-StorageTier –StoragePoolFriendlyName S2D -FriendlyName MT -MediaType HDD -ResiliencySettingName Mirror -PhysicalDiskRedundancy 2
$PT = New-StorageTier –StoragePoolFriendlyName S2D -FriendlyName PT -MediaType HDD -ResiliencySettingName Parity -PhysicalDiskRedundancy 2
# 4: Create Virtual Disk
New-Volume –StoragePoolFriendlyName S2D -FriendlyName <VirtualDiskName> -FileSystem CSVFS_ReFS -StorageTiers $MT,$PT -StorageTierSizes 100GB,900GB
The first two steps enable Storage Spaces Direct and creates the Storage Pool. In the third step we define the two tiers. Notice that we use the “ResiliencySettingName” parameter in the definition of the tiers, where the MT tier has “ResiliencySettingName” set to “Mirror” and the PT tier has “ResiliencySettingName” set to “Parity”. When we subsequently create the virtual disk we specify the size of each tier, in this case 100GB of mirror and 900GB of parity, for a total virtual disk size of 1TB. ReFS uses this information to control its write and tiering behavior (which I will discuss in the next section).
The overall footprint of this virtual disk on the pool is 100GB * 3 (for three copy mirror) + 900GB *1.57 (for 4+3 erasure coding), which totals ~1.7TB. Compare this to the overall footprint of a similar sized 3-copy would have a footprint of 3TB.
Also notice that we specified “MediaType” as HDD for both both tiers. If you are used to Windows Server 2012 R2 you would think that this is an error – but it is actually on purpose. For all intents and purposes the “MediaType” is irrelevant as the SSD devices are already used by the Software Storage Bus and Storage Bus Cache as discussed in this blog post.
ReFS Real-Time Tiering
Now that we have created a multi-resilient virtual disk lets discuss how ReFS operates on this virtual disk. ReFS always writes into the mirror tier. If the write is an update to data sitting in the parity tier, then the new write still goes into the mirror tier and the old data in the parity tier is invalidated. This behavior ensures that writes are always written as a mirror operation which is the best performing, especially for random IO workloads like virtual machines and requires the least CPU resources.
Figure 2 ReFS write and data rotation
The write will actually land in the Storage Bus Cache below the file system and virtual disk. The beauty of this arrangement is that there is not a fixed relation between the mirror tier and the caching devices, so if you happen to define a virtual disk with a mirror tier that is much larger than the actual working set for that virtual disk you are not wasting valuable resources.
ReFS will rotate data from the mirror tier into the parity tier in larger sequential chunks as needed and perform the erasure coding computation upon data rotation. As the data rotation occurs in larger chunks it will skip the write-back cache and be written directly to the capacity devices, which is OK since its sequential IO with a lot less impact on especially rotational disks. Also, the larger writes overwrite entire parity stripes, eliminating the need to do read-modify-write cycles that smaller writes to a parity Space would otherwise incur.
So they say you cannot have your cake and eat it too, however in this case you can have capacity efficacy with multi-resilient virtual disks and good performance with ReFS real-time tiering. These features are introduced in Technical Preview 4 and as we expect to continue to improve performance as we move towards Windows Server 2016.