Storage Spaces Direct with Persistent Memory

Howdy, Claus here again, this time with Dan Lovinger.

At our recent Ignite conference we had some very exciting results and experiences to share around Storage Spaces Direct and Windows Server 2016. One of the more exciting ones that you may have missed was an experiment we did on a set of systems built with the help of Mellanox and Hewlett-Packard Enterprise’s NVDIMM-N technology.

What’s exciting about NVDIMM-N is that it is part of the first wave of new memory technologies referred to as Persistent Memory (PM), sometimes also referred to as Storage Class Memory (SCM ). A PM device offers persistent storage – stays around after the server resets or the power drops – but can be on the super high speed memory bus, and accessible at the granularity (bytes not blocks!) and latencies we’re more familiar with for memory. In the case of NVDIMM-N it is literally memory (DRAM) with the addition of natively persistent storage, usually NAND flash, and some power capacity and to allow capture of the DRAM to that persistent storage regardless of conditions.

These 8 HPE ProLiant DL380 Gen9 nodes had Mellanox CX-4 100Gb adapters connected through a Mellanox Spectrum switch and 16 8GiB NVDIMM-N modules along with 4 NVMe flash drives – each – for an eye-watering 1TiB of NVDIMM-N around the cluster.

Of course, being storage nerds, what did we do: we created three-way mirrored Storage Spaces Direct virtual disks over each type of storage – NVMe and, in their block personality, the NVDIMM-N – and benched them off. Our partners in SQL Server showed it like this:PMperf

What we’re seeing here are simple, low intensity DISKSPD loads – equal in composition – which lets us highlight the relative latencies of each type of storage. In the first pair of 64K IO tests we see the dramatic difference which gets PM up to the line rate of the 100Gb network before NVME is even 1/3rd of the way there. In the second we can see how PM neutralizes the natural latency of going all the way into a flash device – even as efficient and high speed as our NVMe devices were – and provides reads at less than 180us to the 99th percentile – 99% of the read IO was over three times faster for three-way mirrored, two node fault tolerant storage!

We think this is pretty exciting! Windows Server is on a journey to integrate Persistent Memory and this is one of the steps along the way. While we may do different things with it in the future, this was an interesting experiment to point to where we may be able to go (and more!).

Let us know what you think.

Claus and Dan.

p.s. if you’d like to see the entire SQL Server 2016 & HPE Persistent Memory presentation at Ignite (video available!), follow this link: https://myignite.microsoft.com/sessions/2767