Storage Spaces Direct with Samsung Z-SSD™

Hello, Claus here again.

Today we are going to take a look at a new device from Samsung, the SZ985, which is marketed as a ultra-low latency NVMe SSD based on Samsung Z-NAND flash memory and a new NVMe controller. It offers ~3GB/s throughput, and random read operations in 20µs with capacities up to 3.2TB and 30 drive writes per day (DWPD).

We added two Z-SSD devices to each server in a 4-node cluster, each node configured with the following hardware:

  • 2x Intel® Xeon® E5-2699v4 (22 cores @ 2.2 GHz)
  • 128GiB DDR4 DRAM
  • 2x 800GB Samsun Z-SSD
  • 20x SATA SSD
  • 1x Mellanox CX-3 Pro 2x40Gb
  • BIOS configuration
    • BIOS performance profile
    • C States disabled
    • Hyper-threading on
    • Speedstep/Turbo on

We deployed Windows Server 2016 Storage Spaces Direct and VMFleet with:

  • 4x 3-way mirror CSV volumes
  • Cache configured for read/write
  • 44 VMs per node, each with
    • DISKSPD v2.0.17
    • 5GB working set (~900GB total)
    • 1 IO thread
    • 8 QD

First, we took a look at 100% read scenario. The graph shows that observed latency at the top of the storage stack stayed relative constant as we ramped IOPS. The highpoint is 200µs @ 200K IOPS, but staying between 100-150µs at 400K+ IOPS.

The graph below shows the CPU utilization linear increase as we ramp up IOPS, which is expected.

Second, we took a look at 90% read and 10% write scenario, which is more common. Writes have to be performed on multiple nodes to ensure resiliency, which involves network communication and thus is  bit slower than local read operations, but stayed under 1ms even at 1M+ IOPS, and reads stayed very close to what was seen with 100% read.

Similar to the 100% read scenario, the CPU utilization increases linear as we increase IOPS pressure on the system in the 90% read and 10% write scenario.

It is good to see the innovation in driving down latency in flash storage to the benefit of relational database servers, like SQL Server, and caches, like the Storage Spaces Direct cache. I look forward to seeing these devices in our Windows Server Software-Defined datacenter solutions.

What do you think?

Until next time

Claus