Hello, Claus here again. It has been a while since I last posted here and a few things have changed since last time. Windows Server has been moved into the Windows and Devices Group, we have moved to a new building with a better café, but a worse view 😊. On a personal note, I can be seen waddling the hallways as I have had foot surgery.
At Microsoft Ignite 2016 I did a demo at the 28-minute mark as part of the Meet Windows Server 2016 and System Center 2016 session. I showed how Storage Spaces Direct can deliver massive amounts of IOPS to many virtual machines with various storage QoS settings. I encourage you to watch it, if you haven’t already, or go watch it again 😊. In the demo, we used a 16-node cluster connected over iWARP using the 40GbE Chelsio iWARP T580CR adapters, showing 6M+ read IOPS. Since then, Chelsio has released their 100GbE T6 NIC adapter, and we wanted to take a peek at what kind of network throughput would be possible with this new adapter.
We used the following hardware configuration:
- 4 nodes of Dell R730xd
- 2x E5-2660v3 2.6Ghz 10c/20t
- 256GiB DDR4 2133Mhz (16 16GiB DIMM)
- 2x Chelsio T6 100Gb NIC (PCIe 3.0 x16), single port connected/each, QSFP28 passive copper cabling
- Performance Power Plan
- 4x 3.2TB NVME Samsung PM1725 (PCIe 3.0 x8)
- 4x SSD + 12x HDD (not in use: all load from Samsung PM1725)
- Windows Server 2016 + Storage Spaces Direct
- Cache: Samsung PM1725
- Capacity: SSD + HDD (not in use: all load from cache)
- 4x 2TB 3-way mirrored virtual disks, one per cluster node
- 20 Azure A1-sized VMs (1 VCPU, 1.75GiB RAM) per node
- OS High Performance Power Plan
- DISKSPD workload generator
- VM Fleet workload orchestrator
- 80 virtual machines with 16GiB file in VHDX
- 512KiB 100% random read at a queue depth of 3 per VM
We did not configure DCB (PFC) in our deployment, since it is not required in iWARP configurations.
Below is a screenshot from the VMFleet Watch-Cluster window, which reports IOPS, bandwidth and latency.
As you can see the aggregated bandwidth exceeded 83GB/s, which is very impressive. Each VM realized more than 1GB/s of throughput, and notice the average read latency is <1.5ms.
Let me know what you think.
Until next time