This document discusses one of several design patterns that you can use for building out your Infrastructure as a Service fabric infrastructure. This document covers the converged architecture pattern, which is defined by converging networks for each of the traffic profiles used in a Microsoft IaaS solution.
Table of Contents (for this article)
This document is part of the Microsoft Infrastructure as a Service Foundations series. The series includes the following documents:
For more information about the Microsoft Infrastructure as a Service Foundations series, please see Chapter 1: Microsoft Infrastructure as a Service Foundations
Adam Fazio – Microsoft
David Ziembicki – Microsoft
Joel Yoker – Microsoft
Artem Pronichkin – Microsoft
Jeff Baker – Microsoft
Michael Lubanski – Microsoft
Robert Larson – Microsoft
Steve Chadly – Microsoft
Alex Lee – Microsoft
Yuri Diogenes – Microsoft
Carlos Mayol Berral – Microsoft
Ricardo Machado – Microsoft
Sacha Narinx – Microsoft
Tom Shinder – Microsoft
Jim Dial – Microsoft
Windows Server 2012 R2
System Center 2012 R2
Windows Azure Pack – October 2014 feature set
Microsoft Azure – October 2014 feature set
The goal of the Infrastructure-as-a-Service (IaaS) Foundations series is to help enterprise IT and cloud service providers understand, develop, and implement IaaS infrastructures. This series provides comprehensive conceptual background, a reference architecture and a reference implementation that combines Microsoft software, consolidated guidance, and validated configurations with partner technologies such as compute, network, and storage architectures, in addition to value-added software features.
The IaaS Foundations Series utilizes the core capabilities of the Windows Server operating system, Hyper-V, System Center, Windows Azure Pack and Microsoft Azure to deliver on-premises and hybrid cloud Infrastructure as a Service offerings.
As part of Microsoft IaaS Foundations series, this document discusses one of several design patterns that you can use for building out your Infrastructure as a Service fabric infrastructure. This document covers the converged architecture pattern, which is defined by creating separate networks for each of the traffic profiles used in a Microsoft IaaS solution.
This section contains an architectural example that is based on the Microsoft Infrastructure as a Service converged-pattern. This example will provide guidance about the hardware that is required to build the converged pattern reference architecture by using high-level, non-OEM–specific system models.
The converged pattern comprises advanced blade servers that utilize a converged-network and storage-network infrastructure (often referred to as converged network architecture) to support a high availability Hyper-V failover-cluster fabric infrastructure. This infrastructure pattern provides the performance of a large-scale Hyper-V host infrastructure and the flexibility of utilizing software-defined networking capabilities at a higher system density than can be achieved through traditional non-converged architectures.
Although many aspects of converged architectures are the same, this section outlines the key differences between these two patterns. The following diagram outlines an example logical structure of components that follow the converged architectural pattern.
Figure 1 Converged architecture pattern
In the converged pattern, the physical converged network adapters (CNAs) are teamed, and they present network adapters and Fibre Channel HBAs to the parent operating system. From the perspective of the parent operating system, it appears that network adapters and Fibre Channel HBAs are installed. The configuration of teaming and other settings are performed at the hardware level.
As identified in the non-converged pattern, compute infrastructure remains as the primary element that provides fabric scale to support a large number of workloads. Identical to the non-converged pattern, the converged fabric infrastructure consists of an array of hosts that have the Hyper-V role enabled to provide the fabric with the capability to achieve scale in the form of a large-scale failover cluster.
Figure 2 provides an overview of the compute layer of the private cloud fabric infrastructure.
Figure 2 Compute minimum configuration
With the exception of storage connectivity, the compute infrastructure of the converged pattern is similar to the infrastructure of the non-converged pattern, because the Hyper-V host clusters utilize FCoE or iSCSI to connect to storage over a high-speed, converged network architecture.
2.1.1 Hyper-V Host Infrastructure
As in non-converged infrastructures, the server infrastructure comprises a minimum of four hosts and a maximum of 64 hosts in a single Hyper-V failover-cluster instance. Although Windows Server 2012 R2 failover clustering supports a minimum of two nodes, a configuration at that scale does not provide sufficient reserve capacity to achieve cloud attributes such as elasticity and resource pooling.
Converged infrastructures typically utilize blade servers and enclosures to provide compute capacity. In large-scale deployments in which multiple resource pools exist across multiple blade enclosures, a guideline of containing no more than 25 percent of a single cluster in a blade enclosure is recommended.
When you are designing the fabric network for the Windows Server 2012 R2 Hyper-V failover cluster, it is important to provide the necessary hardware and network throughput to provide resiliency and Quality of Service (QoS). Resiliency can be achieved through availability mechanisms, and QoS can be provided through dedicated network interfaces or through a combination of hardware and software QoS capabilities.
Figure 3 provides an overview of the network layer of the private cloud fabric infrastructure.
Figure 3 Network minimum configuration
2.2.1 Host Connectivity
During the design of the network topology and associated network components of the private cloud infrastructure, the following key considerations apply:
- Provide adequate network port density—Designs should contain top-of-rack switches with sufficient density to support all host network interfaces.
- Provide adequate interfaces to support network resiliency—Designs should contain a sufficient number of network interfaces to establish redundancy through NIC teaming.
- Provide network Quality of Service— Having dedicated cluster networks is an acceptable way to achieve QoS, however the use of high-speed network connections in combination with either hardware-defined or software-defined network QoS policies provide a more flexible solution.
For Microsoft IaaS pattern designs, a minimum of two 10 GbE converged network adapters (CNAs) and one OOB management connection is a minimum baseline of network connectivity for the fabric architecture. Two interfaces are used for cluster traffic, and the third is available as a management interface.
To provide resiliency, additional interfaces can be added and teamed by using a network adapter teaming solution from your OEM hardware. We recommend that you provide redundant network communication between all private cluster nodes.
Host connectivity in a private cloud infrastructure should support the following types of communication that are required by Hyper-V and the failover clusters that make up the fabric, including:
- Host management
- Virtual machine
- Live migration
- FCoE or iSCSI
- Intra-cluster communication and CSV
In a converged network architecture, LAN and storage traffic utilize Ethernet as the transport. Fibre Channel and iSCSI are possible choices for the converged infrastructure pattern. Although using SMB 3.0 could also be considered a converged architecture, it is broken into a separate design pattern, the Software-defined architecture pattern.
The converged pattern refers to either FCoE or iSCSI approaches. Proper network planning is critical in a converged design. Use of Quality of Service (QoS), VLANs, and other isolation or reservation approaches is strongly recommended, so that storage and LAN traffic is appropriately balanced.
Storage provides the final component for workload scaling, and as with any workload, it must be designed properly to provide the required performance and capacity for overall fabric scale. In a converged fabric infrastructure, connectivity to the storage uses an Ethernet-based approach, such as iSCSI or FCoE.
Figure 4 provides an overview of the storage infrastructure for the converged pattern.
Figure 4 Storage minimum configuration
2.3.1 Storage Connectivity
For the operating system volume of the host system that is using direct-attached storage to the host, an internal SATA or SAS controller is required, unless the design utilizes SAN for all system-storage requirements, including boot from SAN for the host operating system (Fibre Channel and iSCSI boot are supported in Windows Server 2012 R2).
Depending on the storage protocol and devices that are used in the converged storage design, the following adapters are required to allow shared storage access:
- If using Fibre Channel SAN, two or more converged network adapters (CNAs)
- If using iSCSI, two or more 10-gigabit (GB) Ethernet network adapters or iSCSI HBAs
Windows Server 2012 R2 Hyper-V supports the ability to present SAN storage to the guest workloads that are hosted on the fabric infrastructure by using virtual Fibre Channel adapters. Virtual SANs are logical equivalents of virtual network switches within Hyper-V, and each Virtual SAN maps to a single physical Fibre Channel uplink. To support multiple CNAs, a separate Virtual SAN must be created per physical Fibre Channel CNA and mapped exactly to its corresponding physical topology.
When configurations use multiple CNAs, MPIO must be enabled within the virtual machine workload itself. Virtual SAN assignment should follow a similar pattern as Hyper-V virtual switch assignment in that, if there are different classifications of service within the SAN, it should be reflected within the fabric.
All physical Fibre Channel equipment must support NPIV. Hardware vendors must also provide drivers that display the Designed for Windows logo for all Fibre Channel CNAs, unless drivers are provided in-box. If zoning that is based on physical Fibre Channel switch ports is part of the fabric design, all physical ports must be added to allow for virtual machine mobility scenarios across hosts in the fabric cluster.
Although virtual machines can support iSCSI boot, boot from SAN is not supported over the virtual Fibre Channel adapter and should not be considered as part of workload design.
2.3.2 Storage Infrastructure
The key attribute of the storage infrastructure for the converged pattern is the use of a traditional SAN infrastructure, but it is accessed through an Ethernet transport for the fabric, fabric management, and workload layers. The primary reason to adopt or maintain this design is to preserve existing investments in a SAN infrastructure or to maintain the current level of flexibility and capabilities that a SAN-based storage-array architecture provides, while consolidating to a single Ethernet network infrastructure.
For Hyper-V failover-cluster and workload operations in a converged infrastructure, the fabric components utilize the following types of storage:
- Operating system: Non-shared physical boot disks (direct-attached storage or SAN) for the fabric management host servers (unless using boot from SAN)
- Cluster witness: Shared witness disk or file share to support the failover cluster quorum
- Cluster Shared Volumes (CSV): One or more shared CSV LUNs for virtual machines (Fibre Channel or iSCSI), as presented by the SAN
- Guest clustering [optional]: Shared Fibre Channel, shared VHDX, or shared iSCSI LUNs for guest clustering
Figure 5 provides a conceptual view of this architecture for the converged pattern.
Figure 5 Converged architecture pattern
The fabric and fabric management host controllers require sufficient storage to account for the operating system and paging files. In Windows Server 2012 R2, we recommend that virtual memory be configured as “Automatically manage paging file size for all drives.”
Although boot from SAN from Fibre Channel and iSCSI storage is supported in Windows Server 2012 R2 and Windows Server 2012, it is widely accepted to have storage configured locally per server to provide these capabilities for each server, given the configuration of standard non-converged servers. In these cases, local storage should include two disks that are configured as RAID 1 (mirror) as a minimum, with an optional global hot spare.
To provide quorum for the server infrastructure, we recommend that you utilize a quorum configuration of Node and Disk Majority. To support this, a cluster witness disk is required to support this quorum model. In converged pattern configurations, we recommend that a 1 GB witness disk that is formatted as NTFS be provided for all fabric and fabric management clusters. This provides resiliency and prevents partition-in-time scenarios within the cluster.
Windows Server 2012 R2 provides multiple host access to a shared disk infrastructure through CSV. For converged patterns, the SAN should be configured to provide adequate storage for virtual machine workloads. Because workload virtual disks often exceed multiple gigabytes, we recommend that where it is supported by the workload, dynamically expanding disks be used to provide higher density and more efficient use of storage.
Additional SAN capabilities such as thin provisioning of LUNs can assist with the consumption of physical space. However, this functionality should be evaluated to help make sure that workload performance is not affected.
For the purposes of Hyper-V failover clustering, CSV must be configured in Windows as a basic disk that is formatted as NTFS (FAT and FAT32 are not supported for CSV). CSV cannot be used as a witness disk, and they cannot have Windows Data Deduplication enabled. Although supported, ReFS should not be used in conjunction with a CSV with Hyper-V workloads.
A CSV has no restrictions in the number of virtual machines that it can support on an individual CSV, because metadata updates on a CSV are orchestrated on the server side and they run in parallel to provide no interruption and increased scalability.
Performance considerations fall primarily on the IOPS that the SAN provides, given that multiple servers from the Hyper-V failover-cluster stream I/O to a commonly shared LUN. Providing more than one CSV to the Hyper-V failover cluster within the fabric can increase performance, depending on the SAN configuration.
To support guest clustering, LUNs can be presented to the guest operating system through iSCSI or Fibre Channel. Configurations for the converged pattern should include sufficient space on the SAN to support a small number of LUNs that support workloads with high availability requirements that must be satisfied within the guest virtual machines and associated applications.