This document discusses the network infrastructure components that are relevant for a Microsoft IaaS infrastructure and provides guidelines and requirements for building an IaaS network infrastructure using Microsoft products and technologies.
Table of Contents (for this article)
This document is part of the Microsoft Infrastructure as a Service Foundations series. The series includes the following documents:
For more information about the Microsoft Infrastructure as a Service Foundations series, please see Chapter 1: Microsoft Infrastructure as a Service Foundations
Adam Fazio – Microsoft
David Ziembicki – Microsoft
Joel Yoker – Microsoft
Artem Pronichkin – Microsoft
Jeff Baker – Microsoft
Michael Lubanski – Microsoft
Robert Larson – Microsoft
Steve Chadly – Microsoft
Alex Lee – Microsoft
Carlos Mayol Berral – Microsoft
Ricardo Machado – Microsoft
Sacha Narinx – Microsoft
Tom Shinder – Microsoft
Cheryl McGuire – Microsoft
Joe Davies – Microsoft
Windows Server 2012 R2
System Center 2012 R2
Windows Azure Pack – October 2014 feature set
Microsoft Azure – October 2014 feature set
The goal of the Infrastructure-as-a-Service (IaaS) Foundations series is to help enterprise IT and cloud service providers understand, develop, and implement IaaS infrastructures. This series provides comprehensive conceptual background, a reference architecture and a reference implementation that combines Microsoft software, consolidated guidance, and validated configurations with partner technologies such as compute, network, and storage architectures, in addition to value-added software features.
The IaaS Foundations Series utilizes the core capabilities of the Windows Server operating system, Hyper-V, System Center, Windows Azure Pack and Microsoft Azure to deliver on-premises and hybrid cloud Infrastructure as a Service offerings.
As part of Microsoft IaaS Foundations series, this document discusses the network infrastructure components that are relevant for a Microsoft IaaS infrastructure and provides guidelines and requirements for building a network infrastructure using Microsoft products and technologies. These components can be used to compose an IaaS solution based on private clouds, public clouds (for example, in a hosting service provider environment) or hybrid clouds. Each major section of this document will include sub-sections on private, public and hybrid infrastructure elements. Discussions of public cloud components are scoped to Microsoft Azure services and capabilities.
A variety of designs and new approaches to data center networks have emerged in recent years. The objective in most cases is to improve resiliency and performance while optimizing performance for highly virtualized environments.
2.1.1 Network Architecture Patterns
Many network architectures include a hierarchical design with three or more tiers including:
- Core tier
- Aggregation (or distribution) tier
- Access tier
Designs are driven by the port bandwidth and quantity that are required at the edge, in addition to the ability of the core and aggregation tiers to provide higher speed uplinks to aggregate traffic. Additional considerations include Ethernet broadcast boundaries and limitations and loop-avoidance technologies.
The core tier is the high-speed backbone for the network architecture. The core typically comprises two modular switch chassis to provide a variety of service and interface module options. The data entry of the core tier might interface with other network modules.
The aggregation (or distribution) tier consolidates connectivity from multiple access tier switch uplinks. This tier is commonly implemented in:
- End-of-row switches
- Centralized wiring closet, or
- Main distribution frame room
The aggregation tier provides high-speed switching and more advanced features, like Layer 3 routing and other policy-based networking capabilities. The aggregation tier must have redundant, high-speed uplinks to the core tier for high availability.
The access tier provides device connectivity to the data center network. This tier is commonly implemented by using Layer 2 Ethernet switches—typically through blade chassis switch modules or top-of-rack (ToR) switches. The access tier must provide:
- · Redundant connectivity for devices
- · Required port features
- · Adequate capacity for access (device) ports and uplink ports
The access tier can also provide features that are related to NIC Teaming, such as Link Aggregation Control Protocol (LACP). Certain teaming solutions might require LACP switch features.
Figure 1 illustrates two three-tier network models—one provides 10 GbE to devices and the other provides 1 GbE to devices.
18.104.22.168 Flat Network
A flat network topology is adequate for very small networks. In a flat network design, there is no hierarchy. Each networking device has essentially the same job, and the network is not divided into layers or modules. A flat network topology is easy to design and implement, and it is easy to maintain if the network stays small.
When the network grows, however, a flat network is undesirable. The lack of hierarchy makes troubleshooting difficult because instead of being able to concentrate troubleshooting efforts in just one area of the network, you might have to inspect the entire network.
22.214.171.124 Network Virtualization (Software-Defined Networking)
With this concept of virtual networks (which are not the same as Virtual Private Networks or VPNs) which are composed of one or more virtual subnets, the exact physical location of an IP subnet is decoupled from the virtual network topology. As a result, you can easily move subnets to the cloud while preserving your existing IP addresses and topology in the cloud, so that existing services continue to work and they are unaware of the physical location of the subnets.
While you can move on-premises subnets into the cloud using Hyper-V Network Virtualization, you will still need to configure your routing infrastructure to support the new location of a formerly on-premises network ID.
Hyper-V Network Virtualization in Windows Server 2012 R2 provides policy-based, software-controlled network virtualization that reduces the management overhead that enterprises face when they expand dedicated Infrastructure as a Service (IaaS) clouds. In addition, it provides cloud hosting service providers with greater flexibility and scalability for managing virtual machines so that they can achieve higher resource utilization.
2.1.2 Network Performance and Low Latency
126.96.36.199 Data Center Bridging
Data-center bridging (DCB) refers to enhancements to Ethernet LANs that are used in data center environments. These enhancements consolidate the various forms of a network into a single technology, known as a converged network adapter or CNA. In a virtualized environment, Hyper-V in Windows Server 2012 R2 and Windows Server 2012 can utilize data-center bridging hardware to converge multiple types of network traffic onto a single network adapter.
Data-center bridging is a hardware mechanism that classifies and dispatches network traffic, supporting far fewer traffic flows. It converges various types of traffic, including network, storage, management, and live migration traffic. However, it also can classify network traffic that does not originate from the networking stack (for example, hardware that is accelerated through iSCSI when the system does not use the Microsoft iSCSI initiator).
Advantages of DCB in a cloud networking infrastructure include:
- Simpler management due to the fact that you need to deploy and manage only one fabric.
- Fewer point where networks connect that might fail.
- Cost advantages due to the fact that there are fewer NICs, cables and switches, as well as reduced power requirements
188.8.131.52 Virtual Machine Queue (VMQ)
The virtual machine queue (VMQ) feature allows the network adapter of the host to pass DMA packets directly into individual virtual machine memory stacks. VMQ allows the single network adapter of the host to appear as multiple network adapters to the virtual machines, to allow each virtual machine its own dedicated network adapter. The result is less data in the buffers of the host and an overall performance improvement in I/O operations, thus allowing for greater virtual machine density per IaaS host server.
Windows Server 2012 R2 dynamically distributes the incoming network traffic to host processors, based on processor use and network load. In times of heavy network load, dynamic VMQ automatically uses more processors. In times of light network load, dynamic VMQ relinquishes those same processors.
184.108.40.206 IPsec Task Offload
IPsec protects network communication by authenticating and encrypting some or all of the content of network packets. IPsec Task Offload in Windows Server 2012 R2 utilizes the hardware capabilities of server network adapters to offload IPsec processing. This reduces the CPU overhead of IPsec encryption and decryption significantly and thus makes it available to virtual machines running on host servers in a cloud infrastructure. This enables you to increase virtual machine density on each host server.
In Windows Server 2012 R2, IPsec Task Offload is extended to virtual machines. Customers who use virtual machines and want to help protect their network traffic by using IPsec can utilize the IPsec hardware offload capability that is available in server network adapters. Doing so frees up CPU cycles to perform more application-level work and leaves the per-packet encryption and decryption to hardware. Extending IPsec Task Offload to virtual machines enables you to further increase virtual machine density on host servers.
220.127.116.11 Quality of Service (QoS)
Quality of Service (QoS) is a set of technologies that provide you the ability to cost-effectively manage network traffic in network environments. There are three options for deploying QoS in Windows Server:
- Data center bridging. This is performed by hardware, and it is good for iSCSI environments. However, it requires hardware investments and it can be complex.
- Policy-based QoS. Historically present in Windows Server, this capability is managed with Group Policy. Challenges include that it does not provide the required capabilities within the Microsoft iSCSI initiator or Hyper-V environments.
- Hyper-V QoS. This capability works well for virtual machine workloads and virtual network adapters on servers running Hyper-V. However, it requires careful planning and an implementation strategy because it is not managed with Group Policy. (Networking is managed somewhat differently in Virtual Machine Manager.)
Windows Server 2012 R2 includes the ability to assign a maximum bandwidth to a virtual machine or service. This feature is important for hosting service providers and companies that honor SLA clauses that promise a minimum network bandwidth to customers, as machines that are not constrained by maximum bandwidth could monopolize the pipe. It is equally important to enterprises that require predictable network performance when they run virtualized server workloads on shared hardware.
In addition to the ability to limit maximum bandwidth, QoS in Windows Server 2012 R2 provides a new bandwidth management feature: minimum bandwidth. Unlike maximum bandwidth, which is a bandwidth cap, minimum bandwidth is a bandwidth floor, and it assigns a certain amount of bandwidth to a specific type of traffic. This enables the cloud service provider (whether commercial or enterprise) to make good on network SLAs. It is possible to implement minimum and maximum bandwidth limits simultaneously.
18.104.22.168 Remote Direct Memory Access (SMB Direct)
SMB Direct (SMB over RDMA) is a storage protocol in Windows Server 2012 R2. It enables direct memory-to-memory data transfers between server and storage, with minimal CPU usage, while using standard RDMA-capable network adapters. SMB Direct is supported on three types of RDMA technology: iWARP, InfiniBand, and RoCE. In Windows Server 2012 R2 there are more scenarios that can take advantage of RDMA connectivity including CSV redirected mode and live migration.
SMB over RDMA provides several advantages to a cloud networking infrastructure:
- Must faster live migrations of virtual machines, which enables scenarios when cloud servers need to be taken offline for service or when load needs to be rebalanced among host servers
- Significantly reduced processor overhead related to network traffic, which makes it possible to increase virtual machine density per host server
- Makes it possible to place virtual disk files into network file shares, which enables you to scale the compute and storage infrastructure independently
22.214.171.124 Receive Segment Coalescing
Receive segment coalescing improves the scalability of the servers by reducing the overhead for processing a large amount of network I/O traffic. It accomplishes this by coalescing multiple inbound packets into a large buffer. This scalability increase enables you to increase virtual machine density per host server.
126.96.36.199 Receive-Side Scaling
Receive-side scaling (RSS) spreads monitored interruptions over multiple processors, so a single processor is not required to handle all network traffic related I/O interruptions, which was common in earlier versions of Windows Server. You can select which processors will be used to handle RSS requests beyond 64 processors, which allows you to utilize high-end computers that have a large number of logical processors to support ultra-high density virtual machine configurations.
RSS works with NIC Teaming to remove a limitation in earlier versions of Windows Server, where you had to choose between using hardware drivers or RSS. RSS will also work for User Datagram Protocol (UDP) traffic, and it can manage and debug applications that use WMI and Windows PowerShell.
188.8.131.52 Virtual Receive-Side Scaling
Windows Server 2012 R2 includes support for virtual receive-side scaling (vRSS), which much like standard RSS, allows virtual machines to distribute network processing loads across multiple virtual processors to increase network throughput within virtual machines.
Virtual receive-side scaling is only available on virtual machines running the Windows Server 2012 R2 and Windows 8.1 operating systems, and it requires VMQ support on the physical adapter. Virtual receive-side scaling is disabled by default if the VMQ-capable adapter is less than 10 Gbps.
SR-IOV cannot be enabled in a virtual machine network interface to take advantage of virtual receive-side scaling because SR-IOV bypasses the virtual networking stack. Virtual receive-side scaling coexists with NIC Teaming, live migration, and Network Virtualization using Generic Routing Encapsulation (NVGRE).
The SR-IOV standard was introduced by the Peripheral Component Special Interest Group (PCI-SIG). SR-IOV works with system support for virtualization technologies that provides remapping for interrupts and DMA, and it lets SR-IOV–capable devices be assigned directly to a virtual machine, thus enabling the virtual machine NIC to bypass the Hyper-V networking stack.
Hyper-V in Windows Server 2012 R2 enables support for SR-IOV–capable network devices, and it allows the direct assignment of a network adapter’s SR-IOV function to a virtual machine. This increases network throughput and reduces network latency, while reducing the host CPU overhead that is required for processing network traffic. The result is that a physical adapter is assigned to a virtual machine and that the virtual machine bypasses the virtualization network stack to communicate with the SR-IOV adapter.
SR-IOV is not compatible with NIC-teaming, SMB Direct (RDMA) and TCP Chimney
SR-IOV can be useful in cloud network infrastructure where the commercial or enterprise cloud service provider wishes to provide the highest network throughput and lowest latency as possible to consumer of the cloud service. This would typically be surfaced to the consumer as a value added service.
2.1.3 Network High Availability and Resiliency
184.108.40.206 NIC Teaming
To increase reliability and performance in virtualized environments, Windows Server 2012 R2 and Windows Server 2012 include built-in support for network adapter hardware that is NIC Teaming–capable. NIC Teaming is also known as “load balancing and failover or LBFO”.
NIC Teaming allows multiple network adapters to be placed into a team for the purposes of bandwidth aggregation and traffic failover. This helps maintain connectivity in the event of a network component failure.
NIC Teaming is compatible with all networking capabilities in Windows Server 2012 R2 except:
- Policy-Based QoS
- TCP chimney (not available in Windows Server 2012 R2)
- 802.1X authentication
From a scalability perspective, NIC teaming supports:
- A minimum of two NICs per team
- A maximum of 32 network adapters per team
- An unlimited number of teams per host
Not all traffic will benefit from NIC Teaming. The most noteworthy exception is storage traffic, where iSCSI should be handled by MPIO, and SMB should be backed by SMB Multichannel. However, when a single set of physical network adapters is used for storage and networking traffic, teaming for storage traffic is acceptable and encouraged
NIC Teaming Modes
Establishing NIC Teaming requires you to set:
- Teaming mode
- Distribution mode
Two basic sets of algorithms are used for the teaming modes. These are exposed in the UI as three options—a switch-independent mode, and two switch-dependent modes: Static Teaming and Dynamic Teaming (LACP).
- Switch-independent mode: These algorithms make it possible for team members to connect to different switches because the switch does not know that the interface is part of a team. These modes do not require the switch to participate in the teaming. This is recommended for Hyper-V deployments.
- Switch-dependent modes: These algorithms require the switch to participate in the teaming. All interfaces of the team are connected to the same switch.
There are two common choices for switch-dependent modes of NIC Teaming:
- Static teaming (based on IEEE 802.3ad): This mode requires configuration on the switch and on the host to identify which links form the team. Because this is a statically configured solution, there is no additional protocol to assist the switch and host to identify incorrectly plugged cables or other errors that could cause the team to fail. Typically, this mode is supported by server-class switches.
- Dynamic teaming (based on IEEE 802.1ax): This mode works by using the LACP to dynamically identify links that are connected between the host and a specific switch. Typical server-class switches support IEEE 802.1ax, but most require administration to enable LACP on the port. There are security challenges to allow an almost completely dynamic IEEE 802.1ax to operate on a switch. These switches require that the switch administrator configure the switch ports that are allowed to be members of such a team.
Switch-dependent modes result in inbound and outbound traffic that approach the practical limits of the aggregated bandwidth and therefore represent an advantage over switch-independent modes.
Aside from teaming modes, three algorithms are used for traffic distribution within NIC Teaming in Windows Server 2012 R2 and Windows Server 2012. These are exposed in the UI under the Load Balancing mode as:
- Hyper-V Switch Port
- Address Hash
The Dynamic traffic distribution algorithm (sometimes referred to as adaptive load balancing) adjusts the distribution of load continuously in an attempt to more equitably carry the load across team members. This produces a higher probability that all the available bandwidth of the team can be used.
This mechanism provides the benefits of multiple distribution schemes.
This mode is particularly useful when virtual machine queues (VMQs) are used, and it is recommended for Hyper-V deployments where guest teaming is not enabled.
This method can be used when virtual machines have independent MAC addresses that can be used as the basis for dividing traffic. There is an advantage in using this scheme in virtualization, because the adjacent switch always sees source MAC addresses on only one connected interface. This causes the switch to balance the egress load (the traffic from the switch to the host) on multiple links, based on the destination MAC address on the virtual machine.
Like Dynamic mode, this mode is particularly useful when virtual machine queues (VMQs) are used, because a queue can be placed on the specific network adapter where the traffic is expected to arrive.
This mode might not be granular enough to get a well-balanced distribution, and it will always limit a single virtual machine to the bandwidth that is available on a single interface.
Windows Server uses the Hyper-V Switch Port as the identifier instead of the source MAC address, because a virtual machine in some instances might be using more than one MAC address.
Creates a hash value that is based on components of the packet and then assigns packets that have that hash value to one of the available interfaces. This keeps all packets from the same TCP stream on the same interface. Components that can be used as inputs to the hashing function include:
- Source and destination MAC addresses
- Source and destination IP addresses, with or without considering the MAC addresses (2-tuple hash)
- Source and destination TCP ports, usually used with the IP addresses (4-tuple hash)
Guest Virtual Machine NIC Teaming
NIC Teaming in Windows Server 2012 R2 allows virtual machines to have virtual network adapters that are connected to more than one virtual switch and still have connectivity, even if the network adapter that is under that virtual switch is disconnected. This is particularly important when you are working with a feature such as SR-IOV traffic, which does not go through the Hyper-V virtual switch.
By using the virtual machine teaming option, you can set up two virtual switches, each of which is connected to its own SR-IOV–capable network adapter. NIC Teaming then works in one of the following ways:
- Each virtual machine can use one or both SR-IOV network adapters, and if a network adapter disconnection occurs, it will fail over from the primary virtual function to the backup virtual function.
- Each virtual machine can have SR-IOV from one network adapter and a non-SR-IOV interface to the other switch. If the network adapter that is associated with the SR-IOV NIC becomes disconnected, the traffic can fail over to the other switch without losing connectivity.
NIC Teaming Feature Compatibility
Information about the compatibility of the NIC Teaming feature in Windows Server 2012 R2 is provided in the following whitepaper on Microsoft TechNet: Windows Server 2012 R2 NIC Teaming (LBFO) Deployment and Management.
2.1.4 Network Isolation and Security
With Windows Server 2012 R2, you can configure servers running Hyper-V to enforce network isolation among any set of arbitrary isolation groups, which are typically defined for individual customers or sets of workloads.
Windows Server 2012 R2 provides isolation and security capabilities for multitenancy by offering the following features:
- Multitenant virtual machine isolation through VLANs and private virtual LANs (pVLANs)
- Protection from Address Resolution Protocol (ARP) and Neighbor Discovery protocol spoofing
- Protection against Dynamic Host Configuration Protocol (DHCP) spoofing with DHCP guard
- Isolation and metering by using virtual port access control lists (Port ACLs)
- The ability to use the Hyper-V virtual switch trunk mode to direct traffic from multiple VLANs to a single network adapter in a virtual machine
- Resource metering
- Windows PowerShell and Windows Management Instrumentation (WMI)
The following sections will discuss each of these features in more detail.
Currently, VLANs are the mechanism that most organizations use to help support tenant isolation and reuse address space. A VLAN uses explicit tagging (VLAN ID) in the Ethernet frame headers, and it relies on Ethernet switches to enforce isolation and restrict traffic to network nodes that have the same VLAN ID. The VLAN is the logical equivalent of a network segment, which defines the Ethernet collision domain.
220.127.116.11 Trunk Mode to Virtual Machines
With a VLAN, a set of host machines or virtual machines appear to be on the same physical network segment or collision domain, independent of their actual physical locations. By using the Hyper-V virtual switch trunk mode, traffic from multiple VLANs can be directed to a single network adapter in a virtual machine that could previously receive traffic from only one VLAN. As a result, traffic from different VLANs is consolidated, and a virtual machine can listen to multiple VLANs. This feature can help you analyze and troubleshoot network traffic and enforce multitenant security in your data center through methods such as network intrusion detection.
18.104.22.168 Private VLANs
VLAN technology is traditionally used to subdivide a network and provide network isolation for individual nodes that share a common physical network infrastructure. Windows Server 2012 R2 introduces support for private VLANs, which is a technique that is used with VLANs that can be used to provide isolation between two virtual machines that are on the same VLAN.
When a virtual machine does not have to communicate with other virtual machines, you can use private VLANs to isolate it from other virtual machines that are in your data center. By assigning each virtual machine in a private VLAN only one primary VLAN ID and one or more secondary VLAN IDs, you can put the secondary private VLANs into one of three modes, as shown in the following table.
Communicates only with Promiscuous ports in the PVLAN
Communicates with all ports in the PVLAN
Communicates with ports in the same community and any promiscuous ports in the PVLAN
Table 1 Private VLAN modes
Private VLANs can be used to create an environment where VMs may only interact with the Internet and not have visibility into other VMs’ network traffic. To accomplish this put all VMs (actually their Hyper-V switch ports) into the same PVLAN in isolated mode. Therefore, using only two VLAN IDs, primary and secondary, all VMs are isolated from each other.
Private VLANs could be useful in the following scenarios:
- Lack of free primary VLAN numbers in the data center or on physical switches. (There is a maximum of 4096, possibly less depending on the hardware that is used.)
- Isolating multiple tenants from each other in community VLANs while still providing centralized services (such as Internet routing) to all of them simultaneously (by using promiscuous mode).
22.214.171.124 ARP and Neighbor Discovery Spoofing Protection
The Hyper-V virtual switch helps provide protection against a malicious virtual machine stealing IP addresses from other virtual machines through ARP spoofing (also known as ARP poisoning in IPv4). This type of man-in-the-middle attack is known as Neighbor Discovery spoofing. A malicious virtual machine sends a fake ARP message, which associates its own MAC address to an IP address that it does not own.
Unsuspecting virtual machines send network traffic that is targeted to that IP address to the MAC address of the malicious virtual machine, instead of to the intended destination. For IPv6, Windows Server 2012 R2 helps provide equivalent protection for Neighbor Discovery spoofing.
This is a mandatory option for hosting companies when the virtual machine is not under control of the fabric or cloud administrators.
126.96.36.199 Router Advertisement and Redirection Protection
In Windows Server 2012 R2, the Hyper-V virtual switch helps protect against router advertisement and redirection messages that come from an unauthorized virtual machine pretending to be a router. In this situation, a malicious virtual machine attempts to be a router for other virtual machines. If a virtual machine accepts the network routing path, the malicious virtual machine can perform man-in-the-middle attacks, for example, steal passwords from SSL connections.
188.8.131.52 Rogue DHCP Server Protection
In a DHCP environment, a rogue DHCP server could intercept client DHCP requests and provide incorrect address information. The rogue DHCP server could cause traffic to be routed to a malicious intermediary that inspects all traffic before forwarding it to the legitimate destination.
To protect against this man-in-the-middle attack, you can designate which Hyper-V virtual switches can have DHCP servers connected to them by using DHCP Guard. DHCP server traffic from other Hyper-V virtual switches is automatically dropped. The Hyper-V virtual switch now helps protect against a rogue DHCP server that is attempting to provide IP addresses that would cause traffic to be rerouted.
184.108.40.206 Port ACLs and Network Metering
Port ACLs provide a mechanism for isolating networks and metering network traffic for a virtual port on the Hyper-V virtual switch. By using port ACLs, you can control which IP addresses or MAC addresses can (or cannot) communicate with a virtual machine. For example, you can use port ACLs to enforce isolation of a virtual machine by letting it talk only to the Internet, or communicate only with a predefined set of addresses.
You also can configure multiple port ACLs for a virtual port. Each port ACL consists of a source or destination network address, and permit, deny or meter action. The metering capability also supplies information about the number of instances that traffic was attempted to or from a virtual machine from a restricted (deny) address.
By using the metering capability, you can measure network traffic that is going to or from a specific IP address or MAC address, which lets you report on traffic that is sent or received from the Internet or from network storage arrays. This information can then be used in chargeback or showback scenarios.
Virtual Switch Extended Port ACLs
In Windows Server 2012 R2, extended port ACLs can be configured on the Hyper-V virtual switch to allow and block network traffic to and from the virtual machines that are connected to the same virtual switch. In these cases, network traffic configurations on the physical network cannot manage traffic between virtual machines because the traffic never leaves the virtual switch.
Service Providers Note:
Service providers can greatly benefit from extended port ACLs because they can be used to enforce security policies between resources in the fabric infrastructure. The use of extended port ACLs is useful in multitenant environments, such as those provided by service providers. Tenants can also enforce security policies through extended port ACLs to isolate their resources.
Port ACLs are required on physical top-of-rack switches due to the many-to-one nature of virtual machine connectivity. Extended port ACLs could potentially decrease the number of security policies that are required for the large number of servers in a service provider or large enterprise IaaS fabric infrastructure.
220.127.116.11 Network Virtualization
Isolating the virtual machines for different departments or customers can be a challenge on a shared network. When entire networks of virtual machines must be isolated, the challenge becomes even greater. Traditionally, VLANs have been used to isolate networks, but VLANs are very complex to manage on a large scale.
The primary drawbacks of VLANs include:
- Complex reconfiguration of production switches is required whenever virtual machines or isolation boundaries must be moved.
- Reconfiguration of the physical network for the purposes of adding or modifying VLANs increases the risk of a system-wide outage.
- VLANs have limited scalability because typical switches support no more than 1,000 VLAN IDs (while the specification supports a maximum of 4,095).
- VLANs cannot span multiple Ethernet subnets, which limits the number of nodes in a single VLAN and restricts the placement of virtual machines based on physical location.
Hyper-V Network Virtualization (HNV) enables you to isolate network traffic from different business units or customers on a shared infrastructure, without having to use VLANs. HNV also lets you move virtual machines while preserving their IP addresses. You can even use network virtualization to transparently integrate these private networks into a preexisting infrastructure on another site.
HNV permits multiple virtual networks, even those with overlapping IP addresses, to be deployed on the same physical network. You can set policies that isolate traffic in a dedicated virtual network, independently of the physical infrastructure.
The potential benefits of network virtualization include:
- Tenant network migration to the cloud with minimum reconfiguration or effect on isolation.
- Preserve your internal IP addresses while they move workloads onto shared clouds for IaaS, thus minimizing the configuration changes that are needed for IP addresses, DNS names, security policies, and virtual machine configurations.
- You can use VLANs to manage traffic in the physical infrastructure if the topology is primarily static.
- Simplified network design and improved server and network resource use. The rigidity of VLANs and the dependency of virtual machine placement on a physical network infrastructure results in overprovisioning and underuse.
- Placement of server workloads is simplified because migration and placement of workloads are independent of the underlying physical network configurations.
- Works with current hardware (servers, switches, appliances) to promote performance; however, network adapters that use NVGRE are recommended.
- Full management through Windows PowerShell and WMI. Although a policy management server such as System Center Virtual Machine Manager is highly recommended
HNV also supports dynamic IP address learning, which allows network virtualization to learn about manually assigned or DHCP addresses to set on the virtual network. In environments that use System Center Virtual Machine Manager, when a host learns a new IP address, it will notify Virtual Machine Manager, which adds it to the centralized policy. This allows rapid dissemination and reduces the overhead that is associated with distributing the network virtualization routing policy. In Windows Server 2012 R2, HNV is also supported in configurations that also use NIC Teaming.
3.1.1 Azure Virtual Networks
Microsoft Azure enables you to put virtual machines on a virtual network that is contained within the Microsoft Azure physical network infrastructure. Microsoft Azure Virtual Networks enable you to create secure site-to-site connectivity and private virtual networks. Virtual Networks provide functionality similar to what you find when you connect virtual machines to a virtual switch in Hyper-V.
Some capabilities of Microsoft Azure Virtual Networks include:
- Site-to-site network connectivity between Microsoft Azure and your on-premises network. This is possible by using a VPN gateway located on the Azure Virtual Network and a supported VPN gateway device (including Windows Server 2012 RRAS) on-premises.
- Simple name resolution services are included within Azure Virtual Networks when virtual machines within the same cloud service are contained within the same virtual network (supports host name resolution).
- Simple name resolution services are included within Azure Virtual Networks when virtual machines within different cloud service are contained within the same virtual network (supports FQDN based name resolution only).
- Hybrid name resolution services enable you to specify an on-premises DNS server or a dedicated DNS server that is running elsewhere.
- Persistent DCHP provided addresses for virtual machines so that the internal IP addresses of your virtual machines will remain persistent and will not change, even when you restart a virtual machine.
- Ability to join virtual machines that are running in Microsoft Azure to your on-premises domain.
- Create point-to-site connections (SSTP remote access client VPN connections), that enable individual workstations to establish VPN connectivity to Microsoft Azure Virtual Networks.
Microsoft Azure virtual networks have the following properties:
- Virtual machines can have only one IP address (or one IP plus a virtual IP, if they are load-balanced).
- Every virtual machine receives an IP address through DHCP; static IP addresses are not supported.
- Virtual machines on the same virtual network can communicate directly.
- Virtual machines on different virtual networks cannot communicate directly unless you configure Virtual Network to Virtual Network site-to-site VPN connections (discussed later).
- Egress (outbound) traffic from Microsoft Azure is charged.
- Ingress traffic to Microsoft Azure is free (not charged).
- All virtual machines by default have Internet access and are automatically configured to use a default gateway to provide that access.
- There is only one virtual gateway per virtual network.
- Virtual networks and subnets in Windows Azure must utilize private (RFC 1918) IP-address ranges.
Virtual Networks provide a way for you to enable direct communications between virtual machines that might be located within different cloud services. This removes the requirement for virtual machines that are located within different cloud services from having to reach each other through the Internet.
For more information about Azure Virtual Networks, please see Virtual Network Overview.
3.1.2 Affinity Groups
Affinity groups are a way you can group your virtual machines by proximity to each other in the Azure datacenter in order to achieve optimal performance. When you place virtual machines into an affinity group, Azure keep all of the virtual machines in that affinity group as physically close to each other as possible. This can reduce latency and increase performance, while potentially lowering costs.
Affinity groups are defined at the subscription level and the name of each affinity group must be unique within the subscription. Each affinity group you create is tied to a Region (which is the Location). Specify the same region when creating your affinity group and your virtual network.
The Region represents where the Virtual Network overlay will be. Anything you deploy to the virtual network will be physically located in the Region. If you want to further designate that you want your resources in close proximity physically to each other within the same region, you can specify an affinity group for those particular resources. That means that not only are those resources in the same physical region, they are very close to each other in the regional datacenter.
- Aggregate compute and storage services
- Provide the fabric controller the information needed for them to be kept in the same data center, and even more, in the same cluster
- Enable reduced latency due to information provided to the fabric controller that virtual machines should be kept together
- Improve performance for accessing storage from the compute nodes
- Reduce costs due to virtual machines being placed in the same cluster and thus avoiding communications between data centers are not required
Azure provides name resolution for virtual machines that reside within the same cloud service or the same Azure Virtual Network. It is important to understand that the Azure name resolution mechanism is scoped based on cloud service or virtual network:
- If the virtual machines are contained within the same cloud service, then name resolution can be done at the host name level (the entire FQDN is not required)
- If the virtual machines are not contained with the same cloud service, but are contained within the same Azure Virtual Network, then name resolution can be done using a FQDN (host name only resolution not available)
If you require name resolution across different cloud services, those cloud services must be located on the same Azure Virtual Network. If not, then you’ll need to use your own DNS server.
If you require cross-premises name resolution and you want to register additional DNS records of your own, then you will need to use your own DNS solution and not the Azure-provided solution.
Features of the built-in Azure DNS server:
- Little or no configuration is required in order to use the Azure-provided DNS service.
- Hostname resolution is provided between virtual machines within the same cloud service
- FQDN resolution is provided between VMs within the same Azure Virtual Network if the virtual machines are contained within different cloud services
- You can create your own host names, which are based on the name you assign to the virtual machine
- Standard DNS lookups are supported.
Considerations for using the built-in Azure DNS server:
- Name resolution between virtual networks is not available
- Use of multiple hostnames for the same virtual machine is not supported
- Cross-premises name resolution is not available
- Reverse lookups (PTR) records are not available
- The Azure-created DNS suffix cannot be modified
- No manual DNS record registration into the Azure-provided DNS
- WINS and NetBIOS are not supported
- Hostnames must be RFC 3696 section 2-compatible (They must use only 0-9, a-z and ‘-‘, and cannot start or end with a ‘-‘)
- DNS query traffic is throttled per VM. If your application performs frequent DNS queries on multiple target names, it is possible that some queries may time out. If your application requires a large number of DNS queries that exceed the quota, then consider deploying your own DNS server within the cloud service or Azure Virtual Network.
Microsoft Azure Traffic Manager allows you to control the distribution of user traffic to your specified endpoints, which can include Azure virtual machines, websites, and other endpoints. Traffic Manager works by applying policy to Domain Name System (DNS) queries for the domain names of your Internet resources. The Azure virtual machines or websites can be running in different datacenters across the world.
Traffic Manager can help you:
- Improve availability of critical applications by monitoring your endpoints in Azure and providing automatic failover capabilities when an Azure cloud service, Azure website, or other location goes down.
- Improve responsiveness for high performing applications by directing end-users to the endpoint with the lowest network latency from the client.
- Upgrade and perform service maintenance without downtime by supporting extended scenarios for hybrid cloud and on-premises deployments including the “burst-to-cloud,” “migrate-to-cloud,” and “failover-to-cloud” scenarios. For planned maintenance, disable the endpoint in Traffic Manager and wait for the endpoint to drain existing connections. Then update the service on that endpoint and test it, then re-enable it in Traffic Manager.
- Distribute traffic for large, complex deployments by using nested Traffic Manager profiles where a Traffic Manager profile can have another Traffic Manager profile as an endpoint. You can create configurations to optimize performance and distribution for larger, more complex deployments. For more information, see Nested profiles
In a hybrid networking scenario where virtual machines located on an Azure Virtual Network need to communicate with devices located in other locations, you have the follow options:
- Point to Site VPN (remote access VPN client connection)
- Site to Site VPN
- VNet to VNet VPN
- Dedicate WAN Link
The following sections will discuss each of these options.
4.1.1 Point to Site VPN
A point-to-site connection (typically referred to as a remote access VPN client connection) enables you to connect individual devices to the public cloud service provider’s network in the same way you connect remote devices to your corporate network over a VPN client connecting to an on-premises VPN server.
For example, suppose you have a hybrid cloud infrastructure administrator working from home. The administrator could establish a point-to-site VPN connection from his computer at home to the public cloud service provider’s network that hosts the virtual machines for his organization.
Microsoft Azure supports point-to-site connectivity using the Secure Socket Tunneling Protocol (SSTP). The VPN client connection is done using the native Windows VPN client. When the connection is established to a VPN gateway that is connected to your Azure Virtual Network, the VPN client can access any of the virtual machines over the network connection using any application protocol.
In order to authenticate VPN clients, certificates must be created and exported. You must generate a self-signed root certificate and client certificates chained to the self-signed root certificate. You can then install the client certificates with private key on every client computer that requires connectivity.
At this time only self-signed certificates are supported.
For more information on point-to-site connections to Windows Azure Virtual Networks, see About Secure Cross-Premises Connectivity.
4.1.2 Site to Site VPN
A site-to-site VPN connection enables you to connect entire networks together. Each side of the connection hosts at least one VPN gateway, which essentially acts as router between the on-premises and off-premises networks. The routing infrastructure on the corporate network is configured to use the IP address of the local VPN gateway to access the network ID(s) that are located on the public cloud provider’s network that hosts the virtual machines that are part of the hybrid cloud solution.
Unlike point to site connections to a Microsoft Azure Virtual Network, site-to-site connections do not require you to establish a separate connection for each client computer on your local network to access resources in the virtual network. However, a potential limitation of the site to site VPN configuration is that machines need to be located on the corporate network in order to access the Azure Virtual Network over the site to site VPN connection.
An exception to the requirement for a machine to be physically located to reach the Azure Virtual Network over the site to site connection exists when the off-premises machine is connected to the corporate network over a remote access VPN client connection. In this scenario, the VPN client behaves much like any other computer located on the corporate network.
You must have an externally facing IPv4 IP address and a VPN device or RRAS to configure a site-to-site VPN connection. Site to site connection to Microsoft Azure Virtual Networks do not support on-premises NAT.
18.104.22.168 Multi-Site VPN Connectivity
You can create a multi-site VPN that connects multiple on-premises sites to a single Azure virtual network gateway. This solves a previous issue with Azure Virtual Networks where only a single site to site connection could be created. If other corporate locations needed to connect to the Azure Virtual Network, the connection needed to be routed through the corporate WAN infrastructure and out through the single on-premises gateway that connected to the Azure Virtual Network.
With multi-site VPN connectivity, you are no longer limited to a single site to site connection and multiple corporate locations will be able to connect to the same Azure Virtual Network without having to route through the corporate WAN. The new limit is now 10.
22.214.171.124 VNet to VNet VPN
Connecting an Azure virtual network (VNet) to another Azure virtual network is very similar to connecting a virtual network to an on-premises location. Both connectivity types use a virtual network gateway to provide a secure tunnel using IPsec. The VNets you connect can be in different subscriptions and different regions. You can combine VNet to VNet communication with multi-site configurations. This lets you establish network topologies that combine cross-premises connectivity with inter-virtual network connectivity, as shown in the diagram below.
Figure 2: Site to site and multi-site VPN connections
Capabilities enabled by VNet to VNet connectivity include:
- Geo-replication or synchronization without traversing the Internet
- Using Azure Load Balancer and Microsoft or third party clustering technology, you can setup highly available workload with geo-redundancy across multiple Azure regions. One important example is to setup SQL Always On with Availability Groups spreading across multiple Azure regions.
- Within the same region, you can setup multi-tier applications with multiple virtual networks connected together with strong isolation and secure inter-tier communication
- If you have multiple Azure subscriptions, you can connect workloads from different subscriptions together between virtual networks
- For enterprises or service providers, it is now possible to enable cross organizational communication with VPN technology within Azure
4.1.3 Dedicated WAN Link
A dedicated WAN link is a permanent telco connection that is established directly between the on-premises network and the cloud infrastructure service provider’s network. Unlike the site-to-site VPN, which represents a virtual link layer connection over the Internet, the dedicated WAN link enables you to create a true link layer connection between your corporate network and the service provider’s network.
Microsoft Azure ExpressRoute lets you create private connections between your on-premises data centers or co-lo environment and Azure Virtual Networks in the Azure data centers. With ExpressRoute, you can establish connections to an Azure Virtual Network at an ExpressRoute partner co-location facility or directly connect to Azure from your existing WAN network (such as a MPLS VPN provided by a Network Service Provider).
As a dedicated WAN link, ExpressRoute connections do not go over the public Internet. ExpressRoute connections offer:
- Higher security
- More reliability
- Faster speeds
- Lower latencies
than typical connections over the Internet. In some cases, using ExpressRoute connections to transfer data between on-premises and Azure can also yield significant cost benefits.
The figure below shows a logical representation of connectivity between your infrastructure and Azure. You must order a circuit to connect your infrastructure to Azure through a connectivity provider. A connectivity provider can be either a Network Service Provider or an Exchange Provider.
In the diagram, a circuit represents a redundant pair of logical cross connections between your network and Azure configured in Active-Active configuration. The circuit will be partitioned to 2 sub-circuits to isolate traffic.
The following traffic is isolated:
- Traffic is isolated between your premises and Azure compute services. Azure compute services, namely virtual machines (IaaS) and cloud services (PaaS) deployed within a virtual network are covered.
- Traffic is isolated between your premises and Azure services hosted on public IP addresses. The services that are supported can be found here: Supported Azure Services.
You can choose to enable one or both types of connectivity through your circuit. You will be able to connect to all supported Azure services through the circuit only if you configure both options mentioned above.
Note the following:
- If you connect to Azure through a network service provider, the networks service provider takes care of configuring routes to all the services. Work with your network service provider to have routes configured appropriately.
- If you are connecting to Azure through an exchange provider location, you will need a pair of physical cross-connections and on each of them you will need to configure a pair of BGP sessions per physical cross connection (one public peering and one for private peering) in order to have a highly available link.
4.2.1 Integrating on-premises name resolution
Integrating on-premises name resolution is used in a hybrid cloud infrastructure where applications span on-premises networks and cloud infrastructure service provider’s networks. You can configure the virtual machines located on an Azure Virtual Network to use DNS servers that are located on premises, or you can create virtual machines on an Azure Virtual Network that host corporate DNS services and are part of the corporate DNS replication topology. This makes name resolution for both on-premises and cloud based resources available to all machines that support hybrid applications.
4.2.2 Name resolution for external hosts
Name resolution for external hosts is required when there is no direct link, such as a site-to-site VPN or dedicated WAN link, to the Azure Virtual Network. However, in this scenario you still want to enable some components of hybrid applications to live in the public cloud and yet keep some components on premises.
Communications between components located on an Azure Virtual Network and those on premises can be done over the Internet. If on-premises components need to initiate connections to the off-premises components, they must use Internet host name resolution to reach those components. Likewise, if components in the public cloud infrastructure service provider’s network need to initiate connections to those that are located on premises, they would need to do so over the Internet by using a public IP address that can forward the connections to the components on the on-premises network. This means that you would need to publish the on-premises components to the Internet, although you could create access controls that limit the incoming connections to only those virtual machines that are located in the public cloud infrastructure services network.
Windows Server 2012 R2 provides HNV Gateway server services to support site-to-site virtual private networks, Network Address Translation (NAT), and forwarding between physical locations to support multitenant hosting solutions that leverage network virtualization. This allows service providers and organizations that use HNV to support end-to-end communication from the corporate network or the Internet to the data center running HNV.
Without such gateway devices, virtual machines in a network are completely isolated from the outside, and they cannot communicate with non-network-virtualized systems such as other systems in the corporate network or on the Internet. In Windows Server 2012 R2 and Windows Server 2012, HNV Gateway can encapsulate and decapsulate NVGRE packets, based on the centralized network virtualization policy. It can perform the gateway-specific functionality on the resulting native CA packets, such as IP forwarding and routing, NAT, or site-to-site tunneling.