Hybrid Cloud Infrastructure Solution for Enterprise IT - Design

Published: August 23, 2013
Version: 1.1
Abstract: This article outlines the hybrid cloud computing infrastructure design decisions that were made by a fictional organization based on a common set of enterprise IT requirements.  The article also explains the rationale for why the company made the design decisions that they did. This article is part of the Hybrid Cloud Infrastructure Solution for Enterprise IT guidance set.


Table of Contents

1.0 Introduction
2.0 Conceptual Design
       2.1 Reference Model
       2.2 Principles
       2.3 Patterns
3.0 Physical Design
       3.1 Overview
       3.2 Service Account Acquisition and Billing
       3.3 Network
       3.4 Storage
       3.5 Compute
       3.6 Management and Support
4.0 Summary 


For the latest information, see the Hybrid Cloud Infrastructure Solution for Enterprise IT article set. To provide feedback on this article, leave a comment at the bottom of this article or send e-mail to SolutionsFeedback@Microsoft.com.  To easily save, edit, or print your own copy of this article, please read How to Save, Edit, and Print TechNet Articles. When the contents of this article are updated, the version is incremented and changes are entered into the change log. The online version is the current version. This article discusses technologies listed in the Technologies Discussed section of this article.

1.0 Introduction

This article is one of several articles that are included in an integrated article set called the Hybrid Cloud Infrastructure Solution for Enterprise IT. If you haven’t already, before reading this article, please read the Overview article within this article set, as it provides an overview of the article set as a whole, introduces the problem domain for the solution, the audience that it is written for, and defines the articles contained within it.

This article details which specific public cloud provider, products, technologies, and configuration options were selected, out of the hundreds of individually available options, as well as why the options were selected to meet the unique requirements for the example organization defined in the Scenario Definition article of the article set.  Consequently, this article assumes that you have already read the Scenario Definition article, so if you have not, please do so before reading this article.

For organizations that have similar requirements and constraints as the fictitious organization defined in the Scenario Definition article, the lab-tested design and rationale in this article can help decrease both the implementation time and risk of implementing a hybrid cloud computing infrastructure solution. This article is most helpful to those responsible for evaluating and selecting public cloud service providers and integrating resources from public cloud service providers with their own private cloud resources.

If instead of, or in addition to, only understanding one example hybrid cloud computing infrastructure design, and the rationale for the design options selected for it, you’d like to understand all of the relevant individual design configuration options for hybrid cloud infrastructure solutions so that you can determine which options are most appropriate for your own, unique requirements, then it’s recommended that you read the Hybrid Cloud Infrastructure Design Considerations article. While the Hybrid Cloud Infrastructure Design Considerations article details all available Microsoft cloud computing services, product and technology design options and considerations for hybrid cloud infrastructure solutions, it does not provide any example designs or recommendations for specific requirements.

Many people will find it helpful to read both the Design article (the article you're reading now), as well as the Design Considerations article for the hybrid cloud infrastructure problem domain. Others will only find it necessary to read one article or the other. Though the two articles are related, there are no dependencies between them.

2.0  Conceptual Design

In the Scenario Definition article of this article set, Contoso IT clearly-defined the requirements for the public cloud provider that they selected, as well as the specific requirements for their pilot project. Next it defined a conceptual architecture that would meet and support its requirements. The conceptual design provided Contoso a vendor-agnostic architectural foundation that helped it ultimately select the best cloud providers and services, products and technologies to implement its conceptual design. The sections that follow detail the different aspects of Contoso’s conceptual design.

2.1 Reference Model

The first design artifact that Contoso decided to create for its hybrid cloud infrastructure was a vendor-agnostic reference model. They wanted the reference model to serve as a common definition of the capabilities and functions that their hybrid cloud infrastructure would provide. Though from Microsoft, Contoso was already using the vendor-agnostic Microsoft Cloud Services Foundation Reference Model (CSFRM) as a foundation for its IT environment as a whole, so it first defined what aspects of the model were applicable to its hybrid cloud infrastructure.

The figure below illustrates the Microsoft CSFRM.  The components with yellow borders are the components that are most applicable to Contoso's hybrid cloud infrastructure. The Physical Design section of this article provides design details for the yellow-bordered components in the figure below.

img6

Note:
Further explanation of the Microsoft CSFRM is not included in this article. If you’re interested in understanding it further however, you’re encouraged to read the Microsoft Cloud Services Foundation Reference Model (CSFRM) article, which is part of the Microsoft Cloud Services Foundation Reference Architecture (CSFRA) article set.

2.2  Principles

After Contoso IT defined its reference model, it defined principles for integrating infrastructure services from a public cloud provider with its private cloud. They defined principles as “guidelines” for their physical designs to adhere to, and knew when they defined them that they were aspirational, because achieving them would take time and effort.  Additionally, they knew that fully achieving some of the principles might require technological offerings that are not yet available.

Just as Contoso had applied the principles from the The Microsoft Cloud Services Foundation Reference Architecture - Principles, Concepts, and Patterns article to their private cloud implementation, they applied the principles for the hybrid cloud problem domain from the Hybrid Cloud Infrastructure Design Considerations article to their hybrid cloud infrastructure, as described in the sections that follow.

2.2.1 Utilize the Public Cloud First

Statement:

Use the public cloud to host service components, unless there is a compelling reason not to.

Rationale:

In most cases, it's easier, faster, and more cost-effective to host service components with public cloud providers than it is to host them on a private cloud. In this article, service components are any physical hardware or software that together, are used to provide a service. Individual service components may support several different services within an organization.  Each service component may be hosted on a private or public cloud. Data from the research in Microsoft's The Economics of the Cloud whitepaper explains that it can be up to 10X more cost-effective to host applications with public cloud providers.

This principle isn't one of the Microsoft Cloud Services Foundation Reference Architecture principles, but Contoso IT defined it for their hybrid infrastructure to help them begin a mindset shift within their organization. For the last several years, their mode of operation has been to ask themselves "Why should we host any service component on the public cloud, and if so, which components?" They wanted to shift the mindset of their staff dramatically to a mode of operation where they ask themselves "Why wouldn't we host every service component on the public cloud, and if not, which components wouldn't we host there and why?" Contoso IT expects that this principle will become easier to achieve over time, as their understanding and confidence in their chosen public service provider grows.

Implications:

Some service components have privacy, regulatory compliance, or both types of requirements that can't be met by public cloud providers. Some applications are designed with application patterns or availability level requirements that aren't supported by some public cloud providers.  Contoso IT does not apply this principle to any application which has any of these constraints.  It will however, re-evaluate such applications over time, since both public cloud infrastructure service provider capabilities and privacy and regulatory compliance requirements change over time.

In the initial phase of their hybrid infrastructure, Contoso IT only applied this principle to new service component implementations, but in later phases, will apply this principle to service components currently hosted on their private cloud too.  When doing so, they'll evaluate additional factors, such as the cost and risk to migrate existing service components to their strategic public cloud service provider.

2.2.2 Perception of Infinite Capacity

Statement:

From the consumer’s perspective, a cloud service should provide capacity on demand, only limited by the amount of capacity the consumer is willing to pay for.

Rationale:

For this and each of the remaining principles in the following sections, Contoso decided it had the same rationale for applying each principle to it's hybrid cloud infrastructure that it did for applying them to its private cloud infrastructure, so it didn't specify any unique rationale for the remaining principles.

Implications:

Contoso IT required that the strategic public service provider it selected support a mechanism to scale its resources, based on its capacity needs.  Further, it required that the provider support Contoso's ability to add this capacity at any time. Contoso IT also required that its public provider support automated and programmatic and scriptable interfaces that allowed its existing on-premises systems to add the capacity. After Contoso's IT initial hybrid infrastructure implementation phase, it will evaluate existing service components and prioritize migration to the public cloud provider for service components that have regular, high-peak demand spikes.

2.2.3 Perception of Continuous Service Availability

Statement:

From the consumer’s perspective, a cloud service should be available on demand from anywhere, on any device, and at any time.

Implications:

Contoso IT required that the strategic public cloud infrastructure service provider it selected meets its availability requirements of at least 99.9%.  To support its business continuity requirements, it required the provider to have datacenters in multiple geographic locations.

2.2.4 Optimization of Resource Usage

Statement:

The cloud should automatically make efficient and effective use of infrastructure resources.

Implications:

In support of its Utilize the Public Cloud First principle, Contoso IT will optimize usage of its private cloud resources for existing service components that are unable to be hosted in a public cloud.  They will host all new service components in the public cloud that are able to be hosted there.  Over time, Contoso IT will prioritize migration of existing service components that can be hosted in the public cloud, based on their other principles.

2.2.5 Incentivize Desired Behavior

Statement:

Enterprise IT service providers must ensure that their consumers understand the cost of the IT resources that they consume so that the organization can optimize its resources and minimize its costs.

Implications:

Contoso IT currently "shows" the consumption costs of its private cloud services to its internal consumers.  Showing the consumption numbers to its business unit consumers has already proved beneficial to Contoso's overall IT costs as it has brought light to unnecessary consumption, and has already helped them to start minimizing such consumption.  As a result, Contoso IT decided that it would also show the costs of its public cloud consumption costs to consumers.

2.2.6 Create a Seamless User Experience

Statement:

Within an organization, consumers should be oblivious as to who the provider of cloud services are, and should have similar experiences with all services provided to them.

Implications:

Contoso IT has spent several years integrating and standardizing its systems to provide seamless user experiences for its users, and didn't want to go back to multiple authentication mechanisms and inconsistent user interfaces when it decided to incorporate resources from a public provider into its hybrid services.  Contoso IT required that the strategic public service provider it selected allow it to integrate authentication and authorization mechanisms with its on-premises services, and also that the provider have network, monitoring, and provisioning mechanisms that allowed Contoso IT to integrate the public provider's systems with its own on-premises systems.

2.3 Patterns

Patterns are specific, reusable ideas that have been proven solutions to commonly occurring problems. Just as Contoso had applied the patterns from the The Microsoft Cloud Services Foundation Reference Architecture - Principles, Concepts, and Patterns article to their private cloud implementation, they applied the patterns for the hybrid cloud problem domain from the Hybrid Cloud Infrastructure Design Considerations article to their hybrid cloud infrastructure, as described in the sections that follow.

2.3.1 Resource Pooling

Problem: When dedicated infrastructure resources are used to support each service independently, their capacity is typically underutilized. This leads to higher costs for both the provider and the consumer.

Solution: Contoso IT has existing resource pools for its private cloud.  Contoso's existing private cloud is partitioned into separate systems management partition resource pools today, but to date, Contoso has had no reason to create separate service class partitions resource pools.  Contoso IT decided that it would treat the resources at the public provider as a security service class partition resource pool, as security-related concerns would, at least initially, be a key factor in whether service components were hosted at the public cloud infrastructure service provider's network or on the private cloud.

After its hybrid cloud infrastructure implementation, Contoso had two separate service class partition resource pools, one within its private cloud, which would host medium and high business impact information, and one in its public cloud, which would, at least initially, host only low business impact information. As Contoso IT migrates more of its service components to its public provider in the future, it will re-evaluate how it defines its resource pools, and anticipates that at a minimum, it may create additional service class partitions for different performance, availability, and continuity requirements.  It fully expects to leverage public cloud resources to help it enable different service class partition resource pools in the future.

2.3.2 Scale Unit

Problem: Purchasing individual servers, storage arrays, network switches, and other cloud infrastructure resources requires procurement, installation, and configuration overhead for each individual resource.

Solution: In their private cloud, Contoso's compute scale units consist of multiple servers that they purchase pre-configured in a rack from their supplier.  At their public provider however, since they pay for every resource they add and have no wait time for new capacity as they do when adding capacity to their private cloud, the scale unit for their public cloud service components is one virtual machine.  As they near capacity thresholds, they simply add and remove individual virtual machines, as necessary.

2.3.3 Capacity Plan

Problem: Eventually every cloud infrastructure runs out of physical capacity. This can cause performance degradation of services, the inability to introduce new services, or both.

Solution: Contoso IT selected a public cloud infrastructure service provider that offered them, essentially, unlimited capacity.  As a result, the capacity plan for their pubic cloud service components was quite simple.  Initially, they had planned to monitor the utilization of the service component, define a capacity threshold, and when the threshold was met, add more capacity. 

The public cloud provider they selected however, provided them the capability to auto-scale the number of virtual machines used to host their service components based on thresholds that they defined, so they configured this for their public cloud service components.  The provider then auto-scaled, both up and down, the number of virtual machines that hosted Contoso's service components, in accordance with the thresholds that they defined.

2.3.4 Health Model

Problem: If any component used to provide a service fails, it can cause performance degradation or unavailability of services.

Solution: Initially, Contoso IT perceived the definition of health models for hybrid services to be more difficult than defining health models for services where all components were hosted on premises.  Part of this was due to their fear of the unknown.  They believed they had more control over everything on premises than they did with their public cloud provider, and they also knew more about how everything worked in their private cloud than they did with their public cloud provider.  After they had defined multiple hybrid service health models though, it occurred to them that they didn't need to understand how all the hardware at their public provider worked, and that was actually a benefit to them.  This made their health model definition easier.

Contoso IT selected a public cloud infrastructure service provider that gave them with a service level agreement (SLA), that included an availability level that their provider committed to meet each month.  Contoso IT soon realized that as long as their provider met this SLA, they weren't extremely concerned with how their provider met the SLA, only that it did meet the SLA. 

This allowed much simpler service health models for its hybrid services than it did services that included components that only ran on Contoso's private cloud infrastructure.  In its first phase, Contoso was using the infrastructure as a service functionality from its public provider.  The provider also offered a platform as a service (PaaS) capability, which Contoso planned to utilize in the future.  Contoso expects that their health model definition will be even simpler when using PaaS, as there will be even less functionality they will need to manage and remediate.

2.3.5 Application

Problem: Not all applications are optimized for cloud infrastructures and may not be able to be hosted on cloud infrastructures.

Solution: Initially, Contoso IT decided that it would only migrate application service components to the public cloud that were designed with the stateless Application pattern. They knew that their public provider's infrastructure was designed with the Upgrade Domain and Physical Fault Domain patterns, so they knew that their stateless applications would continue to be resilient in the event of individual virtual machine failures at their provider.

2.3.6 Cost Model

Problem: Consumers tend to use more resources than they really need if there's no cost to them for doing so.

Solution: Contoso IT will show its internal consumers their consumption costs for public cloud resources, just as it does for its private cloud resources, to help it meet its Incentivize Desired Behavior principle.  They will not show the public provider's direct cost to their consumers however.  Instead, they will add additional costs to the public provider's costs, before showing the cost to their internal consumers.  This is because Contoso IT provides additional functionality to the public cloud virtual machines. 

This additional functionality has a cost.  Contoso IT currently has a per virtual machine cost for single sign-on functionality that is incorporated into the cost of its private cloud virtual machines.  It will add this same cost to its public cloud virtual machines.  This combined cost is what it will show to its internal consumers.  In the future, as Contoso adds backup capability, monitoring capability, and other capability to the public cloud resource it consumes, it will further increase the cost of public cloud virtual machine consumption for its internal consumers accordingly.

3.0 Physical Design

Contoso found the Hybrid Cloud Design Considerations article helpful when defining their physical architecture too.  Though the article did discuss Microsoft technologies, it also helped them get a better understanding of the considerations that they should factor into their hybrid cloud infrastructure design.  The guidance in the article helped them evaluate different cloud providers, cloud services, products, and technologies to determine which would best help them meet their requirements and comply with the conceptual design they created for their hybrid cloud infrastructure.

After evaluating several providers and their services, they concluded that Windows Azure would best meet their projected long and short-term requirements, as defined in the Scenario Definition article. In addition, Contoso already had a strong relationship with Microsoft, which also influenced their decision to begin the hybrid cloud infrastructure project using Windows Azure Infrastructure Services.

The Hybrid Cloud Design Considerations article helped Contoso understand which cloud services, products, and technologies would best help it implement components from the reference model that it defined.  The cloud services, products, and technologies that they selected to implement their reference model components are listed in the table below.

Reference Model Entity Product/Technology External Service
Network (support and services)
  • Windows Azure Virtual Networks
  • Windows Server 2012 DNS services
Authentication (support and services)
  • Active Directory Domain Services (AD DS)
  • Windows Azure Active Directory (WAAD)
Directory (support and services)
  • Active Directory Domain Services (AD DS)
  • Windows Azure Active Directory (WAAD)
Compute (support and services)
  • Windows Azure Virtual Machines
Storage (support and services)
  • Windows Azure blob storage
Network infrastructure
  • On-premises VPN gateway
  • Existing on-premises network infrastructure
  • Windows Azure Virtual Networks
Compute Infrastructure
  • Existing on-premises compute infrastructure
  • Windows Azure Virtual Machines
Storage infrastructure
  • Existing on-premises storage infrastructure
  • Windows Azure storage infrastructure

After selecting the public cloud provider, cloud services, products, and technologies that they would implement their hybrid cloud infrastructure with, Contoso continued their physical design.

3.1 Overview

Contoso IT broke their hybrid cloud infrastructure project into multiple phases. Phase 1 of their project focused on the integration of critical infrastructure components between their private cloud and the public cloud provider, and the migration of a simple line of business application to the public cloud.  In Phase 1 of the project, Contoso IT selected a simple two-tier application to migrate to the public cloud and to validate the necessary infrastructure integration with their public infrastructure service cloud provider. 

Both tiers of the application were previously hosted on their private private cloud infrastructure, but in Phase 1, they migrated the web tier to the public cloud infrastructure service provider, and decided that the database tier would remain in their on-premises.  They felt this would give them an opportunity to validate their infrastructure design and facilitate future application migrations to the public cloud.

The application includes a front-end web tier and a back-end database tier. Contoso classified the application as a low business impact (LBI) application because it didn't contain any personally-identifiable information (PII), nor did it have any regulatory compliance requirements. They chose an LBI application because they wanted to start with a low risk asset as they learn more about planning, designing, implementing and managing a hybrid cloud infrastructure. In subsequent phases of the project, they will integrate more robust management and automation capabilities with their public cloud infrastructure service provider, and migrate additional applications to the public cloud provider.

The figure below is a diagram of the physical design for Contoso's hybrid cloud infrastructure solution that includes the LBI application that they migrated in Phase I.

img5

Contoso IT has already completed Phase 1 of their project.  The sections that follow explain which design design decisions that they made and why they made the decisions that they did. Contoso IT made their design decisions by selecting the design options and considerations from the Hybrid Cloud Design Considerations article that best helped them meet the requirements that they had defined in the Scenario Definition article. The decisions are grouped to align with both the CSFRM and the considerations order listed in the Hybrid Cloud Design Considerations article, as Contoso IT found the order in the article to be a useful design sequence.  With that said, they did iterate through most of their design decisions several times before finalizing them because of their learning experiences in developing the solution.

3.2 Service Account Acquisition and Billing

Contoso had several payment plan options from which they could choose when using Azure Infrastructure services. They decided to begin their hybrid cloud infrastructure project with a "pay as you go" plan. After they determine what their cost model will be, Contoso will move to the "pay monthly for six months" plan.  Once they move into full production they will finalize on the "pay monthly for twelve months" plan. They made these decisions based on wanting to begin the project slowly and decided a conservative approach would work best early on. As they gain more experience, they will move to payment plans that increase their financial level of commitment to the public cloud infrastructure service provider.

To simplify billing management, Contoso IT decided to use a single billing account.  The billing account will be assigned to the Director of Contoso IT. They also decided that the single billing account will pay for multiple subscriptions. The reason for this is that using a single subscription for multiple projects can be challenging from an organizational and billing perspective. The Windows Azure management portal provides no method for viewing only the resources used by a single project, and there is no way to automatically break out billing on a per-project basis. Contoso IT was aware that they could somewhat alleviate organizational issues by giving similar names to all services and resources that are associated with a project (for example, HRHostedSvc, HRDatabase, HRStorage). However, they decided that would not help them with billing.

Due to challenges related to granularity of access, organization of resources, and project billing, Contoso decided to create multiple subscriptions and associate each subscription with a different project. Another reason why Contoso created multiple subscriptions was to separate the development and production environments that they plan to deploy in the future. A development subscription will allow administrative access to developers while the production subscription will allow administrative access only to operations personnel.

Finally, Contoso realized that separate subscriptions would provide greater clarity in billing, greater organizational clarity when managing resources, and greater control over who has administrative access to a project. They were however, aware that this approach can be more costly than using a single subscription for all of their projects.

DESIGN DECISIONS:
-
Begin  with a "pay as you go" payment plan
-Use a single billing account managed by the Director of Contoso IT
-Establish separate subscriptions that are based on project
-Begin with a single subscription

3.3 Network

Contoso has a well-designed network infrastructure that is managed by its networking team and they have determined that for the most part, they will have to make very few changes to their underlying networking infrastructure and support model when integrating it with a public cloud infrastructure provider's network. There were a few decisions they had to make however, regarding connectivity to resources hosted on the public cloud infrastructure service provider's network. These decisions included:

  • Network connectivity between on-premises and off-premises resources
  • Inbound connectivity to the public cloud infrastructure service network
  • Load balancing of inbound connections to virtual machines in the public cloud infrastructure service provider's network
  • Name resolution for the public infrastructure service provider's network

The remainder of this section discusses design decisions Contoso made in each of these areas.

3.3.1 Network Connectivity Between On-Premises and Off-Premises Resources

Contoso needed to connect the corporate network to resources located on Azure Virtual Networks. They have decided to use a site-to-site VPN connection to connect the networks. The reason for this is that Contoso needs full bidirectional connectivity to support connections from the front-end web servers on the Azure Virtual Network to the back-end database servers that are located on-premises.  Additionally, they required enterprise-level name resolution for resources connecting both to and from the Azure Virtual Network and they also required connectivity between domain controllers existing on the Azure Virtual Network and the on-premises domain controllers.

DESIGN DECSIONS:
-
Establish a site-to-site VPN between the corporate network and the Azure Virtual Network

3.3.2 Inbound Connectivity to the Public Cloud Infrastructure Service Network

After examining their options for inbound connectivity to the Internet-facing servers on the Azure Virtual Network, they decided that all inbound access for client systems would be done over the Internet. This included client systems that connect from both the Internet and from Contoso's on-premises network. The reason they chose this option was because it simplified their DNS configuration, as they would not need to deploy a split DNS infrastructure.

Contoso dediced that inbound access to resources hosted in the Azure Virtual Network from management systems would not be done over the Internet. Because management access requires access to the operating system themselves, Contoso decided that all management activity must be sourced from management workstations and services within the corporate network. Management workstations and services must be physically located on the corporate network, or they must be connected to the corporate network over VPN or DirectAccess.  This decision was driven by Contoso's security team who required that all management access be done from authorized systems located on the corporate network, as this prevents access from the Internet to the operating systems themselves.

DESIGN DECISIONS:
-
All client access to resources located on the Azure Virtual Network was done over the Internet
-All management access to resources located on the Azure Virtual Network was done from the corporate network

3.3.3 Load Balancing of Inbound Connections to Virtual Machines in a Public Infrastructure Service Network

Contoso IT required high availability for services running on the Azure Virtual Network. In order to provide high availability and load balancing for front-end web systems, they decided to use the Azure Virtual Network's built-in load balancing feature. This decision was made based on the fact that all inbound connections from client systems would be done over the Internet and not through the corporate network. Windows Network Load Balancing was considered, but since it's not supported in Azure Virtual Networks and because it requires virtual machines to have more than one IP address (which is not supported in Windows Azure), Windows Network Load Balancing was not used for load balancing.

Another reason Contoso made the decision to use Azure Virtual Network's built-in load balancing feature is because Windows Azure includes an SLA that guarantees 99.95% uptime when two or more instances of the service component are installed (such as two instances of the front-end web servers or two domain controllers). Contoso is aware that Traffic Manager is another option they have available to them for load balancing, but decided not to use it at this time since it was in early preview when they were going through their design process.  They will evaluate it further in subsequent project phases.

DESIGN DECISIONS:
-
The Azure Virtual Networks built-in load balancer (load balanced endpoints) was used to enable high availability and load balancing for front-end web servers

3.3.4 Name Resolution for the Public Infrastructure Service Network

Contoso IT needed name resolution support for client systems that access virtual machines on the Azure Virtual Network. Since the decision was made that all client access to cloud based resources would be done over the Internet, they decided that they would use name resolution services that are external to both the Azure Virtual Network and the corporate network. DNS entries for resources on the Azure Virtual Network that are accessible over the Internet are maintained on a public DNS server. Client systems located both on and off premises use the public DNS resource records to access the services located on the Azure Virtual Network, although the client systems on the on-premises network use an internal DNS server that contains the same resource records that the public DNS server contains.

Contoso IT also required name resolution support for management systems so that they could access Azure Virtual Network hosted virtual machines. Since they decided to only allow management access to virtual machines on the Azure Virtual Network through the corporate network, they decided to enter the actual virtual machine names into the corporate DNS database. This decision was also supported by the fact that some or most of the resources hosted on the Azure Virtual Network are domain members, and thus are able to take advantage of Active Directory secure Dynamic DNS.

The figure below shows the DNS name resolution process for internal and external users. In the figure you can see that an external user queries a public DNS server that is responsible for hosting public DNS records for the service. The public DNS server returns the public IP address used to access the service in Windows Azure and then the client application connects to that IP address. The figure also shows an internal user sending a query to an internal DNS server that also hosts the public resource records required to reach the service over the Internet. Note that both the internal and external users receive the same public IP address required to reach the service located on Windows Azure. There is a zone transfer that takes place between the internal DNS server hosting the resource records for the service and the publicly accessible server (zone transfer is not depicted in the figure).

The name resolution process for administrative access to the virtual machines located on the Azure Virtual Network is a bit different. In this case, the figure shows that when an administrator wants to connect to a virtual machine on the Azure Virtual Network (for example, using remote desktop protocol (RDP), the query is sent to their Active Directory integrated DNS server that contains resource records for the actual names of the virtual machines located on the Azure Virtual Network. The IP address returned to the administrator's machine is the actual IP address used by the virtual machine on the Azure Virtual Network. The application then connects to that IP address over the site-to-site VPN connection.

img9 

DESIGN DECISIONS:
-Public DNS records were created to allow client Internet access to Azure Virtual Network hosted resources
-Zone transfer was enabled for the public DNS entries between the public DNS server and DNS servers on the corporate network
-Private DNS records were created to allow management access to Azure Virtual Network hosted resources

3.4 Storage

The Contoso intranet has a mature storage infrastructure that includes both Fibre Channel iSCSI SANs and Windows Server 2012 Storage Spaces.  In Phase 1 of their hybrid cloud infrastructure project, Contoso IT had no requirement for using Azure storage outside of using it to store the virtual disk files for their virtual machines.  In future phases however, they fully intend to evaluate implementing some of the hybrid storage scenarios discussed in the Hybrid Cloud Design Considerations article.

3.5 Compute

Contoso needed to make Compute design decisions that center on the virtual machines that were hosted on premises and the Azure Virtual Network. Contoso needed to take into account issues related to the Virtual Machine service provided by Windows Azure.

Contoso made decisions centered around the following issues when designing the hybrid cloud infrastructure’s compute components: 

  • Operating system and service images
  • On-premises physical and virtual server images and disks
  • Virtual disk formats and types
  • Virtual machine customization
  • Virtual machine access
  • Virtual machine service availability
  • Virtual machine service scalability

The remainder of this section discusses design decisions Contoso made in each of these areas.

3.5.1 Operating System and Service Images

In Phase 1 of Contoso IT's hybrid cloud infrastructure project, they migrated the front-end web tier of a low business impact (LBI) application to the Azure Virtual Network. Previously both the front-end web and database tiers were hosted on virtual machines on their corporate network. The application was custom-developed by Contoso's in-house developers. Contoso IT needed to decide if they wanted to move these virtual machines to the Azure Virtual Network or create new virtual machines on an Azure Virtual Network and then install the applications on those new virtual machines.

Contoso decided to create new virtual machines on an Azure Virtual Network and then install the front-end web tier component of their LBI application on the Azure Virtual Network. They made this decision for three primary reasons:

  1. The on-premises virtual machines were using VHDX files and Azure Infrastructure Services did not support VHDX files.  Contoso did not want to convert the VHDX files to VHD files, which Azure Infrastructure Services does support.
  2. The front-end web tier component was easy to install and configure
  3. They didn't have to wait for their multi-gigabyte virtual machine files to upload into the Azure storage.

In the future, Contoso will move existing virtual machines to an Azure Virtual Network if it finds the overhead of moving the virtual machines to be less than the overhead of creating and configuring new virtual machines in Windows Azure.

Since Azure Infrastructure Services allows its consumers to place their virtual machines in several different datacenters, Contoso decided to place the virtual machines in an Azure datacenter that is closest to the back-end database tier in its private cloud. They made this decision based on their understanding that network performance is best when Azure Infrastructure Services hosted virtual machines are closest to the end-users or servers that they need to communicate with.

For similar performance reasons, Contoso placed all the virtual machines into the same affinity group, as that put all members of the web tier into the same datacenter.

DESIGN DECISIONS:
-Contoso created new virtual machines for the front-end web tier on the Azure Virtual Network
-All virtual machines hosted in Azure Infrastructure Services were made members of the same affinity group
-All virtual machines were placed in an Azure datacenter closest to the on-premises network

3.5.2 On-Premises Physical and Virtual Server Images and Disks Design Decisions

Contoso has organizational standards for server operating systems that host services on their network. They use sysprepped disk images that contain the settings that have been approved by operations and security.  This helps them save time and reduce the risk of configuration errors when deploying new operating systems to virtual machines. However, during the first phase of their hybrid cloud infrastructure project, they decided to create new virtual machines on the Azure Virtual Network to support their LBI application based on the rationale discussed in the Operating System and Service Images section of this article.

For this reason, they decided not to use the "gold" images that are currently deployed on their private cloud. New virtual machines deployed on the Azure Virtual Network were manually configured to comply with corporate standards. In the future they will consider porting their "gold" images to Azure storage and use them as the base for new virtual machines created in Azure Infrastructure Services.

Because the servers in the front-end web tier accepted inbound connections from the Internet, those servers were hosted on Contoso's on-premises DMZ segment prior to moving their roles to the Azure Virtual Network. Security configuration for the web tier server virtual machines was managed by the Group Policy settings for the DMZ organizational unit (OU) in Contoso's Windows Server Active Directory. Contoso continued to apply security settings to the virtual machines via Group Policy after they were moved on to the Azure Virtual Network by placing the the web tier virtual machines hosted on the Azure Virtual Network into their DMZ OU in Active Directory.

DESIGN DECISIONS:
-In the first phase of the hybrid cloud infrastructure project, Contoso did not use corporate "gold" images in Azure Infrastructure Services
-Contoso placed the web tier virtual machines hosted in Azure Infrastructure Services into their DMZ OU

3.5.3 Virtual Disk Formats and Types Design Decisions

Contoso's current cloud infrastructure uses only VHDX files. They are aware that currently Azure Infrastructure Services supports only the VHD format. They are aware that they can perform disk conversion, but they prefer to wait until Azure Infrastructure Services supports the VHDX format so that they do not need to incur the overhead of converting virtual disk files.

Contoso used Microsoft's recommendation to put the operating system on an operating system disk and application data on data disks. Also per Microsoft's recommendation, Contoso decided not to put data on caching disks.

DESIGN DECISIONS:
-Contoso will create new virtual machines on Azure Infrastructure Services and use the native VHD disk format supported by the service
-They will use the Microsoft recommended disk types

3.5.4 Virtual Machine Customization Design Decisions

Contoso's virtual machines are configured with a number of different virtual hardware designs, based on the particular service being supported by the virtual machine or the preferences of the operator who set up the virtual machine. They realized that Azure Infrastructure Services requires users to pick from a number of pre-defined virtual hardware configurations and that the level of customization available on the on-premises cloud infrastructure is not available in Azure Infrastructure Services. Contoso reviewed the virtual hardware configuration for virtual machines running on-premises and mapped these existing configurations to those available on Azure Infrastructure Services. When Contoso moves on-premises services to Azure Infrastructure Services, they will use the mapping table to determine what the size of compute instance they will use to host the service on Azure Infrastructure Services.

DESIGN DECISIONS:
-Contoso mapped current virtual machine hardware configuration to those available in Azure Infrastructure Services
-New virtual machines created on Azure Infrastructure Services were configured based on the hardware mapping
-The front-end web tier virtual machines were deployed with the "small" virtual machine configuration

3.5.5 Virtual Machine and Application Access Design Decisions

Contoso needed a way to access virtual machines on Azure Virtual Networks for management purposes. They decided to use the Remote Desktop Protocol (RDP) and Remote PowerShell to access Windows operating systems, as this is what they use to manage virtual machines on a per server level on corporate network. For Linux-based virtual machines that they plan to deploy in the future, they will use the Secure Shell Protocol (SSH) for remote management.  They will only allow RDP and remote PowerShell access from the corporate network or from machines that are virtually connected from the corporate network.  These connections reach the Azure Virtual Network over their site-to-site VPN connection with the Azure Infrastructure Services.

Azure Infrastructure Services automatically creates a port forwarding rule that enables remote PowerShell and RDP for Windows operating systems and SSH access for Linux operating systems from the Internet to virtual machines created on an Azure Virtual Network. To reduce the attack surface for Internet-connected servers, Contoso security policy prohibits Internet access to the operating systems over remote PowerShell, RDP and SSH. For this reason Contoso IT decided to disable the default port forwarding rule created by Azure Infrastructure Services on all Internet-connected virtual machines.

Access to the application running on the web-tier servers is encrypted with the secure sockets layer (SSL) protocol, since users send corporate credentials to log on to the application. In order to allow this, Contoso IT created port forwarding rules that allowed TCP port 443 to the web-tier servers running on Azure Infrastructure Services. No port forwarding rules were created for the Active Directory domain controllers because no Internet-initiated connections are made to them.

DESIGN DECISIONS:
-RDP was used to perform per server management access to Windows virtual machines on Azure Infrastructure Services Virtual Networks
-The default remote PowerShell and RDP port forwarding rules were removed 
-A port forwarding rule was created that allowed TCP port 443 to all of the front-end web servers

3.5.6 Virtual Machine and Service Availability Design Decisions

Contoso used a hardware load balancer to increase the service availability for their multi-tier applications, typically for the front-end web servers. They wanted to continue to use load balancing technology when moving applications to Azure Infrastructure Services. For this reason, utilized the built-in load balancing feature included in Azure Infrastructure Services.

In addition to using load balancing to help insure service availability, Contoso wanted to make sure that all components were available when the public cloud infrastructure virtualization hosts were being serviced or updated. To solve this problem, Contoso decided that all virtual machines participating in the same service will belong to the same availability set.

DESIGN DECISIONS:
-Contoso used the Azure Infrastructure Services built-in load balancing feature for inbound connections to the service
-All machines participating in the same service were assigned to the same availability set

3.5.7 Virtual Machine Service Scalability

The application Contoso put into Azure Infrastructure Services has two on-premises Internet-connected web tier servers. Contoso IT noticed that when the servers were on-premises, there were times that application performance was sometimes degraded due to the web tier servers being overwhelmed with connection traffic. They had considered adding two more web tier servers when the web tier was hosted on their private cloud to improve application performance, but they had never done so. 

Since Contoso moved the web tier to Azure Infrastructure Services, they tooks advantage of the Azure auto-scaling feature. They created four virtual machines for their web tier, started two of the four, and defined the range of running instances to always be between two and four.  They then configured the auto-scaling feature so that it would add an instance to the tier when processor utilization was more than 80% for more than ten minutes, and then automatically remove instances when processor utilization dropped below 60% for more than ten minutes.  This not only allowed Contoso to meet capacity needs without manual intervention, it saved them costs, as they were only charged for the running instances.

Contoso put all of the web tier servers in its pilot application into the same Cloud Service. The reason for this decision is that auto-scaling virtual machines can only be done within the same service because it's configured within the service.

DESIGN DECISIONS:
-Contoso IT configured the web tier servers of the pilot application to auto-scale between two and four servers
-Auto-scaling parameters were set to a range of two to four servers with a processor utilization range set to 60%-80%
-All servers in the web tier were placed into the same cloud service

3.6 Management and Support

For the servers it hosted with its public cloud provider, Contoso had to make several important design decisions around how it would management and support them. The major design decisions they made were in the following areas:

  • Consumer and provider portal
  • Usage and billing
  • Service reporting
  • Infrastructure service provider authentication and authorization
  • Application authentication and authorization
  • Backup services and disaster recovery

The following sections discuss the design decisions Contoso made and the rationale behind those decisions.

3.6.1 Consumer and Provider Portal

Though Contoso has a private cloud infrastructure, it does not yet have what it considers to be a "private cloud."  For example, it still does not provide its consumers the level of self-service and automation that no longer requires any human intervention. Because of this, they did not provide a consumer-facing self-service portal to their users that automatically triggers the creation of virtual machines.. They plan to do so when they have completed their on-premises private cloud offering. Until then, they will take resource requests via their IT web site where business units can make requests for resources in the private cloud infrastructure.

Contoso needed access to Azure Infrastructure Services in order to provision and configure virtual machines. While they had several options, they decided to begin their hybrid cloud project by using the Azure Portal to manage their Azure Infrastructure Services environment. Contoso was aware that they had other options, such as PowerShell management and System Center App Controller, but they decided that during the first phase of the project, they wanted to keep things as simple as possible. In the next phase of their project, they will seek to create a seamless management experience by using using App Controller or a similar solution. In addition, prior to the next phase of the hybrid cloud project, Contoso will gain experience with PowerShell management of their Azure Infrastructure Services resources.

DESIGN DECISIONS:
-Contoso did not provide a consumer self-service portal to its customers during the first phase of the hybrid cloud infrastructure project
-They used the Azure Portal for the first phase of the project

3.6.2 Usage and Billing

As mentioned in section 3.2 (Service Account Acquisition and Billing Considerations for Public Cloud Infrastructure Design Decisions), Contoso began with a conservative approach in terms of resource acquisition in Azure Infrastructure Services, but plans to systematically grow their financial investment in Azure Infrastructure Services as they gain more experience with the service. In order to both simplify and clarify usage reporting, Contoso used a single billing account that will be responsible for multiple subscriptions. This gave them visibility into usage costs for different service offerings that use the hybrid cloud infrastructure.

During the initial phase of the project, Contoso IT used showback to report the costs to the business group responsible for the application. As they deploy more applications into the hybrid cloud infrastructure, they will continue to use showback with multiple business groups. Over time, they will engage in discussions with the financial leadership at Contoso to determine if chargeback is appropriate for some or all of the business groups that use the hybrid cloud infrastructure.

DESIGN DECISIONS:
-Contoso used a single billing account
-Multiple subscriptions will be used in the future with one subscription per service offering 
-Showback was used instead of chargeback during the initial phase of the project

3.6.3 Service Reporting

Contoso needed reports on service availability for the public cloud infrastructure service side of their hybrid cloud infrastructure. Contoso wanted to provide an SLA to consumers of the hybrid cloud service and reports needed to provide information about uptime for each component of the service as well as the service as a whole. At the time they had a monitoring and reporting system that they used on-premises and the plan is to integrate that system with the Windows Azure reporting system if they find that it is possible. If so, they will carry that out in the next phase of the project.

During the first phase of the hybrid cloud project, Contoso took advantage of the no-cost System Center Advisor software as a service offering from Microsoft to provide service configuration monitoring and reporting that will supplement the reporting capabilities included with Azure Infrastructure Services. Contoso will consider wider adoption of System Center Advisor after they gain more experience with the product during the first phase. In addition, they will use the service dashboard to determine the health of the Azure Infrastructure Services services in the region where the virtual machines are located.

DESIGN DECISIONS:
-Contoso used the Azure Infrastructure Services built-in service reports
-They used System Center Advisor to monitor service components hosted on Azure Infrastructure Services
-They used the service dashboard to monitor the health of the overall Azure Infrastructure Services services in their location

3.6.4 Public Cloud Infrastructure Service Provider Authentication and Authorization

When working with a public cloud infrastructure service provider’s system, Contoso needed to decide what authentication and authorization/access control options they would implement. Design decisions they needed to make in this area included:

  • Authentication
  • Authorization
  • Account management

NOTE:
This section (3.6.4) addresses only authentication, authorization and account management issues for access to the Azure service. This section does not cover authentication, authorization and account management issues related to access to the application itself and the application running on the web tier servers located in Azure Infrastructure Services. The next section (3.6.5) will discuss authentication, authorization and account management issues related to accessing the application.

The following sections discuss the design decisions Contoso made in these areas and the rationale for those decisions.

3.6.4.1 Authentication

Contoso needed to authenticate to the provider’s system to gain access to its system's resources. They considered their options and decided that the most efficient and secure way to provide account access to the Azure Portal and Azure Infrastructure Services hosted resources was to integrate their on-premises Active Directory accounts with Windows Azure. This enabled enterprise level account management since it integrated their on-premises account provisioning and de-provisioning systems with Azure's authentication systems.

Contoso decided that they would synchronize their on-premises authentication system (Active Directory) with Azure by taking advantage of Windows Azure Active Directory, but that they would not synch their on-premises passwords with Windows Azure Active Directory for an extra measure of security. Because they weren't synchronizing their on-premises passwords, they also federated their Active Directory with Windows Azure Active Directory by using Active Directory Federation Services, which enabled a claims-based authentication system that could be used to log onto the Azure Portal. This allowed them to use the same username and password to sign-in to Windows Azure services that they used to sign-in to their on-premises services, but it allowed them to not need to ever exchange passwords with Windows Azure.

The rational for using Active Directory Federation Service's claims-based authentication mechanism is that since user passwords aren't being synchronized with Windows Azure Active Directory, there has to be some way for Windows Azure to authenticate the user using the on-premises user account. This is accomplished by using DirSync to synchronize the user accounts. Now that Windows Azure Active Directory "knows" about the user accounts, it needs a way to validate the passwords sent by the user. This is accomplished through AD FS. When Windows Azure Active Directory gets a log on request, it sends the claim to the on-premises AD FS server, which is responsible for validating the claim. This enables Contoso to sync their user accounts and not passwords with Windows Azure Active Directory.

After synchronizing their local accounts with Windows Azure Active Directory, Contoso decided to only allow accounts in their Active Directory to have administrative access to Azure Infrastructure Services hosted resources. This means that when when Active Directory accounts are deleted or disabled on-premises, they automatically no longer have access to Windows Azure services. They believe this functionality will prove very helpful when employees with access to their Windows Azure services leave the company. 

DESIGN DECISIONS:
-Contoso synchronized their on premises Active Directory with Windows Azure Active Directory using DirSync
-Only Active Directory accounts will be allowed administrative access to Azure Infrastructure Services hosted resources for management

3.6.4.2 Authorization and Access Control

Contoso needed to decide what authorization capabilities they would enforce for access to resources contained on-premises and in the public cloud. Since Contoso didn't have a self-service portal (other than their IT web resource request page), its consumers request resources through that web page and their resources are configured for them manually by a member of the IT department. A full self-service portal will be implemented when Contoso completes its private cloud IaaS implementation, and automation will be built into the resource requisitioning process, which will integrate both on-premises and off-premises components.

Contoso reviewed its options for acquiring resources for the public cloud components of the hybrid cloud infrastructure and decided that during the first phase of the project they would just extend their current on-premises approach to requests for Azure Infrastructure Services resources. They accept requests for infrastructure resources through their service desk and Contoso IT deploys components on their on-premises infrastructure or on the public cloud, based on the unique requirements of each application. Only specific members of the Contoso IT group were given administrative access to Azure Infrastructure Services at the first phase of project. Later, these members of the IT group will train the rest of the group on how to manage Azure Infrastructure Services resources.

DESIGN DECISIONS:
-Contoso extended its resource request approach to public cloud components of the hybrid cloud infrastructure
-A subset of the Contoso IT organization were trained on managing Azure Infrastructure Services resources
-The subset of the Contoso IT organization will train the rest of the IT group in the next phase of the project

3.6.4.3 Account Management

Contoso needed an account management system to insure that only valid accounts of authorized users had access to the hybrid cloud infrastructure. They had a mature account management system that was integrated with their on-premises human resources systems. They needed to extend the efficiencies they have with their on-premises system to access to Azure Infrastructure Services hosted resources. As discussed in section 3.6.4.1 (Authentication), they accomplished this goal by integrating their on-premises Active Directory system with Azure Infrastructure Services so that local accounts were required for management access.

Role based access control was important to Contoso security and they implemented this at many levels of their on-premises network. They also required the ability to assign role-based access control to users of their public cloud resources. They decided to assign the head of Contoso IT to the Service Administrator role, and a subset of Contoso IT members to the Co-Administrator role. They did this based on the fact that the Director of Contoso IT makes decisions on who has administrative access and therefore should have the same responsibility for assigning this access to the Azure Infrastructure Services hosted resources.

Contoso was aware that Co-Administrators are allowed to add other Co-Administrators. Therefore, they put a process in place where the Service Administrator is responsible for reviewing the Co-Administrator list on a daily basis to confirm that only authorized users are listed as Co-Administrators. If the Service Administrator is not available due to vacation, sick leave, or any other reason, the Service Administrator's manager will be tasked with this check. The reason for this is that it is possible that a Co-Administrator might add an account that a user could use if they leave the company, which would allow them to retain access privileges when they should no longer have them.

DESIGN DECISIONS:
-On-premises account management was extended to off-premises access to the Azure Portal via claims-based federation
-The Director of Contoso IT was assigned the Service Administrator role
-The Director of Contoso IT will assign Co-Administrator access to a subset of the IT organization
-Contoso IT put a process in place to check on the list of Co-Administrators

3.6.5 Application Authentication and Authorization

Contoso needed to make design decisions around application authentication and authorization. While there are a number of authentication and authorization options available for the applications that Contoso will run in Azure Infrastructure Services, in the majority of cases for Contoso those applications will be dependent on Active Directory. For this reason, Contoso needed to make design decisions for applications that run some or all of their components in Azure Infrastructure Services.

Design decisions for application authentication authorization revolved around the following issues:

  • Active Directory domain controller on an Azure Virtual Network
  • Domain controllers on an Azure Virtual Network
  • Domain Controller Locator 
  • Domain, forest and global catalog
  • Active Directory name resolution and Geo-location
  • Active Directory Federation Services

NOTE:
This section (3.6.5) addresses authentication, authorization and account management issues related to access to the application that has it's web tier hosted in Azure Infrastructure Services. This section does not address issues related to authentication, authorization and account management for access to the Azure services themselves. These issues are discussed in section 3.6.4.

The following sections discuss the design decisions Contoso made in these areas and the rationale for those decisions.

3.6.5.1 Active Directory Domain Controllers in the Public Cloud Infrastructure Provider's Network

Contoso IT needed to decide if they were going to put any domain controllers in an Azure Virtual Network to support applications that require Active Directory for authentication. In addition, they needed to decide what type of domain controller to place on the Azure Virtual Network. Contoso decided to put two domain controllers on the Azure Virtual Network to support user authentication for access to hybrid applications. This allowed users to authenticate even if the site-to-site VPN connection to the on-premises domain controllers failed.

New domain controllers were created on the Azure Virtual Network. Contoso was aware that Azure Infrastructure Services supports VM-Generation ID which enabled them to create domain controllers on-premises and move them to an Azure Virtual Network, but decided that creating new domain controllers on the Azure Virtual Network would be the lower overhead option for the first phase of the project.

Given that the decision was made to put domain controllers on the Azure Virtual Network, Contoso needed to decide which disk types to use in Azure Infrastructure Services virtual machines to support the domain controller role. Contoso decided to adopt Microsoft's recommendations that the operating system be stored on an OS disk and all Active Directory-related files be stored on a Data Disk. This recommendation is based on the fact that OS disks enable write caching by default and Data Disks disable write caching by default, which is a core Active Directory assumption for disk behavior.

DESIGN DECISIONS:
-Contoso put two domain controllers on each Azure Virtual Network to support user authentication to applications
-Those domain controllers belonged to the same domain and forest as the on-premises production domain
-The new domain controllers were created on the Azure Virtual Network network
-Domain controllers used the disk type recommendations provided by Microsoft

The figure below provides a high level overview for application authentication. Notice that both the Internet-based client and the on-premises client connect to the web tier servers over the Internet. The web tier servers then send the credentials for authentication to the domain controllers on the Azure Virtual Network. The red arrows show the directory replication paths.

img10

3.6.5.2 Read-Only Domain Controller

Contoso decided not to use read-only domain controllers because they knew that not all applications are compatible with read-only domain controllers and they did not want to incur the overhead of testing each application to determine if it was compatible with read-only domain controllers. This helped Contoso avoid the need to deploy both read-only and read-write domain controllers.

DESIGN DECISIONS:
-Contoso did not deploy read-only domain controllers

3.6.5.3 Domain Controller Locator

Contoso wanted to make sure that they minimized the impact of egress traffic due to replication traffic because they have to pay for Windows Azure egress traffic. The reason for this is that Contoso would have to pay for egress traffic. In order to achieve this goal, they decided to configure a subnet and site for each each Azure Virtual Network that they create in Azure Infrastructure Services. After creating sites, they configured site links with costs to insure that on-premises domain controllers see that Azure Virtual Network-located domain controllers see these sites as very high cost. They limited replication from the Azure Virtual Network-located domain controllers to weekdays only and created a replication policy that enforced this. Finally, Contoso decided to enable compression for Active Directory replication to decrease the amount of traffic (and egress cost) that they sent over the Azure Virtual Network.

DESIGN DECISIONS:
-Contoso created a subnet and site for each Azure Virtual Network
-High cost site links were set for links to domain controllers location on an Azure Virtual Network
-Replication policies were created that limited outbound replication to weekdays only
-Compression was enabled for Active Directory Replication

3.6.5.4 Domain, Forest and Global Catalog

Contoso had several options to choose from for deciding what type of domain and forest configuration to use for domain controllers and resources located on an Azure Virtual Network. They decided that the Azure security model was strong enough to enable them to put full read-write domain controllers that are members of the corporate on-premises domain and forest on the Azure Virtual Network. They decided to deploy domain controllers that are part of the same domain and forest as the on-premises network because it is easier to manage and deploy and enables the greatest level of application compatibility.

Contoso IT also decided to make the domain controllers located on the Azure Virtual Network global catalog servers. The reasons for this are that Contoso wanted to support Universal Groups, enable log on for their multiple-domain forest, and enable them to take advantage of universal group membership caching.

DESIGN DECISIONS:
-All domain controllers on the Azure Virtual Network were configured as global catalog servers
-All domain controllers on the Azure Virtual Network were read/write domain controllers
-All domain controllers on the Azure Virtual Network were members of the on-premises network forest and domain

3.6.5.5 Active Directory Name Resolution and Geo-Distribution

Contoso needed to decide how to handle name resolution for resources located on the Azure Virtual Network and resources located on-premises. While Contoso was aware that Azure Infrastructure Services provided some native DNS capabilities, they are not robust enough to support Active Directory's DNS requirements, such as support for SRV records. For this reason, Contoso decided to configure the domain controllers on AVNs to also be Active Directory integrated DNS servers.

DESIGN DECISIONS:
-All domain controllers located on the Azure Virtual Network were configured as AD-integrated DNS servers

3.6.5.6 Active Directory Federation Services (ADFS)

While Contoso didn't require federated access to applications hosted in Azure Infrastructure Services during the first phase of the hybrid cloud project, they did decide to use on-premises Active Directory accounts for logging in to Azure Infrastructure Services. In order to accomplish this, they federated their on-premises Active Directory infrastructure with Windows Azure Active Directory. Contoso plans to investigate application authentication federation when they move forward in the future to use Azure PaaS services. In addition, they plan on enabling same sign-on to Azure services by federating their on-premises directory with Azure Active Directory.

DESIGN DECISIONS:
-They will use ADFS for federating with Azure Active Directory to enable same sign-on

The final Active Directory authentication and authorization design can be summarized by the figure below. Contoso synchronized on-premises accounts with Azure directory services so that on-premises Active Directory accounts can be used to log on to the Azure portal. Active Directory Federation Services (ADFS) is used to to enable Azure Directory Services to forward credentials for verification with the Contoso Active Directory domain controllers. Organizational Units (OUs) are used to manage the configuration of both the web tier servers and the domain controllers in the Azure Virtual Network.

img8

3.6.6 Backup Service and Disaster Recovery

In order to make sure that the company is able to recover from failures, Contoso IT needed to make design decisions around data backup and recovery for:

  • Active Directory domain controllers on an Azure Virtual Network
  • Domain controllers on an Azure Virtual Network

The following sections discuss the design decisions Contoso made in these areas and the rationale for those decisions.

3.6.6.1 Backup Services

Contoso hada well-established backup system in place for on-premises virtual machines and application data attached to those virtual machines. They decided that they will continue to use that system for on-premises resources. However, they will not use the same system to back up virtual machines and application data for resources running on Azure Infrastructure Services because they do not want to incur the egress traffic costs.

The pilot application's web tier hosts stateless services that are simple to restore, therefore there is little to gain by backing up the operating systems and there is no application data to back up.  In the next phase of the project, Contoso will consider using Windows Azure Backup Services. By using Windows Azure Backup Services they will be able to backup systems on Azure Infrastructure Services without incurring additional costs from standing up more virtual machines in the Azure Virtual Network to perform backup services.

DESIGN DECISIONS:
-Contoso did not backup the stateless machines that run the pilot application's web tier

3.6.6.2 Disaster Recovery

Though disaster recovery is one of the key usage scenarios they plan to evaluate in the future, they did not evaluate it during their pilot.

3.6.6.3 Operating System Update

Contoso needed to decide how they would update the Windows operating systems in the web tier virtual machines. They could have used the public Windows Update system or Windows Server Update Services. They have a WSUS implementation on-premises and have update policies in place for DMZ systems on-premises. They decided to update the pilot application's web tier servers with their existing, on-premises WSUS servers. This will not impact Contoso IT's network traffic costs because the traffic is inbound to the web tier servers over the Azure Virtual Network, rather than outbound, and they're only charged for outbound traffic. The updates will however, use available bandwidth over the site-to-site VPN connection and therefore updates of the web tier servers will be scheduled for off-hours.

DESIGN DECISIONS:
-Contoso used their on-premises WSUS system to update the web tier virtual machines
-They set update policies so that updates take place during off-hours

4.0 Summary

This article defined the design decisions, and rationale for the decisions, that a fictitious organization chose for its hybrid cloud infrastructure design. It selected options from the myriad of available options that are discussed in the Hybrid Cloud Infrastructure Design Considerations article. It selected the specific options that it did based on requirements that are defined in the Scenario Definition article. For organizations that have requirements and constraints similar to the organization discussed in the Scenario Definition article, the design decisions and rationale in this article can help decrease both your implementation time and risk of implementing a hybrid cloud solution. If you could like to see how Contoso implemented the hybrid cloud solution, please see the Implementation guide. If you would like to learn about other cloud architectural solutions, please visit the the Cloud and Datacenter Solutions Hub.

5.0 Technologies Discussed in this Article

Windows Server 2012 DNS services
Active Directory Domain Services
Windows Azure Active Directory
Windows Azure Virtual Machines
Windows Azure Cloud Services
Windows Azure Storage
Windows Azure Storage Services
Windows Azure Recovery Service
Windows Azure Virtual Network 

6.0 Authors and Reviewers

Authors:
Thomas W. Shinder - Microsoft
Jim Dial - Microsoft

Reviewers:  
Yuri Diogenes - Microsoft
John Dawson - Microsoft

6.0 Change Log

Version Date Change Description
1.0 8/23/2013 Initial posting and editing complete.
1.1 9/30/2013 Changed title to hybrid cloud and changed document reference from IT to cloud