Published: June 20, 2013
Abstract: This article defines the cloud services foundation problem domain, which includes the operational processes and technical capabilities that are necessary to provide cloud computing services within an organization.
Table of Contents
|1.1 Reference Models|
|2.0 Cloud Services Foundation Reference Model|
|2.1 Service Delivery|
|2.2 Service Operations|
|2.3 Management and Support|
To provide feedback on this article, leave a comment at the bottom of the article or send e-mail to SolutionsFeedback@Microsoft.com. To easily save, edit, or print your own copy of this article, please read How to Save, Edit, and Print TechNet Articles. When the contents of this article are updated, the version is incremented and changes are entered into the change log. The online version is the current version.
This article is one of several articles that are included in the Cloud Services Foundation Reference Architecture (CSFRA) article set. An article set is a collection of interrelated articles that are intended to be read as a whole, like the chapters of a book. This article assumes that you have already read the Overview article for the set, which defines what a reference architecture is, what the Cloud Services Foundation is, and describes the relationship of this article to the other articles in the set. If you have not already read the Overview article, we recommend that you do so before you read this article.
This article defines the Cloud Services Foundation Reference Model (CSFRM). The CSFRM expands on the definition of the term cloud services foundation that is included in the Overview article of this article set.
Reference models share the following characteristics:
- They represent a problem domain
- They are often defined for problem domains that are not well understood or understood in a variety of different ways by different people, or that are sufficiently complex so that understanding them requires that the problem domain for which they’re created be decomposed into lower-level entities that promote common understanding
- They often consist of a diagram of entities, the relationships between the entities, and descriptive text that clearly defines each entity and relationship in the diagram
- They are typically vendor/product-agnostic and standards-agnostic to allow for various implementations that are based on them
- They provide common terminology in the problem domain for which they’re created
- They can serve as a foundation for designing and implementing solutions in the same problem domain for which they were created
The problem domain for the Cloud Services Foundation Reference Model (CSFRM) is cloud services foundation. Although the term is defined extensively in the Overview article of this article set, the short definition is:
The minimum amount of vendor-agnostic hardware and software technical capabilities and operational processes necessary to provide information technology (IT) services that exhibit cloud characteristics, or simply, cloud services.
It’s important to note that although the problem domain is the foundation for providing cloud services, it does not include cloud services.
In addition to the attributes of reference models already listed, the CSFRM serves as a framework that can be used to help cloud services providers answer the following questions:
- What kinds of service level requirements should I define before I either design or implement a new cloud service or technical capabilities that support or enable cloud services?
- What kinds of operational processes do I require to operate a cloud service over its lifetime?
- What technical capabilities do I require to host, support, or manage cloud services?
- How will the services I provide be offered and presented to my consumers?
Cloud Services Foundation Reference Model (Click any word in the diagram to be taken to that location in the article, or download an editable Visio diagram)
Many of the terms you see in the CSFRM diagram might look familiar to you. They might even cause you to wonder if and how this model is any different than what you have in your existing environment. Before you make any assumptions as to what anything in the CSFRM diagram represents, we recommend that you read through this article in its entirety for a clear explanation of everything that is represented in the diagram. After you read this entire article, the differences between what’s represented in the CSFRM diagram and what you have in your existing environment should be clear. When terms in the CSFRM diagram are used throughout the remainder of this article, they are capitalized to indicate that the words aren’t just used independently, but rather, that the words represent an entity from the diagram.
- Subdomains: The large blue and green boxes, some of which contain components
- Components: The small boxes inside many of the subdomains
- Relationships: The arrows between subdomains
Subdomains exist in the CSFRM to:
- Divide the cloud services foundation problem domain so that each subdomain can be defined separately.
- Enable a collection of components to be referred to collectively. For example, the components in the Infrastructure subdomain are Infrastructure components.
- Enable a relationship entity to represent the relationship between all of the components in a subdomain to the components of other subdomains. As a result, the relationships between subdomains then also collectively apply to the components that are contained in each subdomain. The relationships are represented by arrows in the model. The verbs by the arrows describe the relationship between the components in the subdomain that the arrow points from and the components in the subdomain that the arrow points to. Therefore, you could say that the Service Delivery subdomain components define the Service Operations subdomain components.
Although each component represents a unique entity, all components in subdomains that are represented with the same color represent the same component type. Because there are two colors of subdomains that are represented in the CSFRM diagram, there are two types of components:
- Process: Green subdomains contain components that represent IT operational processes or service requirements or both. While many of the processes that are represented in the CSFRM are similar to those found in various information technology service management (ITSM) frameworks, there is no deliberate attempt to align the CSFRM processes to any particular framework, because different frameworks are used by different organizations. Therefore, while you might attempt to correlate the CSFRM processes to processes in the ITSM framework that you use, you likely will not find one-to-one mappings for all of the processes, nor will you necessarily find the definitions of the CSFRM processes in alignment with the definitions of the processes that are used in your framework of choice. Therefore, the processes that are represented in the CSFRM represent only the definition provided for each component in this article. Within an organization, the CSFRM processes can be manual, or automated, or be a combination of manual and automated. All processes and requirements for the components in process subdomains should be defined before designing or providing a cloud service to consumers. There are two process subdomains in the CSFRM, the Service Delivery subdomain and the Service Operations subdomain.
- Technical Capabilities: Blue subdomains contain technical capabilities components, which represent the functionality that is provided by hardware devices or software applications or both. Individual technical capabilities do not exhibit cloud characteristics, as defined in the Overview article of this article set on their own; a provider aggregates the functionality of several of them to provide cloud services. There are four technical capabilities subdomains in the CSFRM, Management and Support, Infrastructure, Platform, and Software. While capabilities such as physical structures or containers, cooling, and electricity are necessary to support technical capabilities, they are not represented in the CSFRM because they’re part of a separate problem domain, one that typically has a separate audience from the CSFRM.
The remaining sections of this article define each of the process and technical capabilities subdomains and their components in detail. In this article, the definitions for each subdomain and component are brief. The CSFRA article set, to which this article belongs, also includes detailed planning articles for many of the process and technical capabilities subdomains that are represented in the CSFRM.
Solution guidance that uses Microsoft products and technologies to implement the technical capabilities that are defined in this article and solution guidance for implementing various cloud services are provided separately from this article set and are available at the Cloud and Datacenter Solutions Hub.
The components in this subdomain serve as the conduit for translating consumer requirements into cloud services and for the provider to manage the delivery of the services to the consumer requirements throughout the service lifecycle. This subdomain contains components that represent:
- Service level requirements that must be defined when:
- Designing technical capabilities implementations where the provider owns or manages the technical capabilities that enable the service
- Evaluating services provided by an external provider that owns or manages the technical capabilities that enable the service
- Processes that must be defined to ensure that a service meets its service level requirements throughout its lifecycle
It’s critical to define and measure cloud services requirements as specifically as possible to ensure ongoing customer satisfaction with the service. The components in this subdomain are defined in the following sections.
This component represents the following:
- Ongoing communication with consumers to determine if new services should be added or if existing services should be changed or deprecated. The list of services available to consumers is maintained in a service catalog, which is discussed further in the Consumer and Provider Portal component section of this article.
- The functional and service level requirements for services. Service level requirements are each represented in detail by other components in the Service Delivery subdomain.
- Defining and measuring customer satisfaction level targets for a service
- The plan for how the functional and service level requirements and customer satisfaction levels will be met. The plan typically includes clearly-defined requirements and consistently-applied processes from the Service Delivery and Service Operations subdomains, and often relies heavily on the Service Reporting component to monitor adherence to service level requirements. The Business Relationship Component has a close relationship to the Service Lifecycle Management and Service Level Management components.
This component represents the following:
- The capacity requirements for scale and performance that the service must meet at daily, weekly, monthly, and annual intervals. Both peak and average requirements should be specified for each interval. Capacity requirements should also call out any specific peak times such as month-end, specific holidays, or year-end capacity requirements.
- The plan for how the requirements are to be met. The plan defines how technical capabilities that enable the service are scaled to meet capacity, whether the capabilities are owned and managed by the organization or an external provider or both. The plan applies the organization’s Request Fulfillment process and depends on the Fabric Management and Deployment and Provisioning components to provision new capacity.
This component represents the following:
- The availability level that the service adheres to after it’s available for use by consumers. This is often expressed in the percentage of time that the service is available during some timeframe, such as per month. The availability level should specify whether it includes planned downtime. The service provider might further define the requirement with “during certain hours or days”, or “under normal conditions” text.
- The plan for how the requirements are to be met. In defining the availability level that the service can adhere to and at what cost, the service provider typically evaluates its ability to provide:
- Availability: This is generally thought of as the availability level of the service under normal conditions and includes the expected downtime from normal failures, such as a single-server failure.
- Continuity: This is generally thought of as the availability level of the service under abnormal or disaster conditions, and includes the expected downtime from abnormal failures, such as an entire data center failure.
The cost to provide service levels under both normal and abnormal failure conditions can vary greatly. The provider generally iterates its design to reach an ultimate availability requirement that strikes the right balance between the availability level consumers desire and what they’re willing to pay for that availability level. To meet availability and continuity requirements for a service, the plan may utilize technical capabilities owned and managed by the organization or an external provider or both.
This component represents the following:
- The confidentiality, integrity, and availability requirements of the data that is processed by the service. While each service might have unique requirements, there is generally a standard set of requirements that are used throughout the organization that are also applied.
- The plan for how these requirements are to be met by using the Access Management, Authentication, Authorization, and Data Protection components.
This component represents the following:
- The regulatory compliance requirements for the service. The specific compliance requirements vary across countries and industries and might apply to every service in an organization, specific services in an organization, or specific data in individual services. Examples of regulatory policies that might influence the design or selection of a public service are the Health Insurance Portability and Accountability Act (HIPAA), Payment Card Industry (PCI), Sarbanes-Oxley (SOX), European Union Directive 95/46/EC, International Convergence of Capital Measurement and Capital Standards (Basel II), Data Protection Act of 1998, and others.
- The plan for how these requirements are to be met by using the Access Management, Authorization, Authentication, Data Protection, and Service Reporting components.
This component represents the following:
- Definitions of the cost to provide the service and what consumers pay for the service. The cost to provide the service can’t be finalized without incorporating the cost to meet requirements from the following components:
- Capacity Management
- Availability and Continuity Management
- Information Security Management
- Regulatory and Compliance Management
- The plan for how these requirements are to be met. The service provider must continually evaluate its costs to provide the service and adjust the cost of the service to its consumers as appropriate. A key goal for most service providers is to minimize their cost of delivering a service to their consumers. The Usage and Billing component is a key enabler for recovering costs to provide the service, while efficient Service Operations processes and the Process Automation component are key enablers for decreasing the cost of providing a service.
This component represents the following:
- Requirements for many of the processes that are defined in the Service Operations subdomain components in this article.
- Requirements for many of the technical capabilities that are defined in the Management and Support subdomain components in this article.
- The plan for how these requirements are to be met. While the processes and technical capability requirements might differ across services, they generally do not dramatically differ, because most organizations define a standard, and then try to adhere to it as much as possible.
This component is a key enabler for customer satisfaction and results in the service level agreement (SLA) for the service, which is created from the outcomes of many of the components in this subdomain. This component has a close relationship to both the Business Relationship Management component and all Service Operations components.
This component represents the overall lifecycle processes of each service from requirements definition through design and implementation, operations with the Service Operations components, and eventual retirement. This component has a close relationship to both the Business Relationship Management component and all Service Operations components.
The components in this subdomain represent the processes that are applied to each service to ensure that it continuously meets the requirements that are defined by the components in the Service Delivery subdomain. Although many organizations define each of these components as standardized processes, the specific application of the processes often varies across services. The Management and Support components support the components of this subdomain.
The following sections provide high-level detail about each of the process components in this subdomain. While many of the processes might be similar to the processes used to operate services without cloud characteristics in an organization, there are a few notable differences when operating services with cloud characteristics:
- Processes are typically highly automated to minimize human error and labor
- Consistent application of the processes is critical to meeting service level requirements. While this might state the obvious, many organizations that don’t provide cloud services to their consumers might not be as diligent in their operational processes as organizations that do provide cloud services.
This component defines how consumer requests for service are fulfilled. The process defines requirements for how the Consumer Portal, Deployment and Provisioning, and Fabric Management components are used to fulfill requests, but also defines how Infrastructure, Platform, and Software technical capabilities will be acquired from external vendors. This process is a key enabler to meeting the requirements represented by the Capacity Management component.
This component defines how the requirements for the service from the Information Security Management component are applied and complied with. The Authentication, Authorization, and Data Protection components support the Access Management component.
This component defines:
- How software and hardware assets that provide technical capabilities are managed by the organization, from asset acquisition to asset retirement.
- The assets that enable services and the configuration items (CIs) for each asset that are managed by the Change Management component for each service. Changes to some CIs of an asset may impact the SLA for the service, while others do not. Only the CIs that impact the SLA of the service are managed by the Change Management component.
- The value of the desired state for each CI of each service, how the current value of the CI will be evaluated against the desired state, and what the mitigation process is when the current value of the CI is not the desired state value.
The outcomes of this component define requirements for the Service Management and Configuration Management components and are utilized by the Change Management component.
This component represents the daily, weekly, monthly, and as-needed tasks that are required for the service to meet SLAs throughout its lifecycle. As many tasks as possible are often automated with the Process Automation component.
This component represents the process that enables the provider to determine and weigh the benefits of making a change to the proposed state of CIs of a service against the risk of not being able to meet the SLA metrics for the service. The change requests for which the benefits outweigh the risks are approved. This process is supported by the Service Level Management and Configuration Management components, but utilizes the CIs defined by the Configuration Management component.
This component represents knowledge about the service that is to be gathered, analyzed, stored, and shared throughout the organization. This knowledge is often maintained by the Service Management component and used by the Incident and Problem Management component.
This component represents the process that defines how change requests that are approved by the Change Management component are tested and deployed to production with the Deployment and Provisioning component, while still enabling the service to meet its SLAs.
This component represents the process for:
- Resolving incidents: Incidents are disruptive or potentially disruptive events to a service. The goal of this process is to resolve the incident with maximum speed and minimum service disruption, but identifying the underlying root cause of the incident is not a goal of this process.
- Identifying and resolving problems: Problems are the underlying root cause of one or more incidents. Problems are often more difficult to identify and resolve than incidents are, but their resolution can prevent future incidents.
The Service Management component is used to help identify both resolutions to incidents, and recurring incidents that require problem resolution. The individual who resolves the incident might be a member of the service’s support staff, or might be a consumer that used the Consumer Portal component to resolve the issue themselves. The individual that resolves problems is often a subject matter expert or a team of subject matter experts that have a deep understanding of the technical capabilities that enable the service.
The technical capabilities components within this subdomain manage and support on-premises technical capabilities that host, manage, or support private, externally-consumed public, or hybrid cloud services. The requirements that these capabilities meet are defined by both the Service Delivery and Service Operations components in the environment, but the components also provide data to the Consumer and Provider Portal component that enables consumers to monitor whether the provider adhered to the SLA requirements defined by the Service Level Management component. When you select technologies to implement these components, keep in mind that the functionality of multiple components may be provided by a single technology or that multiple technologies may be required to provide a single component, or that multiple technologies may provide the same component, but in different ways.
This component is the consumers' interface to the services that are available in their environment. It includes:
- The service catalog, which lists the available services, the SLAs that the services offer, and the consumption cost for the services. New services aren’t listed in the service catalog until they meet SLA requirements that are defined by the Service Level Management component. An organization may consume a service from an external provider that manages the technical capabilities that enable the service. The consuming organization may aggregate the service capabilities with technical capabilities that it manages and then provide the service to consumers within its own organization. This is often referred to as a hybrid deployment model for a service. Alternatively, an organization may provide a service to consumers within its own organization that is enabled solely with technical capabilities managed by the organization. This is often referred to as a private cloud deployment model for a service.
- A self-service interface to the Deployment and Provisioning and Fabric Management components that allow consumers to provision new capacity as required.
- A reporting interface that exposes information from both the Usage and Billing and Service Monitoring components.
- Incident and problem resolution data from the Service Management component so that consumers can resolve issues themselves if they choose.
This component also provides a portal for the provider to manage the services that they provide to their consumers.
This component collects usage data and presents it in the Consumer and Provider Portal component. This enables consumers to understand the number of consumption units that they used for a service over a given time period and what that usage cost. Many enterprise IT organizations might use this data to provide “show back” reports instead of actual billing to consumers, so they can understand how many of the organizations’ resources they consume each month. Hosting service provider (HSP) organizations use this data for customer billing.
This component consumes Service Monitoring data and produces reports that describe the actual service level metric values exhibited by a service over regular time intervals. The report data can be compared to SLAs to determine whether the service met its SLAs during the reporting interval that is specified in the SLA. The data from this component is provided to consumers through the Consumer and Provider Portal component. This component is a primary enabler for the Service Level Management and Business Relationship Management components.
This component monitors service levels of all technical capabilities that are used to provide each cloud service. The Service Reporting component consumes the data from this component. Optimally, the Service Monitoring Component is able to integrate with the Service Management component so that it can auto-generate incidents based on defined criteria.
This component supports most of the Service Operations components and integrates data from the following Management and Support components:
- Incident and Problem Management
- Configuration Management
- Service Monitoring
- Service Reporting
The data integrated by the Service Management component is typically exposed through the Consumer Portal component so that various individuals can view it.
This component is the definitive store of both all of the CIs in the environment, and their relationships. The component might be a physical or logical store that correlates data across several different configuration management databases (CMDBs), each of which manages a subset of the CI data. Ideally, the component can also monitor CIs within the environment to identify, report, and repair CIs when their values change from their defined desired states. This component directly supports the Change Management and Asset and Configuration Management components.
This component denies or grants access to technical capabilities components based on access controls that are assigned to entities that have been authenticated by the Authentication component. As a result, this component has a relationship to the Authentication component that enables it to confirm whether the entity that requested authorization to a resource has been authenticated. Federation of the Authorization component across different cloud service providers is particularly valuable to provide the most seamless experience to end-users of services, by minimizing the amount of authorization strategies that must be synchronized across cloud services providers. This component directly supports the Access Management and Information Security Management components.
This component validates that an entity is who or what it claims to be based on some form of proof. In its simplest form,proof can be a user name and password. Authentication could also rely on digital certificates for proof. The entity that’s authenticated might be a human user or a software application that has to interact with another software application. Federation of the Authentication component across different cloud service providers is particularly valuable to provide the most seamless experience to end-users of services, by minimizing the amount of times that they must authenticate to various services.
This component stores entities and their attributes in a directory. The entities could be of a variety of types such as computers or user identities. The component also provides a mechanism to query the directory for the attributes of entities. For example, the Authentication component might query a directory for the user name and password attributes of entities that attempt to authenticate to it.
This component protects data in the event of data corruption or loss or underlying storage capability failures, or any combination of these events. It directly supports data retention policies that support the Information Security Management, Availability and Continuity Management, and Regulatory Policy and Compliance Management components.
This component executes the outcomes of the Release and Deployment Management component for each service to provision new units of capacity on an Infrastructure fabric. As a result, this component often interacts with the Fabric Management component to provision new Infrastructure components to support service capacity needs. This component directly supports the Capacity Management and Request Fulfillment components.
This component coordinates automated processes across multiple Management and Support and Infrastructure components. It ensures that processes are completed in accordance with their defined tasks. This component directly supports automation of many of the Service Delivery and Service Operations components.
This component controls all Infrastructure components through the Virtualization component. The Infrastructure components aren’t referred to as a “fabric” unless they’re managed with a Fabric Management component. This component directly supports the Capacity Management and Request Fulfillment components and directly interacts with the Network Support component to ensure network accessibility of Infrastructure components. The Deployment and Provisioning component often interacts with this component to support service capacity needs.
This component includes functionality that enables the use of network protocols that are used by Infrastructure component to communicate with each other and other devices. It typically includes functionality such as dynamic host configuration protocol for internet protocol (IP) address assignment and management, domain name system for IP name and address resolution, and pre-boot execution environment to enable a network interface-based boot of the Compute component without direct-attached storage (DAS) or operating system. This capability directly supports the Infrastructure components.
The components within this subdomain represent the technical capabilities that you manage that are required to host on-premises Platform, Software, and Management and Support technical capabilities components. You might use these components to enable IaaS services that you provide to consumers in the public, private, community, or hybrid deployment models1. You may also consume IaaS services provided by an external provider that manages the infrastructure components for IaaS services. The requirements that these components meet are driven by all subdomains within the CSFRM, but are generally heavily-standardized to facilitate both automation in the environment, and to optimize volume purchases of hardware and software.
This component abstracts some, but rarely all, of the functionality of the individual Compute, Network, and Storage components. For example, a physical server has basic input/output system (BIOS) settings. In modern servers, one of the available BIOS settings enables you to turn on or off hardware-assisted virtualization functionality, which typically must be turned on for the server to run any modern hypervisor software. The hypervisor is a key enabler to provide a virtual machine service. Even if the hardware-assisted virtualization setting could be virtualized, it likely wouldn’t be, because if the setting weren’t enabled, the virtual machine service couldn’t even be provided. As a result, the functionality that is virtualized is only the functionality that is necessary to provide the requested service to the consumer. Even though virtualization is not required to provide cloud services, it is a key enabler for doing so, and therefore, is represented as a component in the CSFRM.
This component represents physical servers, which include resources such as processors, graphics processing units (GPUs), random access memory (RAM), network interfaces, and the storage necessary to host hypervisor software for the physical server.
This component represents the physical network switches, routers, firewalls, and cabling. It also represents logical networking constructs including virtual local area networks (LANs), access control lists, quality of service, and network interfaces defined in converged network architectures.
This component represents physical storage that is accessed by Compute devices via some networking technology. Types of data that are often stored on this component are virtual machine hard disk files or consumer data or both.
Although the Infrastructure and Management and Support components that are necessary to provide services in the infrastructure as a service (IaaS) service model1 exist in the CSFRM, they are not represented solely to enable IaaS services. They’re represented because they’re necessary to enable any type of cloud service.
Platform components are aggregated with Infrastructure and Management and Support components to provide platform as a service (PaaS) services1. This subdomain does not include any technical capabilities components because Platform components are not critical to provide services in every service model, whereas the other technical capabilities components that are represented in the CSFRM are. Platform technical capabilities components are however, critical to providing platform services. If you provide platform services in your environment, either with platform technical capabilities that you manage or with platform technical capabilities managed by an external provider, you can add Platform components to the CSFRM, as appropriate. Examples of platform services that you might provide in your organization are data, media, and service bus. Platform components are sometimes provided as services that are consumed by Software capabilities, but they may also be consumed directly by end-users.
Software components are aggregated with Infrastructure, Management and Support, and sometimes Platform components to provide software as a service (SaaS) services1. This subdomain does not include any components because Software components are not critical to providing services in every service model, whereas the other components that are represented in the model are. Software technical capabilities components are however, critical to providing software services. If you provide software services in your environment, either with software technical capabilities that you manage or with software technical capabilities managed by an external provider, you can add Software components to the CSFRM, as appropriate. Examples of software services that you might provide in your organization are email, calendaring, customer relationship management, enterprise resource planning, unified communications, and collaboration. Software services are usually consumed by end users.
This article introduced the Cloud Services Foundation Reference Model, which defined common terminology for subdomains and components within the cloud services foundation problem domain. You might have some or all of the components in your environment today, but you might not use them in a manner that enables you to provide services that exhibit cloud characteristics. Providing cloud services requires more than common terminology though. A guiding set of principles, concepts, and patterns that can be applied to the implementation of a cloud services foundation is also required. This information is provided in the Principles, Patterns, and Concepts article in this article set. We recommend that you read that article next. You may also want to return to the Cloud Services Foundation Reference Architecture - Overview article or obtain additional architectural and solutions content by visiting the Cloud and Datacenter Solutions Hub.
1Refers to terminology from the NIST Definition of Cloud Computing. Terminology from this definition is used throughout this article, but is not defined further with this article.
|1.0||6/20/2013||Initial posting and editing.|