Rethinking Enterprise Storage - Imagining the possibilities with hybrid cloud storage

Article
04/02/2015

Recently, Microsoft published a book titled Rethinking Enterprise Storage – A Hybrid Cloud Model – the book takes a close look at an innovative infrastructure storage architecture called hybrid cloud storage.

Last week we published experts from Chapter 6. This week we provide an excerpt from Chapter 7, Imagining the possibilities with hybrid cloud storage. Over the past several weeks on this blog, we published excerpts from each chapter of this book via a series of posts. After this post we will wrap things and provide you with a handy glossary of terms for hybrid cloud storage. We think this is valuable information for all IT professionals, from executives responsible for determining IT strategies to administrators who manage systems and storage. We would love to hear from you and we encourage your comments, questions and suggestions.

As you read this material, we also want to remind you that the Microsoft StorSimple 8000 series provides to customers innovative and game-changing hybrid cloud storage architecture and it is quickly becoming a standard for many global corporations who are deploying hybrid cloud storage. You can learn more about the StorSimple 8000 series here: www.microsoft.com/storsimple

Here are the chapters we have excerpted in this blog series:

Chapter 1 Rethinking enterprise storage

Chapter 2 Leapfrogging backup with cloud snapshots

Chapter 3 Accelerating and broadening disaster recovery protection

Chapter 4 Taming the capacity monster

Chapter 5 Archiving data with the hybrid cloud

Chapter 6 Putting all the pieces together

Chapter 7 Imagining the possibilities with hybrid cloud storage

That’s a Wrap! Summary and glossary of terms for hybrid cloud storage

So, without further ado, here is an excerpt from Chapter 7 of Rethinking Enterprise Storage – A Hybrid Cloud Model

Chapter 7 Imagining the possibilities with hybrid cloud storage

Cloud computing has enormous potential as an infrastructure technology, but as the new kid on the IT block, it has a lot of catching up to do to match the enormous legacy of on-premises technologies and practices. Therein lies the importance of hybrid cloud computing—the integration of on-premises and cloud infrastructures to achieve the best of both worlds. Over the coming decade, hybrid cloud computing will surprise people in the directions it takes and the solutions that sprout from it. This chapter takes a hypothetical look at the future of hybrid cloud storage and what role the Microsoft hybrid cloud storage (HCS) solution may play in it.

Thanks to VMs, everything done in data centers today can be done in the cloud tomorrow

Servers, storage, networking, and management applications can all be provided in Infrastructure-as-a-Service (IaaS) offerings from cloud service providers (CSPs). The services that CSPs offer are continually evolving, with much of the progress coming in the form of instantly available virtual environments. For example, Microsoft Azure enables IT teams to easily and quickly create and manage VMs, storage, and virtual network connections using Windows PowerShell scripts or the Microsoft Azure browser management portal.

VMs have become the granular infrastructure building blocks of corporate data centers.

With VMs everywhere on premises and VMs everywhere in the cloud, it follows that effective VM portability across the hybrid cloud boundary will be an important enabler to hybrid infrastructures. IT teams want to copy successful VM implementations between their data centers and the cloud where they can be run with different goals and circumstances. VM portability provides the flexibility to change how processing is done and is a guard against being locked in by any one CSP. System Center 2012 App Controller is an example of a management tool that automates the process of uploading and installing VMs in Microsoft Azure, and it is an excellent example of the progress being made to integrate cloud and on-premises data centers.

There will always be differences between the things that on-premises and cloud datacenters do best. Certain types of applications and data are likely going to stay on premises, while others can only be justified economically in the cloud. Then there will be everything else that probably could be run either on premises or in the cloud. The final decision on those will be made based on cost, reliability, and security.

Infrastructure virtualization

Abstracting physical systems as VMs started a revolution in computing that continues today with cloud computing. The initial breakthrough technology for VMs was VMware’s ESX hypervisor, a special operating system that allowed other guest operating systems to run on it as discrete, fully-functioning systems. IT teams used ESX to consolidate many server instances onto a single physical server, dramatically reducing the number of physical servers and their associated footprint, power, and cooling overhead. There are several hypervisors in use today, including VMware’s ESXi and Hyper-V, which is part of both Microsoft Server 2012 and Microsoft Azure.

But virtualization technologies have been around for much longer than ESX. The technology was first invented for mainframes, and both virtual networking and virtual storage were well-established when the first VMs from VMware were introduced. Virtualization is one of the most important technologies in the history of computing and will continue to be.

In addition to VMs, virtual switches (v-switches) and virtual storage appliances (VSAs) were also developed to run on hypervisors in server systems. Of these three virtualized infrastructure technologies, VSAs have been the least successful at imitating the functionality of their corresponding hardware systems. This is not so hard to understand considering the performance challenges of running a storage system developed for specialized hardware on a PC-based hypervisor.

However, hypervisors in the cloud are different, simply by virtue of where they run, and are much more likely to attract the interest of storage vendors. The most successful infrastructure transitions tend to be those that require the least amount of change and storage vendors will want to sell VSA versions of their on-premises products to ensure that customers making the transition to the cloud will continue to use their technologies—whether they are products or services.

Orchestrating clouds

IT teams are always looking for ways to manage their infrastructures more efficiently. Installation wizards simplify deployment by defining resources and initiating operations and connections between them. But wizards are usually limited in scope to automating the deployment of a single product or service. The next level of automation that coordinates the deployment of multiple products and services is called orchestration. Orchestration automates multiple complex setup and configuration tasks for technologies that work together to form a solution.

Orchestration is an excellent example of a relatively new technology area with enormous potential for managing hybrid cloud computing environments. As orchestration matures, the breadth and depth of the automation it provides will expand. Eventually, orchestration may be able to create complete virtual data centers by defining all the VMs, storage volumes, virtual networks, data management processes, and policies on both sides of the hybrid cloud boundary.

For example, orchestration could be used to create a group of VM servers for several departments, the VLANs they use to access a CiS system on-premises, the storage volumes on the CiS system, the Microsoft Azure storage bucket to expand the capacity for these file servers, and the cloud snapshot policies for daily data protection and end of month data archiving.

Managing data growth in a hybrid cloud

Organizations using hybrid cloud designs will expect to limit the costs of running their own corporate data centers. Considering the cost of storage and the rate of data growth, it follows that most of the data growth needs to be absorbed by cloud storage while maintaining steady storage capacity levels on premises. The Microsoft HCS solution provides an excellent way to limit capacity growth on-premises by deduplicating primary storage and using the cloud as a tier for low-priority data. There is nothing particularly futuristic about that however, because the solution does that already.

Another way to limit the storage footprint on-premises is to migrate applications to the cloud. Just as the Microsoft HCS solution migrates lower-priority data to the cloud, the applications that are migrated from the on-premises data center to the cloud could also have lower priorities, and less sensitivity to the effects of migration.

Data portability in the hybrid cloud

Hybrid cloud designs will accommodate diverse and changing workloads by providing computing resources for unique and temporary projects, as well increasing capacity for expanding applications on premises. In theory, the flexibility of hybrid cloud computing will enable IT teams to move applications and data according to changes in business priorities.

For these things to transpire, it must be possible to transfer both the applications and their data across the hybrid cloud boundary. Data portability is an aspect of hybrid cloud technology that will likely see a lot of development in the years to come. The sections that follow discuss aspects of data portability in hybrid clouds and how the Microsoft HCS solution could provide it.

Migrating applications and copying data

Data and the applications that use it must be located in the same data center for performance reasons. Because the amount of data is usually much larger than the application software, the migration process will be effectively gated by the time it takes to complete the data migration.

Migration time is the amount of time it takes between shutting an application down in one location and starting it in another. Unlike disaster recovery (DR), where some of the most recently changed data might be lost, migrations are expected to completely copy all data. In other words, there is no recovery point where migrations are concerned and there will always be some amount of data to copy for an application migration.

Copying a lot of data takes a long time, even when there is a lot of network bandwidth available. So, if the goal is to minimize migration time, it follows that minimizing data transfer times is paramount. Data protection technologies that copy only recently changed data to a remote site, such as replication, continuous data protection (CDP), and cloud snapshots, may be useful. There will undoubtedly be several different approaches developed to decrease migration time in the coming years.

The Microsoft HCS solution could be used for application/data migrations someday. Here’s a quick overview of how it might work: After pausing the application, the IT team would take a cloud snapshot, which would upload the most recently written data and the most recent metadata map. After the cloud snapshot completes, they would start a CiS VSA in the cloud that would access and read the metadata map and copy the data to where it would be accessed by the application running in the cloud. This is not much of a departure from the way deterministic recoveries are done today with the Microsoft HCS solution.

Can you get there from here?

Cloud data centers can do many of the same things that on-premises data centers do, but they are distinctly different. One way they differ is the method used to access, write, and read data. The most popular cloud storage service is object storage, which provides buckets for storing data. The CiS system in the Microsoft HCS solution accesses data stored in Microsoft Azure storage buckets using the Microsoft Azure Storage API.

By comparison, data center storage is typically accessed through remote file systems and block storage interfaces. This means VMs and VSAs in the cloud might need to use a different data access method than they do on premises. This is not an insurmountable challenge, but it’s not necessarily trivial either, and its solution has to be built into VMs, VSAs, or cloud services.

Virtual disks as a porting medium

One solution to data portability is to copy the VM’s VMDK or VHD file across the hybrid cloud boundary. The hypervisors managing those VMDK/VHD files could also manage this task or special import/export processes could be used. That said, storage developers have always found ways to add value where data transfers are concerned and will likely find ways to offload this work to participate in hybrid cloud data migration scenarios too.

Emulating on-premises storage methods as a service

A different approach to solving the problem of different data access methods on-premises and in the cloud is to provide cloud storage services that emulate on-premises storage. In other words, a service that allows data to be stored and accessed in the cloud by applications using the same storage methods as servers running on-premises. Amazon’s Elastic Block Storage (EBS) and Microsoft Azure Drive are examples of this type of storage emulation.

One potential application for block storage access in the cloud is storage-based replication. As discussed in Chapter 3, “Accelerating and broadening disaster recovery,” storage-based replication copies new storage blocks from the primary site to the secondary site, where servers at the secondary site can access it. Presumably, replication to the cloud would use a vendor’s VSA in the cloud to receive updated blocks from their on-premises storage system. Storing the replicated blocks might work best with emulated block storage.

Recovery in the cloud

Considering the difficulties many IT teams have with DR, it’s not surprising that one of the most compelling aspects of cloud computing is the potential for conducting DR in the cloud. The ability to have an inexpensive, ready-made recovery site on demand is very appealing. Recovering data in the cloud allows the IT team to work in a green field environment and accommodates a certain amount of mistakes and retries. As a result, there is a great deal for the IT to be excited about if DR has been broken and deemed unfixable.

Predicting recovery time objectives (RTOs) and recovery point objectives (RPOs) for hybrid cloud computing will likely become much more important than they are today because they will be factored into service level agreements (SLAs) for cloud services. The good news is that practicing recoveries will be much easier in the cloud due to the instant availability of resources to test with and the ability to isolate those tests so they don’t interfere with production operations.

To learn more about the possibilities with Microsoft Hybrid Cloud Storage visit www.microsoft.com/storsimple and be sure to download your copy of Rethinking Enterprise Storage: A Hybrid Cloud Model