Your IT Sprawl shouldn’t cost you millions

Harry, the TechNet editor, mentioned to me an interesting article by Rick Delgado on IT sprawl and the unseen costs associated with the problem. He asked me to come up with some practical magic to avoid or at least ease this problem, so here it is. Like every complex problem, I’ll break this down to simplify!

The problem restated

Too many VMs is not only a burden on your datacentre - if you are still doing IT services that way - it’s too much maintenance, too much software licensing, and more importantly too much boring human effort in trying to manage it all.

Have you got a cloud?

This might mean you are running your own data centre as a cloud, or you are using someone else to do this for you. In either case a cloud by definition (check out page 2 of this definition from US standards authority NSIT) should provide two things: Chargeback and Pay-per-use, if you haven’t got those you haven’t got a cloud.

Let’s assume for a moment you don’t have these things, which in my view is a worst case scenario as per the sprawl article, because if it’s free then there’s no incentive for being efficient. If you are the data centre admin you are probably worried about capacity, and not only that, you may have no real idea about which VMs and services belong to whom, which are critical and which are just copies for dev and test. You need to establish what you have got; the key tool here is the Microsoft Assessment and Planning Toolkit (MAPT) which will scan your entire datacentre and report back on what you have  - to you not to Microsoft. It will also look at non Microsoft technologies, like VMWare and Linux, and report on those too if you let it. This then gives you costs and you can start a conversation with the business stakeholders on what is critical and what isn’t - start to create order out of chaos. Of course, many organisations have well established SLAs for services and this situation is not as chaotic.

Without scalability, however, you can still end up with a sort of temporal sprawl in that all your VMs are not working all of the time. You could be more efficient at meeting those SLAs by scaling down services when they are idle so that others can use the spare capacity when needed. The classic scenario is to quiesce some of your ERP system VMs overnight when it’s quiet to allow background jobs like the overnight data warehouse build to run on the same hardware.

My other top tip is to automate everything: rather than leaving hard to create VMs laying around, you have a process to create and kill them when they are needed. Think of VMs as vegetables not flowers, don’t lovingly look after them, chop them up and put them in the frying pan!

Finally, architect groups of VMs as services and tag them as such. Take SharePoint as an example, describe each tier in terms of minimum and maximum VMs in them (say 3 web front ends, two servers and the VMs comprising the availability group on the back end server), you can then design a maintenance schedule to spin these up and down as needed while still keeping the whole service alive. Or you could simply delete the old VMs and configure them as new nodes by getting smart with tools like the free Microsoft Deployment Toolkit and PowerShell Desired State Configuration.

If you have done all of this then you nearly have a cloud, all you are missing is chargeback, which in a private cloud based on Microsoft technologies would mean implementing this in System Center Service Manager.

Of course by the time you have done this work you are actually creating your own cloud, but what if all of this is too much work for a fire-fighting, overworked IT department?

My final piece of advice would be to move the non critical workloads to the clouds where the business will be billed for what it uses. Typically, this would be the dev and test workloads, as well as legacy services that can’t easily be reworked.

You already have a cloud.

In theory all should be well as you and your business are charged for what you are using, but in actual fact sprawl can still occur and the best way to eliminate this is to move to a world of services, specifically platform and software as a service rather than just simply having someone else run your VMs for you. If you are still running VMs then you still have to manage them, and while there are tools like Azure Automation to do the management and Operational Management Suite for monitoring workloads wherever they are, you aren’t really getting the benefit of the cloud.

Using Microsoft technologies an example, why run SharePoint or Exchange in a VM when there is Office 365? When it comes to the various web sites you have, it is simply more efficient to have the websites running over the Azure Web App Service, removing the need for worrying about the OS, scalability, and availability, and just focus on maintaining the site itself.

Conclusion

What I have described is the sort of journey a business might go on to achieve more and more from a given budget. Where we might go next is to think about containers and docker, but in some sense this is a distraction as you are still doing the managing and capacity planning since this is essentially a more advanced virtualisation technology.

This might all sound like gloom and doom for the IT professional, but what it really means is that we get to work on the interesting stuff and that the boring stuff is now done by machine.

Introduction to the Microsoft Private Cloud on the Microsoft Virtual Academy

Build a Private Cloud with Windows Server and System Center jump start on the Microsoft Virtual Academy