Struggling with CADE, McKinsey / Uptime Metric (RE-POST)

The following posting originally appeared on Michael Manos' Loose Bolts blog.

 

This is a Re-post of my original blog post on May 5th regarding tortured thoughts around the CADE Data Center Metric put forward by McKinsey. This has relevance to my next post and I am placing it here for your convenience.

I guess I should start out this post with the pre-emptive statement that as a key performance indicator I support the use of CADE or metrics that tie both facilities and IT into a single metric.  In fact we have used a similar metric internally at Microsoft.  But the fact is at the end of the day I believe that any such metrics must be useful and actionable.  Maybe its because I have to worry about Operations as well.  Maybe its because I don’t think you roll the total complexity of running a facility with one metric.  In short, I don’t think dictating yet another metric, especially one that doesn’t lend itself to action, is helpful.

As some of you know I recently gave keynote speeches at both DataCenter World and the 2008 Uptime Symposium.  Part of those speeches included a simple query of the audience of how many people are measuring energy efficiency in their facilities.  Now please keep in mind that the combined audience of both engagements numbered between 2000-2400 datacenter professionals.  Arguably these are the 2400 that really view data centers as a serious business within their organizations.  These are folks whose full time jobs are running and supporting data center environments for some of the most important companies around the world.   At each conference less than 10% of them raised their hands.   The fact that many in the industry including Ken Brill at the Uptime Institute, Green Grid, and others have been preaching about measurement for at least the last three years and less than 10% of the industry has accepted this best practice is troublesome.  

Whether you believe in measuring PUE or DCIE, you need to be measuring *something* in order to even get one variable of the CADE metric.  Given this lack of instrumentation and\or process within those firms most motivated to do so speaks in large part of the lack of success this metric is going to have over time.  It therefore follows, if they are not measuring efficiency, they likely don’t understand their total facility utilization (electrically speaking).  The IT side may have an easier way of getting variables for system utilization, but how many firms have host level performance agents in place? 

I want to point out that I am speaking to the industry in general.  Companies like ours who are investing hundreds of millions of dollars get the challenges and requirements in this space.  Its not a nice to have, its a requirement.  But when you extend this to the rest of the industry, there is a massive gap in this space.

Here are some interesting scenarios that when extended to the industry may break or complicate the CADE metric:

  • As you cull out dead servers in your environment, your utilization will drop accordingly and as a result the metric will remain unchanged.  The components of CADE are not independent. Dead servers are removed so that Average server utilization goes up then Data Center Utilization goes down showing proportionally so there is no change and if anything PUE goes up which means the metric may actually go up. Keep in mind that all results are good when kept in context of one another.
  • Hosting Providers like Savvis, Equinix, Dupont Fabros, Digital Realty Trust, and the army of others will be exempt from participating.  They will need to report back of house numbers to the their  customers (effectively PUE).    They do not have access to their customers server information It seems to me that CADE reporting in hosted environments will be difficult if not impossible.  As the design of their facilities will need to play a large part of the calculation this makes effective tracking difficult.  Additionally, overall utilization will be measured at what level?
  • If hosters exempted, then it gives CADE a very limited application or shelf-life.  You have to own the whole problem for it to be effective.  
  • As I mentioned, I think CADE has strong possibilities for those firms who own their entire stack.   But most of the data centers in the world would probably not fall into “all-in” scenario bucket.

I cant help but think we are putting the cart before the horse in this industry.  CADE may be a great way to characterize data center utilization but its completely useless if the industry isnt even measuring the basics.  I have come to the realization that this industry does a wonderful job in telling its members WHAT to do, but lacks to follow-up with the HOW.  CADE is meant for higher level consumption.  Specifically those execs who lack the technical skill-sets to make heads or tails of efficiencies and how they relate to overall operations.   For them, this metric is perfect. But we have a considerable way to go before the industry at large gets there.

Regardless, I strongly suggest each and everyone adopt the take away at Symposium….Measure, measure, measure.