One of my favorite sayings has always been, “you can’t manage what you don’t monitor“. Best I can do for giving credit to the quote is that Candle Corporation in the late 90’s had a marketing campaign built around this concept, I still have the shirt. When you think about why we do what we do it becomes fairly easy to see that we serve applications to people, so building from that quote let’s talk about how we monitor users and applications. In this 3 part blog I will discuss the monitoring options we have and share with you how to set these up in System Center. When possible I will highlight additional opportunities to automate the process and cross connect systems to be more DevOps focused.
In Part 1 let’s talk about what options are available to monitor users and applications. For the purposes of this discussion let’s use an example application that is three tier. It has a Web UI as front end and a Web Service as the middle tier which acts as the data access layer to a third tier, a SQL database. I’m not advocating any single type of design or architecture it just simplifies the discussion, so you can plug in other architectures as you see fit.
Why We Monitor
For those that need more than the statement ‘we can’t manage what we don’t monitor’ let’s talk briefly about some main reasons to monitor. I am always a fan of over simplifying so I do that here as well. We monitor to solve problems. You could argue there are many reasons outside of problem resolution and I certainly could agree, but I could also argue back that there are 2 types of problem solving processes, one that is proactive and the other is reactionary. In the proactive world we may monitor to collect data to help capacity plan or to increase customer satisfaction. You may then say those aren’t problems, but aren’t they? Just different types of problems? For reactionary issues it is more self-explanatory, in that we are looking to resolve an immediate complaint, concern or challenge. As you ponder what this crazy person is talking about I want you to think about these processes moving forward in the discussion. I also propose you consider the thought process that all problem resolution process includes these three basic steps. Alerting, is where we learn we have an issue. Triaging, is where we understand the severity and complexity of the issue. Finally, Diagnosing, is the step or steps where we take information in and decide what the resolution is. In some cases diagnosing includes further testing, collection and diagnosing which is a separate process all together.
How We Monitor
As we talk about monitoring in general we need to think about what are the options or areas available. An overly simplistic view is that there are three basic categories of monitoring. As the following diagram depicts. Where the three areas meet in the middle is the sweet spot of monitoring, but not the only or required spot.
So a brief description is in order for each. Synthetic monitoring is a category of monitors that act like users or other applications. These can be geographically dispersed around the world or they can be internal to the monitored environment. These type of monitors gather 2 overarching types of information and/or alerts. The first is availability, as they act as a user they can alert you when the transaction or application responds or doesn’t. The other is response time, synthetic monitors can time how long it takes for a transaction or application to respond and alert. Infrastructure monitoring may seem self-explanatory but we’ll offer up a brief explanation for this blog. In infrastructure monitoring I’ve seen management protocols like WMI and SNMP play a keen role but I’ve also seen performance counters and event logs used to build great monitoring. The last area we’ll discuss is Application Level monitoring. This, like the others, takes many forms. In deep code level monitoring it can be as simple as the developers add special coding to collect user data or application failures, it can also be monitors that hook in to applications at start up and automatically instrument or attach handlers. In either case the basics are that we understand the applications health through application failures and performance, preferably at the code level.
In Part 2 we’ll walk through the different options of synthetic monitoring available using System Center. We’ll create the monitors set the alerting level we care about.
In Part 3 we’ll dig deeper in to the application level and describe exactly what is happening and why, as well as set up examples.