Actionable Alerts for Web applications in Operations Manager 2007

With Operations Manager 2007, monitoring of Web based applications using a synthetic transaction approach was much easier with the inclusion of Web application monitoring template. This replaced the Web sites and services Management Pack in MOM 2005 and leveraged the benefits of the model based approach and state-centric monitoring available in OpsMgr.  One of the side effects was that the Alerts raised when the watcher node detected a problem have a blank description. In the article, I will describe some tips to help you make the Alerts more meaningful and actionable.


Problem with the Alert description

Since I work on monitoring of Web based applications, I often hear from customers who are using the Web application templates that the default alert descriptions for the web applications are useless. When a Web application synthetic transaction fails whether it was an error status code or DNS lookup, that the Web application alert does not indicate anything useful.


In order to reproduce the problem, I created a simple web application with a non-existent URL. I selected all the default settings to create a new Web application. Within a couple of minutes, I received  two alerts which showed up in the Alert view – One for the Web request and the other one for the Web application.


AlertWebRequestjpg


As you can see, neither of them had a good description. So what was the real cause of the problem? I had to click on the Health Explorer and review the monitor tree.


AlertStatusCode


Now you can see that the Alert was raised because the HTTP Status code is 404 – not found. As a user, I have to look through the alerts, load up the health explorer and then identify the unit monitor causing the alert. That’s too many steps. If I am forwarding this alert to a separate system from where I cannot launch the Health Explorer, then I am out of luck.



How can I generate an actionable Alert instead?

The problem is detected by a specific unit monitor which is not alerting by default. In the above case, it is the Status code monitor. One way I can make it generate an alert that is actionable is in simple steps


1. Right click and get properties of the monitor from the health explorer.


2. Enable the Alert for that monitor


3. Set the Alert description as follows


Status code is $Data/Context/RequestResults/RequestResult[“1”]/BasePageData/StatusCode$


4. Reset the monitor. Close the alerts and wait for it to fire again.


Now I get a new Alert for status code with the detail of the failure that I can act upon. That is much more useful.


image


How did I find the magic description string for the alert? There is no magic to this – the information is in the context of the Alert as you can see in the State change Event above in the Health Explorer. (For more details on authoring alert descriptions, please refer to the Authoring guide). It has the request code in it. For the exact string in the context, use the following steps:


FullResultStatusCode


1. Go to the Authoring space and Edit the settings for the Web application. In the Web application editor, click on the Run Test link


2. Once the Test is run, click the ‘View Full results’


3. Now click on the Raw tab to see full details.


4. Identify the status Code field  (or any field that you are interested)  that you wanted to see in the Alert description. Also note the fields name in the XML.


5. Construct the parameterized alert description based on the field you are interested using the Status code parameters as follows:


$Data/Context/RequestResults/RequestResult[“%ReqID%”]/BasePageData/StatusCode$


Now you know a little documented trick for creating actionable alert descriptions.  Do you think this was useful?


If we know this, why did we not make the Web application description reflect this information , by default. For that, check my next blog post.