SCOM Service Alert Testing

Caution
Test the script(s), processes and/or data file(s) thoroughly in a test environment, and customize them to meet the requirements of your organization before attempting to use it in a production capacity.  (See the legal notice here)

 

Note: The workflow sample mentioned in this article can be downloaded from the Opalis project on CodePlex:  https://opalis.codeplex.com

 

Overview

The “SCOM Service Alert Testing” sample is designed to use service-testing as a way to make alerts in SCOM “sticky”. Basically, when an alert in SCOM is marked as Resolved, the top-line workflow will look to see if the alert is one for a service which has a “test” configured for it. If one exists, then the test is run and if the alert is down (i.e. the alert is still valid) then the alert is re-opened. The net effect is that an alert for a service won’t stay in a Resolved state if in fact the service is down. This prevents the accidental resolution of active alerts should a user select a large number of entries and “bulk acknowledge” them in Operations Manager.

 

Top-Level Workflows

The basic top-level workflow is very simple. It looks for updated (not new) alerts from SCOM. One could scope down the filter to just service alerts as well as apply other filters so the workflow doesn’t run for alerts where no service test exists. However, the sample will simply exit after the second activity should there be no test for a given alert.

Link-logic dictates the branching of SCOM Alerts into service test. One could edit the link-level conditions for each branch such that a given service alert gets precisely routed to a give test. The tests are encapsulated in child workflows that are run as triggers without a wait (“fire and forget”).

The Service Tests are very simple in the sample, but one could perform elaborate verification procedures within them. The sample is only meant as a pattern one could follow to build such a solution.

 

“1. Monitor for Service Alerts”

This workflow was designed to accept updated alerts from SCOM and route these alerts to service test. The service tests are contained within child workflows. Possible updates to this workflow could include:

  1. More precise filtering of the trigger condition. Right now all updated events that are Resolved are selected. One may opt for a different (potentially more precise) trigger condition. “Updated” alerts are only chosen (vs. new alerts) because new alerts won’t be created in a Resolved state. The use-case specifically is focused on dealing with alerts that have been updated by users.

  2. Additional service tests can be created. One would simply copy the child workflow patterns to create a new service test. Once a test workflow is created, it can be linked into the parent using a link condition like the three shown in the sample.

  3. Link logic (the arrows that route alerts to the service tests) could be refined to be more precise. One would want to make certain any given alert ran the proper test associated with it.

 

2. Test DNS Service

This workflow performs a simple test of a DNS service. The incoming data could be made more dynamic (for example, the SCOM alert could be re-queried for additional details or the data could be sent into the child workflow via Custom Start parameters).

3. Test Web Server

This workflow performs a simple test of a DNS service. The incoming data could be made more dynamic (for example, the SCOM alert could be re-queried for additional details or the data could be sent into the child workflow via Custom Start parameters).

4. Test Email Server

This workflow performs a simple test of a DNS service. The incoming data could be made more dynamic (for example, the SCOM alert could be re-queried for additional details or the data could be sent into the child workflow via Custom Start parameters).

 

 

Share this post :