SCOM Alert Updater Service - connector example updating SCOM alerts

Hi,

Mark Manty Premier Field Engineer with Microsoft here with another example for System Center Operations Manager.

Quick note on updated build for SCOM 2016: AUC updated example

Today we are going to walk through an example Windows service that is a SCOM connector.

Customers who use product connectors often find they need to modify custom fields and place important data there for their incident management systems to leverage. Additionally connector subscriptions criteria
is limited, we often will need more granular control of which alerts we want to send across a connector to create incidents. Reducing floods of incidents created from SCOM alerts is a priority with most organizations.

My example connector service provides this capability and is simple to maintain and install, plus I have provided the source code for this example allowing anyone interested the ability to add features and/or functionality.

PFE customers may already be familiar with Boris Yanushpolsky’s Alert Update Connector. This is an example that I developed that simulates similar functionality. Building the service requires Visual Studio 2010 and for the SCOM R2 agent to be installed on the system you are building service. SCOM agent does not have to report to a management group.

I ran this in a test environment and it does not have to run on a RMS or MS to work. It must meet the below requirements in order to run successfully:

  • .NET 4.0 Framework
  • SCOM R2 Console (User Interface installed) or SCOM 2012 Console installed
  • Run as account that has administrative rights to the SCOM Management Group

Let’s dive in.

Select Administration – Settings in SCOM Console and use New… button to create new resolution states for Processed and SendAlert. Processed should be set to 251 and SendAlert can be 252 for example.

 


Note that the Processed 251 resolution state is hard coded in the service. This is the resolution state that the connector service updates alerts that have been processed. You should use the connector configuration file (more on this file later) to change resolution state to SendAlert 252 value should you want to pass this on to another connector that will then forward these events to your ticketing system.

Create Connector Configuration file

One of the items in alertupdaterservice.exe.config file is ConfigurationFilePath. This contains the path and file name that stores your alerts to modify.

Below is the text to copy and paste into your Connector.xml file.

The first alert source is an example with all possible Property Names you can modify. ResolutionState, CustomField1 – 10.

<ConnectorConfig GlobalResolutionState="251">

  <AlertSources>

    <AlertSource Id="b59f78ce-c42a-8995-f099-e705dbb34fd4" Type="Monitor" UserName="Domain\username">

      <PropertiesToModify>

        <Property Name="ResolutionState" NewValue="252" GroupIdFilter="" />

        <Property Name="CustomField1" NewValue="$ServerName$" GroupIdFilter="" />

        <Property Name="CustomField2" NewValue="$Id$" GroupIdFilter="" />

        <Property Name="CustomField3" NewValue="$Name$" GroupIdFilter="" />

        <Property Name="CustomField4" NewValue="$ManagementGroup$" GroupIdFilter="" />

        <Property Name="CustomField5" NewValue="$MonitoringObjectDisplayName$" GroupIdFilter="" />

        <Property Name="CustomField6" NewValue="$MonitoringObjectFullName$" GroupIdFilter="" />

        <Property Name="CustomField7" NewValue="$NetbiosComputerName$" GroupIdFilter="" />

        <Property Name="CustomField8" NewValue="$NetbiosDomainName$" GroupIdFilter="" />

        <Property Name="CustomField9" NewValue="$PrincipalName$" GroupIdFilter="" />

        <Property Name="CustomField10" NewValue="$Severity$" GroupIdFilter="" />

      </PropertiesToModify>

    </AlertSource>

    <AlertSource Id="308c0379-f7f0-0a81-a947-d0dbcf1216a7" Type="Monitor" UserName="Domain\username">

      <PropertiesToModify>

        <Property Name="ResolutionState" NewValue="252" GroupIdFilter="" />

        <Property Name="CustomField1" NewValue="$ServerName$" GroupIdFilter="" />

      </PropertiesToModify>

    </AlertSource>

      </PropertiesToModify>

    </AlertSource>

  </AlertSources>

</ConnectorConfig>

 

The two AlertSource configuration items above are examples of how to configure a given rule/monitor to update custom fields and change your resolution state of these alerts to SendAlert.

GroupIdFilter can be used to specify only to update the NewValue If it belongs to the specified group.

Here are the properties on the alerts that were modified using the service with the connector configuration file above.

 

 

Question:

How do I find the Id for the alerting monitors and rules that I want to modify?

You can use PowerShell or read MP files xml and dig through the file to find the monitor or alert’s ID. If you are a Premier customer you could use Boris's ConnectoryConfiguration tool to create your configuration file.

Open Operations Manager Shell and run this command.

get-alert -criteria 'Name = ''Health Service Heartbeat Failure'''

This will work if you have alerts that are available in the management group.

The data returned will have a line for MonitoringRuleId.  This is the value you can specify in the AlertSource section for Id.

MonitoringRuleId
: b59f78ce-c42a-8995-f099-e705dbb34fd4

 

What if you do not have any recent alerts that you can query with PowerShell to find the Id?

get-monitor -criteria 'DisplayName = ''Health Service Heartbeat Failure'''

Id                     : b59f78ce-c42a-8995-f099-e705dbb34fd4

You can use similar command line for rules by using get-rule command.

To get the GroupIdFilter you can use PowerShell to get the Group ID similarly.

get-monitoringobjectgroup

Find the group you want and find the associated Id.

Id               : 9e249559-e166-0e92-1bfc-fea90a63f843

 

I started working on a user interface tool to select monitors and rules that would allow you to configure the connector configuration file, however I have not completed this yet. I may blog about this in future when
completed, if I hear back that this is useful and would be a good use of my time.

Editing alertupdaterservice.exe.config file.

RootManagementServerName
should replace localhost with RMS Server name or one of your 2012 SCOM management servers. If this service is running on one of your management servers or RMS you can specify localhost.

PollingIntervalInSeconds
number of seconds before it polls and processes new alerts, default is every 10 seconds.

ConfigurationFilePath
this is the path to your configuration file that has alert id’s along with fields to update.

ExcludedResolutionStates
value should be 255 which is Closed resolution state.

FailoverMS is a
semicolon delimited list of management servers should you be using SCOM 2012 that the service will attempt to connect to should the RootManagementServerName connection fail.

<?xml version="1.0" encoding="utf-8"?>

<configuration>

<appSettings>

<add key="RootManagementServerName" value="localhost" />

<add key="PollingIntervalInSeconds" value="10" />

<add key="ConfigurationFilePath" value="C:\AUCS\Connector.xml" />

<add key="ExcludedResolutionStates" value="255"/>

<add key="FailoverMS" value="SCOM2012MS1;SCOM2012MS2" />

</appSettings>

</configuration>

 

Installing Alert Updater Service

Run command prompt as administrator.

Change to the directory where you have your files copied, in this case c:\aucs

Run below command to install the service.

C:\AUCS>C:\Windows\Microsoft.NET\Framework64\v4.0.30319\InstallUtil.exe c:\aucs\AlertUpdaterService.exe

Note: if you get an error indicating System.BadImageFormatException you are trying to install the 32 bit version of the service using the 64 bit framework or you are trying to install the 64 bit version of the service with
the 32 bit .NET framework.

 

Change “AlertUpdaterService” to automatic start and enter credentials of an account that is in SCOM administrators group.

 

 

 

Change logon credentials and save.

 

Start the Alert Updater Service.

Open SCOM Console and select Administration – Product
connectors – internal connectors.

Bring properties up on Alert Updater Connector and select Add… button to create the new subscription for alerts you want this connector to subscribe to.

 

 

Enter a subscription name such as “All Alerts alert updater” and select Next.

 

 

Select all the groups you want to subscribe to and select next.

 

Select Next to allow new targets to automatically be added when new ones are created.

 

Select the criteria you want and select create. In this case I am selecting all severity and all priority alerts from all categories. I am only selecting alerts with a resolution state of New.

 

Select Apply and OK to save your changes.

Now in your active alerts view you should see most of your active alerts are either in a Processed or SendAlert state. As alerts are created they are set to New resolution state. As the connector service
processes alerts it will change the resolution state to processed or to the custom resolution state that you specify in your connector configuration file.

 

Service failover

What if you are running this with SCOM 2012 and the management server you are connected to fails?

Here is the is what happens if you have FailoverMS servers listed in your configuration file.

First you get a error event (could be different based on when the failure occurs).

In this case we get a 7010 error event.

Service failed. Please restart service: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection.

Then we see a 5002 information event indicating using failover MS.

Changing connection to failover MS: SCOM2012MS1. Will attempt to connect to new MS.

Then we see a 50 and 55 event indicating connections established.

(50) Successfully connected to SDK service on: SCOM2012MS1

(55) Successfully established connector connection

And within a few minutes you should see 700 informational event indicating processing alerts. This event is a heartbeat event indicating we are processing events and working. You could monitor for missed event 700 within 10 minutes and alert if not found. Could indicate service is not working.

 

Service EventCodes list

All events are logged to Application event log with event source of “AlertUpdaterService”. Below is a list of possible event codes that may be logged by the service. You could add or remove from the source to suit your needs. You can also create a custom management pack to monitor the service for running state and other errors that may occur.

10 = Successfully stopped service.

13 = Service failed to stop within timeout period

20 = Successfully Loaded Service Configuration

21 = RootManagementServer Configuration invalid - Check service configuration file for errors

22 = PollingIntervalInSeconds Configuration invalid - Check service configuration file for errors

23 = PollingIntervalInSeconds Configuration not numeric - Check service configuration file for errors

24 = ConfigurationFilePath Configuration invalid - Check service configuration file for errors

25 = ExcludedResolutionStates Configuration invalid - Check service configuration file for errors

26 = ExcludedResolutionStates Configuration not numeric - Check service configuration file for errors

29 = LoadServiceConfiguration() Catch All Failure: (+ error message) Properties to modify configuration entries:

30 = Successfully Loaded Properties to Modify Configuration

31 = Detected Change in configuration file reloading.

32 = Problem Reloading configuration file Please restart service: (+ message)

33 = Could not configure File Watcher, so will have to restart service if you change configuration file: (+ message)

35 = LoadAlertPropertiesToModify() Configuration File NOT found: (filename)

38 = Failed to write log file for Properties to Modify Configuration: (+ error messsage) warning

39 = LoadAlertPropertiesToModify() Catch all failure: (+ error message) Critical

50 = Successfully connected to SDK service on: (RMS Server Name)

51 = Failed in LoadServiceConfiguration(). Check event log for additional details. Exiting service

52 = Failed in LoadAlertPropertiesToModify(). Check event log for additional details. Exiting service

53 = Failed Loading configuration files. Check event log for additional details. Exiting service: (+ message)

55 = Successfully established connector connection

100 = Could not locate Monitoring Connector. Will attempt to create connector. (Informational)

101 = Failed to create connector Exiting: (+ Message)

102 = Failed to get monitoring connector Exiting: (+ message)

119 = Successfully created Alert Updater Connector (only occurs first time run service if has never been
run (55 event otherwise). Creates Connector in SCOM.

700 = Processing Alerts Now. (Can use this with missed event notification to monitor if service is healthy when running.)

710 = Failed parsing Properties:(string parsing)

711 = Failed replacing Properties updating alert(ignore most cases): (string parsing) : (error information)

5000 = There are No backup Management Servers to retry connection.

5001 = Successfully loaded list of failover Management servers:  + (Management server list read from registry HKLM\Software\AUCS\FailoverMS MSList
String value)

5002 = Changing connection to failover MS:  + new ManagementServer + Will attempt to connect to new MS.

5010 = Failed last attempt to reconnect to Management Group. Fatal error check health of RMS/MS and restart service.

5011 = Failed last attempt to reconnect MS list to Management Group. Fatal error check health of RMS/MS and restart service.

7010 = Service failed. Please restart service: + (error message)

7012 = Service failed. Please restart service: + (error message)

 

Alert Parameter Replacement

Below is a list of the parameters you can populate custom fields with using the service. Replacement paramters begin and end with ‘$’. Some of these may return null values, so you may find that you do not use them or you may want to tweak the code to return additional or different values.

 

ReplacementParameter Name Description
$ServerName$ ServerName Computer name that the alert was generated for
$WebConsoleURL$ Link to the alert if you have SCOM Web Console
$Category$ Category Gets the category of the alert.
$ConnectorId$ ConnectorId Gets or sets the globally unique identifier (GUID) for the connector correspondingto the monitoring alert.
$ConnectorStatus$ ConnectorStatus Gets the status of this alert relative to the connector.
$Context$ Context Gets the context of the alert.
$CustomField1$ CustomField1 Gets or sets the value of the custom field 1 for the alert.
$CustomField10$ CustomField10 Gets or sets the value of the custom field 10 for the alert.
$CustomField2$ CustomField2 Gets or sets the value of the custom field 2 for the alert.
$CustomField3$ CustomField3 Gets or sets the value of the custom field 3 for the alert.
$CustomField4$ CustomField4 Gets or sets the value of the custom field 4 for the alert.
$CustomField5$ CustomField5 Gets or sets the value of the custom field 5 for the alert.
$CustomField6$ CustomField6 Gets or sets the value of the custom field 6 for the alert.
$CustomField7$ CustomField7 Gets or sets the value of the custom field 7 for the alert.
$CustomField8$ CustomField8 Gets or sets the value of the custom field 8 for the alert.
$CustomField9$ CustomField9 Gets or sets the value of the custom field 9 for the alert.
$Description$ Description Gets the description of the alert.
$Id$ Id Overridden. Gets the globally unique identifier (GUID) for the alert.
$IsMonitorAlert$ IsMonitorAlert Gets a Boolean value that determines whether the alert was generated by a monitor.
$LastModified$ LastModified Gets the last time, in DateTime format, that the alert was modified.
$LastModifiedBy$ LastModifiedBy Gets the name of the user that last modified the alert.
$LastModifiedByNonConnector$ LastModifiedByNonConnector Gets the last time, in DateTime format, the alert was modified by something otherthan a connector.
$MaintenanceModeLastModified$ MaintenanceModeLastModified Gets the time, in DateTime format, that the maintenance mode of this alert waslast modified.
$ManagementGroup$ ManagementGroup Gets the Management Group that the object is in. (inherited from MonitoringBase)
$ManagementGroupId$ ManagementGroupId Gets the globally unique identifier (GUID) for the Management Group that theobject is in. (inherited from MonitoringBase)
$ManagementGroupName$ Management Group name from the alert
$MonitoringClassId$ MonitoringClassId Gets the globally unique identifier (GUID) of the non-abstract monitoring class ofthe associated monitoring object.
$MonitoringObjectDisplayName$ MonitoringObjectDisplayName Gets the display name of the monitoring object that is associated with the alert.
$MonitoringObjectFullName$ MonitoringObjectFullName Gets the full name of the monitoring object that is associated with the alert.
$MonitoringObjectHealthState$ MonitoringObjectHealthState Gets the health state of the monitoring object associated with this alert.
$MonitoringObjectId$ MonitoringObjectId Overridden. Gets the globally unique identifier (GUID) for the monitoring object that isassociated with this alert.
$MonitoringObjectInMaintenanceMode$ MonitoringObjectInMaintenanceMode Gets a value indicating whether the monitoring object associated with the alert isin maintenance mode.
$MonitoringObjectName$ MonitoringObjectName Gets the name of the monitoring object that is associated with this alert.
$MonitoringObjectPath$ MonitoringObjectPath Gets the path to the monitoring object that is associated with this alert.
$MonitoringRuleId$ MonitoringRuleId Overridden. Gets the globally unique identifier (GUID) for the rule associated with thealert.
$Name$ Name Gets the name of the alert.
$NetbiosComputerName$ NetbiosComputerName Gets the name of the computer that raised this alert.
$NetbiosDomainName$ NetbiosDomainName Gets the domain of the computer that raised this alert.
$Owner$ Owner Gets or sets the owner of the alert.
$Parameters$ Parameters Gets a collection of parameters for the alert.
$PrincipalName$ PrincipalName Gets the principal name of the computer that this alert was created for.
$Priority$ Priority Gets the priority of the alert.
$ProblemId$ ProblemId Gets the globally unique identifier (GUID) of the associated monitor if theIsMonitorAlert property is true, otherwise, gets or sets the GUID for theproblem.
$RepeatCount$ RepeatCount Gets the repeat count of this alert.
$ResolutionState$ ResolutionState Gets the resolution state of the alert.
$ResolvedBy$ ResolvedBy Gets the user who resolved this alert.
$Severity$ Severity Gets the severity of the alert.
$SiteName$ SiteName Gets the site name of the alert.
$StateLastModified$ StateLastModified Gets the time, in DateTime format, that the state of this alert was last modified.
$TicketId$ TicketId Gets or sets a string identifier for the ticket of the alert.
$TimeAdded$ TimeAdded Gets the time, in DateTime format, the alert was added to the system.
$TimeRaised$ TimeRaised Gets the time, in DateTime format, the alert was raised.
$TimeResolutionStateLastModified$ TimeResolutionStateLastModified Gets the last time, in DateTime format, the resolution state of the alert wasmodified.
$TimeResolved$ TimeResolved Gets the time, in DateTime format, the alert was resolved.

 

Example Source Code

I am including the example service along with example source code to allow you to see how a SCOM connector could be developed.

I created this project with Visual Studio 2010.

Note I built this on a system that had SCOM 2007 R2 CU5 updated Console. This is required to build the service as it utilizes the SCOM SDK binaries located in C:\Program Files\System Center Operations Manager 2007\SDK Binaries directory (or drive you installed console to).

While this was built using the SCOM 2007 SDK binaries, I did test successfully on SCOM 2007 R2 and SCOM 2012 RC1.

Example Source Code

 

Extract the files from zip file and run Visual Studio 2010. Locate AlertUpdaterService Solution file and open.

When the service is started the OnStart method sets global variable bStopped to false indicating the service is running. Then it creates a new thread that contains the alert updater logic and exits.

The OnStop method set’s bStopped variable to true in order for the ConnectorMain thread to know that it should exit processing in order for the service to stop. Wait for the thread to complete processing and then exit OnStop, so the service can stop running.

The service initializes a file watcher on your configuration file that contains alert information to update. Should the watcher fail to initialize it will log an event indicating that you have to restart the service for any changes to take effect. The service should normally load the new configuration file when you change your configuration file and log events indicating reload occurred.

ConnectorMain method loads the service configuration and alert configuration data to get management server name to connect and other configuration data to run the service. Should it fail to load the data it will not run. You should review your configuration file to fix errors should this fail.

 

Should the service lose it’s connection to SCOM it will retry connection against the Root Management server if running on SCOM 2007 R2. Should this be SCOM 2012 and you specified failoverMS list in the configuration file it will attempt to connect to the failover list in the order specified. Should it fail to connect to any failover MS in your list it will attempt to connect to the orginal RMS/MS one more time before failing out.

It also resets the connection retry count if the last time it retried is greater than 30 minutes ago. You can change this value by modifying AddMinutes(30) below to the value you desire and rebuild project.

Should you be running SCOM 2007 R2 you could populate FailoverMS list with your RMS server name delimited with ; for however many time you would like the retry to occur before the service fails.

 

In CallMain we connect to the management group get the connector or create the connector if not found. Log results to event log should anything fail. After we successfully get the connector we will use that to get any alerts that it subscribes to.

 

 

I added a heartbeat that writes to the event log that could be used with a missed event monitor to alert indicating the service is not processing alerts. You can comment this out or add as a monitoring event that would help validate service is still actively processing alerts. This logs an event aproximately every two minutes.

 

 

Get the web console URL from SCOM and acknowledge alerts so we do not process them again. Then we loop on all the alerts that the connector received since last iteration.

We skip processing the excluded resolution state.

We also populate the server name should server name be a replacement parameter that the connector needs to update.

 

 

We populate string values should any replacement parameters be specified.

 

 

Here we set custom fields with the replacement parameter specified.

 

 

We now log events to event log indicating status and update the alert connector id to null and specify a string value that we updated the alert with or without changes that will show up in properties history of alert.

 

I hope you enjoy this example SCOM connector service and let me know if anyone is interested in a User interface tool to configure your connector.xml configuration file with alerts and replacement parameters. If I have time I will work on that example next.

Disclaimer:

This example is provided “AS IS” with no warranty expressed or implied. Run at your own risk. The opinions and views expressed in this blog are those of the author and do not necessarily state or reflect those of Microsoft.

 

AlertUpdaterServiceBlog.zip