Programmatically resetting SCOM Monitors

Resetting SCOM monitors programmatically through the SDK is a useful task and can be used for multiple purposes like

  • Resetting old (outdated) monitor states.
    For very old monitor states there might be no alert available anymore (because someone closed the alert without fixing the root cause (see post on Alert management scenarios) and maybe even no state change events. Resetting this monitor will either generate a new alert (if the issue still exists) or will return the MonitoringObject to a healthy state.
  • Explicitly resetting a monitor e.g. as part of a SCOM alert housekeeping script.

There are a lot of blog posts and scripts available (see the scenario section below), which already cover this topic. E.g. one of my favorite posts regarding this topic is written by SCOMURR. But  none of these posts really explained the necessary components, their relationships and the required .NET methods. That's what I want to try to catch up with in this post.

Reset workflow

What happens, when we reset a monitor?

HealthStateReset_Workflow

When we force a monitor instance to reset its state:

  • a specific reset task will be created
  • the task will be sent to the hosting HealthService (Agent)
  • the HealthService tries to reset the monitor instance and
  • returns the result to the caller

As you can see from the code examples below, we can retrieve the result of this task to know exactly what had happened and if the reset was successful.

 

What kind of monitor do we need to reset?

Before we start: Which kind of monitors should we deal with? SCOM uses different kinds of monitors:

  • Unit monitors
  • Dependency rollup monitors
  • Aggregate (rollup) monitors

My personal opinion on this might be debatable and I am happy to hear your feedback. IMHO it should be sufficient to reset only unit monitors and let the SCOM system reset the rollup and aggregate monitors itself during its normal processes. But if you want to reset rollup monitors, you can use the exact same methods to do so.

Can all (unit) monitors be reset?

That's a very good question. Genereally yes, but there are a few exceptions. There are some special situations where we have non-hosted objects that are agentless monitored. When we try to reset a monitor, the Management Server might have problems to send the necessary task to the right HealthService. Windows cluster virtual server objects are a good example for that. The monitor "Free space%" for Windows cluster disk is a monitor which cannot be reset manually due to this special circumstances.
So the answer to the question is: No, there are a few monitors, which cannot be reset manually!

 

Components needed to reset a monitor

HealthStateResetComponents

As you can see from the picture, you need at least 3 objects to reset a monitor:

  • MonitoringObject (class instance)
    You will need this specific class instance to reset a specific monitor
  • Monitor
    You need the specific monitor instance whose health state you want to reset.
  • MonitorState
    You need the current state(s) of the monitor instance for the specific MonitoringObject to check, whether this monitor is in an unhealthy state. And knowing for how long it is in this state can also be quite useful.

 

Options to reset a monitor

HealthStateResetOptions

From the picture you can see, that there are at least two different ways of resetting a specific monitor:

Option 1: By using ResetMonitoringState($Monitor) method of the MonitoringObject

 # Get monitor
$monitor = Get-SCOMMonitor -Id $MyMonitorID

# Get monitoring object
$monitoringObject = Get-SCOMMonitoringObject -Id $MyMonitoringObjectId

#Create ManagementPackMonitor collection, needed by the GetMonitoringStates method
$MonitorsToReset = New-Object "System.Collections.Generic.List[Microsoft.EnterpriseManagement.Configuration.ManagementPackMonitor]";
$MonitorsToReset.Add($monitor)

#Get the newest MonitorState
$MonitorState = $monitoringObject.GetMonitoringStates($MonitorsToReset)[0];

if(($MonitorState.HealthState -eq "Error" -or $healthState.HealthState -eq "Warning") -and ($healthState.LastTimeModified -le $dtmCustomTimeStamp))
{
     $MonitoringTaskResult = $MonitoringObject.ResetMonitoringState($monitor)
}
#$MonitoringTaskResult contains detailed information about the reset

Option 2: By using Reset($ResetTime)  method of the MonitoringState object

 # Get monitor
$monitor = Get-SCOMMonitor -Id $MyMonitorID

# Get monitoring object
$monitoringObject = Get-SCOMMonitoringObject -Id $MyMonitoringObjectId

#Create ManagementPackMonitor collection, needed by the GetMonitoringStates method
$MonitorsToReset = New-Object "System.Collections.Generic.List[Microsoft.EnterpriseManagement.Configuration.ManagementPackMonitor]";
$MonitorsToReset.Add($monitor)

#Get the newest MonitorState
$MonitorState = $monitoringObject.GetMonitoringStates($MonitorsToReset)[0];

if(($MonitorState.HealthState -eq "Error" -or $healthState.HealthState -eq "Warning") -and ($healthState.LastTimeModified -le $dtmCustomTimeStamp))
{
   $MonitoringTaskResult = $healthState.Reset($ResetTimeOutSeconds);
}
#$MonitoringTaskResult contains detailed information about the reset

 

Monitor reset scenarios

Based on the above defined reset options, you can now imagine a multitude of different monitor reset scenarios:

Based on a specific monitor

  • Get the specific monitor
  • Get the target class
  • Get all instances of that clss
  • Get the HealthState for this particular instance of the monitor
    • If it is unhealthy (and optionally matches other criteria), reset the monitor using one of the two provided option

A function that resets all instances of a specific monitor can be found in TechnetGallery.

Based on a specific alert

  • Get the Alert
  • From the Alert object get the monitor object and the class instance (MonitoringObject)C
  • Check, if MonitoringObject is available
  • Get the HealthState for this particular instance of the monitor
    • If it is unhealthy (and optionally matches other criteria), reset the monitor using one of the two provided option

A function that closes all unhealthy monitors for closed alers can be found in  TechnetGallery.

Based on a specific class instance (aka MonitoringObject)

  • Get the MonitoringObject based on the ID
  • Check, if MonitoringObject is available
  • Get the Monitors related to the instance and filter out all Unit monitors
    • Get the HealthState for each particular instance of the monitor
    • If it is unhealthy (and optionally matches other criteria), reset the monitor using one of the two provided options

Based on all or a specific class

  • Get the specific class (or all classes), which are not singleton (i.e. not Groups)
  • For each class get its class instances
    • Check, if MonitoringObject is available (actively monitored)
    • Get the Monitors related to the instance and class and filter out all Unit monitors
      • Get the HealthState for each particular instance of the monitor
      • If it is unhealthy (and optionally matches other criteria), reset the monitor using one of the two provided options

This scenario is covered by a PS function reset-customscomunitmonitorforclass. You can find this method in the TechnetGallery .