How grooming and auto-resolution work in the OpsMgr 2007 Operational database


How Grooming and Auto-Resolution works in the OpsMgr 2007 Operations DB


In a simplified view to groom alerts…..

 

Grooming of the ops DB is called once per day at 12:00am…. by the rule:  “Partitioning and Grooming  You can search for this rule in the Authoring space of the console, under Rules.  It is targeted to the “Root Management Server” and is part of the System Center Internal Library.

 

It calls the “p_PartitioningAndGrooming” stored procedure, which calls p_Grooming, which calls p_GroomNonPartitionedObjects (Alerts are not partitioned) which inspects the PartitionAndGroomingSettings table… and executes each stored procedure.  The Alerts stored procedure in that table is referenced as p_AlertGrooming which has the following sql statement:


    SELECT AlertId INTO #AlertsToGroom

    FROM dbo.Alert

    WHERE TimeResolved IS NOT NULL

    AND TimeResolved < @GroomingThresholdUTC

    AND ResolutionState = 255

 

 

So…. the criteria for what is groomed is pretty simple:  In a resolution state of “Closed” (255) and older than the 7 day default setting (or your custom setting referenced in the

table above)

 

We won’t groom any alerts that are in New (0), or any custom resolution-states (custom ID #).  Those will have to be set to “Closed” (255)…. either by autoresolution of a monitor returning to healthy, direct user interaction, our built in autoresolution mechanism, or your own custom script.

 

Ok – that covers grooming.


However – I can see that brings up the question – how does auto-resolution work?

 


 


 

 


That specifically states “alerts in the new resolution state”.  I don’t think that is completely correct:


That is called upon by the rule “Alert Auto Resolve Execute All” which runs p_AlertAutoResolveExecuteAll once per day at 4:00am.  This calls p_AlertAutoResolve twice…. once with a variable of “0” and once with a “1”.


Here is the sql statement:


IF (@AutoResolveType = 0)

    BEGIN

        SELECT @AlertResolvePeriodInDays = [SettingValue]

        FROM dbo.[GlobalSettings]

        WHERE [ManagedTypePropertyId] = dbo.fn_ManagedTypePropertyId_MicrosoftSystemCenterManagementGroup_HealthyAlertAutoResolvePeriod()

        SET @AutoResolveThreshold = DATEADD(dd, @AlertResolvePeriodInDays, getutcdate())

        SET @RootMonitorId = dbo.fn_ManagedTypeId_SystemHealthEntityState()

        — We will resolve all alerts that have green state and are un-resolved

        — and haven’t been modified for N number of days.

        INSERT INTO @AlertsToBeResolved

        SELECT A.[AlertId]

        FROM dbo.[Alert] A

        JOIN dbo.[State] S

            ON A.[BaseManagedEntityId] = S.[BaseManagedEntityId] AND S.[MonitorId] = @RootMonitorId

        WHERE A.[LastModified] < @AutoResolveThreshold

        AND A.[ResolutionState] <> 255

        AND S.[HealthState] = 1

 

<snip>

 

    ELSE IF (@AutoResolveType = 1)

    BEGIN

        SELECT @AlertResolvePeriodInDays = [SettingValue]

        FROM dbo.[GlobalSettings]

        WHERE [ManagedTypePropertyId] = dbo.fn_ManagedTypePropertyId_MicrosoftSystemCenterManagementGroup_AlertAutoResolvePeriod()

        SET @AutoResolveThreshold = DATEADD(dd, @AlertResolvePeriodInDays, getutcdate())

        — We will resolve all alerts that are un-resolved

        — and haven’t been modified for N number of days.

        INSERT INTO @AlertsToBeResolved

        SELECT A.[AlertId]

        FROM dbo.[Alert] A

        WHERE A.[LastModified] < @AutoResolveThreshold

        AND ResolutionState <> 255


 


So we are basically checking that Resolution state DOES NOT EQUAL 255….. not specifically “New” (0) as we would lead you to believe by the wording in the UI. 

 

Then, there are simply two types of auto-resolution: 

 

  1. Resolve ALL alerts no matter what source (rule or monitor), as long as they haven’t been last modified within “30” days. (where 30 days is the default value)
  2. Resolve all MONITOR based alerts where the targeted object has returned to a healthy state, and hasn’t been last modified within “7” days.  (where 7 days is the default value)

 

Comments (19)

  1. Kevin Holman says:

    @Keithk2 –

    On monitors – these get autoresolved just like rules (30 days) – except if the alert source (the monitor state) is healthy, then 7 days.

    You are correct – in that it creates a problem if a monitor based alert is resolved after 30 days – and the alert source (the monitor) is still unhealthy.  I suppose, if you haven't fixed to the root cause of the monitor state after 30 days…. the shame is on you, no the autoresolution.  🙂

    Yes – sometimes changes to what generates notifications will be modified in a service pack, R2 update, or even a CU.  I know that based on customer feedback and requests, there have been some changes made here.  However, if you dont subscribe to closed alerts, then you wont get alert notifications about them being closed.

  2. Anonymous says:

    Kevin, most of the monitors have a parameter called "Auto-Resolve Alert". When it's set to False, the monitor in New state should NOT be automatically resloved after the source becomes healthy or resloved after 30 days when the source remains unhealthy, should it?  I assume the parameter overrides the system "Auto-Resolve Alerts" settings.  Thanks for your feedback.

  3. Anonymous says:

    This appears to be present up to RC-SP1 version, build 6.0.6246.0 &#160; In the Task Status console view

  4. Keithk2 says:

    One other thing.  We just started getting email notifications from alerts resolved by "auto-resolve" only in the last few weeks.  A few weeks prior to that we went to CU5 from CU1.  I know you have pretty good knowledge of these updates in CU5, therefore can you confirm any connections?  

  5. Kevin Holman says:

    Auto-resolve setting of a monitor has no bearing.  These stored procs will modify the alerts as designed based on the criteria specified above.

  6. Keithk2 says:

    Kevin,

    Sounds good for rules, but monitors?  Correct me if I am wrong, but if a monitor's alert get's resolved while the state of the monitor is still unhealthy, then the alert won't get raised even though it is still a unhealthy.  This sounds like a problem for the same reason we should never manually close an alert for a monitor since the health state is not reset.  Appreciate clarification on this.

    Thanks in advance Kevin,

    Keith

  7. Kevin Holman says:

    @Suresh –

    No – you misunderstand.  We never groom alerts in any active resolution state.  We ONLY groom alerts in resolution state 255 (closed).  That should be CRYSTAL clear from the first half of this article.  

    We RESOLVE alerts alerts (set the alert to 255) based on the crtieria in the article.  THEN, the grooming will groom it out based on your alert retention setting in the console (7 days by default).

  8. Kevin Holman says:

    @shashikanth –

    This is one of the ley difference between monitors and rules.  Monitors have a health state, and when the state goes back to healthy – there is a link between the monitor state and any alert generated – where the default behavior is to auto-close the alert if the monitor that generated it returns to healthy.

    However, there is no "auto-close" of alerts from rules.  Since there is no health state association…. alerts from rules are generated by a write action within the rule definition.  We do EVENTUALLY auto-resolve alerts from rules, as long as they meet the retention criteria and have not been modified since that time.  Our default for rules is to autoresolve them after 30 days of zero attention/modification, which is described above.  This is also configurable.  The idea is, that if you have done anything about the issue after a month, then you probably aren't going to, so we will close the alert in order to free up resources and de-clutter the console.

  9. Anonymous says:

    This is a continuation of my other post, on general alert grooming: How grooming and auto-resolution

  10. Keithk2 says:

    Thanks Kevin for the clarification.  Agreed on the first point, but how nice would it be if it could reset health as well or even have auto resolution settings for rules and monitors seperatly.  Good to know what to expect though at this point and appreciate your timely feedback!

  11. Kevin Holman says:

    Well, I don’t know of any reasons that you couldn’t change the time on that workflow… auto-resolution isn’t really a big deal. That need makes perfect sense.

  12. Kevin Holman says:

    What would be the reason you’d want to change that? In general, auto-resolving alerts should be a relatively quick SQL operation, and I am not sure under what context one would need to change this.

  13. Anonymous says:

    Thanks for educating me Kevin!!!

    Need some clarification on your statement "Resolve all alerts where the object has returned to a healthy state in “N” days".

    Do you mean the Agent health for particular server (or) Health of a monitor or rule for that alert? I am asking this bcoz, if the agent is healthy, then monitors will change the state automatically to closed if the issue resolved. At the same time, this is not possible with Rules.

    Could you please clarify this statement?

    Thanks,

    Suresh

  14. susaa says:

    Kevin,

    Good Day!!!

    Can you confirm if this document is applicable for SCOM 2007 R2?

    From the doc(as per above screenshot), I understand 1. Any alert which is set to Resolution State (0, 255 and custom states(e.g,55,56)) will be groomed after 30 days and 2. Any alert which is set to Resolution State (0, 255 and custom states(e.g,55,56)) and if the particular agent is healthy then it will be groomed in 7 days.

    Is my understanding correct?

    Please help me.

    Thanks in Advance!!!

  15. shashikanth says:

    Hi

    I am able to find "Auto-Resolve Alert" override settings of a Monitor but not in RULE Alerts,  how to change the settings of a RULE that should not close automatically,,  requirement is ——Alerts generated by a RULE should not AUTOCLOSE,

    Shashikanth

  16. ExchAdminInd says:

    Hi

    Can we change scheduled synchronise time of 4:00AM of rule "AlertAutoResolveExecuteAll"?

  17. Hypoport says:

    Hi Kevin, first of all thank you for the above article. I just saw your answer to Vikas question and I have to say, that I want to change the time from 4:00 AM to 11:00 AM too. The reason for this is our Incident Manager gets woken up at night by these
    closed alert messages.

  18. Zolon Farsane says:

    Kevin,

    What is the support ability if the “auto resolve” stored proc is edited to not auto resolve alerts generated by a specific rule with a specific resolution state?

    As example, I have a custom scripted monitor that generates an alert based on an event. It can be 3+ days before another event comes in, and the issue is still active. This same script closes the alert if the next received event passes the criteria for it to be resolved. As I have a resolution method for these specific alerts, I need them to be ignored by the Auto-Resolve sproc.

    1. Kevin Holman says:

      We do not support any customer modifications to our stored procedures. You’d have to understand any update we make would also overwrite your custom sproc.

      For something like this – I recommend a different approach. A custom script that runs once per day – and updates some field in the alerts…. therefore their lastmodified time will be bumped and they will never auto-resolve.