Monitor the operations database grooming procedure (v2)

UPDATE: Monitoring the operational database grooming procedure is now included in the SCOM management pack since version 6.1.7672.0. I will keep this post up as an example of how to create a monitor based on a script that queries a database.

Okay, so it’s only been two days since I posted the original monitor.  But, I’ve been thinking about this one after the original post and now have what I think is a better solution to monitoring the operations groom procedure.  It’s still worthwhile to read the first couple paragraphs in the original post, as I talk about why we should monitor the groom procedure and common causes of failures.

What I wanted to avoid with the initial post was a dependency on another MP and an assumption that customers are running the SQL MP, so I did not specifically target the SQL DB Engine, which might have been a better choice.  But, still, it’s not exactly a precise target class, because we still need to create the monitor in a disabled state and override-enable for the instance.  Not to mention, things get a little more complicated if the database server is clustered.

With all this in mind, I decided to write up this new post with instructions to create this workflow to run on the Root Management Server.  This also gives us an opportunity to change the monitor implementation to include three states.  Reason being is, sometimes grooming will fail for other reasons relating to database performance or availability, and then succeed the next day.

Rather than creating a critical state and generating an alert in this single failure scenario, I thought it would be better to only change state to warning at the first failure.  Then if the groom process fails two or more times, change state to critical and generate a critical alert.  I think this makes more sense, as we certainly don’t need to see any premature alerts for a condition that may not be directly related to grooming.

So here we go!

 

Create the Monitor

Time Script Three State Monitor
image

Configure general properties as shown.
image

Configure schedule as shown.
image 

 

 

Configure script information as shown, then click Parameters. (script is at end of article)
image

Configure script parameters as shown, with a space between the two parameters, entering your database server or cluster virtual name.  If you gave the database a custom name, enter that database name instead of OperationsManager.
image

 

Configure unhealthy expression as shown.
image

Configure degraded expression as shown.
image

Configure healthy expression as shown.
image

Configure health mapping as shown.
image

Configure alert settings as shown.
image

 

State change and alert flow

Warning state change event for one failed groom interval
Capture

Critical state change event for 2 or more failed groom intervals
Capture

Critical alert for two or more failed groom intervals
Capture

Upgrade state when groom succeeds once after critical
Capture

Returns to healthy for two or more successful groom intervals
Capture

 

The script

'OperationalDatabaseGroomingProcedureMonitor3State.vbs

Option Explicit
'Declarations
Dim objCN,objRS,strQuery,strStatusSum
Dim oArgs,oAPI,oBag
Dim strDBServer,strDatabase

'Define local event constants
Const EVENT_TYPE_ERROR = 1
Const EVENT_TYPE_WARNING = 2
Const EVENT_TYPE_INFORMATION = 4

'Create objects
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oArgs = WScript.Arguments
Set oBag = oAPI.CreatePropertyBag()

'Define parameters
strDBServer = oArgs(0)
strDatabase = oArgs(1)

'Set DB connection
Set objCN = CreateObject("ADODB.Connection")
objCN.Open "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=" & _
strDatabase & ";Data Source=" & strDBServer & ""

strQuery = "SELECT SUM(StatusCode) AS StatusSum " & _
"FROM InternalJobHistory " & _
"WHERE CONVERT(varchar, TimeStarted, 101) IN " & _
"(CONVERT(varchar, DATEADD(day, - 1, GETUTCDATE()), 101), " & _
"CONVERT(varchar, GETUTCDATE(), 101)) " & _
"AND Command = 'Exec dbo.p_GroomPartitionedObjects and dbo.p_Grooming'"

'Query DB
Set objRS = objCN.Execute(strQuery)

'Set variables
strStatusSum = objRS ("StatusSum")

'Submit Property Bag
Call oBag.AddValue("DBServer",strDBServer)
Call oBag.AddValue("Database",strDatabase)
Call oBag.AddValue("StatusSum",strStatusSum)

'Healthy state
If strStatusSum => 2 Then
Call oBag.AddValue("State","Healthy")
'Log event to Operations Manager log. For testing only.
Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor3State.vbs",100,EVENT_TYPE_INFORMATION,"Script executed " & _
"with StatusSum " & strStatusSum)

'Warning state
ElseIf strStatusSum = 1 Then
Call oBag.AddValue("State","Warning")
Call oBag.AddValue("Details","Operational database grooming has failed one time in the last two days.")
'Log event to Operations Manager log. For testing only.
Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor3State.vbs",100,EVENT_TYPE_WARNING,"Script executed " & _
"with StatusSum " & strStatusSum)

'Critical state
ElseIf strStatusSum = 0 Then
Call oBag.AddValue("State","Critical")
Call oBag.AddValue("Details","Operational database grooming has failed two or more times. " & _
"Check grooming by running the following SQL query against the operations database: " & VBCRLF & VBCRLF & _
"SELECT * FROM InternalJobHistory ORDER BY InternalJobHistoryId DESC")
'Log event to Operations Manager log. For testing only.
Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor3State.vbs",100,EVENT_TYPE_ERROR,"Script executed " & _
"with StatusSum " & strStatusSum)

End If

'Return property values
Call oAPI.Return(oBag)