UPDATE: Monitoring the operational database grooming procedure is now included in the SCOM management pack since version 6.1.7672.0. I will keep this post up as an example of how to create a monitor based on a script that queries a database.
Often times we’re not aware of, or concerned about grooming in Operations Manager. After all, it’s an internal process that is managed by the RMS and we might expect an alert if anything goes awry with the operations database. Right?
The System Center Core Monitoring management pack does have the Operational Database Space Free (%) monitor, which will generate an alert if the database is running low on space; 40%=Warning and 20%=Critical, by default. This is our first indicator that something is not right, because we should size the operations database accordingly to accommodate a steady operational state. This is the main reason we recommend not enabling auto growth on the operations database.
Is it good enough to know when the operations database is running out of space? Not for me. Because running out of disk space is usually a belated indication and symptom of another issue.
Common causes of database growth
The only time the operations database will grow beyond what is expected during steady operational state, is when we introduce some noise. The issue is usually one or more of the following:
* Poorly written monitor was introduced
* Too much data collected by new rule
* New management pack(s) imported
* Adding agents to management group
* Transaction Log is not sized accordingly
At the end of the day it’s the grooming procedure that isn’t running successfully. Unless there was some issue not directly related to the grooming process (performance or availability problem), if the grooming procedure fails one time it is likely that it will continue failing until an administrator intervenes.
The last bullet above, Transaction Log is not sized accordingly, is the reason grooming will continue to fail after the first failure. Because when grooming fails, it’s almost always because the transaction log ran out of space.
If the transaction log runs out of free space during the first failed groom interval, it certainly will not have enough space during subsequent intervals to successfully groom the data that created the large drop in the first place.
So why not monitor the grooming procedure?
Out of the box, there is no monitoring of the grooming procedures that keeps the operational database clean and performing well. I hope this is included in core monitoring at some point, but for now we’ll need to create a simple workflow to monitor grooming.
I could create some sort of extended management pack, which would properly discover the operations database server and include the associated monitor, but I’d rather show you how to do it so you can create this in your own custom management pack. Most customers already have some sort of extended core monitoring management pack, and who needs another MP with a single workflow?
Before going further, please see the improved monitor instructions here.
Create a unit monitor
Enable the new monitor
Scope the Authoring space to Windows Server Operating System and find your new monitor.
'Define local event constants
Const EVENT_TYPE_ERROR = 1
Const EVENT_TYPE_WARNING = 2
Const EVENT_TYPE_INFORMATION = 4
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oArgs = WScript.Arguments
Set oBag = oAPI.CreatePropertyBag()
strDBServer = "."
strDatabase = oArgs(0)
'Set DB connection
Set objCN = CreateObject("ADODB.Connection")
objCN.Open "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=" & strDatabase & ";Data Source=" & strDBServer & ""
strQuery = "SELECT InternalJobHistoryId, TimeStarted, StatusCode, Command FROM InternalJobHistory " & _
"WHERE (CONVERT(varchar, TimeStarted, 101) = CONVERT(varchar, GETDATE(), 101)) and Command = 'Exec dbo.p_GroomPartitionedObjects and dbo.p_Grooming' " & _
"ORDER BY InternalJobHistoryId DESC"
Set objRS = objCN.Execute(strQuery)
strInternalJobHistoryId = objRS ("InternalJobHistoryId")
strTimeStarted = objRS ("TimeStarted")
strStatusCode = objRS ("StatusCode")
strCommand = objRS ("Command")
'Submit Property Bag
If strStatusCode = 1 Then
Call oBag.AddValue("Details","Operational database grooming return status code " & strStatusCode & " for procedure " & strCommand & ", " & _
"which was started at " & strTimeStarted & ". Check Internal Job History Id " & strInternalJobHistoryId & ".")
'Return property values
'Log event to Operations Manager log. For testing only.
'Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor.vbs",100,EVENT_TYPE_INFORMATION,"Operational database grooming return status code " & strStatusCode & " for procedure " & strCommand & ", " & _
' "which was started at " & strTimeStarted & ". Check Internal Job History Id " & strInternalJobHistoryId & ".")