No Alert from SQL MP when clustered services go down
I recently ran into the following issue:
The SQL Server Management Pack has several monitors to monitor various SQL Services:
However, on a SQL Cluster, if one of these services is taken offline:
We don't get an alert from SQL (we do get a cluster alert saying that a cluster is offline), and the monitor stays healthy:
This happens because:
- By default, a Basic Service Monitor will only monitor services whose startup type is Automatic
- On a Clustered SQL instance, the service startup type will be set to Manual
To fix this, you simply need to set the "Alert only if startup type is automatic" override to "False" for the Clustered SQL Instances
Now, the health state is changed when the service is down and we are properly alerted:
NOTES:
The SQL Monitors affected by this are:
- Microsoft.SQLServer.2008.DBEngine.ServiceMonitor
- Microsoft.SQLServer.2008.ReportingServices.ServiceMonitor
- Microsoft.SQLServer.2008.AnalysisServices.ServiceMonitor
- Microsoft.SQLServer.2008.IntegrationServices.ServiceMonitor
- Microsoft.SQLServer.2008.DBEngine.FullTextSearchServiceMonitor
- Microsoft.SQLServer.2008.Agent.ServiceMonitor
- Microsoft.SQLServer.2005.DBEngine.ServiceMonitor
- Microsoft.SQLServer.2005.ReportingServices.ServiceMonitor
- Microsoft.SQLServer.2005.AnalysisServices.ServiceMonitor
- Microsoft.SQLServer.2005.IntegrationServices.ServiceMonitor
- Microsoft.SQLServer.2005.DBEngine.FullTextSearchServiceMonitor
- Microsoft.SQLServer.2005.Agent.ServiceMonitor
You must have at least version 6.0.6441.0 of the SQL Server Management Pack for this to work. The latest version is 6.0.6460.0 and can be downloaded here.
If you manually create a Basic Service Monitor in the OpsMgr console, the "Alert only if startup type is automatic" override will not work. You'll need to export the MP and edit the XML to add <CheckStartupType>true</CheckStartupType> to the monitor configuration (this is already done in the latest SQL MP):
Before change (not working):
<ComputerName>$Target/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkNam
e$</ComputerName>
<ServiceName>Messenger</ServiceName>
</Configuration>
</UnitMonitor>After change (working):
<ComputerName>$Target/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkNam
e$</ComputerName>
<ServiceName>Messenger</ServiceName>
<CheckStartupType>true</CheckStartupType>
</Configuration>
</UnitMonitor>