Availability Reporting MP Issue - Servers do not appear on reports, or their data remains static

I wanted to take the time to share a recently observed and reported problem which affects all versions of the Availability Reporting Management Pack for MOM 2005 (upto and including MP version 05.0.5000.0001).

Symptoms Observed

One or more servers expected to be displayed on the Availability report do not appear even after all pre-requisites are met (for example, servers have been restarted, events expected are known to be collected),

OR

One or more servers do appear on the Availability report but the Availability and Reliability statistics are not updated.

Cause

A MOM agent has been re-installed on one or more servers exhibiting the symptoms.

How to determine if your Availability Reporting installation is affected

Use SQL Query Analyzer or SQL Workbench to perform the following TSQL query against the SystemCenterReporting database:-

SELECT uidServer AS ComputerID,nvcServerName AS RenamedServer 
FROM MRAS_Server A WHERE A.uidServer NOT IN (SELECT ComputerID
FROM SC_ComputerDimension_View)

If this query returns results, observe the RenamedServer rows returned and determine if is possible that these servers have had the MOM agent reinstalled since Availability Reporting installation was performed. Are these servers missing from the Availability report or if they appear on the report do their Availability statistics remain constant?
If the query does not return results, fortunately you're not affected.

How to rectify the problem

This is unfortunately a bug in the Availability Management pack code. However the problem can be worked around by removing what are now orphaned rows from a SQL table (rows in the SystemCenterReporting.dbo.MRAS_server which reference a former idComputer GUID in the SystemCenter.dbo.SC_ComputerDimension_View view).
**** I strongly recommend this operation be performed after a backup of the SystemCenterReporting database is taken ****

Here is a background to how the symptoms occur and how to workaround them.

When the ReliabilityAnalysisReporting (MRASLoader.exe) Windows task runs it invokes a series of SQL stored procedures. One of these stored procedures (MRAS_pcServerSync) inserts the current agent list from the SC_ComputerDimension_View (Note the word INSERT, the current logic does not update the table which is the problem) into the MRAS_Server table. It is from this table the server names are sourced for Availability Reporting.
If a MOM agent is therefore removed and reinstalled (typically can occur through routine maintenance or server rebuild) the old GUID remains the MRAS_Server table while the updated GUID is correct the in the SC_ComputerDimension_View. The Availability reporting calculations cannot therefore correlate any of the incoming event criteria against the old computer GUID.

Here is the TSQL DML Query which will remove the orphaned entries from the MRAS_Server table:-

Use SystemCenterReporting

GO 

DELETE FROM MRAS_Server WHERE uidServer NOT IN
(SELECT ComputerID FROM SC_ComputerDimension_View)

 

Once the query is executed, on the next execution of the ReliabilityAnalysisReporting task the computer name/s and GUID/s will be transferred to the MRAS_Server table.

The issue is currently under investigation by Microsoft and I will provide an ETA on a fix a soon as I have it. I KB article documenting the issue will be forthcoming shortly
If you do find that you are affected by this problem please comment this posting with the number of servers you saw affected. It will certainly give me an indication of what I suspect may be a wide-spread problem.

Clive Eastwood
Supportability Program Manager - System Center Operations Manager