Not able to see all servers in my Availability Reporting reports for MOM 2005

Seeing these symptoms?

Even after rebooting and waiting a long period of time for availability data to be collected from servers, they still don't appear on reports

OR

Servers appear on reports but their availability metrics appear static between report generation periods

 

Which versions of the Availability MP are affected

All, upto and including 05.0.5000.0001

 

This can happen because

The MOM agent was removed from the managed server and then reinstalled at some point.

 

How can I tell if I'm affected?

If you run the following Transact SQL (TSQL) query against your SystemCenterReporting database and it DOES return rows you are affected by this issue

Use SystemCenterReporting

Select uidServer as ComputerID,nvcServerName as OrphanedServer from mras_server A where A.uidServer not in (Select ComputerID from sc_computerdimension_table)

 

Execute the query.

Why did it happen?

The act of removing the MOM agent from a managed computer together with removal from the OnePoint database means that the idComputer GUID used within the OnePoint database for uniquely identifying the computer is removed. Reinstalling the MOM agent will mean a new idComputer GUID is generated.

This same idComputer GUID is transferred to the datawarehouse, so if an agent changed idComputer GUID the data associated with the previous installation of the agent is essentially orphaned in the warehouse (although it will be groomed when the time comes).  This is pretty much the same for the Availability Reporting MP but this is also coupled with a bug in the Availability MP which won't allow the newly installed agent data to be recognised.

 

Is a fix available from Microsoft

Microsoft is aware of the problem, but a fix for this is currently not available.

 

How can I fix it and what are the ramifications?

If the above query does return rows it is most likely the Orphaned server column contains the server names of those servers missing from reports or containing stale data. The cause is a synchronization problem in one of the Availability reporting SQL stored procedures executed by the AvailabilityAnalysisReporting (MRASLOADER) Windows scheduled task.  The MP maintains its own table of servers within the SystemCenterReporting table MRAS_Servers. As new managed agents are installed they become present in the SC_ComputerDimension_View in the warehouse. The server list used by Availability reporting in the MRAS_Servers table is built from this view. However, changes in the idComputer GUID are not updated (idComputer GUID in the OnePoint db is analogous to ComputerID in the SystemCenterReporting SC_ComputerDimension_View which is analogous to the uidServer in the MRAS_Servers table)

The problem can be remodied by removing what are essentially now orphaned entries in the MRAS_Server table for those reinstalled agents, and then letting the synchronization procedure copy the computer names and GUID's to the MRAS_Server table again on the next execution of the AvailabilityAnalysisReporting (MRASLOADER) Windows scheduled task.

Before using the following procedure to remove the old MRAS_Server entries I STRONGLY recommend a backup of the SystemCenterReporting database.

 

Use SQL Workbench or SQL Query analyzer to connect to the SystemCenterReporting database

Create a new query and enter the following (I've added comments for clarity with -- , these can be removed):

--Make sure current db is the systemcenterreporting db

Use SystemCenterReporting

--Create temporary table containing the guids of the agents previously removed

Select uidServer into #Orphans From MRAS_Server a Where a.uidServer Not In (Select ComputerID From SC_ComputerDimension_View)

 

Declare @uidToRemove uniqueidentifier

Declare uidCur cursor forward_only for

Select uidServer From #Orphans

-- cursor to read each of the guids from the temp table

Open uidCur

fetch next from uidCur into @uidToRemove

while @@FETCH_STATUS = 0

Begin

-- MRAS_pcDeleteServer removes the old data for the previous instance of the GUID

exec MRAS_pcDeleteServer @uidToRemove

Delete From MRAS_Server Where uidServer=@uidToRemove

Fetch next from uidCur into @uidToRemove

End

 

Drop table #Orphans

-- Query Ends Here.

 

Excute the query.

 

This should return the same number of rows affected as the Select query above.

Once this is done, either run the AvailabilityAnalysisReporting (MRASLOADER) Windows scheduled task, or wait for it to run on its next scheduled execution. This will copy the updated server entries for previously orphaned servers back into the MRAS_Server table with the correct GUID's.  At this point you will have to schedule a reboot of the affected servers in order to set the start point for Availability reporting collection and then await collection of enough events for report data for those servers to appearon reports.