The 31552 event, or “why is my data warehouse server consuming so much CPU?”


<!--[if lt IE 9]>

<![endif]-->

Comments (34)
  1. Kevin Holman says:

    @Mahmood –

    You just have to run it and keep running it until it catches up.  It would be good to know what broke it – you should run reports and figure out what monitors wen nuts and caused it to get out of synch.

  2. Kevin Holman says:

    Hey Keith – shoot me an email contact via the blog with your email, and lets have a chat.  đź™‚

  3. Kevin Holman says:

    @Gene –

    I have seen the manual runs taken over 24 hours.  Just let it go.  It all depends on how much we have to process, and how far behind we are.  You just keep on running it, over and over – until it is caught up.  It is painful – but handled once it is done.  

  4. Anonymous says:

    Thx Kevin..wll do!  its at 5 hrs now and still going.  Will post again tomorrow morning and hope its done.

  5. Kevin Holman says:

    @HC –

    I have not seen that condition, and I would recommend opening a case with Microsoft for that.  Please keep us posted on what the issues was.

  6. Anonymous says:

    Tank you! You saved my day!!!

    Running SCOM 2007 R2/ SQL 2008 R2 Enterprice /w CLU5

  7. Anonymous says:

    When I try to run the maintenance manually for the Performance data set, it comes back in about 2 seconds with a value of 0 for MaintenanceWorkAttemptedInd and NULL for LastOptimizationActionSuccessfulCompletionDateTime.  I disabled the scheduled maintenance, bounced the agent on the RMS, and waited overnight, same thing happens.  Any ideas?  This is what I'm running:

    USE [OperationsManagerDW]

    DECLARE @DataSet uniqueidentifier

    SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = 'Performance')

    EXEC StandardDatasetMaintenance @DataSet

  8. Kevin Holman says:

    Yes – using SQL standard edition can cause an occasional 31552 during large maintenance operations.

    I did NOT say "people shouldn't run SQL standard edition".  It is supported…. I am just saying that while index operations occur, changes to the database can be blocked, which can cause this error to randomly occur.  It isnt an issue.  It is an explanation.

    You have to ignore these errors if they are random, and dont repeat.  It is only a cause of concern if they do not recover from the situation.  If it recovers, it will change state back to healthy and close the alert.

  9. Anonymous says:

    I had the same issue due to the ACS collector.  I used Daniele's fix and it worked, but now the manual SQL run of the Standard dataset maintenance has been running for over 4 hrs!!  I am worried!

  10. Anonymous says:

    Hi I am facing issue with SCOM, It is giving error code 26319 with the following description. Can anybody help me urgently to solve the issue.

    The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

    If the event originated on another computer, the display information had to be saved with the event.

    The following information was included with the event:

    GetDataWarehouseStoredProcedureResult
    uuid:c10a2b09-e8de-45eb-bf92-ba4ae6ca724d;id=69
    The creator of this fault did not specify a Reason.
    System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.UnknownDatabaseException]: The creator of this fault did not specify a Reason. (Fault Detail is equal to A network-related or instance-specific error occurred while establishing a connection
    to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server)).

  11. Kevin Holman says:

    @Gora –
    It is pretty clear from your event, that your database server is not reachable or not responding.

  12. Murad says:

    kevin,

    You mentioned under the "The most common issues causing the 31552 events seen are:" section that

    "Using SQL standard edition, you might see these randomly at night, during maintenance as online indexing is not supported using SQL standard edition."

    can cause this "31552" error message so are you saying that people shouldn't run SCOM 2007 R2 on Standard SQL version? If, yes how come Microsoft SCOM supported configuration document dosn't say that? If Standard SQL is supported and we know that SCOM relays on online indexing then how can I remove this dependancy to eliminate these errors?

    Thanks

    Murad

  13. HC says:

    Hi Kevin

    I really needs your help – Right now we are facing the problem described in this blog (we have had the problem before and have used your blog with success) but this times there seems to be too much data to process, because when I’m running the “StandardDatasetMaintenance” job it runs for around 3 hours and then the SQL comes with the following error:

    Msg 777971002, Level 16, State 1, Procedure StandardDatasetMaintenance, Line 142

    Sql execution failed. Error 777971002, Level 16, State 1, Procedure StandardDatasetAggregate, Line 424, Message: Sql execution failed. Error 777971002, Level 16, State 1, Procedure StateAggregate, Line 341, Message: Sql execution failed. Error 666, Level 16, State 2, Procedure StateAggregate, Line 196, Message: The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID 422213163220992. Dropping and re-creating the index may resolve this; otherwise, use another clustering key.

    Have you seen this error before?  or do you have any ideas on what it could be?

    Do you know if there is a way to run the “StandardDatasetMaintenance” so it only process smaller chunks of data?

    Right now I’m running a case with Microsoft on this issue, but it could be nice if you have some input for me  :o)

  14. HC says:

    Hi Kevin

    I already have a Microsoft case running – I will let you know as soon we solve the problem.

  15. HC says:

    Hi Kevin

    Now we have solved the problem – I find this blog that solved it for me. nocentdocent.wordpress.com/…/troubleshooting-statechangeevent-grow

    The problem:

    On the 4th of December we have had a problem with an ACS collector – that problem have generated ~160.000 state changes between 03:00 AM and 04:00 AM.

    At that point we were unable to process all these state changes even when we run the “StandardDatasetMaintenance” described in your blog here. It where running for around 3 hours (the TempDB grow to ~150 GB while it’s running) and then the SQL came with the error I posted earlier.

    How we solved it:

    By reading the blog I posted in the beginning of this comment, I was able to find the ~160.000 rows that where generated on the 4th between 03:00 AM and 04:00 AM, and after that I was able to delete them manually (Totally unsupported – but it is working).

    After I have deleted all the rows we were able to run the “StandardDatasetMaintenance” manually a few times and then enable the rule again – Now everything is running smooth again.

    Thanks to Vikram Sahay from Microsoft support – he really help on this problem

  16. Keithk2 says:

    We have too much state data (852 days) and am concerned if we just manually run the StandardDatasetMaintenance our trans log or temp db will fill up.  Is there anyway to chunk up the grooming on our data warehouse?  I tried to change the grooing period in the OpsMgr UI to 750 days so that when I run the StandardDatasetMaintenance it will groom out all data older than 750 days and then work backwards.  Unfortunatly the UI doesn't allow me to change the grooming period longer than 60 days.

  17. Kitaab says:

    Dear Kevin,

    the query has been running for over 24 hours now and it is filling the TempDB Log which has grown beyond 100 GB now and we are running out of space.. Can we stop the query will that have any affect .

    We want to do so the clear the Tempdb space and then start the query again , but i am afraid  what query daid for past 24 hours ..

  18. Kitaab says:

    Dear Kevin,

    The Query has been running for 40GB, can we stop and restart the query , also would you consider 5.5GB data in ManagedEntityProperty Table as Excessive.

  19. Mahmood says:

    Dear Kevin,

    My Data Warehouse database is pretty new i.e. 60 days old which is partly in production and I'm facing similar issue

    When i run this query it is more than 24hrs and still haven't completed….

    Database is hosted on 3 node cluster which is Hex Core – 2 Processor with 48GB RAM

    I'm not sure if its fine or something else is wrong

    Regards

    Mahmood

  20. Otto Rogowski says:

    To start fixing this SCOM error

    go to Administration> Run As Configuration > Accounts

    In the Accounts pane there are 2 accounts you need to reset the username and password for Data Warehouse SQL Server Authentication Account and Reporting SDK SQL Server Authentication Account.By default these accounts are set to nothing but a single space. reset both accounts using the same account and password

    Apply the changes and checked the OpsMgr eventlog. Even though the EventID’s 31552 will still appearing the frequency will be far less.

    Even though the OpsMgr databases are maintained automatically, it doesn’t hurt to run every now an then a Stored Procedure to update the statistics: sp_updatestats. Run it against the Data Warehouse database (OperationsManagerDW) and the output showed that many tables were updated.

  21. Anonymous says:

    There are many articles on tweaking certain registry settings for SCOM agents, Gateways, and Management

  22. Murad says:

    Hi Kevin, I am running SCOM 2012 R2 and seeing the following Event logs (31551; 31552; 31561) .These showed up yesterday 10-02-2014 around 4:00AM. I’ve never had this issue before and no new changes were made in SCOM recently. However, I did migrate SCOM
    DBs from a physical SQL server to Virtual last month.
    Looking at the event log data, it seems like it’s related to SQL having connection issues due to AD or network.

    I would really appreciate your thoughts/input on this issue(s)

    Here are the Eventlogs:
    Log Name: Operations Manager
    Source: Health Service Modules
    Date: 10/2/2014 4:01:46 AM
    Event ID: 31551
    Task Category: Data Warehouse
    Level: Error
    Keywords: Classic
    User: N/A
    Computer: MS1SCOMP03
    Description:
    Failed to store data in the Data Warehouse. The operation will be retried.
    Exception ‘SqlException’: Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The
    duration spent while attempting to connect to this server was – [Pre-Login] initialization=83; handshake=14915;

    One or more workflows were affected by this.

    Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData
    Instance name: MS1SCOMP03
    Instance ID: {3895AC2A-12FA-09A7-7870-B70E160EF194}
    ———————————————————————-
    Description:
    Failed to store data in the Data Warehouse.
    Exception ‘SqlException’: Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The
    duration spent while attempting to connect to this server was – [Pre-Login] initialization=83; handshake=14915;

    One or more workflows were affected by this.

    Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance
    Instance name: Event data set
    Instance ID: {8833B838-31E4-7E89-E6F7-782FB737F040}

    ——————————————————————————-

    Failed to enumerate (discover) Data Warehouse objects and relationships among them. The operation will be retried.
    Exception ‘SqlException’: Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The
    duration spent while attempting to connect to this server was – [Pre-Login] initialization=83; handshake=14915;

    One or more workflows were affected by this.

    Workflow name: Microsoft.SystemCenter.DataWarehouse.Discovery.StandardDataSet
    Instance name: APM data set
    Instance ID: {CF68AF05-5A3C-9FD0-EE89-0533223A2DC5}
    Management group: SCOMPROD

  23. arif says:

    hala faul

  24. Megat says:

    Hi ..When I run step number 5, I got this :

    Caution: Changing any part of an object name could break scripts and stored procedures.
    Cannot grant, deny, or revoke permissions to sa, dbo, entity owner, information_schema, sys, or yourself.
    Cannot grant, deny, or revoke permissions to sa, dbo, entity owner, information_schema, sys, or yourself.
    Cannot grant, deny, or revoke permissions to sa, dbo, entity owner, information_schema, sys, or yourself.

    Please advise.

  25. Kevin Holman says:

    @Megat –

    You don’t have *SysAdmin* level rights to your SQL instance. You need this to run these manually, or you need to execute the query as the DW account that you used during installation.

  26. Megat says:

    Hi Kevin..

    Thanks. It went well. Btw, mine was executed within seconds. FYI, our DW size is 500GB in size.
    Cheers

  27. Kevin Holman says:

    @Megat –

    If it executes in seconds – it probably didn’t run.

    Did you run the query to see how far behind your aggregations are?

  28. Megat says:

    I didn’t. But now when I ran it ,I got result as below ( arrgh…cannot copy paste here) :-

    SchemaName AggregationType Count_OutstandingAggregations LastGroomingDate
    State Daily 1 Today
    Perf Hourly 1 Today
    Exchange2010 Daily 1 Today
    Exchange2010 Daily 1 Today
    Exchange2010 Daily 1 Today
    Exchange2010 Daily 1 Today
    Exchange2010 Daily 1 Today
    CM Daily 1 Today

    **Note: Today refers to today’s date. it quite long for me to type, so i just use short form.

  29. Sunil_03 says:

    Hi Kevin,

    I am getting 31552 with below description though it is immediately followed by 31554 , but delayed some times(healthy event)

    Failed to store data in the Data Warehouse.
    Exception ‘SqlException’: Sql execution failed. Error 547, Level 16, State 0, Procedure ManagementPackGroom, Line 364, Message: The DELETE statement conflicted with the REFERENCE constraint "FK_RelationshipStage_RelationshipType". The conflict occurred in database
    "OperationsManagerDW", table "dbo.RelationshipStage", column ‘RelationshipTypeRowId’.

    One or more workflows were affected by this.

    Workflow name: Microsoft.SystemCenter.DataWarehouse.Synchronization.Configuration

    Instance name: 3a8ef40b-d651-48ca-9279-df69f5ef815b
    Instance ID: {745AB694-607F-01E3-3A0F-DF5462DF7548}

    Any inputs on how to fix the same would be appreciated as we are getting 2115,31553, 29181 further due to this issue

    Regards
    Sunil Mittal

  30. Morgan Tiley says:

    I know this was from some time ago but I’m hoping for a little help. I’ve done everything listed above and 48 hours 39 minutes later it is still running….. I’m willing to let it go and wait but I’d like to be sure it is doing something. Any way to check?
    It appears d to be about 30 days behind according to the query. Should the results of the query change as the maintenance progresses or are none of the transaction committed until it finishes? Help?

  31. Mathieu Lacoste says:

    @Morgan – Run the query to see if the number of aggregation is decreasing every once in a while. I expect mine to run for several days since I’m like nearly 800 behind.

    1. David says:

      Hi Kevin,

      Great blog………

      If I run the first query on the blog – its look like I have duplicated Schema Names as I have few Exchange2010 and comtrade mp, so when I run the second query this fails with this error

      Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, , >= or when the subquery is used as an expression.
      Msg 512, Level 16, State 1, Line 16

      How do I remove or fix ?

      I thinking this is a MS support ticket

  32. David says:

    Hi Kevin,

    Great blog………

    If I run the first query on the blog – its look like I have duplicated Schema Names as I have few Exchange2010 and comtrade mp, so when I run the second query this fails with this error

    Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, = or when the subquery is used as an expression.
    Msg 512, Level 16, State 1, Line 16

    How do I remove or fix ?

    I thinking this is a MS support ticket

  33. Jaggu1684 says:

    Receiving the below event 31552 on one our Mgmt servers continuously, Anyhelp would be appreciated.

    Failed to store data in the Data Warehouse.
    Exception ‘SqlException’: Sql execution failed. Error 777971002, Level 13, State 1, Procedure PortStateProcessStaging, Line 711, Message: Sql execution failed. Error 1205, Level 13, State 52, Procedure PortStateProcessStaging, Line 93, Message: Transaction (Process ID 71) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

    One or more workflows were affected by this.

    Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance
    Instance name: Process Monitoring: Process Network Ports data warehouse dataset
    Instance ID: {D33349D5-5F96-D666-BCD9-AE36AC079496}
    Management group: XXXXX

Comments are closed.

Skip to main content