Recommended registry tweaks for SCOM 2016 management servers


<!--[if lt IE 9]>

<![endif]-->


Comments (14)
  1. stephen lisko says:

    Kevin,

    Thanks for posting this article. Looks like the recommendations are the same as for SCOM 2012 management servers. So if I am planning to upgrade my existing SCOM configuration from 2012 to 2016 will the registry tweaks that I currently have set on my management servers remain, or will I have to tweak them?

    Thanks

    1. Kevin Holman says:

      Historically – the “upgrades” actually do an uninstall/reinstall…. so it is possible registry entries will get wiped out. I’d absolutely go back and re-verify these after an upgrade.

  2. Ronnie says:

    Thanks for the writeup.

    Do you know why these are not set by default?

  3. Breezer says:

    Thnx for the info! Helps alot!

  4. Birdal says:

    Hi Kevin,
    we have the issues Event ID 15002 and 15004 on both Gateway Servers.
    Are these Registry keys are to set then only on Gateway Servers, OR only on Management Servers, OR on both Management Servers AND Gateway Servers?
    Best Regards
    Birdal

    Event IDs:
    Event ID: 15002
    Task Category: Pool Manager
    Level: Error
    Keywords: Classic
    User: N/A
    Computer:
    Description:
    The pool member cannot send a lease request to acquire ownership of managed objects assigned to the pool because half or fewer members of the pool acknowledged the most recent initialization check request. The pool member will continue to send an initialization check request.

    Management Group:
    Management Group ID: {C601BF31-FBEC-4CD4-12F9-814C98AFF83E}
    Pool Name:
    Pool ID: {9EE78DB3-4D6C-DA05-608F-3B79294E3AFB}
    Pool Version: 3075036988681890219
    Number of Pool Members: 3
    Number of Observer Only Pool Members: 1
    Number of Instances: 2

    Log Name: Operations Manager
    Source: HealthService
    Date: 19.07.2017 17:22:02
    Event ID: 15004
    Task Category: Pool Manager
    Level: Error
    Keywords: Classic
    User: N/A
    Computer:
    Description:
    The pool member no longer owns any managed objects assigned to the pool because half or fewer members of the pool have acknowledged the most recent lease request. The pool member has unloaded the workflows for managed objects it previously owned.

    Management Group:
    Management Group ID: {C601BF31-FBEC-4CD4-12F9-814C98AFF83E}
    Pool Name:
    Pool ID: {9EE78DB3-4D6C-DA05-608F-3B79294E3AFB}
    Pool Version: 3075036988681890219
    Number of Pool Members: 3
    Number of Observer Only Pool Members: 1
    Number of Instances: 2

  5. venkatesh says:

    I have a Data Warehouse issue in my new SCOM 2016 environement, where the RMS is a 2016 server data center edition. The reg key path you mentioned to modify the DW command timedout is not present. I have both DB and DW on a single server and all my reg key path shows is HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup where all DB and DW details are present. Can you let me know where should I add Command Timeout Seconds to get rid of event 31551 and 31552 events.

  6. jpriver says:

    Thank you Kevin! great information as always!

  7. Manny Kang says:

    Hi Kevin, I hope all is well. we are seeing a number of 29181 event on the Management Servers, all related to SnapshotSynchronizationworkItem timeout issues. Artile (https://blogs.technet.microsoft.com/silvana/2014/09/04/eventid-29181-snapshotsynchronization-not-taking-place/) suggest to implement key HKLM\Software\Microsoft\Microsoft Operations Manager\3.0\Config Service

    ->New DWORD CommandTimeoutSeconds – is this required in 2016?

    Thanks

    1. Kevin Holman says:

      I generally don’t recommend extending snapshot timeout. Most of the time extending a timeout is like placing a band aid… on a gushing wound that really needs stitches.

      I would first try and understand what’s unique about your environment that is causing snapshot to fail.

      First – how long does it run before it fails?
      Does it fail often then complete with success?
      Is this SCOM 2016?
      What OS?

      Snapshot runs once per day – and should complete with success. It is normal for it to fail a few times every night, but it should have a successful completion, once per day.

      SELECT * FROM cs.workitem
      WHERE WorkItemName like ‘%snap%’
      ORDER BY WorkItemRowId DESC

      1. Manny Kang says:

        Hi Kevin,

        We see the issue on a random basis, running the query we have 2 out of 11 Management Server reporting the issue:

        Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Snapshot data transfer operation failed batch write at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.CheckBatchWriteErrors() at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.TransferData(SnapshotProcessWatermark initialWatermark) at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.ExecuteSharedWorkItem() at Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.SharedWorkItem.ExecuteWorkItem() ———————————– Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Data access operation failed Server stack trace: at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessOperation.ExecuteSynchronously(Int32 timeoutSeconds, WaitHandle stopWaitHandle) at Microsoft.EnterpriseManagement.ManagementConfiguration.SqlConfigurationStore.ConfigurationStore.ExecuteOperationSynchronously(IDataAccessConnectedOperation operation, String operationName) at Microsoft.EnterpriseManagement.ManagementConfiguration.SqlConfigurationStore.ConfigurationStore.WriteConfigurationSnapshot(IConfigurationSnapshotDataSet dataSet) at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs) at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase) at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData) at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.WriteConfigurationSnapshotDelegate.EndInvoke(IAsyncResult result) at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.SnapshotBatchWritten(IAsyncResult asyncResult) ———————————– System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. —> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out Server stack trace: at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at System.Data.SqlClient.TdsParser.ProcessAttention(TdsParserStateObject stateObj) at System.Data.SqlClient.TdsParserStateObject.WriteSni(Boolean canAccumulate) at System.Data.SqlClient.TdsParserStateObject.WritePacket(Byte flushMode, Boolean canAccumulate) at System.Data.SqlClient.TdsParserStateObject.WriteByteArray(Byte[] b, Int32 len, Int32 offsetBuffer, Boolean canAccumulate, TaskCompletionSource`1 completion) at System.Data.SqlClient.TdsParser.WriteUnterminatedValue(Object value, MetaType type, Byte scale, Int32 actualLength, Int32 encodingByteSize, Int32 offset, TdsParserStateObject stateObj, Int32 paramSize, Boolean isDataFeed) at System.Data.SqlClient.TdsParser.WriteBulkCopyValue(Object value, SqlMetaDataPriv metadata, TdsParserStateObject stateObj, Boolean isSqlType, Boolean isDataFeed, Boolean isNull) at System.Data.SqlClient.SqlBulkCopy.ReadWriteColumnValueAsync(Int32 col) at System.Data.SqlClient.SqlBulkCopy.CopyColumnsAsync(Int32 col, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.CopyRowsAsync(Int32 rowsSoFar, Int32 totalRows, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsyncContinued(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsync(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestContinuedAsync(BulkCopySimpleResultSet internalResults, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestAsync(CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalAsync(CancellationToken ctoken) at System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServerAsync(Int32 columnCount, CancellationToken ctoken) at System.Data.SqlClient.SqlBulkCopy.WriteToServer(IDataReader reader) at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.SqlBulkInsertOperation.ExecuteSynchronously(IDataReader reader) at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs) at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase) at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData) at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.SqlBulkInsertOperation.AsyncExecute.EndInvoke(IAsyncResult result) at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.SqlBulkInsertOperation.CommandCompleted(IAsyncResult asyncResult) ClientConnectionId:c9685527-804f-437c-959a-059a2ddc2fce Error Number:-2,State:0,Class:11

        The WorkItemStateID is 10 and the duration is under 70 seconds.

        Its SCOM 2016 UR2, OS Server 2016 and SQL 2016 RTM back end cluster (AlwaysOn)

        Its all very random, sometime we get multiple Management servers having the issue, other times just a couple.

        Could it be network related? Ie the SQL timesouts – network side blips?

        Thanks

        1. Kevin Holman says:

          Snapshot only runs once a day. Random failures are FINE as long is it completes every day.

          What was the output of the SQL query?

          1. Manny Kang says:

            Hi Kevin,

            So query returns WorkItemStateID as 20 (Succeeded) for 2 Management Servers and a value of 10 (failed) for 3 Management Servers.

            But for those servers that have a value of 10 (Failed) they look as they do complete as the CompletedDateTime filed is populated.

            The error is:

            Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Snapshot data transfer operation failed batch write at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.CheckBatchWriteErrors() at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.TransferData(SnapshotProcessWatermark initialWatermark) at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.ExecuteSharedWorkItem() at Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.SharedWorkItem.ExecuteWorkItem() ———————————– Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Data access operation failed Server stack trace: at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessOperation.ExecuteSynchronously(Int32 timeoutSeconds, WaitHandle stopWaitHandle) at Microsoft.EnterpriseManagement.ManagementConfiguration.SqlConfigurationStore.ConfigurationStore.ExecuteOperationSynchronously(IDataAccessConnectedOperation operation, String operationName) at Microsoft.EnterpriseManagement.ManagementConfiguration.SqlConfigurationStore.ConfigurationStore.WriteConfigurationSnapshot(IConfigurationSnapshotDataSet dataSet) at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs) at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase) at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData) at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.WriteConfigurationSnapshotDelegate.EndInvoke(IAsyncResult result) at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.SnapshotBatchWritten(IAsyncResult asyncResult) ———————————– System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. —> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out Server stack trace: at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at System.Data.SqlClient.TdsParser.ProcessAttention(TdsParserStateObject stateObj) at System.Data.SqlClient.TdsParserStateObject.WriteSni(Boolean canAccumulate) at System.Data.SqlClient.TdsParserStateObject.WritePacket(Byte flushMode, Boolean canAccumulate) at System.Data.SqlClient.TdsParserStateObject.WriteByteArray(Byte[] b, Int32 len, Int32 offsetBuffer, Boolean canAccumulate, TaskCompletionSource`1 completion) at System.Data.SqlClient.TdsParser.WriteUnterminatedValue(Object value, MetaType type, Byte scale, Int32 actualLength, Int32 encodingByteSize, Int32 offset, TdsParserStateObject stateObj, Int32 paramSize, Boolean isDataFeed) at System.Data.SqlClient.TdsParser.WriteBulkCopyValue(Object value, SqlMetaDataPriv metadata, TdsParserStateObject stateObj, Boolean isSqlType, Boolean isDataFeed, Boolean isNull) at System.Data.SqlClient.SqlBulkCopy.ReadWriteColumnValueAsync(Int32 col) at System.Data.SqlClient.SqlBulkCopy.CopyColumnsAsync(Int32 col, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.CopyRowsAsync(Int32 rowsSoFar, Int32 totalRows, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsyncContinued(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsync(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestContinuedAsync(BulkCopySimpleResultSet internalResults, CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestAsync(CancellationToken cts, TaskCompletionSource`1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalAsync(CancellationToken ctoken) at System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServerAsync(Int32 columnCount, CancellationToken ctoken) at System.Data.SqlClient.SqlBulkCopy.WriteToServer(IDataReader reader) at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.SqlBulkInsertOperation.ExecuteSynchronously(IDataReader reader) at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs) at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase) at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData) at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.SqlBulkInsertOperation.AsyncExecute.EndInvoke(IAsyncResult result) at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.SqlBulkInsertOperation.CommandCompleted(IAsyncResult asyncResult) ClientConnectionId:ae009e7a-0ce8-449b-a092-fd27ed6d52c3 Error Number:-2,State:0,Class:11

            So is it safe to say we can ignore anything occurrence where WorkItemName is 10 and CompletedDateTime is populated? And only really act on WorkItemStateID`s 12 (Abandoned) and 15 (Timeout)

            Thanks

            1. Kevin Holman says:

              No, it is not safe to say that. 10’s are bad. But again – it is ok to haver snapshot job failures, AS LONG AS you get at least one “20” per day, which means that it was able to complete with success, once per day.

  8. Manny Kang says:

    Thanks, the issue we are seeing is that the SnapShotSyncronization engine work item never seems to recover after 25hr period, so for example a 29180 does not get generated. Other Work flows, such as GetNextWorkItem engine work item, will generate a 29181 and recover with a 29180.

    What I am unclear about is when we get alert Microsoft.SystemCenter.ManagementConfigurationService.SnapshotWorkItemMonitor (generated by monitor Snapshot Sync state) – I see WorkItemStateId 10 for the server (when running the query). Will SnapShotSyncronization engine work item automatically rerun? Even with restarting the Config Service I don’t see 29180 logged for SnapShotSyncronization engine work item – is this normal behavior?

    Its random why see this on some servers not other.

    Thanks again

Comments are closed.

Skip to main content