I recently had an interesting customer issue.
We were deploying a new management group to do some performance testing of the impact to SCOM performance as we scale up agents. This particular management group only had the default MP’s from installing SCOM, and the Base OS MP’s. Nothing more.
When we scaled up to ~2000 agents, we took a checkpoint at performance. The console was zippy, and the management servers were having no issues. However – when we analyzed performance on the database, we saw really high CPU.
Zooming into a smaller time chunk – the CPU was pretty wild:
What we found – was that the customer had moved the SCOM databases to a different server than originally installed to. When they did this – they did not fully follow the TechNet instructions, to ensure that SQL Broker is enabled and CLR is enabled.
You can check this :
SELECT is_broker_enabled FROM sys.databases WHERE name='OperationsManager'
SELECT * FROM sys.configurations WHERE name = 'clr enabled'
Both should return a value of “1” to show they are enabled.
Changing these values are covered here: https://technet.microsoft.com/en-ca/library/hh278848.aspx
Always make sure you handle the other changes necessary when moving a database, and don’t forget to add the sysmessages back, documented here: Event 18054 errors in the SQL application log – in SCOM 2012 R2 deployments
After making these changes – the impact was significant, going from 50% avg CPU consumption, to 11%.
24 hour snapshot:
One hour snapshot:
Whenever you visit a SCOM customer, or inherit a SCOM environment that you don’t know the full history on, they might not have these settings optimized, and they might not even be aware they are impacted, especially if their agent count is low. There are other symptoms you’d see, such as regular expressions failing in the logs without CLR enabled, and agent discovery not working without SQL broker…. but always a good thing to inspect when reviewing the health of a deployment.