Infinite recycling of the flush health state cache task

When troubleshooting the OpsMgr agent, you may see recommendations to “clear the cache”. This is found here

image

And explained in the following TechNet article

How and When to Clear the Cache
https://technet.microsoft.com/en-us/library/hh212884.aspx

However when The Flush Health Service State and Cache task is run against an OpsMgr agent, the task might re-run on a loop, leading to increased IO and no state changes for this agent. If this infinite recycling of the flush health state cache task you find on the agent health service restarts like this

image

The re-running of the task is a timing issue, which occurs when the agent fails to send a successful task message to the management server before recycling the service following the cache being cleared.

At his moment Microsoft is reconsidering this function for the next release. The options for now is to wait until the task finish. The time depends on the timing issue, but it will finish eventually (max timeout is 24 hours). Or in order to stop the tasks from re-running the you can uninstall and reinstalled the agents on the affected servers.

At this moment it is recommended not to use the function. If necessary flush the cache manually. Though take note, in general as you can read in the TechNet article to flush the cache should be the final step when troubleshooting issues with the agent, before uninstalling and reinstalling the agent.

“Resolving” an alert on an agent by flushing health state is not a good way to manage things and it is really intended as a last ditch troubleshooting effort.

This task deletes the local agent cache which holds data such as the current health state and rule/monitor configuration for that agent. Deleting this cause the agent to discard its current health state and configuration and to re-request a new configuration from its management server. So it would force the agent to reset its current health view and start again.

This may be useful if there is a problem with the local agent cache DB, and we have previously seen problems with the operating system DB engine we use.

These have since been fixed by our respective operating system teams

Management servers or assigned agents unexpectedly appear as unavailable in the Operations Manager console in Windows Server 2003 or Windows Server 2008
https://support.microsoft.com/kb/981263

In fact the main reason this task was included in our product was to help automate the workaround proposed in that KB.

Hope this helps!

 

Thanks to Brian McDermott and Dirk van Coeverden (dirkv(at)microsoft.com)