One of the most common scenarios while monitoring the servers are monitoring CPU and Memory. We have default monitors in OS MP which checks the total processor utilization and Available MBs of memory. These 2 are key performance counters when we start troubleshooting any performance related issue on servers.
On of the challenges about our monitors is we get alerted when CPU or Memory under pressure but in order to troubleshoot you need to know which process is actually causing it . This key information is missing in our alerts. Consider the situation where you get those alerts when you are not at the office , in this case you have no pointers from operations manager to diagnose the situation.
In Operations Manager we can attach a task to a monitor to
- gather more information - Diagnostic Tasks
- Take action to resolve the problem – Recovery Tasks
So all we need to attach a task to retrieve top processes when these monitors goes unhealthy.
Tracking High CPU Utilization
Here I have one server under load and both memory and processor monitors are in unhealthy state.
To find what is consuming the CPU cycles we can run List Top CPU Consuming processes task from Windows Server Operating System state view under Microsoft Windows Server
So all we need to do is attach this task from Windows.Server>library.MP to Total CPU Utilization monitor as recovery task .
Here is the XML snippet does the action
Here is the Health Explorer after recovery is attached
Now you can go back and track which processes were causing high processor utilization on your servers from State Changes in Health Explorer.
Tracking Memory Pressure
Same technique can be applied to track memory pressure, only problem is we don’t have a built in task to achieve this. So I have modified the script which collects CPU cycles so it returns memory consumption for the processes.
Now whenever you have a problem with CPU or Memory utilization you will be able to access the process causing the situation from health explorer.
Attached 3 MPs for Windows 2003 , 2008 & R2 , 2012&R2