A Post-it note is like an addendum, no?
As an Operations engineer, how many times do you get notified for a service restart?
Did you know about Service Recovery actions, or SCOM Recovery Tasks?
Why didn't the SCOM Recovery tasks get added to many of the common Microsoft Applications?
Hopefully today, we can discuss some actions to help limit the amount of manual rework required to resolve service issues.
Let's explain the basics
- Windows Servers have a Recovery tab in the Services.msc menu.
- Does your monitoring tool allow for recovery actions?
To implement recovery actions, here's an example of the Services Recovery Tab
Here's an example of the SCOM agent service
NOTE 3 failures spaced 1 minute apart to restart the service
Let's take it one step further, and add a restart to the service from another tool (insert your monitoring tool here).
In SCOM, taking an action after identifying the problem can be handled different ways
- Services are related to Health, which are typically found as monitors, and to apply restart automation falls into Recovery Tasks.
- In Monitors as a 'Recovery Task', or in Rules as a response
- Rule Response
Active Directory Domain Services (AD DS)
Now that we understand the methods available, let's get to the Addendum.
The Active Directory Domain Services Addendum MP will add Recovery tasks to AD DS Service Monitors.
NOTE: This is for the newer v10.0.x.y management packs that support AD DS 2012-2016
Specifically, the Pack has 12 Recovery tasks for DFS, NTDS, DFSR, IsmServ, KDC, NetLogon, NTFRS, W32Time, Group Policy, DNS Client, ADWS, and DNS.
The recovery tasks verify service state, start 'not running' services, and include the option to recalculate health.
My goal is automation that helps anyone work smarter versus harder, with the goal to avoid being woke up at 2am just to restart a service.