Using a recovery in OpsMgr – Basic

<!--[if lt IE 9]>


Comments (27)
  1. Anonymous says:

    Do you have a procedure for scenario that you talked about in your last paragraph?  I have a need to attempt recovery about three times then raise an alert that can be forwarded to the responsible tech.  Any help will be appreciated.

    send response to:


  2. Tony.Nes says:

    Here is what we did to address the scenario Kevin talks about in the last paragraph. Hope this helps:

    ‘Start the service (in this case I am playing with the Windows Update Service)
    strServiceName = "wuauserv"
    Set objWMIService = GetObject("winmgmts:{impersonationLevel=impersonate}!\.rootcimv2")
    Set colListOfServices = objWMIService.ExecQuery ("Select * from Win32_Service Where Name =’" & strServiceName & "’")
    For Each objService in colListOfServices

    ‘ Sleep 10 Seconds

    ‘ Check to see if service is running
    strComputer = "."
    Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\" & strComputer & "rootcimv2")
    Set colRunningServices = objWMIService.ExecQuery _
    ("select State from Win32_Service where Name = ‘wuauserv’")

    ‘ Write the status of running / not running to the event log
    For Each objService in colRunningServices
    If objService.State <> "Running" Then
    Set WshShell = WScript.CreateObject("WScript.Shell")
    strCommand = "eventcreate /T Error /ID 111 /SO _DW_SCOM_SrvcMntr /L Application /D " & _

    Chr(34) & "Net start of the Windows Update Service has failed." & Chr(34)
    WshShell.Run strcommand
    ElseIf objService.State = "Running" Then
    Set WshShell = CreateObject("WScript.Shell")
    strCommand = "eventcreate /t Information /ID 100 /SO _DW_SCOM_SrvcMntr /L Application /D " & _
    Chr(34) & "Net start of the Windows Update Service has succeeded." & Chr(34)
    WshShell.Run strcommand
    End If

  3. Tony.Nes says:

    Correction to my last comment: … The comment made by Dipsg can be found here:

    Essentially both posts are worth looking at when addressing the last paragraph of this article.

  4. Kevin Holman says:

    That is not possible, unfortunately.

    This is because the alert is generated by the statechange.  The recovery is also kicked off in response to a state change.  These are simultaneous processes which run in parallel and are not connected, therefore there is no way to make the recovery output dump into the alert – because the alert has already fired.

    What you can do is have the recovery output also log another event, input the data into the event – then not alert on the monitor, but on the event itself in another workflow.

    1. Martyn Baggaley says:

      Hi Kevin,

      It would be very beneficial to know where the diagnostic output is saved in the SCOM database?
      for example the output the diagnostic task ‘List Top CPU Consuming processes’ produces?

      Also is there a way I can retrieve the diagnostic output through Powershell?

  5. Kevin Holman says:

    I dont – and I dont know offhand of any community examples…. basically the logic is, that you would write a script that attempts to restart the service three times with a sleep between, then if it doesnt start… you can have the script create an event in the opsmgr EVT log.  Then have a rule watching for this event and generate an alert.  You could always have the recovery just do a "NET START & NET START & NET START" but running these back to back isnt as good as a script, which can sleep, analyze the service state, kill processes, etc… but that all depends on your scripting skills and testing.

    1. Rinku says:

      Hi Kevin,

      In Exchange MP, all exchange services get monitor by one unit monitor. Do I need to create recovery task for each exchange service?

      How can we automate unit monitor which monitors many services?

  6. Tony.Nes says:

    Looks like the "best practice" means of addressing the last paragraph is to use Orchestrator. Here a good write-up of the process. … Also, see the comment made by Dipsg on 17 Sep 2013 4:19 AM

  7. HuyN says:


    Great writeup…Is it possible to include the results from the Diagnostic and Recovery task in the alert description?  This would be very useful when our support team gets a ticket for an alert such as Total CPU Utilization Percentage that includes the List of Top CPU Consuming processes as well without opening the Health Explorer.



  8. silent says:


    thanks for this great article!

    It is possible to send an email notification when recovery task was excecuted?

    SCOM is going to start the service that´s ok, but I want to know when a service was stopped and SCOM restarted it.

  9. Mondeo Dave says:


    Is it possible to get SCOM to run a batch file as a recovery task?  I have a group of 6 services on a server.  If one of the services stops, then all 6 services should also stop then restart in a particular order.  I have been given a .bat file that if run manually on the box locally it all works.  The batch file runs and all servier stop and restart in the correct order.  I have added the batch file as a recovery task and set it to run automatically, but it fails to run.

    The .bat file has been copied locally to the server.   The full path to file I used was c:folderrestartservices.bat.

    The working directory is c:folder.  I don't know where I'm going wrong.  Can you offer advice on this please?

  10. Hugh Scott says:

    How do I tell if a recovery task actually executed?  I have one configured and it appears to work most of the time, but occasionally it appears to fail and I'm trying to research why.

  11. Hugh Scott says:

    So, it appears that there is a view in the Operaitons Manager Database (SCOM 2007 R2) called RecoveryJobStatusView.  It's not readily apparent to me how it's intended to be used, but I can clearly see that my recovery task was executed (based on the timing of the TimeStarted field adjusted for UTC.

    The Output field is less than helpful:

    <DataItem type="System.CommandOutput" time="2013-04-19T09:52:16.1126762-04:00" sourceHealthServiceId="DC410D98-C067-3351-D2A0-1E1E4CF6069D"><StdOut></StdOut><StdErr></StdErr><ExitCode>0</ExitCode><ProcessError></ProcessError></DataItem>

    There are no errors to indicate that the command failed (it's an OS command that runs).  But I find no evidence that that task actually kicked off.

    A little help?  Some guidance?

  12. Frode says:

    Great article Kevin!

    Do you know how often this check is performed?

    and is that configurable?

  13. Hari says:

    Hi Kevin, Is there a way to run a linux command as a part of the recovery task for the monitor? A little help would be great!

  14. pardeep says:

    is it possible to do agentless monitoring of windows server machine which is in work group

  15. Pavan.M says:

    Hi Kevin,
    I have an event based monitor, I need to crate recovery based on another event id for example monitor will generate and alert when it sees event id 1234 and it has to recover and close the alert when event id 5678 occurs is this possible? if yes can you tell
    me how to achieve it?

  16. Rinku says:

    Hi Kevin
    In Exchange MP, all exchange services get monitor by one unit monitor. Do I need to create recovery task for each exchange service?

    How can we automate unit monitor which monitors many services?

    1. Brian Frahm says:

      Kevin – not sure this is 100% on topic… I have a SCOM2K12 MP in which a state goes unhealthy if a specific event ID (let’s say 27) is logged in the Application log. It goes back to healthy if a different event ID (let’s say 28) is logged. Generally, SCOM will generate an alert when Event ID 27 fires. Sometimes, event ID 28 will fire within say 30 seconds of event ID 27. Is it possible that SCOM doesn’t poll the state health frequently enough to fire a notification? Could it maybe be checking every X minutes before generating a notification?

  17. Ashutosh Kumar says:

    Hi Kevin,

    I need help regarding configuring the Recovering Task in SCOM for Web Application Availability Monitor.

    Scenario is: We have multiple application endpoints which are hosted on different App Pool in IIS. We need to set up recovery task in SCOM so that if there is any alert regarding Web Application Availability Monitor for a particular endpoint, it go on restart the App Pool which hosts that particular endpoints and not other App Pools.

    Environment: SCOM MS in Azure, Agents in In-house DC

    Just wanted to check whether such kind of Customization is possible in SCOM? If yes, please guide me how to set up this.

  18. Jeevan says:

    Hi Kevin, could you please suggest which account can we use for action account either “Local System Action Account” or “Domain account”. becoz we dnt see any result of using these recovery tasks with “Local System Action Account”. We have multiple domains.

    1. Kevin Holman says:

      I *always* recommend use of Local System for the default action account.

      Local system has full rights to start/stop services on a computer – so it won’t be a rights issue.

      1. jeevan says:

        Hi Kevin,
        Does this requires any trust between different domain. As I see recovery tasks are working good on the machines which are same domain as SCOM MS and cannot run on the servers which are in different domain.

        1. Kevin Holman says:

          No. The recoveries run as default agent action account. The same as the monitoring workflows. You should use Local System.

          1. jeevan says:

            Hi Kevin,
            How can we enable “Recalculate monitor state after recovery finishes”

  19. Hi Kevin;
    Thanks for Great article as always…
    I configured the Monitor in SCOM to monitor a log file for modified time is <10 minutes else generate alert. which works fine.
    Also configured recovery task for this monitor to restart the application service and wrote a vb script for the same. Script work fine when run it with elevated privilege CMD but on normal CMD prompt it give access denied error.
    I am not sure SCOM 2012 R2 will run recovery task script under elevated privilege or not. As my recovery script is not running successfully.( Service suppose to restart does not)
    Also where I can find the logs to check if recovery task run properly in SCOM.

    Nilesh Gavali.

  20. Nilesh Gavali says:

    Hello Kevin,
    Configured Monitor to monitor Log file, if log file is not updated for last 20 minutes it will trigger alert and run diagnostic task to stop one of the service then start it.
    But facing challenge as need to know how recovery task run in sequence as we have first recovery task to stop service and second task to start the same service.
    recovery task are running but the sequence of task are wrong. fist it should stop the service instead it run start service task then stop service.
    there is now way we can make out which sequence it should follow.
    need help to get this clarify.

Comments are closed.

Skip to main content