TMG Service recovery actions

If the Firewall service crashes a number of times within a short time period it does not automatically restart after the 4th crash. If you review the Service Control Manager settings for the Firewall service appears to be configured to restart after all failures.

clip_image001

After each of the first three failures, you will see this error in the event log:

Log Name: System
Source: Service Control Manager
Date: 3/4/2013 1:36:24 PM
Event ID: 7031
Level: Error
Computer: TMG.domain.local
Description:
The Microsoft Forefront TMG Firewall service terminated unexpectedly. It has done this 3 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.

This is inline with the expected behavior.

However, after the fourth failure the service will no longer restart and you will see this error in the event log:

Log Name: System
Source: Service Control Manager
Date: 3/4/2013 1:45:34 PM
Event ID: 7034
Level: Error
Computer: TMG.domain.local
Description:
The Microsoft Forefront TMG Firewall service terminated unexpectedly. It has done this 4 time(s).

The behavior may appear inconsistent and unexpected but it is actually by design.

During the TMG installation, the service is configured to only automatically restart after the first 3 crashes in a 24 hour period in order to raise the attention of the system administrator that something is going wrong with this service that needs investigating. This can be considered similar to IIS Rapid Fail Protection to avoid a situation where we are restarting and then crashing straight way

By checking the service configuration in the registry key, HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\fwsrv we can see the following:

clip_image003

The number of configured recovery actions is actually four, the first three being "Restart the service" and the fourth being "Do nothing", this results in the behavior described above.

The Windows Service Control Manager UI is limited to displaying only the first 3 actions and therefore gives the wrong impression of the configured actions.

If you have good reasons to configure the service to restart for subsequent failures you can do so by running the following command at an elevated command prompt:

Sc.exe failure fwsrv reset= 86400 actions= restart/60000/restart/60000/restart/60000

This configures 3 restart actions to restart the service after 60 seconds. The last action will be used to determine the behavior of subsequent crashes.

To revert to the default TMG behavior please run the following command from an elevated command prompt:-

Sc.exe failure fwsrv reset= 86400 actions= restart/60000/restart/60000/restart/60000//

This will re-configure Service Control Manager to restart the Firewall service for the first 3 crashes but to then take no action for the 4th and subsequent crashes.

 

Author:

Gianni Bragante

Support Engineer - Microsoft Forefront Edge Security Team

Reviewer:

Ian Parramore

Sr. Escalation Engineer - Microsoft Forefront Edge Security Team