Exchange 2007 managed services might time out during certificate revocation checks


Introduction

We have been working on a problem that surfaced with the release of Exchange 2007 Rollup 5. A number of customers reported that some of their Exchange 2007 managed services did not start automatically after Rollup application, however they would start manually. In most of these cases the computer in question was not connected to the Internet. Investigation showed that the problem was the timing of the Windows Service Control Manager (SCM) and validation of all of the certificates associated with a service within the SCM timeout. In the case of computers that were connected to the Internet, the problem seemed to be network latency, as the problem in those cases can happen only intermittently.

Why this happens

To sum it up: the problem did not manifest itself until Exchange 2007 managed binaries were signed with two certificates (because the original one was expiring). For Exchange 2007 RTM this occurred when RU5 was released. Because now .Net Framework Common Language Runtime (CLR) attempted to validate two certificates by connecting to http://crl.microsoft.com, the process took longer, to the point where SCM timeout would pass and services would fail to start up automatically. In environments with limited or no Internet connectivity, this might fail every time. In other cases, this might fail intermittently based on current network state and load.

Workarounds

KB article 944752 Exchange 2007 managed code services do not start after you install an update rollup for Exchange 2007 is being revised to only contain the recommended workaround, which is to modify the services configuration files. The modification will to prevent the CLR from going onto the Internet in the first place. This is accomplished by adding a section to the managed executable configuration file as outlined in Bypassing the Authenticode Signature Check on Startup (MSDN .NET Security Blog, Shawn Farkas).

Security concerns

The first question that people usually ask about is if making the modification to .config files compromises server security.

In case of Exchange Server 2007, this is not relaxing security at all.  From a security point of view, what it’s doing is saying "assume that the Authenticode signature is invalid."

Let’s say that you have these three assemblies:

  • An assembly without an Authenticode signature.
  • One with an invalid Authenticode signature
  • One with a valid signature that has this option set in the Configuration file

All three behave exactly the same.  The CLR will load them, and not give Publisher evidence to the assembly.  Since the assembly doesn’t get the Publisher evidence, any trust decisions that were being made based upon the validity of the signature will no longer apply – so if the assembly is only trusted due to its signature it will lose its trust status due to the config switch.  Exchange 2007 assemblies however, do not use this mechanism as the only one to determine if the assembly should be run or not, so disabling it does not compromise Exchange.

To make this .config file modification

The resolution to this problem involves editing the configuration files associated with the Exchange Services to use a switch which was added to CLR 2.0 SP1 (which is by default present in .NET framework V3.5.) You can update .NET Framework 2.0 or 3.0 by installing the fix from http://support.microsoft.com/kb/942027/. That hotfix contains the fix outlined in http://support.microsoft.com/default.aspx/kb/936707.

Before continuing with this procedure, save a copy of your existing configuration files to a safe location. In the event of an error in the configuration file, the applicable service will fail to start.
Create configuration files for all managed code Exchange 2007 services to resolve this issue. Please note that we do not suggest going to create/modify the .config files unless your server is actually impacted by this problem.


To create an application configuration file that contains this configuration setting, follow these steps:

1. Create a file, and then name the file the <ApplicationName>.exe.config file.
2. In a text editor, open the file that you created in step 1.
3. Add the following code to the file:

<configuration>
  <runtime>
          <generatePublisherEvidence enabled="false"/>
  </runtime>
</configuration>

4. Save the changes to the file to the applicable directory (see below for a list of files and locations – the .config files should be saved into the same folder where the affected executable is).

If the configuration file already exists for a specific service, just add the "<generatePublisherEvidence enabled="false"/>" line to the runtime options section in the file.

Exchange 2007 Services/Apps which come with a .config file that might need to be updated:

Bin\EdgeTransport.exe
Bin\ExBPA.exe
Bin\ExBPACmd.exe
Bin\ExTRA.exe
Bin\Microsoft.Exchange.Cluster.ReplayService.exe
Bin\Microsoft.Exchange.EdgeSyncSvc.exe
Bin\Microsoft.Exchange.Monitoring.exe
Bin\Microsoft.Exchange.Search.ExSearch.exe
Bin\Microsoft.Exchange.ServiceHost.exe
Bin\MSExchangeMailboxAssistants.exe
Bin\MSExchangeMailSubmission.exe
Bin\MSExchangeTransportLogSearch.exe
ClientAccess\PopImap\Microsoft.Exchange.Imap4.Exe
ClientAccess\PopImap\Microsoft.Exchange.Pop3.Exe

Exchange 2007 Services for which you need to create a new .config file (unless it was already created for another reason):

Bin\Microsoft.Exchange.AntispamUpdateSvc.exe
Bin\MsExchangeFDS.exe
Bin\MSExchangeTransport.exe

Troubleshooting

If a service fails to start after modifying or creating the config files, the most likely reason is an XML syntax error or an incorrect value. In both of these cases the service will fail to start and you’ll get an error similar to this example from the Exchange 2007 Edge Transport Service:

Event Type:      Error
Event Source:    MSExchangeTransport
Event Category:  Process
Event ID:        14004
Description:
The worker process has failed to load application configuration file: System.Configuration.ConfigurationErrorsException: Configuration system failed to initialize —> System.Configuration.ConfigurationErrorsException: The ‘generatePublisherEvidence’ start tag on line 4 does not match the end tag of ‘runtime’. Line 5, position 6. (C:\Program Files\Microsoft\Exchange Server\Bin\edgetransport.exe.config line 5) —> System.Xml.XmlException: The ‘generatePublisherEvidence’ start tag on line 4 does not match the end tag of ‘runtime’. Line 5, position 6.
   at System.Xml.XmlTextReaderImpl.Throw(Exception e)
   at System.Xml.XmlTextReaderImpl.ThrowTagMismatch(NodeData startTag)
   at System.Xml.XmlTextReaderImpl.ParseEndElement()
   at System.Xml.XmlTextReaderImpl.ParseElementContent()
   at System.Xml.XmlTextReaderImpl.Skip()
   at System.Configuration.XmlUtil.StrictSkipToNextElement(ExceptionAction action)
   at System.Configuration.BaseConfigurationRecord.ScanSectionsRecursive(XmlUtil xmlUtil, String parentConfigKey, Boolean inLocation, String locationSubPath, OverrideModeSetting overrideMode, Boolean skipInChildApps)
   at System.Configuration.BaseConfigurationRecord.ScanSections(XmlUtil xmlUtil)
   at System.Configuration.BaseConfigurationRecord.InitConfigFromFile()
   — End of inner exception stack trace —
   at System.Configuration.ConfigurationSchemaErrors.ThrowIfErrors(Boolean ignoreLocal)
   at System.Configuration.BaseConfigurationRecord.ThrowIfParseErrors(ConfigurationSchemaErrors schemaErrors)
   at System.Configuration.ClientConfigurationSystem.EnsureInit(String configKey)
   — End of inner exception stack trace —
   at System.Configuration.ConfigurationManager.GetSection(String sectionName)
   at System.Configuration.ConfigurationManager.get_AppSettings()
   at Microsoft.Exchange.Transport.TransportAppConfig.GetConfigBool(String label, Boolean defaultValue)
   at Microsoft.Exchange.Transport.TransportAppConfig.ResourceManagerConfig.Load()
   at Microsoft.Exchange.Transport.TransportAppConfig.Load()
   at Microsoft.Exchange.Transport.Main.Program.Run(String[] args)

Additional information

We’re continuing to investigate this problem and will provide more information as it becomes available.

Ed Beck, Nino Bilic


Share this post :

Comments (22)
  1. paul says:

    Thank you for this fix, I have this problem in a pre prod system and dont mind making this change there. But will this be in a hot fix later? I do not want to have to update this on a high number of servers, or make sure it is changed on every new Exchange 2007 server that is built.

  2. Exchange says:

    Paul,

    We are tentatively planning to have a permanent solution for this, yes, as manual solution is well… not very cool. We’ll definitely share the details as plans solidify.

  3. Robert says:

    How do you go about getting that level of detail in the logs?  I just got:

    7009

    Timeout (30000 milliseconds) waiting for the Microsoft Exchange Transport service to connect.

    and

    7000

    The Microsoft Exchange Transport service failed to start due to the following error:

    The service did not respond to the start or control request in a timely fashion.

  4. Exchange says:

    Robert,

    In case that services fail to start up in a timely fashion (what happens if CRL checks time out) – you will get events that you got, yes… the event pasted in the blog post is an example of what you can see if the .config file had a bad syntax. In that case, we throw a more detailed event when parsing the .config file. So you would not necessarily seen it unless you have made a change to the .config file and there was a syntax problem with it.

  5. Robert says:

    Ah, I get it. Thoght I was missing the <SuperDuperErrorDebugging="true"/>  setting… :-)

  6. Ramiro says:

    It’s very frustating having to make this change in every server :(

    Even copying files or making a script…

  7. Ramiro says:

    This manual solution is better than configuring proxy setting using proxycfg.exe? What’s the best practice?

    We are testing this in a lab environment, but we’d like to apply this in production as well.

    Thanks!

  8. EdBeck says:

    Ramiro,

    The changes only need to be made on non-Internet connected Exchange servers.  I’d like to note though that even though the timeout issue that this solution resolves is most often seen with non-Internet connected Exchange Servers, there have been reports of this problem occurring on Exchange servers which are connected to the Internet; this is attributed to network latency.

    This method is the best practice that we’re recommending.

    I’d like to take this opportunity to point out Guillaume Bordier’s TechNet blog in which he has created a script to automate the .config file change.  His script has not been tested by the Exchange Team, but you can read his post, and get the script at http://blogs.technet.com/gbordier/archive/2008/07/11/exchange-server-2007-rollups-nighmares.aspx.

  9. Ian Addis says:

    We found this problem on an internet connected server where the services failed to load automatically or manually.  The problem is that although the exchange services are published via ISA Server, the web proxy config was done via a manual setting in IE.  As a result, the Framework had no rights to access the internet.

    Even though I fixed this by publishing WPAD into to this server, some services still failed to load automatically, presumably some delay in achieving internet connection.  The services would however start manually.

    I made the change listed here and it now works as expected.  I wasted a good 4 hours on it though, not a good advert for Exchange 2007!

    Ian Addis

    Dorset Software

  10. Matt says:

    For the sake of others easily finding this magical solution, can you maybe post the error messages for all the services that fail, so that other suckers like me can find it in less then 1 day, as this is not only a pain to implement but also a pain to find…

    i.e.

    —————————————————————-

    Event Type: Error

    Event Source: Service Control Manager

    Event Category: None

    Event ID: 7009

    Date: 8/6/2008

    Time: 2:42:20 PM

    User: N/A

    Computer: EX1

    Description:

    Timeout (30000 milliseconds) waiting for the Microsoft Exchange Mailbox Assistants service to connect.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

    ——————————————————————

    Event Type: Error

    Event Source: Service Control Manager

    Event Category: None

    Event ID: 7000

    Date: 8/6/2008

    Time: 2:42:20 PM

    User: N/A

    Computer: EX1

    Description:

    The Microsoft Exchange Mailbox Assistants service failed to start due to the following error:

    The service did not respond to the start or control request in a timely fashion.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

  11. Ed Beck says:

    Good point Matt,

    I just signed off on the updated KB article, though it may be a few more days before it’s public.  Here’s the Symptoms section of the new KB.

    By the way, we’re updating the existing KB (http://support.microsoft.com/kb/944752/) not publishing a new one.

    SYMPTOMS
    After you install an update rollup for Microsoft Exchange Server 2007, the Exchange 2007 managed code services may not start. Additionally, the following events are logged in the System log:

    Event Type: Error
    Event Source: Service Control Manager
    Event ID: 7000
    Description: The Microsoft Exchange EdgeSync service failed to start due to the following error:
    The service did not respond to the start or control request in a timely fashion.

    Event Type: Information
    Event Source: Microsoft Exchange Server
    Event ID: 5001
    Description: Bucket 77004151, bucket table 5, EventType e12, P1 c-rtl-amd64, P2 08.00.0733.000, P3 msexchangetransport, P4 unknown, P5 unknown, P6 s.serviceprocess.timeoutexception, P7 0, P8 08.00.0733.000, P9 NIL, P10 NIL.

    Event Type: Error
    Event Source: Service Control Manager
    Event ID: 7000
    Description: The Microsoft Exchange Transport Log Search service failed to start due to the following error:
    The service did not respond to the start or control request in a timely fashion.

    Event Type: Error
    Event Source: Service Control Manager
    Event ID: 7009
    Description: Timeout (30000 milliseconds) waiting for the Microsoft Exchange Transport Log Search service to connect.

    The following events are logged in the Application log:

    Event Type: Error
    Event Source: MSExchange Common
    Event Category: General
    Event ID: 4999
    Description:
    Watson report about to be sent to dw20.exe for process id: 1448, with parameters: E12, c-RTL-AMD64, 08.00.0733.000, MSExchangeTransport, unknown, unknown, S.ServiceProcess.TimeoutException, 0, 08.00.0733.000

    Event Type: Error
    Event Source: Microsoft Exchange Server
    Event ID: 5000
    Description:
    EventType e12, P1 c-rtl-amd64, P2 08.00.0733.000, P3 msexchangetransport, P4 unknown, P5 unknown, P6 s.serviceprocess.timeoutexception, P7 0, P8 08.00.0733.000, P9 NIL, P10 NIL.

    Note Depending on the Exchange 2007 server role, the events may display time-outs for other Exchange Server services.

  12. justin says:

    Wow I wish Premier Support Services would have known about this.  We spent over 4 hours on the phone with them. It probably took 3 hours before they found KB944752.  The tech chose method #3.

    As far as using the proxycfg.exe method.  proxycfg.exe is not available in Win2K8.  Win2k8 uses "netsh winhttp set proxy <proxyname>" instead to test fix one of our servers.   PSS is supposed to report back to us which method is best.

    IMHO this issue is silly and a final solution should get pushed in an update soon.  It is not unreasonable to expect Exchange servers to be isolated from the Internet.  It is just good practice to only enable access to what is really needed.    

  13. Oscar Soto Casali MVP DS says:

    Exchange 2007 SP1 Rollup 3 is also affected by this issue.

    I’ve corrected increasing the services start timeout as explained in

    http://support.microsoft.com/kb/944752 method 3

    thanks

  14. EdBeck says:

    Justin,

    Thanks for your feedback on your support experience.  I just sent mail to a couple of our support distribution lists (very broad distribution lists) with your feedback and pointing to this post.

    We are planning to have a permanent solution for this, yes, as manual solution is well… not very cool. We’ll definitely share the details as plans solidify.

    One more update for now, I’ve modified KB 944752 to include only this method as the recommended workaround.  We’re pushing to make it public as soon as we can, probably early next week.

    Thanks again,

    Ed

  15. GoneCrazy says:

    Ed, it would be nice if this article matches the solution described in 944752.  Kinda makes you wonder which advice to follow.

    PS.  Trying very hard not to say something nasty about this.

  16. Ed Beck says:

    Hello GoneCrazy,

    I’ve been updating 944752 and have been pushing it through the KB process. I signed off on it yesterday and expect it to be public early next week.

    Please see the “Workarounds” section of the original blog post for more information on what the new version of the KB contains.  

  17. PEKMEZ David says:

    Hi,

    Just had the same error on the two node of an NLB cluster,

    Exchange 2007 on Windows 2008 (2 SCC and 2 HUB/CAS), update from Exchange 2007 SP1 to SP1 Rollup 3.

    I just updated the .config files and everything worked perfectly :)

    Thanks !

  18. John Colton says:

    I encountered this same issue after applying the "Update Rollup 3 for Exchange Server 2007 Service Pack 1 (KB949870)" to a new Exchange 2007 CAS role server.  However, I was able to resolve this issue simply by adding the URL "http://crl.microsoft.com/&quot; to the Trusted Sites zone on that server.  Could the fix be that simple?

  19. Kevin Robson says:

    I have an issue with the Microsoft Exchange Transport Service failing to start when configured to Log On as Network Service account. It starts fine using the Local System account? The Exchange server is in a lab and has Internet connectivity – I can browse to  http://crl.microsoft.com/ from the server. Initially, I had issues with getting the majority of Exchange Services to start but resolved this by modifying/creating the .config files.

    Does anyone know what issues running the Exchange Transport Service under the Local System account might cause later down the line? I’m happy(ish) to do this in the lab but wouldn’t move into production until I was confident.

  20. Sameer Patel says:

    Kevin

    Can you run couple of utilities from live.sysinternals.com (Filemon & Regmon) to see if Network Service Account has appropriate rights on file shares and registry. We have seen in past as one of the common cause where Local System would work and Network Service Account will fail.

  21. Nishanth says:

    In our case, we added an entry for crl.microsoft.com pointing to 127.0.0.1 in hosts file and things were back in businesss again.

  22. Dazzle says:

    I tried the script first with now luck. I then manually edited the config files of the all the service exe. I got some of the services running but then i had to create from new, config files for the other exchange services.

    All appears to be ok now.

Comments are closed.