The failure code on the certificate was 0x800B010A (A certificate chain could not be built to a trusted root authority.)

Last week I was helping a customer with some OpsMgr certificate issues with their monitoring Agents in a non-trusted domain. More info on Monitoring an Agent in a non-trusted domain can be found here: https://blogs.technet.com/smsandmom/archive/2008/09/10/opsmgr-2007-monitoring-an-agent-in-a-non-trusted-domain.aspx

These were the events in the OperationsManager Eventlog:

Event Type:       Warning
Event Source:   OpsMgr Connector
Event Category:               None
Event ID: 20067
Date:                    6/17/2009
Time:                    3:33:31 PM
User:                    N/A
Computer:         computername
Description:
A device at IP 192.168.1.1:5723 attempted to connect but the certificate presented by the device was invalid.  The connection from the device has been rejected.  The failure code on the certificate was 0x800B010A (A certificate chain could not be built to a trusted root authority.).

For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp.

Event Type:       Warning
Event Source:   OpsMgr Connector
Event Category:               None
Event ID: 21002
Date:                    6/17/2009
Time:                    3:33:31 PM
User:                    N/A
Computer:         computername
Description:
The OpsMgr Connector could not accept a connection from 192.168.1.1:5723 because mutual authentication failed.

For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp.

Event Type:       Error
Event Source:   OpsMgr Connector
Event Category:               None
Event ID: 20070
Date:                    6/17/2009
Time:                    3:33:31 PM
User:                    N/A
Computer:         computername
Description:
The OpsMgr Connector connected to MS01.support.local, but the connection was closed immediately after authentication occurred.  The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration.  Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.

For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp.

Event Type:       Error
Event Source:   OpsMgr Connector
Event Category:               None
Event ID: 21016
Date:                    6/17/2009
Time:                    3:33:33 PM
User:                    N/A
Computer:         computername
Description:
OpsMgr was unable to set up a communications channel to MS01.support.local and there are no failover hosts.  Communication will resume when MS01.support.local is available and communication from this computer is allowed.

For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp

So it was clear the agent could not communicate with the Management Server in the un-trusted domain using certificates. So we needed to check if the certificates were ok. And in this case it turned out that Certutil was our friend ;-). Certutil.exe is a command-line program that is installed as part of Certificate Services in the Windows Server 2003 family (and higher). Here are the steps we took to verify that there was certificate issue and how we solved it.

Issue:
Agent needing a certificate to communicate with Management Server are generating “A certificate chain could not be built to a trusted root authority” event ids (20067, 20070,  21016)  errors in the Operations Manager eventlog.

Reason:
Wrong proxy settings, so the (Intermediate) Root CA could not be contacted.

See next line in output from certutil -urlfetch -verify <cert.cer> tool:

Failed "AIA" Time: 0
Error retrieving URL: The server name or address could not be resolved 0x80072ee7 (WIN32: 12007)
https://cert.domain.local/aia/SUPPORT.WEB%20ROOT%20CA.crt

Complete output from certutil see attachment certutil_output.txt

Steps to solve issue:

  1. Check for eventids (20067, 20070,  21016)
  2. Export certificate from Local\Computer\Personal\Certificate Folder
    Save as DER encoded binary X.509 (.CER) file.
  3. Run certutil -urlfetch -verify <cert.cer> tool on cer file exported in step 2.
  4. Search certutil output for errors, like “retrieving URL: The server name or address could not be resolved 0x80072ee7 (WIN32: 12007)”
  5. Open Internet Explorer and copy URL that cannot be resolved. If you cannot download the *.crt file look at your proxy settings. These should be empty of correct.
  6. Correct proxy settings.
  7. [not sure if step 7 is really needed] Remove certificates from Local\Computer\Personal\Certificate Folder and Local\Computer\Operations Manager\Certificate folder
  8. Import certificate again in Local\Computer\Personal\Certificate folder
    You can run certutil -urlfetch -verify <cert.cer> tool again to see if there are still any errors.
  9. Run MomCertImport <nameofcertexport>.pfx again.
  10. Check eventlog for restart of HealthService (will be restarted after running MOMCertImport) and if everything is ok now ;-)

certutil_output.txt