Agent discovery and push troubleshooting in OpsMgr 2007


OpsMgr 2007 Agent troubleshooting:

There is a GREAT graphical display of the Agent discovery and push process, taken from:

http://blogs.technet.com/momteam/archive/2007/12/10/how-does-computer-discovery-work-in-opsmgr-2007.aspx

 

Agent Prerequisites:

  1. Supported Operating System Version (see below)
  2. Windows Installer 3.1
  3. MSXML 6 Parser

Agent push requirements (including firewall ports):

  • The account being used to push the agent must have local admin rights on the targeted agent machine.
  • The following ports must be open:
    • RPC endpoint mapper                              Port number: 135             Protocol: TCP/UDP
    • *RPC/DCOM High ports (2000/2003 OS)    Ports 1024-5000              Protocol: TCP/UDP
    • *RPC/DCOM High ports (2008 OS)            Ports 49152-65535           Protocol: TCP/UDP
    • NetBIOS name service                             Port number: 137             Protocol: TCP/UDP
    • NetBIOS session service                           Port number: 139             Protocol: TCP/UDP
    • SMB over IP                                            Port number: 445             Protocol: TCP
    • MOM Channel                                          Port number: 5723           Protocol: TCP/UDP
  • The following services must be set:
    • Display Name:  Netlogon                           Started                 Auto      Running
    • **Display Name:  Remote Registry            Started                 Auto      Running
    • Display Name:  Windows Installer              Started                 Manual   Running
    • Display Name:  Automatic Updates             Started                 Auto      Running

 

*The RPC/DCOM High ports are required for RPC communications.  This is generally why we don’t recommend/support agent push in a heavily firewalled environment, because opening these port ranges creates a potential security issue that negates the firewall boundary.  For more information:

http://support.microsoft.com/kb/154596/

http://support.microsoft.com/default.aspx?scid=kb;EN-US;929851

Important: Don’t change the RPC high ports without have an deep understanding of your environment and the potential impact !!!

 

**Not required for agent push, but required for some management packs.

  • The remote management server must be able to connect to the remote agent machine via WMI and execute WMI Query "Select * from Win32_OperatingSystem".  WMI must be running, and healthy, and allowing remote connections.
  • The management server must be able to connect to the targeted agent machine via \\servername\c$

Logging:

  • When pushing an agent from a management server, a log will be written in the event of a failure to:  \Program Files\System Center OpsMgr\AgentManagement\AgentLogs\ on the Management Server.
  • The log on an agent is not enabled by default (like MOM 2005) when using agent push.  If you manually install an agent using the MSI – it will place a verbose logfile at C:\documents and settings\%user%\local settings\temp\momagent.log

To troubleshoot agent push with a verbose log – you need to enable verbose MSI logging:    http://support.microsoft.com/kb/314852/en-us

Common Agent Push errors:

Below are some common push failures.   Also see my troubleshooting table for more detailsConsole based Agent Deployment Troubleshooting table

The MOM Server detected that the following services on computer "(null);NetLogon" are not running. These services are required for push agent installation. To complete this operation, either start the required services on the computer or install the MOM agent manually by using MOMAgent.msi located on the product CD. Operation: Agent Install

Remote Computer Name: dc1.opsmgr.net

Install account: OPSMGR\localadmin

Error Code: C000296E

Error Description: Unknown error 0xC000296E

Solution: Netlogon service is not running.  It must be set to auto/started

The MOM Server detected that the Windows Installer service (MSIServer) is disabled on computer "dc1.opsmgr.net". This service is required for push agent installation. To complete this operation on the computer, either set the MSIServer startup type to "Manual" or "Automatic", or install the MOM agent manually by using MOMAgent.msi located on the product CD.

Operation: Agent Install

Install account: OPSMGR\localadmin

Error Code: C0002976

Error Description: Unknown error 0xC0002976

Solution:  Windows Installer service is not running or set to disabled – set this to manual or auto and start it.

The Agent Management Operation Agent Install failed for remote computer dc1.opsmgr.net.

Install account: OPSMGR\localadmin

Error Code: 80070643

Error Description: Fatal error during installation.

Microsoft Installer Error Description:

For more information, see Windows Installer log file "C:\Program Files\System Center Operations Manager 2007\AgentManagement\AgentLogs\DC1AgentInstall.LOG

C:\Program Files\System Center Operations Manager 2007\AgentManagement\AgentLogs\DC1MOMAgentMgmt.log" on the Management Server.

Solution:  Enable the automatic Updates service…. Install the agent – then disable the auto-updates service if desired.

 

 

Additional Info:

There are sub-components to the OpsMgr Agent installer service

1.       The service is a standard NT Service. The service also handles registration/un-registration of DCOM object that has logic for handling MSI/MSP.

2.       The DCOM object takes directive from the module on OpsMgr Server, this object provides asynchronously installing/uninstalling/updating OpsMgr. It also returns list of currently installed QFEs, verifies pre-requisites like channel connectivity before completing agent install. It handles multi-homing of agent, and reads agent parameters such as version, install dir, etc.

3.       RPC is used to establish a connection to the target machine, SMB is used to copy the source files over.

4.       WMI is used to check prerequisites.

Agents Inside a Trust Boundary

Discovery:
Discovery requires that the TCP 135 (RPC), RPC range, and TCP 445 (SMB) ports remain open and that the SMB service is enabled.

Installation:
After a target device has been discovered, an agent can be deployed to it. Agent installation requires:

  • Opening Remote procedure call (RPC) ports beginning with endpoint mapper TCP 135 and the Server Message Block (SMB) port TCP/UDP 445.
  • Enabling the File and Printer Sharing for Microsoft Networks and the Client for Microsoft Networks services (this ensures that the SMB port is active).
  • If enabled, Windows Firewall Group Policy settings for Allow remote administration exception and Allow file and printer sharing exception must be set to Allow unsolicited incoming messages from: to the IP address and subnets for the primary and secondary Management Servers for the agent. For more information, see How to Configure the
  • Windows Firewall to Enable Management of Windows-Based Computers from the Operations Manager 2007 Operations Console.
  • An account that has local administrator rights on the target computer.
  • Windows Installer 3.1. To install, see article 893803 in the Microsoft Knowledge Base (http://go.microsoft.com/fwlink/?LinkId=86322).
  • Microsoft Core XML services (MSXML) 6 on the Operations Manager product installation media in the \msxml sub directory.

Ongoing Management:
Ongoing management of an agent requires that the TCP 135 (RPC), RPC range, and TCP 445 (SMB) ports remain open and that the SMB service remains enabled.

Supported Operating systems for an Agent:

See:  Operations Manager 2007 R2 Supported Configurations

 

Comments (30)

  1. Kevin Holman says:

    So…. no ports are required to perform a manual agent install.  You would assume that you have access to the desktop of that machine – and therefore the firewall is irrelevant.

    Now – for the manually installed agent to COMMUNICATE with through the firewall, that is a different story.  Only tcp_5723 is required for agent communication.  This communication channel is initiated FROM the Agent, TO the Management Server.  Once the channel is opened from the agent – the communication is bi-directional.

    The only additional ports – are ones for Active directory, if you are using AD/Kerberos authentication.  This is assumed working if the machine is a member of a domain, and their authenticating DC is on the other side of the firewall.  If using certs, this is irrelevant.

  2. Anonymous says:

    Tried pushing out to the server again this morning and the push was successful yet I can find no change in the enviromental variables.

  3. Kevin Holman says:

    Not sure what you mean by live. Yes, it is still applicable to SCOM 2012.

    As to your issue – if the agent reports that it didn’t find policy in AD, it didn’t find policy in AD. 🙂 Check the SCP that is supposed to be created when enabling AD integration. There is a tool to create the container. The LDAP rule only populates the groups.
    There is good documentation on this and lots of blog articles as well on configuring AD integration.

  4. Phil Marcum says:

    Hey sir thanks for taking the time to respond, again. Yes the dev_scom_HSvcSCP_SG container along with Domain Local security groups and containers for each of the management servers were created during the ADI setup. In terms of reference materials I have
    SCOM 2012 Unleashed 2nd edition and just about all of the known articles in terms of configuring ADI per a documented procedure. So I’m thinking I’ve overlooked some minor detail in terms of getting this to work.

    In viewing a reference site I see mention of a rule called AD rule for Domain. I have version 7.1.10226.0 of the Default management pack installed and when performing a search for this rule I see an AD rule for Domain: mydomain.com, ManagementServer: domainms
    but I don’t see the rule displaying polling info. Again not sure what I’m missing.

  5. Kevin Holman says:

    the account used is the management server action account – by default – this is in the UI.  Unless – you chose an optional account and entered that in – then it will use those credentials one time, and discard them.

  6. Anonymous says:

    I uninstalled an agent off of  a server 2K and when I re-installed an agent I get Error Code: 80070643 Error Description: Fatal error during installation.  Auto Updates is set to auto and is running.  What else could throw the 80070643 error?  

  7. Kevin Holman says:

    All communications are *initially originated* FROM the agent TO the management server, however, then once the communication channel is open from the agent – the communication is bi-directional.  Therefore – it depends on how your chosen firewall works – as to whether you need to open communiciation in both directions, or only from agent to MS.  When in doubt, just open both directions for this single port.

  8. Phil Marcum says:

    Hey is this still a live topic? Is there a simlar article for SCOM 2012? Hoping I can get some direction as I’ve got an agent deployment issue and it’s been a week today with no resolution from the MS SCOM forums.

    My thread:
    http://social.technet.microsoft.com/Forums/en-US/8d0b6317-54c6-4c6c-bc0a-52b6e02a63b7/scom-2012-r2-agent-deployment-uninstall-old-and-install-new?forum=operationsmanagerdeployment

    Basically I’m trying to deploy the 2012 R2 agent via the following startup script:

    msiexec /i \server.mydomain.comopsmgragent%Processor_Architecture%MOMAgent.msi /qn /l*v c:scom2012r2mmainstall.log USE_SETTINGS_FROM_AD=1 USE_MANUALLY_SPECIFIED_SETTINGS=0 ACTIONS_USE_COMPUTER_ACCOUNT=0 ACTIONSUSER=svc_dscom ACTIONSDOMAIN=mydomain ACTIONSPASSWORD=mypassword!
    SET_ACTIONS_ACCOUNT=1 AcceptEndUserLicenseAgreement=1

    The script is deployed as a group policy which is assigned to a security group housing the servers for that particular environment. So there’s a DEV group and a PROD group.

    The policy has been created and filtered to the security group. On a Windows 2012 R2 test server I see that the Microsoft Monitoring Agent has been successfully installed and its icon appears in the Control Panel. When clicked I don’t get any management info.

    The following entries appear in the logs on this server:

    Event ID: 2011 The Health Service did not find any policy in Active Directory
    Event ID: 2003 No management groups were started. This may either be because no management groups are currently configured or a configured management group failed to start. The Health Service will wait for policy from Active Directory configuring a management
    group to run.

    The HealthService is Running and I’ve configured AD Integration in the SCOM console using the following query: (&(objectCategory=group)(name=DSCOM_ADI))

    In checking the server I come across the C:Program FilesMicrosoft Monitoring Agent folder but no log. What am I overlooking?

    Any responses appreciated.

  9. Kevin Holman says:

    Whats in the OpsMgr event log on the DC?  Try a manual agent install and see if it will complete.

    Turn up MSI logging and then post or email the logfile.

  10. Richard says:

    Kevin, as usual, invaluable information, thanks so much.

    -The learning curve continues.

  11. Tim says:

    I can do a discovery and install of an agent on my domain controller but the server stays in "Pending Management" and doesn’t go anywhere.  If I check on the machine where I pushed the agent the files are present in c:program…system center op…

    What is going on?  I have uninstalled antivirus on both machines and I can figure out why the agent isnt viewable anywhere in the System Center Console…Please help

  12. RichardS says:

    What direction do the firewall rules need to be? Are they all uni-directional from SCOM to app servers?

  13. parag waghmare says:

    Thank you very much.You blog is very helpfull in my current working enviromnent.

  14. John Bradshaw says:

    Thx Kevin.

    What to do when the fatal code still can’t be overcome, after following the fix scenario above?

    Error Code: 80070643

    Error Description: Fatal error during installation.

    Thx,

    John Bradshaw

  15. Philip says:

    Very helpful.  How do you determine what account is being used to do the push?

  16. Dominique says:

    any similar information for Unix Agent?

  17. Jon says:

    Hi Kevin,

    Your blogs have been invaluable to me setting up OpsMgr 2007.  Thanks for all the great info.  

    I’m trying to find a comprehensive list of ports required to perform a manual agent install through a firewall, as well as ports required for ongoing monitoring.  All machines are part of the domain, just seperated by the firewall.  Can you help with this? My security admins will not allow me to open the wide ranges of RPC ports required for push installs.  

  18. Neel says:

    we provided admin access to SCOM even though we are not able to see reporting console in SCOM after installing the same.

  19. Johan says:

    I have a scenario where for workgroup servers, we install the certs and agents manually but when the server appears in the console, it shows up as "Not Monitored" even after several days.

    The event logs do not show any cert issues, though.

    Any ideas ?

  20. Kyle says:

    Hey Kevin,

    Thank you for writing this article, its helped me quite a bit. I do however keep running into the following issue while pushing agents via agent push script. My script will discover all of the machines i am trying to push to, and will attempt to install the agent but i get the following two errors.

    The Operations Manager Server failed to open service control manager on computer [FQDN of Server]. Therefore, the Server cannot complete configuration of agent on the computer.

    Operation: Agent Install

    Install account: [My install account]

    Error Code: 800706BA

    Error Description: The RPC server is unavailable.

    [this basically means the server is down, or unreachable at the moment.. Normal in this environment as servers are shipped from and too sites]

    and

    The Operations Manager Server cannot process the install/uninstall request for computer [FQDN of Server] due to failure of operating system version verification.

    Operation: Agent Install

    Install account: [my install account]

    Error Code: 80070005

    Error Description: Access is denied.

    This is the error that is bothering me. I am able to use that same install account and install the agent on these machines, after logging directly into the server as the install account. I am also able to push agents from the RMS server using that install account. I checked to see if there were logs under agent logs and i cannot really read them very well, i also couldn't find a log for each of the servers that get this error.

    The account is a domain administrator

    The antivirus is turned off when installing the agent

    The account is a local administrator on the machines

    The server service is started on the machines that are domain controllers

    The windows firewall is turned off

    Im lost on what else i could possibly check, any suggestions would be very helpful.

    Thank you very much for your time!

  21. Mariusz says:

    Good info!  

    Thanks again.  Got an issue with pushed agent install.  It was placed on Pending Management list.  Consecutive re-installs and troubleshooting on the client side did not give any results.  

    In my case MOM Channel Port number: 5723 stopped responding on MS.  I saw only connections from RMS and nothing from agents.  The telnet to MS host on port 5723 then 'netstat -an | findstr 5723' on MS host themselves proved it.

    Restarted System Center Management – HealthService service on MS host caused MS to re-connect with monitored agents and agent install succeeded.

  22. Anonymous says:

    A small, but usefull link collection to use for configuration and upgrading System Center Operations

  23. Anonymous says:

    Here are some more links from my private collections. This links are very usefull to administrate, configure

  24. Anonymous says:

    These are the top Microsoft Support solutions for the most common issues experienced when using System

  25. Anonymous says:

    Top Microsoft Support solutions for the most common issues experienced when you use System Center 2012

  26. TonyO says:

    Hi, I’m very new to SCOM am getting the following error trying to discover.
    The Operations Manager Server could not execute WMI Query "Select * from Win32_OperatingSystem" on computer EDAPP100.dev.construction.enet.

    Operation: Agent Install

    Install account: ENETCONSTRUCTServiceMOM_MSA_C

    Error Code: 800706BA

    Error Description: The RPC server is unavailable.

  27. Anonymous says:

    Top Microsoft Support solutions for the most common issues experienced when you use System Center 2012

  28. Anonymous says:

    Top Microsoft Support solutions for the most common issues experienced when you use System Center 2012

  29. Anonymous says:

    Top Microsoft Support solutions for the most common issues experienced when using System Center 2012

  30. Anonymous says:

    Top Microsoft Support solutions for the most common issues experienced when using System Center 2012