Monitoring UNIX/Linux with OpsMgr 2016


<!--[if lt IE 9]>

<![endif]-->


Comments (15)

  1. ronald van den berg says:

    Hi Kevin, i have some additions to this article.

    First, the new option ‘Use Run-As Credentials’ sounds fantastic, but it isn’t. It only works if you associate the unix/linux profile in the way you describe it. If you have more than 1 set of unix/linux accounts you cannot associate all accounts to all targeted objects, just 1. So in my case i associate the accounts to groups, but of course the server is not yet member of the group when there is no agent on the server. And using the same accounts on all domains is not allowed for us.

    Second the commands for installing, upgrading and de-installing are changed compared to older versions.
    If you try to un-install an agent that has not been upgraded to 1.6 (Or higher?) it tries to run a non existing uninstall command.
    So you first need to upgrade the agent before you can un-install it.

    The upgrade, installation and un-install requires new commands to be added to sudoers, check your /var/log/secure for that. Maybe it’s not changed for all distributions, but i tested with centos and that requires these commands to be added since it’s no longer using rpm in the command lines:

    scomuser ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scomuser/scx-[0-9].[0-9].[0-9]-[0-9][0-9][0-9].universalr.[0-9].x[0-9][0-9].sh –upgrade –force; EC=$?; cd /tmp; rm -rf /tmp/scx-scomuser; exit $EC
    scomuser ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scomuser/scx-[0-9].[0-9].[0-9]-[0-9][0-9][0-9].universalr.1.x[0-9][0-9].sh –install; EC=$?; cd /tmp; rm -rf /tmp/scx-scomuser; exit $EC
    scomuser ALL=(root) NOPASSWD: /bin/sh -c /opt/microsoft/scx/bin/uninstall

    Gr,
    Ronald

  2. Rich Manship says:

    I’ve searched high and low to find anyone suffering the same issue as me. I’ve followed you guide to the letter but the 3 Linux agents I’ve installed (1 CentOS, 2 Ubuntu) only report in on the Heartbeat and other basic checks that don’t really seem to be info taken from the monitored server itself. I haven’t found an answer to this or even anyone suffering the same issue. Any help or places to look would be incredible. Any suggestions?

    1. Kevin Holman says:

      Have you imported the correct MP’s for these OS?

  3. Scott Banyas says:

    Is there a distinguishing alert, similar to Windows, for Linux/Unix for ICMP (Failed to Connect to Computer) after a Heartbeat Failure?

    It is one thing for the agent to have issues, but a whole bigger issue if the computer is not reachable via ping.

    1. Kevin Holman says:

      No, not built in. Microsoft considers HB failures the same as “server down” even though we know that’s not realistic. Many customers institute a ping solution, the problem is that many environments today blog ICMP, therefore this isnt always a reliable method.

  4. Eric says:

    dont believe this post is being monitored. I would LOVE to hear of ANYONE who is successfully monitoring Linux systems with SCOM 2016… in particular on SLES 11. Operations Manager is great. SCOM NOT SO!

    1. Kevin Holman says:

      I have customers doing this. Whats the problem?

  5. Javier says:

    Hi Kevin, hope you can help me. Im installing SCOM 2016 agent on SUSE 12 Enterprise, i run scx-1.6.2-338.sles.12.x64.sh with putty and all is ok, but when i go to see the scx-host-[hostname].pem (/etc/opt/microsoft/scx/ssl/) isn´t created, i only can see scx.pem. How can i generate scx-host-[hostname].pem to sign the certificate in the MS?

    really thanks

  6. rob1974 says:

    As there’s not much documentation about multihomed unix servers and i dont really have the time for writing proper blog. if you are upgrading your SCOM 2012 environment to SCOM 2016 side by side, this will work fine as well.

    Just discover the unix/Linux servers already in the old environment. Make sure you have imported the xplat certs (in the blog above Configure the Xplat certificates) from the 2012 resource pool to all servers in de 2016 resource pool.
    The agent doesn’t get upgraded in the proces, so this actually works without changing your sudoers.
    However, you do want to upgrade to the latest version, so you need to change sudoers to push the upgrade (also described in kevin’s post).

  7. Hi Kevin,

    I am trying to accomlish an automated solution for this. Do you have any inputs? We want to roll out the software using chef (manually install seen from scom) where we will have to find a solution for copying and signing agent certificates before we have to run the discovery wizard, which is a nightmare.

    Any thoughts on this?

    Martin

  8. Kittu says:

    HI Kevin,

    Could you kindly help me for below Issue :

    Customer wants to have two threshold warning & critical for Unix File system similar to memory & Processor (Achieved by importing Unix Extended MP). How can I do it as for logical disk monitores it has only one threshold option.

  9. Michał Sacharewicz says:

    There is one little but important error in your guide:

    You have defined a single account for monitoring (scxmon) and then defined it as a signle UNIX/Linux RunAs Account instance with sudo elevation enabled. Then you have assigned this single RunAs Account to both the “Action” and “Privileged” RunAs Profiles.

    This wont work. When SCOM lanuches any command, that command is attributed with whether to use the “Action” or “Privileged” RunAs Profile, and if the UNIX/Linux RunAs Account assigned to that profile has sudo elevation enabled, it will always be used, regardless of real need.

    This will clearly conflict with sudoers configuration narrowing the use of sudo by SCX agent. In simple terms, virtually all of the unprivileged commands will fail to run due to disallowed sudo use.

    A proper way would be to configure the same linux account as two separate “UNIX/Linux RunAs Account” instances. One with sudo elevation enabled and the other without. Then assign the first one to the Privileged Account RunAs Profile, and the latter to Action Account RunAs Profile.

    This has been tested (and in fact discovered) in a real deployment and the first visible result of this misconfiguration was that neither Apache nor MySQL installations were discovered and displayed in the corresponding views (“Apache HTTP Servers” and “MySQL Servers” respectively).

  10. Hi Keven,

    I have added test Linux VM that is an Oracle Linux mix (dont ask) it discovered fine and is being monitored by SCOM as a Universal Linux agent, but we are not getting any Logical/Physical Disk or Network adaptor discovery, even after 24hrs?

  11. I’ve got a curious problem that I think is related to the SCOM performance rather than to the Linux agent deployment itself, but it’s about an Agent deployment process, so I’ll ask here:

    I’ve got an intermittent problem with agent deployment. Both Invoke-SCXDiscovery and Install-SCXAgent fail 95% of times due to the failure of a single task: “UNIX/Linux Discovery Enumerate Available Agents Task”.

    I’ve reviewed the task and it succeeds occasionally, but in most cases it just runs for several minutes and then dies/timeouts. Yet the internal PowerShell script takes like a second to run if run manually – it just lists few directories and enumerates agent scripts into some XML.

    I assume that the issue must be that my SCOM server hits some powershell queue limit and the task script waits forever in that queue. But I lack insight as to where to look for such SCOM internals, how to debug it and confirm diagnosis.

    Would you be able to provide me with some help in that regard?

  12. leogeo80 says:

    Hi Kevin,

    Could I to configure an monitor for CPU and RAM with two states for the Linux Servers?

Skip to main content