Disclaimer: Due to changes in the MSFT corporate blogging policy, I’m moving all of my content to the following location. Please reference all future content from that location. Thanks.
In summary, we went over various security concerns deploying SCOM. Although there are a bunch listed, there are two that I believe could take down an organization in a hurry: poor run as account distribution or a SCOM admin’s account being compromised. The run as account distribution could allow for quick lateral movement through Tier 1 as well as possibly exposing all of your data to the attacker (depending on the permissions of said run as account). A SCOM admin being exposed could allow for someone to use SCOM as a deployment mechanism to compromise your environment. These are both big deals in my opinion, and when it comes to designing for security, this is the concept that you need to design around.
As such, I think there’s some wisdom in treating this application as though it’s a Tier 0 asset. It should be protected carefully. If you wanted to strictly follow Microsoft’s model, you may end up putting separate management groups in Tier 0, Tier 1 (and possibly multiple management groups here), Red Forest, etc. While I suspect that there will be some in the cyber community that disagree with me (and if so, I would appreciate the feedback), I personally am not sure that this worth the effort. For one, it massively over-complicates administration, as you’re managing multiple SCOM environments. You have to have multiple lines of accountability for SCOM alerts. You now alerts coming from multiple management groups, and will have to tune across multiple management groups as well. You will now deal with multiple sets of reports as well as multiple sets of notifications. You also haven’t really mitigated any risk to your Tier 1 environment in particular, and that is where all your data is stored. In the cyber community, we stress Tier 0 protection, and that’s good as compromising Tier 0 is by far the easiest way for an attacker to own your environment, but Tier 1 protection is just as critical, as that is where your assets are contained. Remember that in SCOM architecture, the agents should be running as local system, so on their own, they don’t pose a threat no matter where SCOM is located. It’s how you distribute your run as accounts and how you allow Operations Manager to be managed.
With that in mind, my ideal architecture would be to place my management group in the Red Forest. To back track a bit, a red forest is an untrusted domain. If you implement a RF, you will use IPSEC to prevent management of your resources through any other way. That means no more RDP to a server from your standard desktop. You will use a privilege access workstation (PAW) that is joined to the red forest, as IPSEC will prevent you from accessing your server environment in any other way. The RF has no internet access or email, so it’s not prone to being infected by malware, and since it’s not trusted, it’s much harder to laterally move into it (though side note, if you’re using the same passwords in RF as you are in prod, then you do effectively have the same hash, which could in theory be compromised).
This does present some challenges. For one, to manage the production environment, you’ll need to setup gateways in production and configure certificate authentication between the management group and the gateways. This isn’t terribly difficult to do, but it can introduce a few points of failure, namely agent and gateway failover, as these do not occur automatically. Jimmy Harper wrote a nice piece on how to do this, and since (as I understand it) the commands are the same in 2016 as they are in 2012, I’ll simply link it here. Since agent deployment is not a static thing, you’ll likely need to run the agent failover PowerShell as a scheduled task on the management server on a periodic basis, and add the gateway failover scripts as a part of any new gateway deployment. At least in this scenario, you’re only managing one management group, and you’ve mitigated the risks associated with SCOM administration. Second, you are still at risk to run as account distribution issues. I recommended in part 2 of this series that these be audited. This isn’t hard to do from the admin console. The bottom line is that this is something that needs to be performed with some frequency, as a poorly distributed run as account can lead to a very quick compromise. Third, this may present issues depending on how you created your RF. I’m definitely approaching it as more of a management network for the entire domain. If you have an RF that is for DA credentials only, then this really isn’t the best option, and you’re going to need to put it in your Tier 1 management network. That type of network will be joined to the production domain, so care will need to be taken to protect it against credential theft.
Legacy Protocols and Operating Systems
This is last thing that comes to mind in architecture for a secure SCOM environment. It goes without saying that leaving on legacy protocols potentially exposes yourself to all sorts of attacks. I think at this point, we are all familiar with the consequences to leaving SMB1 enabled. Though the exact cost to IT organizations remains unknown, the estimates range from hundreds of millions to around $4B for WannaCry alone, much of which could have been prevented if organizations had moved away from Windows XP well before the 2017 attack was launched.
That sadly, is not the only legacy protocol out there. Other protocols include NTLM V1, LANMAN, Digest Authentication, and older versions of TLS. All of these are, at this point, on official deprecation lists from Microsoft. Turning them off certainly presents risks to older applications, so there’s some value in eliminating older apps, or at the very least restricting where these protocols can be used.
SCOM, for the record, does not require any of these protocols, so I highly recommend removing these, as well as deprecating older Operating systems such as Server 2008/2012.
It is worth noting that there are issues with the SCOM installer and legacy protocols. I’m hoping that Microsoft does fix this at some point, as I know this has been reported to the product team, but there are some known issues with certain legacy protocols and the SCOM installer. In that case, you may need to turn them on only for the purpose of deployment.
- RC4, documented on my blog.
- TLS – I haven’t observed this, but have been told that the installer can have issues if older versions of TLS are not turned on. This does make sense as TLS 1.3 support was added in a recent SCOM 2016 UR.
- NTLM V1 – Again, I haven’t observed this, but I do know there is an active investigation regarding the SCOM reporting piece requiring NTLM V1 for install.