Finally Remove NTLMv1 with Project VAST

Are you old enough to remember parachute pants, VCRs, and boom boxes? How about the Mosaic browser, Banyan VINES, and Token Ring networking? Do you still use any of these things? Probably not. But chances are your organization uses a protocol that is equally old.

You wouldn’t wear leather armor on a modern battlefield. And you shouldn’t expect 25-year-old technology to stand up to a six-month-old attack technique.

Hey, it’s Jon once again, with this month’s installment about Project VAST (the Visual Auditing Security Tool). In this edition, we need to talk about the elephant in the room; we need to talk about NTLM and what you can do about this ancient and deprecated protocol. Yes, this protocol is probably in your environment and yes it is a problem. But with some diligent work and some help from Project VAST, you can deal with it effectively.

Quick Review: What is NTLM

Once upon a time, before Active Directory (AD), before Windows 2000, before Microsoft’s implementation of Kerberos, there was NTLM (okay, I’ll stop with the reminiscing 😊 ). NTLM stands for NT Lan Manager, a suite of Microsoft protocols used for authentication and integrity. You may know NTLM as a challenge-response protocol. By today’s standards, an NTLM challenge-response is really pretty simple:

-          Client sends user name to resource server in plain text

-          Resource server generates a nonce or random number, and sends it to the client

-          Client encrypts the nonce with the hash of the user’s password (which it has claimed into memory at logon) and returns it to the server

-          The server proxies the username, the challenge it sent to the client, and client response to the challenge, to the DC (or other authoritative server) for confirmation

-          The DC/authority retrieves the hash from its local Security Accounts manager (SAM) database

and uses it to encrypt the challenge it received from the server

-          The DC/authority compares the new challenge it created against the one from the client; identical challenges result in Authorization to the resource server

(Ref: MSDN at https://msdn.microsoft.com/en-us/library/windows/desktop/aa378749(v=vs.85).aspx)

Notice anything here? NTLM was particularly innovative and effective, in its heyday, because it never transmitted the user’s password or its hash across the network, where it could be easily stolen.

Why is NTLM still in use if we have Kerberos?

Good question. Though NTLM has been largely replaced by Kerberos in AD (and several other protocols for Internet-based authentication), NTLM is still in widespread use because it fills in some gaps where Kerberos is not possible. Kerberos relies upon a trusted-third-party scenario; in AD, this third-party is the Key Distribution Center (KDC) portion of a DC. But suppose you have an old application that can’t support Kerberos, or that you have a member server that lives outside of your AD forest, or that you need to call a resource in a manner unsupported by a Service Principal Name (SPN), such as an IP address. In these scenarios, you need something else and that something else is generally NTLM.

OK, so what’s the problem?

Well done – you’re asking all the right questions.  😊  The problem is multi-faceted. First, NTLMv1 hashing (or the mathematical value used to represent its password) is based upon the Data Encryption Standard (DES) symmetric-key encryption algorithm. DES was considered secure when it was invented and when NTLMv1 used it. But given today’s graphical processing units and freely-available password cracking tools, DES is easily broken.

As easy as NTLM is to crack, that’s not its biggest weakness. Its biggest weakness is its vulnerability to credential theft attacks such as Pass the Hash (PtH). While there are many similar attacks, the underlying strategy is generally the same: steal a “secret” (such as an NTLM hash) from an end-point where it has been cached in memory (recall the third step of challenge-response), and use it to access resources to which one would otherwise not be granted access. (To be clear, PtH is not a Windows-specific attack.)

A thorough discussion of PtH and credential theft is not really my focus here. (For a thorough discussion, check out my friend Mark Simos’s excellent video and work at https://aka.ms/pth.) For now, think of PtH as akin to using a fake ID. The account whose context is being used is still the account of record; but in its context, an attacker has stolen another “ID” (or hash) and used it to grant itself access to a resource that is prohibited (or to which the original account has not been authorized). In the physical world, maybe this “resource” is a bar or six-pack of beer; in AD terms, it’s generally a computing resource or data set.

By using free tools (the same tools used by attackers), one can easily see the problem. Take a look at these commands executed with Windows Credential Editor (WCE).

In a nutshell, we have just attacked NTLM by stealing very powerful credentials from memory and re-using them. This attack required no knowledge of NTLM, Kerberos or AD. And, in the example above, the attacker was able to achieve domain dominance.

What to do about NTLM?

We haven’t talked about NTLM versions yet. Prior to the ascendancy of AD and Kerberos, Microsoft released two (or more precisely three) revisions to the NLTM standard:

  • NT Lan Manager
  • NTLMv1
  • NTLMv2

All three have been long deprecated; it’s safe to say that NT Lan Manager (“LanMan” for short) offers no real protection, NTLMv1 offers limited protection, and NTLMv2 – being the latest revision – is the better of the three. In terms of resistance to cracking, LanMan and NTLMv1 cannot be used securely (short of encrypting the transport with something like IPSec); only NTLMv2 may be considered secure from this standpoint.

To be clear, all three are vulnerable to credential theft attacks like PtH. (To be very clear, Kerberos is vulnerable to similar attacks like Pass the Ticket.) That said, there are three compelling reasons for removing NTLMv1 from your environment, even if you leave NTLMv2 for backward compatibility (and you will probably need to, at least for awhile).

First, it is far less complex for an attacker to anticipate the challenge length in NTLMv1, as it is always a 16-byte random number. NTLMv2, on the other hand, uses a challenge of variable length.

Second, for challenge encryption, recall that NTLMv1 uses DES encryption, whereas NTLMv2 uses the stronger HMAC-MD5. As of now, it’s not feasible to brute-force HMAC-MD5. (Quantum computing will change everything, but that’s a different story.)

Third, even if you’re somehow not concerned about the security vulnerabilities inherent in NTLMv1, you’ll have to remove it to use Windows 10 Credential Guard (and you should absolutely use Credential Guard). Because NTLMv1 is so much less secure, the only protocols that Credential Guard supports are Kerberos and NTLMv2. You’ll have to address the sources of NTLMv1 before using Credential Guard; else it won’t end well when you disallow NTLMv1.

NTLM versions are easily configurable via Group Policy (GPO) at Computer Configuration\Windows Settings\Security Settings\Local Policies\Security Options\Network security: LAN Manager authentication level with six different options ranging from 0 (least secure) to five (most).

A rule of thumb that I encourage customers to embrace is to start auditing with level three. Level three sets NTLMv2 as the default, but allows for fallback to the older protocol version. Eventually, we’ll want to get to level five, but we’re not ready yet. We need to audit, and that’s where VAST comes in.

The Challenge (so to speak)

At least three native methods of logging NTLM traffic exist. For our project and because we need to explicitly log the version, we’ll focus on Event ID 4624, An account was successfully logged on.

The good news is that this is a very high-value, verbose event. It clearly tells us the account, originating Workstation name and its IP address (to be exact, this is the workstation name and IP of the computer that last chained the NTLM request). Critically for us here, 4624 also shows us the Package Name, which in this case is NTLMv1. So we can construct a pretty good story out of this data: The built-in Administrator account authenticated using NTLMv1 to the local machine from computer Workstation1, which has IP 192.168.2.61.

Inspecting this event is efficient enough, but look closely. In my lab (as in most environments), I have thousands of events to comb through – in this case over 28,000 of them on this single machine. This is partly because 4624 tracks all successful logons – not just those using NTLM.

Once again – what we have here is a problem of big data.

Enter Project VAST

We need to deal with this data in two ways. First, we clearly need to aggregate it. Windows Event Forwarding and SIEMs do this well enough. But in my experience, aggregation is simply not enough. After all, combing through this amount of data, even after aggregation, is not always very realistic; too few organizations are successful using aggregation alone. This brings me to the second necessary factor here: we need to make the big data set consumable and truly actionable.

We’ll start, as we always do in Project VAST, with Azure Log Analytics. Once we have the data aggregated in Azure, we can create Kusto queries to view the data and control the output.

In AD, we weren’t able to easily filter natively to only NTLM authentications; recall that 4624 is a successful logon, regardless of protocol. In Azure Log Analytics, we can easily query on the AuthenicationPackageName field, as we’ve done here. Still, in just 24 hours, we have 209 NTLM logons to sift through. Overall that’s better but we haven’t rendered the data really actionable yet. After all, we want folks to make well-informed, data-centric decisions about their security budget.

A Closer Look

Let’s take a look at Project VAST’s NTLM tab. Recall that here we are exporting the Kusto query out of Azure Log Analytics and importing it into Power BI. This configuration allows Power BI to query Azure Log Analytics directly with no need for intermediary data sources. The NTLM tab in Project VAST allows us to visualize the 4624 data and filter the display in a number of ways.

Start by focusing your attention to the NTLM Version filter that I’ve marked above with the red arrow. Because 4624 includes the Package Name attribute, we can filter to either V1 or V2. For the reasons we’ve discussed earlier in this article, Brian and I have made the decision to default VAST filtering on this page to NTLMv1 only. This will help you focus on the less secure authentication traffic patterns.

Directly above the filter NTLM Version is a filter titled isAdmin. Because Project VAST both queries your AD for members of built-in groups (like Domain Admin and Server Operators, for example) and also allows you to specify Administrative accounts, we can filter only admin accounts, non-admin accounts or (as we’ve done here), not filter on either. This view is therefore displaying the NTLMv1 authentication traffic for both admin and non-admin accounts. This is a good place to start.

Below the filter for NTLM Version, have a look at NTLM Auth by Account (Top 5). As in tabs we’ve discussed previously, the yellow bars represent NTLM traffic from non-Admin accounts; the red, for Admin.

If you’ve read the previous entries, this look and feel should be becoming familiar. In the upper left-hand corner, we have represented the flow of data of the top five NTLM authentications. In my lab, there are only two due to the size, 192.168.2.57 and -56, each sending authentications against DCs 1 and 2.

Now that we have an idea of the most significant culprits and authentication flows, let’s drill down to some truly actionable data. By clicking on one of the accounts in NTLM Auth by Account, we can examine data that solely pertains to that account and the other filters that we have applied. Let’s click on svc7.

On display now is only the data pertaining to svc7’s NTLMv1 authentication flows within our data set. We can easily see the host IP, the authenticating DCs, timestamp information, and some raw data. We now have the understanding that we need in order to take action – starting with determining the process running on 192.168.2.56 responsible for NTLMv1 traffic. Next we’ll work with application owners, vendors, or infrastructure teams to change the traffic over to NTLMv2 or Kerberos.

In other words, we will have surfaced a vulnerability and then mitigated it – making for a nice story of progress as well as return on investment along our security roadmap. And like all journeys, our work with NTLMv1 won’t last indefinitely. Once we’re satisfied that we’ve mitigated all of our NTLMv1 traffic (e.g. the NTLM tab in Project VAST, filtered to V1, is blank), then it’s time to change our GPO setting to five. All new NTLM traffic will have to use NTLMv2, since the DCs won’t accept any of the other five levels of negotiation.

That wraps it up for Project VAST and NTLM auditing for now. Good luck, let us know how we can help and, as always, happy auditing.