It turns out that weird things can happen when you mix Windows Server 2003 and Windows Server 2012 R2 domain controllers

 

We have been getting quite a few calls lately where Kerberos authentication fails intermittently and users are unable to log on.  By itself, that’s a type of call that we’re used to and we help our customers with all the time.  Most experienced AD admins know that this can happen because of broken AD replication, unreachable DCs on your network, or a variety of other environmental issues that all of you likely work hard to avoid as much as possible - because let’s face it, the last thing any admin wants is to have users unable to log in – especially intermittently.

Anyway, we’ve been getting more calls than normal about this lately, and that led us to take a closer look at what was going on.  What we found is that there’s a problem that can manifest when you have Windows Server 2003 and Windows Server 2012 R2 domain controllers serving the same domain.  Since many of you are trying very hard to get rid of your last Windows Server 2003 domain controllers, you might be running into this.  In the case of the customers that called us, the login issues were actually preventing them from being able to complete their migration to Windows Server 2012 R2.

We want all of our customers to be running their Active Directory on the latest supported OS version, which is frankly a lot more scalable, robust, and powerful than Windows Server 2003.  We realize that upgrading an enterprise environment is not easy, and much less so when your users start to have problem during your upgrade.  So we’re just going to come out and say it right up front:

We are working on a hotfix for this issue, but it’s going to take us some time to get it out to you. In the meantime, here are some details about the problem and what you can do right now.

Symptoms include:

When a user tries to log on to their computer, the logon may fail with “unknown username or bad password”

If you look in the system event log, you may notice Kerberos event IDs 4 that look like this:

Event ID: 4
Source: Kerberos
Type: Error
"The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server host/myserver.domain.com. This indicates that the password used to encrypt the Kerberos service ticket is different than that on the target server. Commonly, this is due to identically named machine accounts in the target realm (domain.com), and the client realm. Please contact your system administrator."

 

Why this happens:

The Kerberos client depends on a “salt” from the KDC in order to create the AES keys on the client side. These AES keys are used to hash the password that the user enters on the client, and protect it in transit over the wire so that it can’t be intercepted and decrypted. The “salt” refers to information that is fed into the algorithm used to generate the keys, so that the KDC is able to verify the password hash and issue tickets to the user.

When a Windows 2012 R2 DC is promoted in an environment where Windows 2003 DCs are present, there is a mismatch in the encryption types that are supported on the KDCs and used for salting. Windows Server 2003 DCs do not support AES and Windows Server 2012 R2 DCs don’t support DES for salting.

You might be wondering why these encryption types matter.  As computer hardware gets more powerful, older encryption methods become easier and easier to break.  Thus, we are constantly incorporating newer, more powerful encryption into Windows and Kerberos in order to help protect your user passwords (and your data and your network).

 

Workaround:

If users are having the problem:

Restart the computer that is experiencing the issue. This recreates the AES key as the client machine or member server reaches out to the KDC for Salt. Usually, this will fix the issue temporarily.

How to prevent this from happening:

Option 1: Query against Active Directory the list of computers which are about to change their machine account password and proactively reset their password against a Windows Server 2012 R2 DC and follow that by a reboot.

There’s an advantage to doing it this way: since you are not disabling any encryption type and keeping things set at the default, you shouldn’t run into any other other authentication related issue as long as the machine account password is reset successfully.

Unfortunately, doing this will mean a reboot of machines that are about to change their passwords, so plan on doing this during non-business hours when you can safely reboot workstations.

We’ve created a quick PowerShell script that you can run to do this.

Sample PS script:

> Import-module ActiveDirectory

> Get-adcomputer -filter * -properties PasswordLastSet | export-csv machines.csv

Reset Password:

nltest /SC_CHANGE_PWD:<DomainName>

Option 2: Disable machine password change or increase duration to 120 days.

You should not run into this issue at all if password change is disabled. Normally we don’t recommend doing this since machine account passwords are a core part of your network security and should be changed regularly. However because it’s an easy workaround, the best mitigation right now is to set it to 120 days. That way you buy time while you wait for the hotfix.

If you go with this approach, make sure you set your machine account password duration back to normal after you’ve applied the hotfix that we’re working on.

Here’s the relevant Group Policy settings to use for this option:

Computer Configuration\Windows Settings\Security Settings\Local Polices\Security Options

Domain Member: Maximum machine account password age:

Domain Member: Disable machine account password changes:

Option 3: Disable AES in the environment by modifying Supported Encryption Types for Kerberos using Group Policy. This tells your domain controllers to use RC4-HMAC as the encryption algorithm, which is supported in both Windows Server 2003 and Windows Server 2012 and Windows Server 2012 R2.

You may have heard that we had a security advisory recently to disable RC4 in TLS. Such attacks don’t apply to Kerberos authentication, but there is ongoing research in RC4 which is why new features such as Protected Users do not support RC4. Deploying this option on a domain computer will make it impossible for Protected Users to sign on, so be sure to remove the Group Policy once the Windows Server 2003 DCs are retired.

The advantage to doing this is that once the policy is applied consistently, you don’t need to chase individual workstations.However, you’ll still have to reset machine account passwords and reboot computers to make sure they have new RC4-HMAC keys stored in Active Directory.

You should also make sure that the hotfix https://support.microsoft.com/kb/2768494  is in place on all of your Windows 7 clients and Windows Server 2008 R2 member servers, otherwise they may have other issues.

Remember if you take this option, then after the hotfix for this particular issue is released and applied on Windows Server 2012 R2 KDCs, you will need to modify it again in order to re-enable AES in the domain. The policy needs to be changed again and all the machines will require reboot.

Here are the relevant group policy settings for this option:

Computer Configuration\Windows Settings\Security Settings\Local Polices\Security Options

Network Security: Configure encryption types allowed for Kerberos:

Be sure to check:  RC4_HMAC_MD5

If you have unix/linux clients that use keytab files that were configured with DES enable:  DES_CBC_CRC, DES_CBC_MD5

Make sure that AES128_HMAC_SHA1, and AES256_HMAC_SH1 are NOT Checked

 

--The Directory Services Team