Hi Rob here, I am a Support Escalation Engineer in Directory Services out of Charlotte, NC, USA. We work a lot of Kerberos authentication failure issues. Since Kerberos is typically the first authentication method attempted, it ends up having authentication failures more often. One of the great things about Windows is that the product seems to just work without too much customization that is needed by the customer. However I wanted to create a blog to try and demystify how Kerberos authentication works. I plan on writing several Kerberos blogs in the near future to include Kerberos Delegation aka Double-hop, how to troubleshoot Kerberos authentication.
There are some general terms that you might not be familiar, so let’s run through them quickly.
Kerberos defines two different types of accounts (or Principals). The two different names given to these types of accounts are User Principal Name (UPN), and Service Principal Name (SPN). We would typically relate these two types of principals to Active Directory users and computers.
Only user accounts have a UPN defined on their account. When looking at a user account if you click on the Account tab, the UPN is derived from the combining of the two fields listed for “User logon name”. A User Principal Name must be unique across the entire forest otherwise when the KDC goes to look up the Users Account via UPN it will get back more than one account and cause authentication failures for all users that have the same UPN. The UPN of an Active Directory object is an attribute of the object, and can only hold a single value. The attribute name is userPrincipalName. An example of a UPN is: firstname.lastname@example.org.
Service Principal Names MUST be unique across the entire Active Directory forest, and can be assigned to either User accounts or Computer accounts. Only computer accounts automatically have Service Principal Names defined.
Service Principal Names define what services run under the accounts security context. For example some of the services that a computer might have are File server / CIFS (Common Internet File System), if it is a domain controller it is going to have an LDAP SPN, and Active Directory Replication SPN, and FRS SPN. Service Principal Names can be defined on user accounts when a Service or application is running under that users Security context. Typically these types of user accounts are known as “Service Accounts”. It is very import that you understand that Service Principal Names MUST be unique throughout the entire Active Directory forest.
Some typical scenarios when a user account has a Service Principal Name defined are:
- When SQL Server Service is using a user account or “Service Account” instead of the default of “LocalSystem”. An example is: MSSQLSvc/sqlsrvr.contoso.com:1433
- When an IIS Web Application Pool is running as a specified user account rather than as the default of “Network Service”. An example is: http/websrvr.contoso.com
Some tools that can be used to list the Service Principal names on an Active Directory object are: LDP, LDIFDE (These two are great utilities if you want to query LDAP for all objects that have the SPN defined on them.), next is ADSIEdit, or SetSPN. The last two are great utilities if you want to see what SPNs are registered on a given object.
KDC (Key Distribution Center): The KDC is a service that should only be running on a domain controller. The service name is “Kerberos Key Distribution Center”. Basically the KDC is the service that is responsible for authenticating users when Kerberos is used. The KDC implements two server components. Authentication Server (AS), and Ticket Granting Server (TGS).
Authentication Server (AS): The KDC role that verifies the identity of the principal and issues the Ticket Granting Ticket (TGT) to the principal upon successful authentication. Holding a valid TGT allows the principal to request a Service ticket from the Ticket Granting Server (TGS). There will be a TGT in the Credentials Cache for each domain the principal has accessed resources in. An example of this would be: a user in contoso.com domain wanted access to a file server in emea.contoso.com the user would have a TGT for contoso.com, and emea.contoso.com
Ticket Granting Server (TGS): The KDC role that issues a service ticket when a principal requests connection to a Kerberos service. You must have a Ticket Granting Ticket (TGT) for the Active Directory domain before you can be issued a service ticket in that Active Directory domain. Although the KDC issues the service ticket it does not talk directly to the service that the principal is requesting the ticket for. Once a service ticket has been issued by the Ticket Granting Server, the service ticket is put into the principal’s credentials cache for later use. When the principal needs to connect to the requested service the service ticket is used from the credentials cache and sent to the service it is attempting to connect to.
There are only two different types for tickets that the KDC issues.
- Ticket Granting Ticket (TGT)
- Service tickets from the Ticket Granting Server.
Kerberos Ticketing Process:
How Kerberos works can be very difficult to keep straight. There is a lot of decrypting and encrypting of authentication data. I have laid out the entire ticketing process here in two formats. If you are just trying to understand at a high level of how Kerberos authentication works I would suggest that you keep to the number lists below. If you already know the high level Kerberos ticketing process and are looking for more detail on how Kerberos authentication works I would suggest that you look at the bulleted list under each numbered list below.
Image is taken from the Kerberos TechNet article
1. The client sends a KRB_AS_REQ to the KDC and more specifically the Authentication Server to request a Ticket Granting Ticket (TGT).
- The AS_REQ is built on the client machine using the current computer time and encrypting it with the users Password hash. There is some other information within the AS_REQ packet that includes the UPN of the Principal. This information is called “Authentication Data”
2. Once the KDC verifies the users Authentication Data, it responds back to the client with a KRB_AS_REP to the client with a TGT and session key for the TGT.
- The KDC can decrypt the AS_REQ because the principal’s password hash is stored within Active Directory; it then verifies that the timestamp in the AS_REQ to make sure that the systems are within 5 minutes of time skew. This process validates that the principal authenticating knows the users account and password.
- The TGT session key is used in subsequent service ticket requests.
3. The client is then able to request service tickets since it has a valid TGT for the Active Directory domain. The client then sends a KRB_TGS_REQ to the KDC and more specifically the Ticket Granting Server to request a Service Ticket. Keep in mind that it not only sends the Service Ticket Request, but also a copy of the TGT that it was given earlier.
- The principal is going to build an authenticator that is encrypted with the Session key of the TGT. This authenticator data is the User Principal Name, as well as the current system timestamp. The TGS_REQ has the following information: The Service Principal name that they want access to, and the TGT from the previous step.
4. Once the KDC has verified the validity of the TGT that is included with the Service Ticket request it responds back to the client with a KRB_TGS_REP with the Service Ticket and service session key.
- After successful decryption of the TGT, the Ticket Granting Server has access to the TGS Session key that the authenticator data is encrypted with. Decrypting the Authenticator data helps to prove the user principal who was issued the TGT is who sent the Service Ticket Request. Also with decrypting the Authenticator data, the KDC is able to verify that system time is within 5 minutes of time skew and is not a replay attack.
- An LDAP query is done for the Service Principal Name (SPN) that the ticket is being requested for from the Ticket Information. The LDAP Search is doing a query on the servicePrincipalName attribute and is done against a domain partition.
- The TGS generates a Unique Service Session Key. This session key is going to be used by the principal and service.
- The KDC then creates the service ticket with the following information within it:
- A copy of the unique session key
- v Authorization data that is copied from the principals TGT that was given in the request. The Authorization data is going to have the Principals SID, all group information, again this information is known as the Privileged Attribute Certificate or PAC.
- Once the service ticket information is compiled the entire service ticket is encrypted with the Services User Key (password hash).
- The ticket information and the principal’s copy of the service session key are encrypted with the Ticket Granting Server session key.
5. Next the client sends the Service Ticket to the Service/Application as a KRB_AP_REQ. You will typically see this embedded in the type of packet that the service uses. Like file shares use SMB, for example.
- The information in the KRB_AP_REQ is the service ticket obtained from the TGS, and an authenticator encrypted with the session key for the service.
6. After authentication succeeds The Service responds back to the client with a KRB_AP_REP.
- When the client receives KRB_AP_REP, it decrypts the service’s authenticator with the service session key it shares with the service and compares the time returned by the service with the time in the client’s original authenticator. If the times match, the client knows that the service is genuine.
- After the server authenticates the client using Kerberos authentication, the Privilege Attribute Certificate or PAC is taken from the service ticket and used to create the user’s access token. This PAC is verified against a domain controller through a NetLogon call to verify the PAC Signature.
As you can see, the KDC does not participate directly in the authentication of users to the end service with Kerberos. The KDC is known as the trusted 3rd party in this type of authentication. It is known this way because it is the only service that knows the passwords of the user and the service.
Kerberos delegation is the act of principal (Service) impersonating another principal (user) to gain access to a 3rd principal (service). In other words, User connecting to an IIS Server that queries a SQL Server as the user who is requesting some data from the web server.
There are two different types of delegation.
- Unconstrained (Windows 2000/2003/2008 Servers can do this type of delegation.)
- Constrained (Windows 2003/2008 Servers in 2003 Domain Functional Level (or higher) can do this type of delegation.)
We are going to cover Kerberos Delegation in detail in another blog entry.
Tools used to view and troubleshoot Kerberos:
KerbTray: This is a great utility GUI based utility that shows up in the system tray that allows you to view all your Kerberos tickets as well as being able to purge them. The purge feature is done by right clicking the green ticket in the system tray and selecting “Purge Tickets”. The Purge Tickets options delete all Kerberos tickets.
KList: This is a great command line tool that lists Kerberos tickets as well as being able to purge Kerberos tickets. The nice thing about this tool is that you can selectively purge Kerberos tickets rather than deleting all tickets like the KerbTray utility does.
Network Captures: Network capturing utilities can be indispensable when troubleshooting a Kerberos authentication issue. As we say here “the truth is on the wire”. Most network capture utilities have very good Kerberos parsers included. Some of our favorites here are Network Monitor 3, WireShark, and Ethereal.
Kerberos Event logging: The operating system by default does not create event log entries for Kerberos authentication events. You can however turn this feature by reviewing the following KB article:
262177 How to enable Kerberos event logging – http://support.microsoft.com/default.aspx?scid=kb;EN-US;262177
You would enable this feature on the client machines and any other machines participating in Kerberos delegation.
Note: I would caution you on enabling this feature. There are some events that you will see that are really not Kerberos errors – such as 0x12 KDC_ERR_CLIENT_REVOKED, 0xD KDC_ERR_BADOPTION, or 0x34 KRB_ERR_RESPONSE_TOO_BIG. We have had cases where the customer enabled this from a previous case and never turned it back off. Since they were now sensitive to all Kerberos errors they have opened up a new case just to be asked to turn off the logging because the events were not really errors.
There are some basic dependencies that need to be in place for Kerberos Authentication to succeed.
For Kerberos to function correctly, the supporting infrastructure must be sound. Since Passwords are used to encrypt data within Tickets it is imperative that when a user or computer changes their password that Active Directory replication is able to send these changes throughout the environment.
Proper name resolution is required. For Kerberos to function it has to be able to resolve the proper IP Addresses for the KDC as well as the Service Principal you are attempting to access. Both DNS and NetBIOS name resolution must be setup correctly. Many cases identified as Kerberos issues were caused by bad records in DNS, broken WINS replication, or HOSTS/LMHOSTS files with invalid data. DNS SRV records for _kerberos will need to be in place, for both the _tcp and _udp DNS sub-domains. Check the configured DNS suffixes and search order as well. A typical problem that we find is that the DNS Zone has been configured for WINS Lookup. When this happens the wrong DNS FQDN is found for the service the user is attempting to connect to, which then causes the application to ask for a service principal for the wrong FQDN.
All machines participating in Kerberos authentication need to be within 5 minutes of time. By default Windows will use the Windows Time (w32time) service, but can use a third party NTP client. We don’t care what time it is as long as all computers in the forest agree to within 5 minutes of one another.
We need to ensure that we have good connectivity. TCP and UDP port 88 must be open from clients to domain controllers. A common problem is that routers will arbitrarily fragment UDP packets; when this happens the Kerberos ticket request packets are discarded by the KDC. Windows Vista and Windows Server 2008 now default to using TCP for kerberos ticket requests. Typically you work around this issue by implementing the following KB article:
244474 How to force Kerberos to use TCP instead of UDP in Windows Server 2003, in Windows XP, and in Windows 2000 – http://support.microsoft.com/default.aspx?scid=kb;EN-US;244474
LDAP queries will be made by the DC / KDC for Service Principal Name records. Duplicate computer names, usernames, etc, or manually registered duplicate SPN’s anywhere in the forest can cause Kerberos errors. There is an event that is created when this happens, but it is only logged on the domain controller that attempted to find the service principal. It is a Kerberos Event 7.
Other Kerberos information:
How the Kerberos Version 5 Authentication Protocol Works http://technet2.microsoft.com/WindowsServer/en/library/4a1daa3e-b45c-44ea-a0b6-fe8910f92f281033.mspx
Kerberos Authentication in Windows Server 2003 http://technet2.microsoft.com/windowsserver/en/technologies/featured/kerberos/default.mspx
MIT’s references for Kerberos
The Kerberos Consortium
– Rob Greene