The NT Token cache on the web server – Maybe you didn’t know this even existed…
Consider this scenario:
You are setting up ADFS in a federated scenario with SharePoint configured as a token based application.
The initial setup has miscellaneous configuration errors that you correct along the way. You test again and find some more configuration issues further down the line. Each time you correct something – you try to get to the web site with your client machine and your test account. Each time – you are getting closer and closer.
You finally make it to the SharePoint page and you are happy…No errors from the Federation Servers and things went as expected. Maybe not 100% the way you expected – but you just need to make some minor changes with the SharePoint permissions, then you are ready to test out some different claims and other items you have on your list.
OK – careful right here. Something happened under the hood here – and it’s important!
Let’s talk about what just happened when you finally made it to SharePoint error free. The ADFS token based web agent wrote a NT token on the web server and this user (identified by their identity claim) will find and use this same token on subsequent requests to applications on this box for the next 60 minutes.
See where you can run into trouble during initial setup and testing? Let's continue with this example…
Let's assume that when you first accessed the site (successfully) – you had a UPN identity claim and Group Claim A (which mapped to Windows Group A). The agent wrote a token with the SID of Windows Group A on the web server. You then realize you need to test Group Claim B which is associated with Windows Group B. You take all the correct steps necessary – add him to a different group on the account side, log off/log on, test again.
Hmmm – you are still getting the permissions associated with Group Claim/Windows Group A. You start checking your configuration - looking at logs – you see the Group claim B getting passed as it should from the FS-A to the FS-R – but when the user gets to the Web Server – you don’t have the permissions you associated with group claim B.
Now the doubt starts to creep in…Just when you thought you had the hang of this claim thing 😉
You check/double/triple check your configuration – maybe you configure group claim C – same thing! What changed you ask yourself? Trying a different user on the account side probably never occurred to you (I know it never does to me when I’m in this place).
Hopefully you read this (and remember about the NT Token Cache) before you spin your wheels too long with a scenario as I described here. The example I gave above is not the only way you can get in trouble here – It’s just one of the ways.
If you enable debug logging on the web server, you will see a message indicating that a cache entry has been found
When you are in the lab and are going to be making changes, testing, then more changes, and more testing – You may want to consider reducing the CacheEntryLifetime to the minimum (1 minute) from the default (60 minutes). To do this – add the following registry values to the web server at this location:
CacheEntryLifetime – dword – 60 decimal
CacheScavengeInterval – dword – 60 decimal
Reboot the server for these changes to take effect.
Now – you can continue your testing without hitting this type of problem. Keep in mind – this is for lab environments only. I have no idea what this would do to a busy production web server from a performance standpoint.
A complete list of all the cache settings is located here
This blog certainly raises some questions (for me anyway) - when I tried to test things to verify and provide more detailed information, I got into a major rat hole…
I’ll follow up with more detailed information like:
- How the debug logs look – how to verify this is what you are hitting
- Shadow account existence – and the account partner “ resource account” setting
I think more detailed items on the subject are needed here. I’m going to put this out for now and build on it later.