Integrating SharePoint 2013 with Azure Active Directory – Part 2 The Custom Claims Provider

In Part 1 of this series, we went through how to configure SharePoint to use ACS and Azure Active Directory (AAD) as our Identity Provider. Once that is complete you will have a working end to end solution in which you can authenticate, get authorized and work in the site. What you also have is the standard out of the box experience, which means you have the “echo chamber” people picker experience (i.e. whatever you type in, people picker will “echo” back out that it is a valid value for every claim you have mapped).

One of the things we love about AAD though is the Graph API – an out of the box API and REST endpoint that we can use to query the directory. This is really one of the big value propositions for using AAD with SharePoint; unlike a lot of other SAML directories, this one has a queryable directory ready to go. In this post I’ll cover some of the Graph API programming that I did, but I won’t focus exclusively on that because there are or will be lots of Graph API code samples out there (try starting here if you are lost: https://msdn.microsoft.com/en-us/library/windowsazure/hh974476.aspx). Of more importance I think is for you to understand some of the features of the Graph API and how that impacts how one might typically write a custom claims provider. With that background, I’ll explain then the reasoning I used in deciding how to implement the different features of the custom claims provider.

Let’s talk first about some of those features of the Graph API and AAD because it should provide better context as to why I made certain implementation choices. The first release has these constraints that you need to be aware of (some of which I covered in Part 1):

  • When a user is authenticated through ADFS to AAD to ACS, the role claims for the user are not sent back.
  • The claims sent back for a user are part of a fixed schema. You cannot assume that every claim you define in ADFS will be sent back. Instead, only the properties in the AAD schema are returned; you can see the complete schema here: https://msdn.microsoft.com/en-us/library/windowsazure/dn195587.aspx.
  • The Graph API only supports “AND” queries, not “OR” queries. In SharePoint a user typically enters their query, and we return results that match across any of the mapped claims. Since you can’t create a query that says were foo=”James” OR bar=”James”, it means that you have to issue a separate query to Graph for every attribute you want to query on, which can obviously lead to performance / latency issues.
  • There is not any way to do wildcard searches with Graph API; it supports equals, less than or equal to, and greater than or equal to. Of course, users are used to typing “Ste” and getting back “Steve”, “Stephen”, “Stephanie”, etc. so this can be a tough one, but I do have one thought on this problem and have incorporated it into the custom claims provider I’m going to be describing here.
  • You are effectively forced to use UPN as the identity claim. This is because all queries for user attributes to the Graph API must include either the user’s UPN or objectID, and the objectID is not useful if you are human, so I will focus here on using UPN.

Okay, now that we know the parameters under which we will be working, let’s talk about the implementation of the custom claims provider. What I would suggest first is that you get your development environment where you will be creating your custom claims provider set up and ready to go to build Graph applications. In order to do that there are a few steps you need to do:

  1. Download the WCF Data Services for OData 3 from https://www.microsoft.com/download/en/details.aspx?id=29306.
  2. Get the CreateServicePrincipal.ps1 PowerShell script that is included with the sample application here: https://code.msdn.microsoft.com/Write-Sample-App-for-79e55502
  3. Download and install the PowerShell tools described here: https://technet.microsoft.com/en-us/library/jj151815.aspx
  4. Run the CreateServicePrincipal.ps1 PowerShell script to create your ServicePrincipal. The script will output a set of values at the end like TenantDomainName, AppPrincipalId, Password, etc. that will be required for getting the access token you need to query the Graph API. Once you run the script you should see output like this:

TenantDomainName: dreamswirls.onmicrosoft.com
TenantContextId: 06c83d0b-d384-4dcd-a8fc-a347d83a37b4
AppPrincipalId: 1f438108-ba1b-427f-9ce9-3a99d1aa2103
Password: e+XalkeKfs6Ax1Fqj++OP4mcS8PKQDHwzeG7rB7LiM=
Audience URI: <1f3cd808-21b-4d7f-9ee9-8298f1aa2103@06832a0b-d3dd-4e3d-a0fc-a3478ad2b7b4>

Now that you have details needed to connect to the Graph API, we can set about the first task, which is to address the fact that the role claims for the user don’t come through. The way we’ll tackle solving that is with claims augmentation, by implementing the FillClaimsForEntity method in our provider. What I’ll need to do is to get the user’s UPN claim; as I explained in Part 1 of this series, we created a claim rule in ACS that sends us the UPN claim. Unfortunately, even in SharePoint 2013 we still can’t get the user’s claims directly from the parameters provided when the FillClaimsForEntity method is invoked. Instead then we’ll use the method I previously described here: https://blogs.technet.com/b/speschka/archive/2011/03/29/how-to-get-all-user-claims-at-claims-augmentation-time-in-sharepoint-2010.aspx. This method works in SharePoint 2013 as well as SharePoint 2010. That may make this a good time to call out one other point – this entire scenario works both for SharePoint 2010 and SharePoint 2013. In the attachment that accompanies this post, the zip file will contain a Word document with this posting, a complete solution compiled for SharePoint 2010 and Visual Studio 2010, and a complete solution compiled for SharePoint 2013 and Visual Studio 2012.

To get back on track now…the first thing you need to keep in mind is that this code should only fire if the current user is a SAML claims user, so I use code like this to make sure that is the case:

string upn = string.Empty;
 
//get the claim provider manager
SPClaimProviderManager cpm = SPClaimProviderManager.Local;
 
//get the current user so we can get to the "real" original issuer
SPClaim curUser = SPClaimProviderManager.DecodeUserIdentifierClaim(entity);

//get the original issuer for the user
SPOriginalIssuerType loginType = SPOriginalIssuers.GetIssuerType(curUser.OriginalIssuer);
 
//we only need to do this for SAML users, so see if that's what our user is
if ((loginType == SPOriginalIssuerType.TrustedProvider) ||
 (loginType == SPOriginalIssuerType.ClaimProvider))
{
       //run the code to get the UPN

Okay, now that I know I’m working with a SAML user, here’s how I’ll get their UPN from the token sent to SharePoint:

//get the request envelope with the claims information
 
string rqst = System.ServiceModel.OperationContext.Current.RequestContext.RequestMessage.ToString();
 
//create an Xml document for parsing the results and load the data
XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml(rqst);
 
//create a new namespace table so we can query
XmlNamespaceManager xNS = new XmlNamespaceManager(xDoc.NameTable);
 
//add the namespaces we'll be using
xNS.AddNamespace("s", "https://www.w3.org/2003/05/soap-envelope");
xNS.AddNamespace("trust", "https://docs.oasis-open.org/ws-sx/ws-trust/200512");
xNS.AddNamespace("saml", "urn:oasis:names:tc:SAML:1.0:assertion");
 
//get the list of claim nodes
XmlNode xNode = xDoc.SelectSingleNode("s:Envelope/s:Body/trust:RequestSecurityToken/trust:OnBehalfOf/saml:Assertion/saml:AttributeStatement/saml:Attribute[@AttributeName = 'upn']", xNS);

//get the match and pull in the UPN
if (xNode != null)
 upn = xNode.FirstChild.InnerText;

Now that I’ve got the UPN, I can go ahead and query the Graph API to get a list of groups the user belongs to; I’ll add each matching group as a Role claim for the user. I’ll show the code here and then walk through it:

//see if we got the upn
if (!string.IsNullOrEmpty(upn))
{

//make a request now to get the list of groups for this user
DataClasses.UsersAndGroups groups = GetUsersAndGroups(upn, AadObjectType.GroupMembership);

//add each group we got back as a role claim
foreach (DataClasses.Group g in groups.AllGroups.GroupList)
{
claims.Add(new SPClaim(ROLE_CLAIM, g.DisplayName,
Microsoft.IdentityModel.Claims.ClaimValueTypes.String,
SPOriginalIssuers.Format(SPOriginalIssuerType.TrustedProvider,
SPTrustedIdentityTokenIssuerName)));
}
}

Let’s look at the code a little more closely now. The first thing to call out is the code to retrieve the list of groups for this user. In this case I’m calling a method called GetUsersAndGroups and I’m basically saying for this UPN I want a list of all the group memberships. This is all code I wrote myself so it’s included in my provider. Within that code it uses the ServicePrincipal information I described earlier in this post to query the Graph API. In this case I’m providing the Url to the specific user and adding the memberOf attribute; the resulting Url looks something like this: https://graph.windows.net/dreamswirls.onmicrosoft.com/users/speschka\@dreamswirls.com/memberOf?api-version=2013-04-05 (the “api-version” is an additional attribute required for all calls into Graph).

That leads into the second point, which is that I’m just using the REST endpoint directly to retrieve all my data from AAD. I also have an access token that I need to include in my request headers. I have a class-level variable in which I store that, so I start out by seeing if it’s been retrieved yet and if not, I go get it so I don’t have to make that additional web call each time. 

I also have code in the method that GetUsersAndGroups calls that will empty out that class-level variable for the access token if I get an error that suggests the access token is expired. In the GetUsersAndGroups method I just look after I call that method to see if the access token is empty; if so then I do a one-time retry to get the data again. When the code executes again it will go out and request a new access token and then get back to work. It’s not fool proof of course, but will work adequately for most cases.

When I do get data back, I store it a custom class that I created to store results for both Users and Groups called DataClasses.UsersAndGroups. I enumerate through all the groups and add them as role claims to the user. Here’s the final thing to point out in this chunk of code – note how I am using the new SPClaim method to create the claim I’m augmenting, instead of the CreateClaim method. The reason for that is that this provider must be the default because it is issuing identity claims. The way to do that is with the new SPClaim method, as I’ve documented previously here: https://blogs.technet.com/b/speschka/archive/2010/05/25/replacing-the-out-of-box-name-resolution-in-sharepoint-2010-part-2.aspx. When users use the people picker to add users and groups to SharePoint groups, my custom claim provider is adding the claims in this way to, so when I add claims via augmentation I need to add them the same way. If I didn’t, a role claim for “foo” added with CreateClaim would not have the same claim value as one added with new SPClaim. As a result, someone who got the “foo” role claim added via augmentation would NOT have access to content.

Finally, remember one of the things I pointed out above, that you don’t get all claims back, just the fixed schema that is defined by AAD. In this case I decided that all I really need for my farm is going to be the identity claim and role claims. I got the identity claim at authentication time and I got the role claims in the code above. If had some other set of claims though that I really needed for my farm, then I would also grab them in my FillClaimsForEntity method and add them there. The number of claims you use though also has an impact on the Fill… methods that I will describe below.

Okay, there was some additional explanation in there that is relevant to all of the Graph API queries I’m using, so the explanation for the FillClaimsForEntity was a little longer than the rest will be. Now I’ve mentioned that I am of course the default claims provider in this scenario, and that means I’m going to also need to implement both FillResolve methods and the FillSearch method. I’ve also implemented other methods like FillHierarchy, FillClaimTypes, etc., but there’s nothing specific to using AAD about them so I won’t cover those here. You can always look in the source code included with this post if you are curious about what’s in them.

Let’s talk about FillSearch next. Now in this case, some user is going to type in a string and expect to find every claim that has a value that starts with what they typed in; using the example I gave above, they type in “Ste” and they would be shown identities like “Steve”, “Stephen”, “Stephanie”, etc. Of course though, there are the problems I described above – we can’t issue “OR” queries and there isn’t support for wildcard queries…so what do we do? Well first of all, as I mentioned earlier, I decided to limit the claims I am going to use in this farm to two: identity claim and role claim. The fact that I can’t issue an OR query means that whenever FillSearch is called, I’m going to have to fire off two queries to the Graph API – one to look for a matching identity and one for a matching group.

This is also why it’s impactful if you need more claims – the more you have, the more queries you have to execute each time someone types a new letter in the people picker. Make sure you understand the behavior in SharePoint 2013 – after you type the 3rd character or more and pause, SharePoint will search for names. Each character you type after that and pause, SharePoint will do another search. You can imagine the number of queries you could end up firing off here, especially if you have multiple claim types to query against.

I also made another design choice here, in that I decided to make these queries look at displayName for matches. In a perfect world I could look at first name, last name, displayName, emailAddress, UPN, etc., but last time I looked things aren’t perfect, so I chose what I think is the most common use case. A user can still type in “Ste”, and I can hopefully bring back “Steve Peschka”, “Stephen Jackson”, “Stephanie Phillips”, etc.

Now the next problem is the wildcard issue – how can I find all those users I just listed if a user only types in “Ste”? Well, as I hinted above, I ended up coding my own solution for this. I will tell you now that it is not perfect, but, you should know my feelings on perfection by now. So instead I wrote a function that I called GetWildcardQuery. It takes a field name and query criteria, and spits out a string that can be used with the $filter operator in the Graph API. So how does it work? Well it’s what I call a classic Steve string munging routine. It takes the input string and figures out what the logical “end” of that string is and adds one character. 

This is of course easier explained with some examples. So suppose a user typed in “Ste”; the logical end of that string would be “Stz”, so when I add one it returns “Stf” (i.e. I want everything between “Ste” and “Stf”). That allows me to create a query that says greater than or equal to “Ste”, and less than or equal to “Stf”. The most obvious problem here is that anything that starts with “Stf” would also be a match, even though it should not be. However, you can’t just say less than or equal to “Stz”, because then it would not match on “Stza”, “Stzaa”, and so forth and so on. So the lesser of all evils is “Stf”. Other examples (that you will find in the source code as my test cases) include “sz” should return “t”, “srz” should return “ss”, “sar” should return “sas”, “frzz” should return “fs”, “frizz” should return “frj”. It’s not beyond my imagination that someone will say “hey, that’s wrong, here’s why”, but at least you know how I developed this logic and why.

So now that you understand how I chose to deal with the a) lack of an OR query and b) lack of a wildcard query, let’s get back to the SharePoint stuff. It’s pretty simple really – in my FillSearch method I just execute two queries, and if I find matches then I add them to the list. The relevant here code looks like this:

//query for users
DataClasses.UsersAndGroups people = QueryDirectory(searchPattern, AadObjectType.Users);

if (people.AllUsers.UserList.Count > 0)
{
//add results to picker
}

//query for groups
DataClasses.UsersAndGroups groups = QueryDirectory(searchPattern, AadObjectType.Groups);

if (groups.AllGroups.GroupList.Count > 0)
{
//add results to picker
}

The only thing worth calling out here is that the QueryDirectory is just a little stub for querying based on displayName. It uses that field name with the wildcard filter method and then queries the directory as shown previously. Essentially just two lines of code that looks like this:

string criteria = GetWildcardQuery("displayName", filter);
results = GetUsersAndGroups(criteria, objectType);

Here’s an example then of what the picker looks like using my wildcard logic:

There’s more entries than I can show in this screenshot but you get the idea – I typed in “fre” and it found a user called Freddy, a group called French Club, and a group called Frequent Flyers. So, not perfect, but not bad either.

Now let’s talk about the two FillResolve methods. They are somewhat similar to FillSearch but not exactly. The first FillResolve method I’ll talk to is the one that’s called when you use the Search dialog and includes the SPClaim as a parameter. Now what’s nice about this FillResolve method is that by looking at the SPClaim parameter, I can tell whether this is an identity claim or a role claim, so I only have to send off one query to the Graph API. Other than that it looks very much like search; the main difference though is that I know this should only ever return one entity, since it represents a selection made from the picker so I always just pick the first match if there is one (which there always should be):

if (resolveInput.ClaimType == USER_CLAIM)
{
 DataClasses.UsersAndGroups users = GetUsersAndGroups(resolveInput.Value, AadObjectType.User);
 
 if (users.AllUsers.UserList.Count > 0)
 {
  //add a PickerEntity to the list
  }
}
else
{
 DataClasses.UsersAndGroups groups = QueryDirectory(resolveInput.Value, adObjectType.Groups);

 if (groups.AllGroups.GroupList.Count > 0)
 {
  //add a PickerEntity to the list
 }
}

For the last FillResolve it’s really just like search all over again…someone types in some assortment of characters and clicks the Resolve button. It could be a user, it could be a group, who knows. So I just treat it exactly like search:

//query for users
DataClasses.UsersAndGroups people = QueryDirectory(resolveInput, AadObjectType.Users);
 
if (people.AllUsers.UserList.Count > 0)
{
 //add results to list
}
 
//query for groups
DataClasses.UsersAndGroups groups = QueryDirectory(resolveInput, AadObjectType.Groups);

if (groups.AllGroups.GroupList.Count > 0)
{
 //add results to list
}

So that really covers the gist of the custom claims provider implementation. There are a few other methods implemented and an event receiver to install the provider, but those are all just standard custom claims provider code snippets, nothing specific to AAD so I didn’t bother adding them to this already very long post. Once it’s been compiled and deployed I just went back and modified the SPTrustedIdentityTokenIssuer to make this the default claims provider, as I’ve previously described here: https://blogs.technet.com/b/speschka/archive/2010/04/28/how-to-override-the-default-name-resolution-and-claims-provider-in-sharepoint-2010.aspx. After that you’re good to go. 

I tested this provider in what I believe are all the important scenarios – selecting a site collection administrator, using both an identity claim and role claim in a web app policy, using an identity and role claim in SharePoint groups, and with the permission checker dialog. All have passed with flying colors, so this is now ready for you to try out if you wish – enjoy!

AAD_Provider_Code.zip