The Azure Custom Claim Provider for SharePoint Project Part 2

In Part 1 of this series, I briefly outlined the goals for this project, which at a high level is to use Windows Azure table storage as a data store for a SharePoint custom claims provider. The claims provider is going to use the CASI Kit to retrieve the data it needs from Windows Azure in order to provide people picker (i.e. address book) and type in control name resolution functionality. 

In Part 3 I create all of the components used in the SharePoint farm. That includes a custom component based on the CASI Kit that manages all the commnication between SharePoint and Azure. There is a custom web part that captures information about new users and gets it pushed into an Azure queue. Finally, there is a custom claims provider that communicates with Azure table storage through a WCF - via the CASI Kit custom component - to enable the type in control and people picker functionality.

Now let’s expand on this scenario a little more.

This type of solution plugs in pretty nicely to a fairly common scenario, which is when you want a minimally managed extranet. So for example, you want your partners or customers to be able to hit a website of yours, request an account, and then be able to automatically “provision” that account…where “provision” can mean a lot of different things to different people. We’re going to use that as the baseline scenario here, but of course, let our public cloud resources do some of the work for us.

Let’s start by looking at the cloud components were going to develop ourselves:

  • A table to keep track of all the claim types we’re going to support
  • A table to keep track of all the unique claim values for the people picker
  • A queue where we can send data that should be added to the list of unique claim values
  • Some data access classes to read and write data from Azure tables, and to write data to the queue
  • An Azure worker role that is going to read data out of the queue and populate the unique claim values table
  • A WCF application that will be the endpoint through which the SharePoint farm communicates to get the list of claim types, search for claims, resolve a claim, and add data to the queue

Now we’ll look at each one in a little more detail.

Claim Types Table

The claim types table is where we’re going to store all the claim types that our custom claims provider can use. In this scenario we’re only going to use one claim type, which is the identity claim – that will be email address in this case. You could use other claims, but to simplify this scenario we’re just going to use the one. In Azure table storage you add instances of classes to a table, so we need to create a class to describe the claim types. Again, note that you can instances of different class types to the same table in Azure, but to keep things straightforward we’re not going to do that here. The class this table is going to use looks like this:

namespace AzureClaimsData

{

    public class ClaimType : TableServiceEntity

    {

 

        public string ClaimTypeName { get; set; }

        public string FriendlyName { get; set; }

 

        public ClaimType() { }

 

        public ClaimType(string ClaimTypeName, string FriendlyName)

        {

            this.PartitionKey = System.Web.HttpUtility.UrlEncode(ClaimTypeName);

            this.RowKey = FriendlyName;

 

            this.ClaimTypeName = ClaimTypeName;

            this.FriendlyName = FriendlyName;

        }

    }

}

 

I’m not going to cover all the basics of working with Azure table storage because there are lots of resources out there that have already done that. So if you want more details on what a PartitionKey or RowKey is and how you use them, your friendly local Bing search engine can help you out. The one thing that is worth pointing out here is that I am Url encoding the value I’m storing for the PartitionKey. Why is that? Well in this case, my PartitionKey is the claim type, which can take a number of formats: urn:foo:blah, https://www.foo.com/blah, etc. In the case of a claim type that includes forward slashes, Azure cannot store the PartitionKey with those values. So instead we encode them out into a friendly format that Azure likes. As I stated above, in our case we’re using the email claim so the claim type for it is https://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress.

Unique Claim Values Table

The Unique Claim Values table is where all the unique claim values we get our stored. In our case, we are only storing one claim type – the identity claim – so by definition all claim values are going to be unique. However I took this approach for extensibility reasons. For example, suppose down the road you wanted to start using Role claims with this solution. Well it wouldn’t make sense to store the Role claim “Employee” or “Customer” or whatever a thousand different times; for the people picker, it just needs to know the value exists so it can make it available in the picker. After that, whoever has it, has it – we just need to let it be used when granting rights in a site. So, based on that, here’s what the class looks like that will store the unique claim values:

namespace AzureClaimsData

{

    public class UniqueClaimValue : TableServiceEntity

    {

 

        public string ClaimType { get; set; }

        public string ClaimValue { get; set; }

        public string DisplayName { get; set; }

 

        public UniqueClaimValue() { }

 

        public UniqueClaimValue(string ClaimType, string ClaimValue, string DisplayName)

        {

            this.PartitionKey = System.Web.HttpUtility.UrlEncode(ClaimType);

            this.RowKey = ClaimValue;

 

            this.ClaimType = ClaimType;

            this.ClaimValue = ClaimValue;

            this.DisplayName = DisplayName;

        }

    }

}

 

There are a couple of things worth pointing out here. First, like the previous class, the PartitionKey uses a UrlEncoded value because it will be the claim type, which will have the forward slashes in it. Second, as I frequently see when using Azure table storage, the data is denormalized because there isn’t a JOIN concept like there is in SQL. Technically you can do a JOIN in LINQ, but so many things that are in LINQ have been disallowed when working with Azure data (or perform so badly) that I find it easier to just denormalize. If you folks have other thoughts on this throw them in the comments – I’d be curious to hear what you think. So in our case the display name will be “Email”, because that’s the claim type we’re storing in this class.

The Claims Queue

The claims queue is pretty straightforward – we’re going store requests for “new users” in that queue, and then an Azure worker process will read it off the queue and move the data into the unique claim values table. The primary reason for doing this is that working with Azure table storage can sometimes be pretty latent, but sticking an item in a queue is pretty fast. Taking this approach means we can minimize the impact on our SharePoint web site.

Data Access Classes

One of the rather mundane aspects of working with Azure table storage and queues is you always have to write you own data access class. For table storage, you have to write a data context class and a data source class. I’m not going to spend a lot of time on that because you can read reams about it on the web, plus I’m also attaching my source code for the Azure project to this posting so you can at it all you want. 

There is one important thing I would point out here though, which is just a personal style choice. I like to break out all my Azure data access code out into a separate project. That way I can compile it into its own assembly, and I can use it even from non-Azure projects. For example, in the sample code I’m uploading you will find a Windows form application that I used to test the different parts of the Azure back end. It knows nothing about Azure, other than it has a reference to some Azure assemblies and to my data access assembly. I can use it in that project and just as easily in my WCF project that I use to front-end the data access for SharePoint.

Here are some of the particulars about the data access classes though:

  • ·         I have a separate “container” class for the data I’m going to return – the claim types and the unique claim values. What I mean by a container class is that I have a simple class with a public property of type List<>. I return this class when data is requested, rather than just a List<> of results. The reason I do that is because when I return a List<> from Azure, the client only gets the last item in the list (when you do the same thing from a locally hosted WCF it works just fine). So to work around this issue I return claim types in a class that looks like this:

namespace AzureClaimsData

{

    public class ClaimTypeCollection

    {

        public List<ClaimType> ClaimTypes { get; set; }

 

        public ClaimTypeCollection()

        {

            ClaimTypes = new List<ClaimType>();

        }

 

    }

}

 

And the unique claim values return class looks like this:

namespace AzureClaimsData

{

    public class UniqueClaimValueCollection

    {

        public List<UniqueClaimValue> UniqueClaimValues { get; set; }

 

        public UniqueClaimValueCollection()

        {

            UniqueClaimValues = new List<UniqueClaimValue>();

        }

    }

}

 

 

  • ·         The data context classes are pretty straightforward – nothing really brilliant here (as my friend Vesa would say); it looks like this:

 

namespace AzureClaimsData

{

    public class ClaimTypeDataContext : TableServiceContext

    {

        public static string CLAIM_TYPES_TABLE = "ClaimTypes";

 

        public ClaimTypeDataContext(string baseAddress, StorageCredentials credentials)

            : base(baseAddress, credentials)

        { }

 

 

        public IQueryable<ClaimType> ClaimTypes

        {

            get

            {

                //this is where you configure the name of the table in Azure Table Storage

                //that you are going to be working with

                return this.CreateQuery<ClaimType>(CLAIM_TYPES_TABLE);

            }

        }

 

    }

}

 

  • ·         In the data source classes I do take a slightly different approach to making the connection to Azure. Most of the examples I see on the web want to read the credentials out with some reg settings class (that’s not the exact name, I just don’t remember what it is). The problem with that approach here is that I have no Azure-specific context because I want my data class to work outside of Azure. So instead I just create a Setting in my project properties and in that I include the account name and key that is needed to connect to my Azure account. So both of my data source classes have code that looks like this to create that connection to Azure storage:

 

        private static CloudStorageAccount storageAccount;

        private ClaimTypeDataContext context;

 

 

        //static constructor so it only fires once

        static ClaimTypesDataSource()

        {

            try

            {

                //get storage account connection info

                string storeCon = Properties.Settings.Default.StorageAccount;

 

                //extract account info

                string[] conProps = storeCon.Split(";".ToCharArray());

 

                string accountName = conProps[1].Substring(conProps[1].IndexOf("=") + 1);

                string accountKey = conProps[2].Substring(conProps[2].IndexOf("=") + 1);

 

                storageAccount = new CloudStorageAccount(new StorageCredentialsAccountAndKey(accountName, accountKey), true);

            }

            catch (Exception ex)

            {

                Trace.WriteLine("Error initializing ClaimTypesDataSource class: " + ex.Message);

                throw;

            }

        }

 

 

        //new constructor

        public ClaimTypesDataSource()

        {

            try

            {

                this.context = new ClaimTypeDataContext(storageAccount.TableEndpoint.AbsoluteUri, storageAccount.Credentials);

                this.context.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(3));

            }

            catch (Exception ex)

            {

                Trace.WriteLine("Error constructing ClaimTypesDataSource class: " + ex.Message);

                throw;

            }

        }

 

  • ·         The actual implementation of the data source classes includes a method to add a new item for both a claim type as well as unique claim value. It’s very simple code that looks like this:

 

        //add a new item

        public bool AddClaimType(ClaimType newItem)

        {

            bool ret = true;

 

            try

            {

                this.context.AddObject(ClaimTypeDataContext.CLAIM_TYPES_TABLE, newItem);

                this.context.SaveChanges();

            }

            catch (Exception ex)

            {

                Trace.WriteLine("Error adding new claim type: " + ex.Message);

                ret = false;

            }

 

            return ret;

        }

 

One important difference to note in the Add method for the unique claim values data source is that it doesn’t throw an error or return false when there is an exception saving changes. That’s because I fully expect that people mistakenly or otherwise try and sign up multiple times. Once we have a record of their email claim though any subsequent attempt to add it will throw an exception. Since Azure doesn’t provide us the luxury of strongly typed exceptions, and since I don’t want the trace log filling up with pointless goo, I don’t worry about it when that situation occurs.

  • ·         Searching for claims is a little more interesting, only to the extent that it exposes again some things that you can do in LINQ, but not in LINQ with Azure. I’ll add the code here and then explain some of the choices I made:

 

        public UniqueClaimValueCollection SearchClaimValues(string ClaimType, string Criteria, int MaxResults)

        {

            UniqueClaimValueCollection results = new UniqueClaimValueCollection();

            UniqueClaimValueCollection returnResults = new UniqueClaimValueCollection();

 

            const int CACHE_TTL = 10;

 

            try

            {

                //look for the current set of claim values in cache

                if (HttpRuntime.Cache[ClaimType] != null)

                    results = (UniqueClaimValueCollection)HttpRuntime.Cache[ClaimType];

                else

                {

                    //not in cache so query Azure

 

                    //Azure doesn't support starts with, so pull all the data for the claim type

                    var values = from UniqueClaimValue cv in this.context.UniqueClaimValues

                                  where cv.PartitionKey == System.Web.HttpUtility.UrlEncode(ClaimType)

                                  select cv;

 

                    //you have to assign it first to actually execute the query and return the results

                    results.UniqueClaimValues = values.ToList();

 

                    //store it in cache

                    HttpRuntime.Cache.Add(ClaimType, results, null,

                        DateTime.Now.AddHours(CACHE_TTL), TimeSpan.Zero,

                        System.Web.Caching.CacheItemPriority.Normal,

                        null);

                }

 

                //now query based on criteria, for the max results

                returnResults.UniqueClaimValues = (from UniqueClaimValue cv in results.UniqueClaimValues

                           where cv.ClaimValue.StartsWith(Criteria)

                           select cv).Take(MaxResults).ToList();

            }

            catch (Exception ex)

            {

                Trace.WriteLine("Error searching claim values: " + ex.Message);

            }

 

            return returnResults;

        }

 

The first thing to note is that you cannot use StartsWith against Azure data. So that means you need to retrieve all the data locally and then use your StartsWith expression. Since retrieving all that data can be an expensive operation (it’s effectively a table scan to retrieve all rows), I do that once and then cache the data. That way I only have to do a “real” recall every 10 minutes. The downside is that if users are added during that time then we won’t be able to see them in the people picker until the cache expires and we retrieve all the data again. Make sure you remember that when you are looking at the results.

Once I actually have my data set, I can do the StartsWith, and I can also limit the amount of records I return. By default SharePoint won’t display more than 200 records in the people picker so that will be the maximum amount I plan to ask for when this method is called. But I’m including it as a parameter here so you can do whatever you want.

The Queue Access Class

Honestly there’s nothing super interesting here. Just some basic methods to add, read and delete messages from the queue.

Azure Worker Role

The worker role is also pretty non-descript. It wakes up every 10 seconds and looks to see if there are any new messages in the queue. It does this by calling the queue access class. If it finds any items in there, it splits the content out (which is semi-colon delimited) into its constituent parts, creates a new instance of the UniqueClaimValue class, and then tries adding that instance to the unique claim values table. Once it does that it deletes the message from the queue and moves to the next item, until the it reaches the maximum number of message that can be read at one time (32), or there are no more messages remaining.

WCF Application

As described earlier, the WCF application is what the SharePoint code talks to in order to add items to the queue, get the list of claim types, and search for or resolve a claim value. Like a good trusted application, it has a trust established between it and the SharePoint farm that is calling it. This prevents any kind of token spoofing when asking for the data. At this point there isn’t any finer grained security implemented in the WCF itself. For completeness, the WCF was tested first in a local web server, and then moved up to Azure where it was tested again to confirm that everything works.

So that’s the basics of the Azure components of this solution. Hopefully this background explains what all the moving parts are and how they’re used. In the next part I’ll discuss the SharePoint custom claims provider and how we hook all of these pieces together for our “turnkey” extranet solution. The files attached to this posting contain all of the source code for the data access class, the test project, the Azure project, the worker role and WCF projects. It also contains a copy of this posting in a Word document, so you can actually make out my intent for this content before the rendering on this site butchered it.

Azure.zip