Okay, I'm going to start out by saying I don't how many parts there will be to this little series, but they will come as content worth talking about shows up. Friends…in today's post I'm just going to cover some basic design principals about SharePoint 2013 that I want to make sure stay top of mind when you are thinking through your logical architecture for your SharePoint farms.
To begin with, I'm going to start with something called the SharePoint PLA. This is a concept that was developed internally at Microsoft and has or will be making it's way out into the public. Well, actually I know that Gokhan talked about it in some detail at the SharePoint Conference in 2012 (see SPC192 if you have access to that content; sorry don't ask me where to find it if you don't because I don't know). The PLA stands for Product Line Architecture (or, ironically, Pretty Lousy Acronym – what a co-winky-dink). I'm not going to really do a deep dive into the specifics of it because it would take forever, but there are some basic tenets underlying it that we focus on – 1 farm, 1 web application, 1 zone. Think about that for a minute and how it relates to previous versions of SharePoint. One farm has always been my personal preference anyways, until business needs dictate otherwise. One web app is definitely a change though, if for no other reason that we've told people for many years that you should have separate web apps for things like the My Site host. One zone is honestly more of a thing that I'm throwing in there, and I'll explain more about why I say that in a bit.
So first of all, WHY do we recommend this approach. There are probably many reasons, but I'll cite the two most important reasons here: 1) this is the basic design of Office 365, and as such, it's a design that gets the most work and scrutiny by our test teams. More test coverage means fewer bugs and a greater likelihood of y'all being happy campers with your SharePoint implementation. 2) It's fundamentally less complicated to design and implement, which overall should lead to simplified operations and management and longer term, a lower total cost of ownership (TCO).
Okay, so HOW do we go about doing this? Well we made significant investments in the host header site collections in SharePoint 2013 precisely to support this scenario. (P.S. Please don't toss rocks at me for calling it "host header site collections" vs. "host named site collections" vs. whatever else; I call it what I call it, I'm sure you can figure out what I mean). As I mentioned above, this is the basic environment that is used with Office 365, which supports millions of customers, so we did as much as we could to ensure that they are an industrial strength solution. The implementation of it is pretty straightforward really, as you're building your SharePoint farm. You would typically do something like this:
- Create a new web application; you can do this still in Central Admin, just leave the host header field blank when you create it.
- Create your new site collections using PowerShell – this is required in order to make them host header site collections.
- Remove the path-based "/sites" inclusion from the web app.
- Create a root site collection; you can do this in Central Admin as well. It will be a path-based site, but you should set up security on it such that you do NOT give any users access to this site collection.
- Create managed paths for the host header site collections
- Turn on self service site creation on the web app (so users can create My Sites)
Some of these details are pretty straightforward; others maybe not so much. So let's carry this forward further with an example – suppose I want to create three site collections like this:
First note that I DID make each of these sites SSL. As I've mentioned many times in many places, in SharePoint 2013 you really always want to use SSL in production because of the extensive use of OAuth. What that means is that I need to get a wildcard SSL certificate for *.contoso.com and add it to my single web application in IIS and it will work for all of my site collections. Now, in addition to these sites, I also want to have the following inclusions available (remember by the way that an inclusion for ANY host header site collection is available to ALL of them):
- /sites (for creating new collab site collections)
- /personal (for creating personal sites)
So assume you've done the first aspect of this, which is to create the new web app in central admin. Now you can use PowerShell to create your host header site collections like so:
New-SPSite -Url https://collab.contoso.com -OwnerAlias contoso\speschka -HostHeaderWebApplication https://hh.contoso.com -Name "Collab" -Template "STS#0" -OwnerEmail firstname.lastname@example.org
New-SPSite -Url https://portal.contoso.com -OwnerAlias contoso\speschka -HostHeaderWebApplication https://hh.contoso.com -Name "Portal" -Template "BLANKINTERNETCONTAINER#0" -OwnerEmail email@example.com
New-SPSite -Url https://my.contoso.com -OwnerAlias contoso\speschka -HostHeaderWebApplication https://hh.contoso.com -Name "My Sites" -Template "SPSMSITEHOST#0" -OwnerEmail firstname.lastname@example.org
In this case assume that when I created the web app, it had a virtual IP in DNS associated with the name "hh.vbtoys.com", and that was the Public Url that I used for the web app when I created it in central admin – that's why I used that as the HostHeaderWebApplication parameter. Also note, there seems to be some confusion on whether the web app itself needs to be a "host header web application". It does not. It should not. It will not…work if it has a host header. That's why I say, you can create it in central admin, just leave the host header field blank when you do so.
Okay, so all of my host header root site collections are created now. Next, I'm going to add my root site collection for the web app and remove the path-based /sites inclusion. When that's done, I'll go ahead and the inclusions for my host header site collections; we'll do that with this bit of PowerShell:
So that was pretty easy. Now that I have that done, I'm going to go ahead as well and create my enterprise search center site collection. You would probably put it in your portal application, but in my case I'm just going to put it in the path of my collab site collection:
New-SPSite -Url https://collab.contoso.com/sites/search -OwnerAlias contoso\speschka -HostHeaderWebApplication https://hh.contoso.com -Name "Search" -Template "SRCHCEN#0" -OwnerEmail email@example.com
With that done, NOW I go and create my service applications. One of the reasons I waited until now was so that I could get my My Site stuff setup – the My Site host site collection (my.contoso.com), the inclusion for personal sites at /personal, and self service site creation turned on. When I get to creating the UPA, I'll configure it so that it says the My Site host is https://my.contoso.com, and the path for personal sites is /personal. Set up all the other service applications and now you're pretty much ready to go.
Sorry, one other random point to note here – if you need to get a list of all of the site collection template IDs to use when creating these sites in PowerShell, just run the Get-SPWebTemplate cmdlet. I just piped it to a text file so I could pull out what I need, i.e. Get-SPWebTemplate > templates.txt.
Let's talk about zones now. As I alluded to above, we really just want one zone, but there can be some problems with that. First let's discuss what uses the default zone: first and foremost – crawling. You ALWAYS want to crawl the default zone. I know our documentation is not really up to snuff on this particular point, but trust me when I say that for SharePoint 2010 and 2013, you should always crawl the default zone. If you don't, certain things will break. Sorry I don't have a complete list, but for example, contextual scopes like this list, this site, etc. are always served out of the default zone. Secondly, links for site feeds in team sites are always rendered out of the default zone. If you have multiple zones and your users are not in the default zone, then they will have links rendered for them that may not be correct (not so much an issue with search, but it is with team site feeds). Also, for what it's worth, the original design of the App Model had them only available out of the default zone as well. For that one specifically, we've made some feature enhancements in the March PU, but if you want to go with an app domain mapping per zone, you also need to have a load balancing solution that supports forwarding requests to specific port number; Windows Load Balancing for example, does not do this.
Okay – so if you're using Windows authentication then all of this is probably going to be pretty straightforward to keep on a single zone. Where this does get more difficult is if your users are using something other than Windows authentication. Why is this a problem? Well, it's not really a problem, but more like a major annoyance. I say that because the out of the box behavior will be that every time a user visits a site, they will be prompted to select what kind of authentication they want to use. If all of your users are using SAML authentication for example, then there is no good reason why they should have to select the SAML authentication provider when they visit the site. But that's exactly what will happen in this scenario because you've got Windows authentication on the zone (because the crawler requires it), and SAML authentication on the zone (because your users require it).
The solution in this case is really just to replace the default identity provider selector page. I actually wrote a blog post about this last release, and the information is still valid for SharePoint 2013. You can find the details here: http://blogs.technet.com/b/speschka/archive/2011/04/30/bypassing-the-multi-authentication-provider-selection-page-in-sharepoint-2010.aspx. What you'll want to do differently from the example there, is look for the crawler when it hits your page. If the request is NOT coming from the crawler then you should just automatically redirect the user, along with all the query string parameters, to the appropriate login page (the pages are different for SAML auth vs. FBA, but I'm guessing you can figure this out with a simple Fiddler capture if needed). In terms of determining whether the request is coming from the crawler, I think the easiest way to do that is to look at the user-agent header. You can find the value that SharePoint uses by opening up regedit on any of your SharePoint servers and looking here – HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\15.0\Search\Global\Gathering Manager – for the UserAgent key. The value you will find in there by default for SharePoint 2013 is Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 6.0 Robot). So you can just look at an incoming Request.Headers collection to a) see if it has a User-Agent key (it almost always will), and if so, if it matches the value used by the crawler. If it does NOT, then you know it's not the crawler and you should just forward the request to the appropriate authentication page.
So with that, I'll wrap up part 1. Hopefully the concepts of single farm, single web app, single zone make sense and you can apply them if at all possible in your environments.