Today call me the Sultan of Scale, because we are talking about SharePoint Scale and the list view threshold. When you excitedly upgrade your farm to SharePoint 2010 you may head on over to a large list and try and access it or do an administrative operation such as add a column. Suddenly you are met with this:
You do some digging around and find out that there is a setting in Central Administration where you can change the list view threshold for a web application. Aha! If you just change that setting up to 100,000 you have no more problems. Hold on a minute. Let me introduce you to the list view threshold. It is like your good friend, the designated driver. The guy that ruins your night by taking your keys away when you are too drunk to drive and you can’t go get cream cheese hot dogs at 2am. You just tried to raise the BAC level at which he takes the keys away from .8 to 16. The keys don’t get taken away and your server farm crashes, ok maybe not but you may cause a performance decrease for other users.
This graph shows the impact of changing the list view threshold on a farm with a large list where users are frequently performing operations that exceed the list view threshold. As you can see, the impact of just changing the setting to 10,000 is significant.
SharePoint 2010 has some great scalability improvements over SharePoint 2007. There are improvements in search, columns, and metadata that make it faster and easier to store access millions of items in a list, but to design such large lists you still need to think about a few things. This post covers a few basics, for lots of in depth information check out this article I authored: Designing large lists and maximizing list performance.
Why does SharePoint have a list view threshold and how does it help you besides throwing an error? Dealing with a large amount of items takes time and resources. Fortunately SharePoint provides methods to help you perform common operations efficiently such as using indexes to filter and metadata navigation fallback to return a smaller set of results when a request is too large. But some operations such as poor performing queries (ones that try and get everything from a list or filter without indexes) or operations that affect every list item such as adding a column take time and resources. On a small list this doesn’t matter because there are so few items that the operation is fast. As the list size increases these operations take longer and use more resources. Rather than let these operations run unbounded, the list view threshold prevents them. You can think of the list view threshold as a check engine light letting you know that you should change the query and how data is accessed or perform the operation when farm usage is low.
But why is the limit set at 5,000? When users perform operations against a database, SQL Server locks rows to prevent conflicts from concurrent operations. Once operations start to affect more than 5,000 items the entire table gets locked because it is more efficient than locking the individual rows. SharePoint stores all list items for an entire content database in a single table
When queries are made on approximately more than 5,000 items (even if it is a read) that table will get locked and any other operations for all lists in that content database must wait until the current operation completes. The default value of 5,000 was carefully chosen and Microsoft recommends not changing it. Lowering the limit will result in minimal performance improvements, and raising it even to 10,000 will decrease throughput if operations are commonly performed that would otherwise be prevented.
So what can you do about it? You have errors and errors are bad. Errors occur in a few different places.
Accessing list views that contain too many items
Attempting to perform privileged operations (adding a column, creating an index, adding content types)
Custom code that accesses a large list
You can configure views to filter on indexed columns or move items into folders so there are less than 5,000 items in the root of any one folder or the root of a list. One example view is a my documents view. This is an easy view to configure because it uses system metadata that will always be set for every item. In this view you filter by Modified By and/or the Created By columns. To create this view in the filters section of the settings for a view select the Modified By column and set the value to [ME] and then set a second filter with OR on the Created By column with the value also set to [ME]. The Created By column should be used as well as Modified By when multiple users edit the same documents. Modified By is not a multiple user column so this view will not necessarily show all of the documents a user has ever modified. In this example both columns should be indexed because it is an OR operation.
You should also activate the metadata navigation site feature. Just activating the feature will cause list view issues to go away because fallback will kick in and return the most recent results rather than throwing a list view threshold exception. It is also a very useful feature for filtering items in a list so consider configuring it so users can use it. To activate the metadata navigation feature go to site settings for a particular site, go to manage site features, and activate the metadata navigation feature. You can then go to the list or library settings for a large list and click on the metadata navigation settings link. Columns that can be used as hierarchies and key filters will be available. Once you add these columns for metadata navigation indexes will automatically be created. If a list is already over the list view threshold this operation will be blocked so an administrator must perform the action or a daily time window should be configured so the operation can be performed in low usage hours.
Next you can configure a daily time window so privileged operations can be performed when the farm has low usage. A daily time window can be configured per web application in Central Administration. When users receive a list view exception they will be notified that they can perform the operation during the time window. If user load is light from 8:00pm-1:00am then you can specify a time window for that time frame. This allows users to perform operations such as adding columns and content types to large lists and you get to control the time frame when users can perform these operations.
For custom code there are a few general guidelines to follow. Use indexes to filter SPQueries to amounts less than the list view threshold. For example if you are using an SPQuery to find a specific item in a list based on a particular column make sure it is indexed. If there is the possibility that an SPQuery may return more results than the list view then the content iterator class should be used to work through the content without hitting the list view threshold. This is especially important if you are doing operations on all of the content in a list or all of the content in a site or site collection.
Here is an example of performing an SPQuery to get all items in a list based on a non-indexed field. Of course this is a bad idea on a large list and this design really should not even be considered, but here is an example of how you might do it:
SPQuery query = new SPQuery();
query.Query = "<Where><Eq><FieldRef Name=\"MyUnindexedField\"/><Value Type=\"Text\">FieldValue</Value></Eq></Where>";
SPListItemCollection items = list.GetItems(query);
// Do what you need with the collection of items
Not only is this querying on an unindexed field, it is not limiting the results. So on a large list, this query will be blocked because the field is not indexed and too many results may be returned.
Here is an example using the content iterator:
If you can add an index on MyUnindexedField, then you can use NVP (Name Value Pair) paging on that index:
Content Iterator Example
SPQuery query = new SPQuery();
query.Query = "<Where><Eq><FieldRef Name=\"MyIndexedField\"/><Value Type=\"Text\">FieldValue</Value></Eq></Where>" + ContentIterator.ItemEnumerationOrderByNVPField;
ContentIterator ci = new ContentIterator();
// Do what you need with each item
delegate(SPListItem item, Exception e)
// Handle an exception that was thrown while iterating
// return true for ContentIterator to rethrow the exception
In a worst case scenario you may need to temporarily raise the list view threshold. Rather than doing this for an entire web application, you can disable the limit for a single list until issues can be resolved. This is not recommended, but there may be specific cases that this may be necessary. SPList objects have an EnableThrottling setting that can be set to false to prevent operations from being blocked by the list view threshold.
Now you may have not have even upgraded yet and are wondering what you can do before upgrade. Here are a few ideas:
Run the pre upgrade checker to identify large lists (lists with more than 5,000 items)
Minimize large list upgrade issues by:
Filtering list views to amounts less than 5,000 items using an indexed column
Organizing content into folders that contain less than 5,000 items (this is not necessary if views and web parts are filtered properly)
Checking web parts and other custom code that query lists to ensure that they do not perform operations that will affect more items than the list view threshold
In conclusion the list view threshold is your friend. It will help you identify poor performing operations and help protect performance for all users. Don’t change the list view threshold, with proper planning you can minimize list view exceptions and improve the performance of your farm. And remember, friends don’t let friends run unbounded queries.
Quentin Christensen, Program Manager, Sultan of Scale