SharePoint 2013 Monitor and Tune Content Feed

I have not seen too much in the way of guidance for monitoring and tuning content feeding for SharePoint 2013. This post will cover isolating the bottleneck in your feeding chain. In a future post, I will cover how to address these bottlenecks. But, in most cases you will simply need to add more instances of a component or hardware resources. If you are familiar with the content feed monitoring and tuning process for Fast Search for SharePoint 2010, the process is basically the same for SP2013. This article is starting for monitoring the performance of content feeding in SharePoint 2013. The focus is on the SharePoint 2013 counters; as guidance for system level counters is already provided elsewhere.

Note: some of the thresholds are a rough estimate and may be replaced by more formal product documentation or field experience in the future.

 

The main components in the feeding chain are:

  • Crawl Database

  • Crawl Component

  • Content Processing Component

  • Index Component

Here are some useful counters to monitor for each component.

Component Key Hardware Resource Counters
Crawl Database Disk I/O Logical Disk\*
Crawl Component Processor 

Search Gatherer Projects -SharePointServerSearch\Transactions Waiting

Search Gatherer Projects -SharePointServerSearch\Transactions In Progress

Search Gatherer Projects -SharePointServerSearch\Transactions Completed

Search Gatherer - SharePointServerSearch\ThreadsAccessing Network

Search Gatherer -SharePointServerSearch\Threads Filtering

Search Gatherer -SharePointServerSearch\Threads Idle

Content Processing Component Processor 

Search Submission Service\# Pending Items

Search Flow Statistics\# Items Queued ForProcessing

Search Flow Statistics\Input Queue Empty Time

Search Flow Statistics\Input Queue Full Time

Index Component  Disk I/O Logical Disk\*

 

When we begin content tuning we need to isolate the bottleneck in the feeding chain. I find it useful to approach this as a two step process. First we want to determine if the bottleneck occurs prior to the search subsystem (Upstream) or after hitting the search subsystem (Downstream). You could also look at these two pieces as content acquisition and content processing.

Content Acquisition consists of:

  • Content Source (SharePoint site, web site, BCS connection, etc)

  • Crawl Component 

  • Crawl Database

 

Content Processing consists of:

  • Content Processing Component

  • Index Component

 

A good first counter to look at is Search Gatherer Projects - SharePointServerSearch\Transactions Waiting for all of the Crawl Components. If this counter is low (less than a few thousand), content processing is keeping up with content acquisition. If the counter is high and/or consistently rising, then we are pushing more data than Content Processing can keep up with. This is also visable in the Crawl Health report by viewing the queue length.

In the case where Content Processing is slow, we have two possibilities: the Content Processing Components are the bottleneck or the Index Components are the bottleneck. Both Search Submission Service\# Pending Items and Search Flow Statistics\# Items Queued For Processing will be high (greater than a few hundred).

  1. If Content Processing Components are the bottleneck
  1. Processor utilization will be high on servers running Content Processing Components

  2. Search Flow Statistics\Input Queue Full Time will be low (less than about 1000)

  • If Index Components are the bottleneck
  1. There will be high disk I/O and/or latency on servers running Index Components

  2.  Search Flow Statistics\Input Queue Full Time will be high (greater than about 1000)

When Content Acquisition is slow, there are three possible bottlenecks: Content Source, Crawl Database, Crawl Component

  1.  If Crawl Database is the bottleneck
  1. Search Gatherer Projects - SharePointServerSearch\Transactions Completed will be high (greater than a few hundred)

  2. There will be high disk queue length / disk latency on Crawl DB

  • If Crawl Component is the bottleneck (not common)
  1. There will be high processor utilization on Crawl Component servers

  2. There will be no disk latency issue on Crawl DB

  • If Content Source is the bottleneck
  1. Search Gatherer - SharePointServerSearch\Threads Accessing Network will be close to Search Gatherer - SharePointServerSearch\Threads Filtering

  2. There will be low processor utilization on Crawl Component servers

  3. There will be no disk latency issues on Crawl DB