FAST Search for SharePoint 2010 Crawler Troubleshooting

Being in the field for Microsoft has taught me some valuable troubleshooting skills for multiple applications. One of the most common issues that people ask me for help on is FAST Search for
SharePoint 2010 crawler troubleshooting. I have done a presentation on this this for some Search users groups that got posted on YouTube HERE.

We need to break your issue down into two sides, narrow down the problem by eliminating extraneous information that likely confuse what's causing the real problem or divide and conquer as the art
of war suggests, SharePoint 2010 and FAST.

SharePoint 2010 and outside influences:

From my experiences 90+% of the root causes for content ingestion issues with SharePoint 2010 and FS4SP are related to crawler, misconfiguration/architecture, content sources / WFE, or SQL
related. Here is what I would recommend to check in order:

1. Disable TCP offloading, or ToE, on the network interface cards for all SharePoint Search Component Servers, and for all FAST Servers per:
https://support.microsoft.com/kb/2570111

Not only does this cause slow crawling but it also creates query timeouts if not disabled on the FAST search/QR nodes.

2. SQL, have your SQL admins check for any locked tables, specifically the FAST content DB and Crawler database(s), this will cause all
sorts of problems down the line, specifically huge spikes in the performance counters discussed below. Multiple instances of the crawl database are a good
idea to support scaling. It is not recommended to have more than 25 million items in a single crawler database and no more than 10 crawl databases per
Search Service Application.

https://technet.microsoft.com/en-us/library/ff599536(v=office.14).aspx –Add (or remove) a crawl database

https://technet.microsoft.com/en-us/library/hh292622(v=office.14).aspx –Best practices for SQL Server 2008 in a SharePoint Server 2010 farm

https://technet.microsoft.com/en-us/library/cc298801(v=office.14).aspx –Storage and SQL Server capacity planning

 

3. Exclude AntiVirus on FAST data directory:

%FASTSEARCH%\data\*

And the SharePoint Crawler gatherer temp directory:

C:\Users\WSS_Search_svc\AppData\Local\Temp\gthrsvc_SPSearch4

*This will very in terms of name obviously but is essentially the search service accounts Temp directory in its user profile

More info: https://support.microsoft.com/kb/952167

 

4. Is your content source fast enough to crawl? Your crawler will be starved if your content source is throttling. This is a case for redirecting crawler traffic to a dedicated WFE:

https://technet.microsoft.com/en-us/library/dd335962(v=office.14).aspx

 

5. Check performance on crawlers

  1. Check CPU, SP crawler components will spike out a system at 80%, if this is the case, add more CPU/resources, more crawler components or more
    crawler DBs.
  2. Check these perfmon variables multiple times to see what the feeding chain is doing.

Batches Total - The total number of batches that were submitted for processing.

Good for a baseline, we want to see this number progress along with Batches Success.

Batches Success - The number of batches that have been successfully indexed.

Over time we want to see this increment, if it is flat lining there is a problem, but you knew that.

Batches Failed - The number of batches that failed prior to indexing.

Failed = Bad, there must be a reason, SQL issues? It's beyond the content at this point and waiting for resources from the crawler/SQL down.

Batches Open - The number of batches that are currently being processed and that were neither successful nor have they failed.

The gatherer has collected what it needs, now we are waiting on SQL for some checks, high numbers here could potentially reflect SQL issues.

Batches Ready - The number of batches that have been queued up and are not yet submitted to FAST Search Server 2010 for SharePoint.

This number should always be low unless there is some traffic from FAST.

Batches Submitted - The number of batches that have been submitted to FAST Search Server 2010 for SharePoint but that have not yet completed processing.

High numbers here usually point to a FAST problem, see below.

  

FAST:

For the FAST side this is pretty similar to the legacy products ESP, many of these discussion are well documented in blogs and forums.

 

  1. Infotool is your friend, if you need to open a support case this is the first item CTS will ask for https://technet.microsoft.com/en-us/library/ee943528(v=office.14).aspx
  2. Most common issue here is IOPs, the indexers use an amazing amount of disk IO, so much so that it is recommended to place the indexer
    data directories on direct attached storage. Check Perfmon for IO problems https://technet.microsoft.com/en-us/library/cc300400.aspx
  3. Document Processor count is surprising up there as well, by default this daemon is way too small. Most of my medium sized enterprise customers have 40-62 DPs, depending on their installation. https://technet.microsoft.com/en-us/library/ff381247(v=office.14).aspx
  4. Check for indexing issues, Check %FASTSEARCH%\var\log\syslog from the admin node, Run “indexerinfo status –a” to make sure your indexers are healthy and turning over https://technet.microsoft.com/en-us/library/ee943511(v=office.14).aspx

Healthy example, note time_since_last_index and status, during feeding these should appear
low and doing something other than idle respectively:

<?xml version="1.0"?>

<indexer hostname="host1.contoso.com" port="13050" cluster="webcluster" column="0" row="0" factory_type="filehash" preferred time_since_last_index="3034" >

<documents size="7901944935537.000000" total="10318975" indexed="10318975" not_indexed="0"/>

<column_role state="Master" backups="1"/> <index_frequence min="0.250000" max="2745.937500"/>

<partition id="0" index_id="1366738633614254000 " status="indexing" type="dynamic" timestamp_indexed="1366738645" indexed_per_seconds>

<documents active="5792" total="5792"/>

....

     5. Review %FASTSEARCH\var\log\syslog indexer and contentdistributor log files for full queues

     6. Also I have seen HUGE FiXML files trying to be written, so check that as well, the indexer will choke if this is the case, you
can view this in %FASTSEARCH%\data\data_fixml\, look for the newest files being written and verify if they are <50meg.  If so you will need to make the default file
size crawled smaller and/or create an exclude for the offending site. https://technet.microsoft.com/en-us/library/ff473168(v=office.14).aspx –Create exclude crawl rules

     7. Utilize our “Fast Search Performance and capacity tuning Guide”, to reduce the “MaxSubmittedBatches” and “MaxSubmittedPUDocs”, based on your Farm configuration according to the formula detailed in the article below.  

MaxSubmittedBatches = The total number of batches from each crawl component.

MaxSubmittedPUDocs = The total number of Partial Updates (Security Changes) from each crawl
component.

Performance and capacity tuning (FAST Search Server 2010 for SharePoint)

https://technet.microsoft.com/en-us/library/gg604781.aspx

 

 

If this does not help your issue contact your local TAM, as my team has multiple services to resolve FS4SP issues as well as proactive services for your environment.