Crawl taking indefinitely to complete

Came across this a couple of times and, unfortunately there is no easy way out of it.

Here is the behavior description:

Crawling your site, full or incremental will not finish, even after a long time.

If you try to access the SQL server’s tempdb will most likely result in the inability to get a lock on the database.

The last item appearing in the crawl log is registered a more than 2 or 3 hours ago, still, the crawl status is “Crawling incremental” or "Crawling Full"

There are no mssdmn processes running on the indexer server and the memory occupied by mssearch.exe ranges in the area of 150 –200 Mb

You try to restart the Office Sharepoint Search service using services.msc and the service times out on stopping.

You can try to stop/pause the crawling using the UI but the page will timeout. upon refresh, the content source still shows as crawling.

A fix for the behavior was already published in August Cumulative Update :

KB 956056   

Description of the SharePoint Server 2007 hotfix package (Coreserver.msp): August 26, 2008  https://support.microsoft.com/default.aspx?scid=kb;EN-US;956056

(A full crawl may take several weeks to be completed. Additionally, the crawl may stop responding, and you cannot stop it or cancel it.)

If you already installed the said fix, and still experience this behavior, it means you are most likely in our scenario.

What could possibly inflict this you ask?

Well, you can run into this behavior if one or more lists(document libraries) on your site lacks the default view.

This means , the respective list’s default URL , instead of pointing to:  https://site/subsite/list/forms/allitems.aspx , points to https://site/subsite/  which will have as effect the building of all the anchors for the items in the respective list(s) starting from the root of the site. This (depending on the number of lists and number of list items in each list) will lead to an immense number of rows in the temp db  while the server will attempt to index the list items referencing them recursively from the root of the site. The direct effect is the dramatic decrease in the performance of the tempdb processing and stalls the crawl process.

This might happen if you deleted the default view by mistake, or created the list through migration tools that did not create all the aspects of the lists (as in , just created the SPList object and stuffed the SPListItems in them…)

How to check if you have such lists ?

Here a snippet of code that can help you achieve that :

/*
This source code is freeware and is provided on an "as is" basis without warranties of any kind,
whether express or implied, including without limitation warranties that the code is free of defect,
fit for a particular purpose or non-infringing. The entire risk as to the quality and performance of the code is with the end user.
*/

…………..

    SPWebApplication spwa = SPWebApplication.Lookup(new Uri(args[0]));
                    foreach (SPSite osite in spwa.Sites)
                    {
                        foreach (SPWeb oweb in osite.AllWebs)
                        {
                            Console.WriteLine(oweb.Url + "==============");
a:;         // this label will serve in case we implement the fix                  

 foreach (SPList olist in oweb.Lists)
                            {
                                if (olist.Hidden ==false)
                            {
                                try
                                {
                                    Console.WriteLine("Title: " + olist.Title);
                                    if (olist.DefaultView == null)

                                    { Console.WriteLine("ERROR: No default list view found “); }

                                    else
                                    {
                                        Console.WriteLine("DefaultViewURL: " + olist.DefaultViewUrl);
                                        Console.WriteLine("DefaultViewTitle: " + olist.DefaultView.Title);
                                    }
                                }
                                catch (Exception e)
                                {
                                   Console.WriteLine("ERROR------\n" + e.Message + "\n”);

                                }
                            }
                            }
                            Console.WriteLine("=========");
                            oweb.Dispose();
                        }
                        osite.Dispose();
                    }

…………...

if the result of the code returns some of the lists as having a null default view, the only way to fix it, (except deleting the document library and recreating-it) is through object model  :

add the following lines to the code , next to :

{

Console.WriteLine("ERROR: No default list view found “);

SPView spView = olist.GetUncustomizedViewByBaseViewId(1); // 1 = All Documents Base Template
StringCollection viewColl = spView.ViewFields.ToStringCollection();
olist.Views.Add("All Documents", viewColl, spView.Query, spView.RowLimit, spView.Paged, true);

Console.WriteLine("ERROR :fixed“);

goto a; //Collection was modified, enumeration might not execute

}

For the currently running crawl, there is no easy way to stop it, you either have to wait until it will eventually finish ( because as, said, it is not stalled , it’s just extremely slow) or forfeit the existing index and Stop the Office Server Search Service on the indexer.

NOTE: Stopping the Office Server Search on the indexer will result in loosing the index and having to re-crawl all the content.