Fast search and Custom Scope

 

 

To create new search scope and filter:

To add filter to search scope:

$scope = Get-SPEnterpriseSearchQueryScope -SearchApplication "FAST Query SSA" -Identity "ABCD"
 
$scope.Filter = "or(path:starts-with("https://mysite.com/ABCD/"), path:starts-with(https://mysite.com/ABCD/"))
 
$scope.Update()

 

$scope = Get-SPEnterpriseSearchQueryScope -SearchApplication "FAST Query SSA" -Identity "ABCD"
 
$scope.Filter ='path:starts-with("https://mysite.com/ABCD/")'
 
$scope.Update()

-ExtendedSearchFilter 'path:starts-with("https://mysite.com/ABCD/")'

 

But some times we will not get any results for a particular scope using fast search, the problem here is that the way the 'path' managed property is populated by the SharePoint crawler and the FAST Search Enterprise Crawler is different.  While the SharePoint crawler populates this managed property using the full URL, the Enterprise Crawler only stores the relative URL path off of the base URL.  For example:

Result 1(from SharePoint crawler content):
< FIELD NAME="path">https://xxxx.xxxxx.xxx/sites/xxxx/benefits/xxxxx/Pages/xxxxx.aspx</FIELD>

Result 2(from Enterprise crawler content):
< FIELD NAME="path">/xxxxx/xxxxx/xxxxx/xxxx/xxxxx.cfm</FIELD>

 

IF we take the a Fql trace we can see .

 

Non working

https://xxxxxxx.gov/xxxx/xxxx/xxxx/xxxx/xxxx.cfm?REV=None</FIELD><FIELD NAME="url">https://xxxxxxx.gov/xxxx/xxxx/xxxx/xxxx/xxxx.cfm</FIELD><FIELD NAME="domain">xxxx.xxxx.xxxx</FIELD><FIELD NAME="tld">gov</FIELD><FIELD NAME="path">/xxxx/xxxx/xxxx/xxxx/xxxx.cfm</FIELD><FIELD NAME="processingtime">2013-02-13T09:02:10Z</FIELD><FIELD NAME="docdatetime">2013-01-22T23:26:13Z</FIELD><FIELD NAME="size">13360</FIELD><FIELD NAME="docvector">[business rules, 1][vacation donation, 0.912871][benefit plans, 0.912871][donated vacation, 0.763763][vacation business, 0.707107][vacation, 0.591608][payroll services, 0.57735][hrm, 0.5][donation recipient, 0.5][labor relations, 0.5][vacation donors, 0.5][vacation recipient, 0.5][employee, 0.447214][pals, 0.408248][laboratory director, 0.408248]</FIELD>

 

Working

<FIELD NAME="languages">en</FIELD><FIELD NAME="charset">utf-8</FIELD><FIELD NAME="urls">https://xxxxxxx.gov/xxxx/xxxx/xxxx/xxxx/Pages/Vacation.aspx</FIELD><FIELD NAME="url">https://xxxxxxx.gov/xxxx/xxxx/xxxx/xxxx/Pages/Vacation.aspx</FIELD><FIELD NAME="domain">portal.ornl.gov</FIELD><FIELD NAME="tld">gov</FIELD><FIELD NAME="path">https://xxxxxxx.gov/xxxx/xxxx/xxxx/xxxx/Pages/Vacation.aspx</FIELD><FIELD NAME="processingtime">2013-02-13T04:05:49Z</FIELD><FIELD NAME="docdatetime"/><FIELD NAME="size">53180</FIELD><FIELD NAME="docvector">[vacation, 1]</FIELD><FIELD NAME="docaclsystemid">win</FIELD><FIELD NAME="author">Willoughby, Thomas G.</FIELD><FIELD NAME="createdby"/><FIELD NAME="fileextension">ASPX</FIELD><FIELD NAME="isdocument">true</FIELD><FIELD NAME="modifiedby">Willoughby, Thomas G.</FIELD><FIELD NAME="account"/><FIELD NAME="assignedto"/>

 

Issue is we are trying to query over the path using a scope filter based on the domain:

$scope.Filter = "or(path:starts-with("https://xxxx.xxxx.gov/abcd/"), path:starts-with(https://xxxxx.xxxxx.gov/abcd/"))

So a scope filter over the domain wasn't returning the results from the Enterprise Crawler because it only stores the relative path portion of the URL.

Instead of creating the scope over the path managed property, you should instead use the 'urls' managed property.  This stores the same consistent full URL path for the SharePoint crawler or Enterprise Crawler:

Result 1(from SharePoint crawler content):
< FIELD NAME="urls">https://xxxx.xxxx.gov/xxxx/xxxx.aspx</FIELD>

Result 2(from Enterprise crawler content):
< FIELD NAME="urls">https://xxxx.xxxx.gov/xxxx/xxxx.cfm</FIELD>

So the scope filter over urls for the domain will correctly match.

The easiest way to determine this was the issue was to use a tool like the following to examine the xml query output for results from the collection crawled by the fast search enterprise crawler:
https://fastforsharepoint.codeplex.com/
https://fs4splogger.codeplex.com/

That enables us to easily examine the contents of the managed property the scope was created against so we could understand why it was failing for documents from a specific content source.