One of the great things about working with search is seeing all of the great facets of the product. One of the more intriguing aspects I have seen is the faceted searching inside of Microsoft Office SharePoint Server (MOSS). So what is faceted search? To understand that we have to look at what makes up facets. I like to think of facets as the data behind the data, in short this is the metadata and all of your digital information has it. Metadata can be as simple as information like, date, type of file, author...etc, or complex like, case number, project status, patient ID and so on. When we work with information inside of organizations we tend to be very descriptive in defining the taxonomy of the metadata to insure that it meets the needs of our business.
So in a nutshell, faceted searching is simply using all the facets (or metadata) of data to see the other potential related documents. Anyone can run generic faceted searches, they are typically the tag followed by a colon i.e. genre: rock, filetype: .pptx...etc. However, doing a full faceted search can be very complex to setup a single query so you can add faceted search to your MOSS implementation with some code, I recently learned about. The code creates a set of web parts that provide intuitive way to refine search results by category (facet). The facets are implemented using SharePoint API and stored within native SharePoint METADATA store.
The nice thing about looking at search from this angle is that it doesn't matter how facets are crawled into the METADATA store. Any core MOSS functionality will work the same: indexing through BDC, external web sites via HTTP protocols, or local sharepoint sites, libraries, lists. As soon as the content is indexed and META tag assigned, it is available for facets.
There has been a updated version of the Faceted Search code for MOSS located on codeplex: http://www.codeplex.com/FacetedSearch
Version 2 includes the following:
- Multi-thread processing. 1st thread runs for up to 500 facets synchronously, while the 2nd thread is running asynchronously against up to ~30,000 facets
- Client side refresh (not AJAX) that updates only Facets web part w/o page refresh
- Web part connections to pass Facet settings to the Bread Crumbs
- Extended facet schema now supports:
- Facet icons. Default icon per Facet name complimented by an icon per Facet value
- Friendly names for facet values
- Exclusions. Allow exclude facet when values match pattern
- Built-in wild-card match, especially useful for exclusions
- Improved search syntax, added supports for sentences and quoted phrases