How to View SharePoint Online Crawl Logs in Office 365


PFE Srinivas Varukala discusses a recent customer question where there was a request to view the SharePoint Online crawl logs.


Update 9-8-2017:

This article will soon be obsolete since we now do not allow creation of eDiscovery cases in SharePoint Online.

Please refer to:

https://blogs.msdn.microsoft.com/svarukala/2017/08/09/how-to-download-or-view-sharepoint-online-search-crawl-logs-using-security-compliance-center/

 

 

 

One can view the Crawl Logs in SharePoint Online using the Office 365's eDiscovery site template and by granting the eDiscovery managers security group the appropriate permissions to read crawl logs from your tenant.

Since I was not able to find a step by step guide to accomplish this, I thought this blog post will be helpful to the community members who are looking for such need.

Below are the steps that you need to follow:

1. Login to your Office 365 tenant then navigate to your SharePoint admin center. If you don’t already have an eDiscovery template based Site Collection, then go ahead and create a new Private Site Collection based on eDiscovery Center site template. I am not going to document the detailed steps as it is well documented here. Below is a screenshot for “new site collection” creation form.

Sharepoint Online Create Site Collection Using eDiscovery Center Template

2. It is very important you follow closely the steps outlined in the KB article when creating an eDiscovery Center, particularly steps 1 through 4; I will also summarize here:

a. eDiscovery managers need the necessary permissions to search for content in SharePoint Online sites and Exchange Online mailboxes, place content on hold, and export the search results. A good way to assign permissions to a group of people is to create a security group in Exchange Online, add members to the security group, and then assign eDiscovery-related permissions to the security group in SharePoint Online and in Exchange Online.

b. Once the security group is created, assign the following permissions to it:

i. Assign owner permissions to the eDiscovery center site that was created in step 1 in this article.

ii. Add the security group as the Site Collection Administrator for all the SharePoint Site Collections in your tenant that contain searchable content.

iii. Assign the security group read permissions to the crawl logs for your SharePoint Online organization. This lets eDiscovery manager view and download crawl log errors. Below are the steps to follow to complete this step:

In the Office 365 admin center, choose Admin > SharePoint.

In the SharePoint admin center, click Search.

On the Search administration page, click Crawl Log Permissions.

In the Crawl Log Permissions box, type the name of the eDiscovery manager’s security group, and then click OK.

3. Log in to the eDiscovery Center site using the eDiscovery manager account. Then Create a new case. You can either use the handy ‘Create new case’ button on the home page or use the left-side quick launch navigation to click on the ‘Cases’ link and then use the ‘Click to create new case’ link on that page.

Sharepoint Online Create New eDiscovery Case

4. Step 3, essentially creates a new subsite of eDiscovery case site template. In my case, I created a new case named ‘TestCase’. The URL to my case looks like: https://mytenant.sharepoint.com/sites/crawldisco/testcase.

Sharepoint Online New eDiscovery Case

5. Navigate to your case site and now create a new eDiscovery set either using the left-side navigation or using the +new item under ‘Identify and Hold’ section.

a. Give it a name. In my case I named it as ‘TestSet’.

b. In the Sources section, click the ‘Add & Manager Source’ link, which will open a popup box. In this popup box, in the Locations section, ensure you add all the Site Collection URLs that contain searchable content and for which you want to collect the crawl logs. After adding all the sources, then click Ok. Here is a screenshot from my tenant:

Sharepoint Online eDiscovery Set Name

c. You can click on the ‘Get Statistics’ button to show the Items count and size.

d. You can also click on the ‘Preview Results’ button which open a popup, then click on the ‘SharePoint’ tab to view the results listed.

e. Finally, ‘Save’ your eDiscovery set.

6. Now create a new Query. You can do this from left-side navigation or using the +new item link under ‘Search and Export’.

a. Give it a name. In my case I gave it as ‘TestQuery’ .

b. For ‘Sources’ click on ‘Modify Query Scope’ and in the pop select the eDiscovery set that was created in step 5.

c. Now click on the ‘Search’ button to see the statistics (items and size) updated under Sources and also check the results listed in the SharePoint tab down below.

d. Save your query.

7. Open up your Query and click the ‘Export’ button at the bottom of the page. This will lead to Export: New item page. In this page, we don’t have to change anything as we are not concerned with Exchange mailbox logs or the versions of SharePoint documents.

Note: If you checked the ‘Include versions for SharePoint documents’ checkbox, then be aware that this could increase the file size of the export depending on the size of your libraries that has versioning enabled.

8. Click OK on this page.

9. Click Download Report.

10. If you are exporting content for the first time on a computer, you will be prompted to install the Discovery Download Manager. Click Yes.

11. When you are finished exporting the report, click Close.

12. Reports called SharePoint Results.csv, Exchange Results.csv, Export Errors.csv, Search Results SharePoint Index Errors.csv, and Exchange Index Errors.csv will be created on your computer.

13. The ‘Search Results SharePoint Index Errors.csv’ contains the Search crawl/index errors. This is a nice way to proactively check for any search indexing issues. Also, a good troubleshooting mechanism.

Sharepoint Online eDiscovery Index Errors

Below are the columns that are found in the CSV file:

1. DocId

2. URL

3. Error Code

4. Error Type

5. Error Message

6. Last Crawl Attempt (Date)

7. Source (URL)

8. Source Name

Sample data for the Error Code, Error Type and Error Message:

Error Code

Error Type

Error Message

17

Warning

The object was deleted.

15

Container Error

This item and all items under it will not be crawled because the owner has set the

NoCrawl flag to prevent it from being searchable (SearchID = GUID VALUE)

497

Container Error

An unrecognized HTTP response was received when attempting to crawl this item.

Verify whether the item can be accessed using your browser. ( SearchID = GUID VALUE)

21

Warning

This item was truncated because the parsed output was greater than the maximum number of allowed characters. (Max output size of 2000000 has been reached while parsing!; ; -1 (0): Parsing error parsing invalid JSON input stream.; SearchID = GUID VALUE)

 



Posted by MSPFE editor Rhoderick Milne.