I stumbled upon an interesting issue yesterday with my former colleague Hans Herrgott and decided that it is worth to write about it, to help people who might struggle with the same one at some point in the future.
SharePoint Search cannot crawl SharePoint content sources. SharePoint sites can be succesfully accessed using the browser (from the crawling server), but the crawl component is not able to reach out to the content during any kind of crawl. The crawl log is complaining as follows:
The start address https://portal.contoso.com cannot be crawled.
Context: Application 'Search_Service_Application', Catalog 'Portal_Content'
Details: Crawling of this item failed, HTTP 504: Gateway Timeout. Try accessing the item using a browser on the crawl machine. If URL is accessible through the browser, it is possible that the crawl targets for that host are not configured correctly. Please contact the Host Administrator for assistance.
A WinHTTP system proxy was set. This prevented the crawl component to reach out to the URL configured within the content source. WinHTTP proxy settings are per-machine, not per-user and are separate from the proxy settings in Microsoft Internet Explorer.
This issue was caused by a system proxy configuration on the server that is running the crawl component. It can be resolved by using the command prompt on the server, by entering "netsh winhttp reset proxy". This command clears the proxy configuration on the affected machine. If the WinHTTP Proxy is required for other reasons, a bypass (exclusion) list can be configured, this should include all SharePoint Web Application URLs.
See Netsh Commands for Windows Hypertext Transfer Protocol (WINHTTP) for further information about the netsh tool.
Please share your thoughts & experiences in the comments. Any feedback to this post is highly welcome.
Thank you for rating this article if you liked it or it was helpful for you.