eG Monitoring
 

Measures reported by OSSsrchGatherTest

The search architecture contains search components and databases. The Crawl component crawls content sources to collect crawled properties and metadata from crawled items and sends this information to the Content Processing component. Crawling is the process of gathering the content for search. To retrieve information, the crawl component connects to the content sources by using the proper out-of-the-box or custom connectors. After retrieving the content, the Crawl Component passes crawled items to the Content Processing Component. It also stores tracking information and historical information about crawled items such as documents and URLs in the Crawl database.

The Content Processing component transforms the crawled items and sends them to the index component. This component also maps crawled properties to managed properties. Additionally, this component stores unprocessed information it exracts in the Link database, so that the Analytics component can carry out search and usage analytics using the information.

The Index component receives the processed items from the Content Processing component and writes them to the search index. This component also handles incoming queries, retrieves information from the search index and sends back the result set to the Query Processing component.

Since the success of the search function depends upon how well the Crawl and Content Processing components perform their duties, administrators need to keep their eyes open for irregularities in the functioning of these two components, so that such anomalies are detected instantly and corrected before they can stall searching.

This test monitors the crawl/gatherer component and the content processing components, and reports issues in its performance (if any).

Outputs of the test: One set of results each for the Microsoft SharePoint server that is being monitored

Descriptor:Microsoft SharePoint server

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Documents_crawl Indicates the number of documents in the crawl history since the gatherer service was started. Number

Too many documents in the crawl history may unnecessarily clutter the Crawl database. If the value of this measure rises consistently, you may want to cleanup the crawl history to conserve storage space.

Doc_crawl Indicates the number of documents currently waiting in the crawl queue. Number

If the value of this measure increases with time, it could mean that the Crawl component is crawling content very slowly.

Links_waiting Indicates the current number of links that are waiting to be processed. Number

If the value of this measure increases with time, it could mean that the Content Processing component is experiencing bottlenecks when processing links.

Links_processed Indicates the number of links that were processed since the gatherer service was started. Number