Measures reported by SPFlowStatTest
Content processing in Sharepoint is performed by the content processing component (CPC) and the index component.
The Content Processing Component (CPC) uses Flows and Operators to process the content. Flows define how to process content, queries and results and each flow processes one item at a time. Flows consist of operators and connections organized as graphs. This is where activities like language detection, word breaking, security descriptors, content enrichment (web service callout), entity and metadata extraction, deep link extraction and many others take place. The flow has branches that handle different operations, like inserts, deletes and partial updates.
Once content is processed by the CPC, the index component receives the processed items from the CPC and writes them to the search index. The index component also handles incoming queries, retrieves information from the search index, and sends back the result set to the query processing component.
Whether it is the CPC that fails to process the content rapidly or the index component that writes to the index slowly, what suffers is the end-user’s experience with Sharepoint search! To ensure that Sharepoint delivers to users a fast and flawless searching experience, administrators should not only be able to detect slowdowns before they impact query processing, but also tell where the slowdown originated - is it with the CPC or the index component? The SPFlowStatTest test answers this question accurately! This test monitors the flows on CPC, keeps track of documents that are in queue waiting to be processed by the flows, and reports how quickly the CPC and the index component process the enqueued contents. While at it, the test points to potential bottlenecks in content processing and accurately isolates the source of the bottleneck - is it the CPC or the index component?
Output of the test : One set of results for the Sharepoint server monitored
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| item_total |
Indicates the total number of items placed on input queues. |
Number |
|
| item_queue |
Indicates the number of items that are currently in queues in front of input operators that are ready for processing. |
Number |
A high value or a consistent increase in the value of this measure is indicative of bottlenecks in content processing. |
| thread_active |
Indicates the number of threads that are currently active. |
Number |
|
| input_queue_empty |
Indicates the total time spent waiting for space to become available on input queues. |
Millisecs |
If this value is high (say, over a thousand), it indicates that the CPC is taking a long time to process the contents in the input queues and free up the queues! You may then want to check the processor usage on the CPC component. If this is very high, it is a clear indication that the CPC is stressed and could be the key contributor to the slowdown in content processing.
On the other hand, if the value of this measure is low (say, less than a thousand), it indicates that the input queues are getting cleared very quickly. This implies that the CPC is processing content quickly. In this case, check the disk I/O and latency on the index component. If these parameters are high, it implies that the index component is stressed and is unable to handle the load imposed by the CPC. You can thus conclude that the bottleneck lies with the index component. |
| input_queue_full |
Indicates the total number of client polls since the start of the component. |
Number |
Each time a client refreshes the session to check for callbacks this measure will be incremented. |
| client_submit |
Indicates the total number of submits performed by clients since the start of the component. |
Number |
|
| document_skipped |
Indicates the total number of documents skipped in the submission service before being delivered to the content processing component. |
Number |
A non-zero value is desired for this measure. A high value is disconcerting as it indicates that too many crawled documents are not reaching the CPC for processing as the CSS disregards them. Further investigation into the reasons is necessitated. |
| document_timeout |
Indicates the total number of documents that timed out in the submission service. |
Number |
A low value is desired for this measure. A high value implies that the search index may not include many crawled documents as they have been timed out of the submission queue itself. This in turn may result in ineffective search queries. You may hence want to reset the timeout value for documents in the submission service. |
| flows_feeding |
Indicates the current number of flows used for feeding. |
Number |
The CPC uses Flows and Operators to process the content. Flows define how to process content, queries and results and each flow processes one item at a time. The number of current flows is hence an indicator of the number of documents that are being processed by the CPC. |
| pending_item |
Indicates the current number of items delivered to the content processing component but where no callback has yet been received. |
Number |
A high value or a consistent rise in the value for this measure could indicate a bottleneck in content processing. |
|