eG Monitoring  

Measures reported by EndecaDataLayerTest

Endeca Server is the control center for Endeca data stores. Endeca Server uses a database-like operational model to manage the Endeca data stores running on the machine. The Endeca data stores are created and controlled by the server using a set of commands. Each Endeca data store is serviced by a Dgraph process. The Dgraph uses proprietary data structures and algorithms that allow it to provide real-time responses to client requests. It stores the data files that were created from loading the data into it. After the data files are stored, the Dgraph receives client requests via the application tier, queries the data files, and returns the results. Any slow down in processing the client requests indicate a serious processing bottleneck on the server, probably because Dgraph is taking too long a time to process data files. Administrators should hence continuously track the time taken by the Dgraph to process the data in the data files and return the result to client requests.

This test auto-discovers the data files of the Endeca server, and reports the time taken to flush, commit and merge the data into each data file. Additionally, this test reveals the time taken to release data from each data file and abort any I/O activity on each data file. Using this test, administrators can isolate those data files on which data is taking too long to be flushed/committed.

Outputs of the test : One set of results for every datafile on the Endeca search application being monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Flush_avg Indicates the time taken to flush the data from this data file. Seconds The value of this measure depends on how frequent the data must be flushed.

Compare the value of this measure across data files to identify the data file on which flushing the data took too long to complete.
Merge_avg Indicates the time taken to merge the data in this data file. Seconds Ideally, the value of this measure should be low. A high value could indicate bottlenecks while merging.

Compare the value of this measure across data files to identify the data file on which merging the data took too long to complete.
Abort_avg Indicates the time taken to abort a process on this data file. Seconds The value of this measure should be low.

Compare the value of this measure across data files to identify the data file on which aborts took too long to complete.
Commit_avg Indicates the time taken to commit the data to this data file. Seconds Any changes made during the process must be immediately made permanent and hence the value of this measure should be very low.

Compare the value of this measure across data files to identify the data file on which data commits took too long to complete.
Release_avg Indicates the time taken for release the data from this data file. Seconds The value of this measure should be very low.

Compare the value of this measure across data files to identify the data file on which releasing the data took too long to complete.