eG Monitoring
 

Measures reported by CassNodeDetTest

Cassandra is a distributed database system using a shared nothing architecture. A single logical database is spread across a cluster of nodes and thus the need to spread data evenly amongst all participating nodes. Data is evenly divided around its cluster of nodes. Each node is responsible for a portion of data.

If a node becomes unreachable, then, that portion of data may become unavailable, provided the data is not replicated to other nodes. Therefore, it becomes mandatory to monitor the nodes that are unavailable and unreachable. The CassNodeDetTest test helps administrators to keep track on the nodes constantly.

This test helps administrators to figure out the count of the nodes that are unavailable and unreachable. In addition, this test also reports the number of nodes that joined the cluster, left the cluster and migrated to another cluster. Using this information, administrators can figure out the nodes that are frequently unreachable and initiate further investigation to avoid the nodes from being unreachable too often.

Ouputs of the test: One set of results for the target Cassandra Database node being monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Available_nodes Indicates the number of nodes available in the cluster. Number The detailed diagnosis of this measure lists the IP address of the nodes that are available in the cluster.
Joining_nodes Indicates the number of nodes that joined the cluster. Number The detailed diagnosis of this measure lists the IP address of the nodes that joined the cluster.
Leaving_nodes Indicates the number of nodes that left the cluster. Number The detailed diagnosis of this measure lists the IP address of the nodes that left the cluster.
Moving_nodes Indicates the number of nodes that were migrating to another cluster from the cluster. Number The detailed diagnosis of this measure lists the IP address of the nodes that migrated to another cluster.
Unreach_nodes Indicates the number of nodes that were unreachable in the cluster. Number

The detailed diagnosis of this measure lists the IP address of the nodes that are unreachable.

By identifying the nodes that are unreachable, administrators can initiate further investigation on why the nodes are unreachable and troubleshoots the issues related to those nodes.