eG Monitoring
 

Measures reported by IgniteCaRebaTest

When a new node joins the cluster, some of the partitions are relocated to the new node so that the data remains distributed equally in the cluster. This process is called cache rebalancing. If an existing node permanently leaves the cluster and backups are not configured, you lose the partitions stored on this node. When backups are configured, one of the backup copies of the lost partitions becomes a primary partition and the rebalancing process is initiated.

The rebalancing process is key to balancing the cluster nodes and ensuring that data is properly spread across the nodes in the cluster. That's why it is really important to monitor rebalancing process to ensure that it is working efficiently and issues if any, are addressed before it can affect application performance.

This test monitors the rebalancing process and provides key statistics like speed, size etc., which help administrators draw insights about the efficiency of rebalancing process.

Outputs of the test: One set of results for each Apache Ignite Server

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
rebClearngPartitionsLeft Indicates the total number of partitions that need be cleared before the rebalancing process could start. Number

This given an indication of time left before rebalancing will start. If there are too many partitions to be cleared before rebalancing process can start, the overall process time will be high.

rebalancedKeys Indicates the total number cache keys which are already rebalanced during last few rebalancing sessions. Number

If there is already large percentage of keys which are already rebalanced, the rebalancing process will be quick. If the process is still taking time you may have to take a look.

rebalancingBytesRate Indicates the average speed of data transfer between the nodes during the rebalancing process. MB/Sec

If the rebalancing speed is trending upwards over a number of measurements, it might be a cause of concern.

rebalancingKeysRate Indicates the average number of keys transferred per second between the nodes. Number

 

rebalancingPartitions Indicates the number of partitions on current node which are under rebalancing. Number

If there are too many partitions being rebalanced on a given node, the process might be slow.

rebalancingStartTime The time when rebalancing of local partitions started for the cache. This metric will return 0 if the local partitions do not participate in the rebalancing. Number

If rebalancing has been going on for a very long time, the start time might be of great value to administrators.