eG Monitoring
 

Measures reported by IgniteCachesTest

Apache Ignite is a memory based distributed caching and processing platform which can be used to cache, persist and process big data.It offers various features for Fast Data paradigm like In-memory distributed caching, high performance distributes computations, fault tolerant services, node auto discovery, map-reduce processing, etc. To support these features, Ignite provides components like DataGrid- Distributed Caches, Compute Grid- Distributed Processing, Service Grid- Distributed Services, etc. The core of the Ignite is it's Cache. Ignite Cache is Key-Value store very similar to hashtable, hashmap or map data structure in various programming languages. Only requirement to create cache in Ignite is define name of cache.

Given the cache is the core of Ignite and data storage, it is important to monitor the same periodically so that any issues can be proactively managed.

This test monitors various caches created on Ignite cluster and collects key statistics like number of cache entries, evictions, updates, hit, miss percentages etc., which essentially indicate the health of cache and trend analysis on these metrics can provide key information on cache performance.

Outputs of the test: One set of results for each Apache Ignite Server

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
cacheEvictions Indicates the total number of eviction from the cache since the cache is created. Number

A healthy number of evictions would be maintained. If this number is low, the cache will eventually be full and administrators may need to increase the size.

cacheGets Indicates the total number of get operations initiated on the cache since the cache is started Number

 

cacheHits Indicates the number of successful retrievals from the cache since the cache is created. Number

All of these together provide a good measure of cache performance, if there are too many misses administrators or application programmers may have to adopt another caching strategy.

cacheMisses Indicates the number of unsuccessful attempts at retrieving entry from cache. Number

 

cacheMissPercentage Indicates the percentage of cache misses against the total number of get operations. Percentage

 

cachePuts Indicates the total number of operations where new entries were created in cache ever since cache was created. Number

Adding of new entries of cache should be accompanied by some degree of cache evictions and removals otherwise the cache will keep growing. Administrators need to keep an eye on all cache additions and removals.

cacheRemovals Indicates the total number of operations where the entry is removed from the cache. Number

 

nonNullCacheSize Indicates the total number of values in the cache which are not null. Number

This is the real measure of cache size as there may be lot of null values which are not consuming any memory.

cacheSize Indicates the total size of cache in MB. MB

 

cacheTxCommits Indicates the total number of transactions committed in the cache ever since the cache is created. Number

Ideally all the transaction should be committed and successfully completed, but in case the transactions are being rolled back repetedly and count of commits is low, the administrators or app programmers may need to redesign cache data or cache operation so that there are minimum failures of operations.

cacheTxRollbacks Indicates the total number of transactions rolled back in the cache ever since the cache is created. Number

Ideally all the transaction should be committed and successfully completed, but in case the transactions are being rolled back repetedly and count of commits is low, the administrators or app programmers may need to redesign cache data or cache operation so that there are minimum failures of operations.

keySize Indicates the total number of keys in the cache at any given time. Number

 

keysToRebalanceLeft Indicates the number of keys existing on current node which are yet to be rebalanced to other nodes. Number

Key rebalancing provides a level of resilience against the cache failure.

averageGetTime Indicates the average time it takes to execute a get operation. Seconds

These numbers should ideally remain steady over a number of transactions but in case the time taken is following an upward trend, it may be concerning and can be a sign of deteriorating performance of cache.

averagePutTime Indicates the average time it takes to execute a put operation. Seconds

These numbers should ideally remain steady over a number of transactions but in case the time taken is following an upward trend, it may be concerning and can be a sign of deteriorating performance of cache.

averageRemoveTime Indicates the average time it takes to execute a remove operation. Seconds

 

averageTxCommitTime Indicates the average time it takes to commit a transaction on the cache. Seconds