eG Monitoring
 

Measures reported by HanaGCStatsTest

Multiversion Concurrency Control (MVCC) is a concept that ensures transactional data consistency by isolating transactions that are accessing the same data at the same time. To do so, multiple versions of a record are kept in parallel. Issues with MVCC are usually caused by a high number of active versions. Old versions of data records are no longer needed if they are no longer part of a snapshot that can be seen by any running transaction. These versions are obsolete and need to be removed from time to time to free up memory. This process is called Garbage Collection (GC) or Version Consolidation. It can happen that a transaction is blocking the garbage collection. The consequence is a high number of active versions and that can lead to system slowdown or out-of-memory issues.

Garbage collection is used to remove old versions of data objects from the system. Afterward, transactions cannot reference these old versions. References to these objects are kept in history (cleanup) files, which are processed by the garbage collector. Cleanup files contain deleted information which is kept because of MVCC isolation requirements. When the transaction completes, garbage collection uses the cleanup files to finally remove data.There are different kinds of garbage collection in SAP HANA environments such as Rowstore version consolidation, Column store version consolidation, Memory garbage collection, Persistence garbage collection, LOB garbage collection, liveCache garbage collection, and Calculation engine garbage collection. Any problems with garbage collections can inturn lead to critical issues such as increased memory requirements, increased disk space utilization, and performance degradations upto system standstills. Hence it is imperative to monitor the garbage collection processes.

The purpose of this test is to collect metrics related to garbage collection in each volume and provide insight into the health of the garbage collection process if it is running and working efficiently. Looking at the metrics, administrators can determine the count of started, processed and queued jobs, and also the rate of queued and processed jobs,which helps them to investigate any bottleneck conditions before the entire system starts running out of space.

Outputs of the test: One set of results for each volume ID in the target database server instance being monitored

Descriptor: Volume ID

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
GCWaiters Indicates the number of garbage collection waiters in this volume. Number

 

StartedJobs Indicates the number of cleanup jobs that are started by the garbage collection process in this volume during the last measurement period. Number

 

ProcessedJobs Indicates the number of undo files that are processed for cleanup by the garbage collector in this volume during the last measurement period. Number

Undo files contain information needed for transaction rollback and these files are removed when the transaction completes. If data is deleted but must still be accessible because of MVCC isolation, then the corresponding information is written to cleanup files. At the end of the transaction, cleanup files are passed to history management. Garbage collection uses the cleanup files to finally remove data. Undo files and cleanup files may be cached and reused because of performance issues.

QueuedJobs Indicates the number of all garbage collection queue loads in this volume during the last measurement period. Number

A low value is desired for this measure. A high value of this measure indicates problems in garbage collection which can be due to long running transactions that blocks the garbage collection process. It leads to accumulation of GC jobs in the queue, and eventually causing overflow of the version store and loss of critical information.

ProcessedJobsRate Indicates the rate of processing undo files for cleanup by the garbage collector in this volume during the last measurement period. Processed/second

 

QueuedJobsRate Indicates the rate of all garbage collection queue loads in this volume during the last measurement period. Queued/Second

A high value for this measure is an indication of performance problems.