eG Monitoring
 

Measures reported by OraRacChkPntTest

The checkpoint process is responsible for updating file headers in the database datafiles. A checkpoint occurs when Oracle moves new or updated blocks (called dirty blocks) from the RAM buffer cache to the database datafiles. A checkpoint keeps the database buffer cache and the database datafiles synchronized. This synchronization is part of the mechanism that Oracle uses to ensure that your database can always be recovered.

Check-pointing is an important Oracle activity which records the highest system change number (SCN,) so that all data blocks less than or equal to the SCN are known to be written out to the data files. If there is a failure and then subsequent cache recovery, only the redo records containing changes at SCN(s) higher than the checkpoint need to be applied during recovery.

Key checkpoint-related activities may generate wait events. For instance, SQL statements may have to wait for processing until the DBWR (database writer) finishes writing dirty blocks in the buffer cache to the datafiles. If too many such wait events occur on an instance, it may cause the performance of the Oracle cluster to deteriorate. It is hence essential to keep close tabs on the checkpoint-related wait events and the activity responsible for them.

This test auto-discovers the wait event types related to the checkpoint process, and reports the number of events of each type that have occurred in each instance of an Oracle RAC.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Check_point_count Indicates the number of wait events of this type that have occurred on this instance during the last measurement period. Number Ideally, the value of this measure should be low. A consistent increase in this value is a cause of concern, as it indicates that a checkpoint-related activity is not getting completed, resulting in the generation of numerous wait events and degrading the overall performance of the Oracle RAC. Compare the value of this measure across the event types to determine which type of wait event has occurred most frequently on an instance.