eG Monitoring
 

Measures reported by MsSqlAlsRcvryPtTest

Recovery Point Objective (RPO) is defined as the amount of acceptable data loss or the point in time up to which the data can be recovered. Whenever a failover is detected, the administrators may want the secondary database to take over quickly from the primary database. If large quantity of data is not transferred to the secondary database from the primary database, then the users have to wait for a longer period to access the databases during failover. Often there would be a minimal data loss when a failover is in progress. This data loss may be due to the time lag that occurs during synchronization that happens between the primary and secondary databases. If the time taken is too long, it indicates that the synchronization process between the primary and secondary databases is taking too long to complete. This in turn will affect the users who will be compelled to wait for a prolonged time period to access the databases. To avoid such scenarios, it is essential to monitor the recovery point objective of the SQL server. The MsSqlAlsRcvryPtTest test helps administrators in this regard.

This test reports the amount of logs that had not been synchronized with the secondary database and the amount of hardened logs that are yet to be applied to the secondary database. In addition, this test helps administrators to analyze the time duration for which the log records were waiting in the redo queue before being rolled to the secondary database. This way, administrators may be proactively alerted to fine tune the time taken to roll the log to the secondary database so that the synchronization process completes in a quick and hassle free manner.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Log_bytes_flushed Indicates the rate at which log bytes were flushed to the secondary database to complete synchronization since the last recovery point. Flushes/sec If the value of this measure is consistently increasing, then it indicates that the potential data loss can increase indefinitely.
Log_snd_qsize Indicates the amount of log that had not been sent to the secondary database from this database to complete synchronization. KB Ideally, the value of this measure should be zero. A high value for this measure indicates that this much of data is unavailable in the secondary database during failover which directly implies that the customers would experience this data loss equal to this measure.
Redo_qsize Indicates the total number of kilobytes of hardened log that currently remain to be applied to the secondary database to roll it forward. KB A low value is desired for this measure.
Redo_rate Indicates the rate at which log records were rolled forward on the secondary database from this database. KB/sec  
Pending_log_rcvrytime Indicates the time duration for which the log records were waiting in the redo queue until being rolled forward to the secondary database. Secs Ideally, the value of this measure should be low.
Pending_log_flushtime Indicates the time duration for which the logs were in the send queue until being flushed completely to the secondary database. Secs