eG Monitoring
 

Measures reported by PgRpoTest

Recovery Point Objective (RPO) is the maximum tolerable amount of data you can afford to lose in case of a potential PostgreSQL database server crash. In a high availability setup, the master and the replica servers should always be in sync. If the master database server crashes before data is synced with the replica servers, then, a significant amount of data will be lost. Generally, administrators do not wish to lose data in case of failures/crashes. To avoid such data loss, it is essential for the administrators to periodically keep track on the amount of data that each replica server is lagging behind i.e., the amount of data that is still more required for the replica server and the master to be in sync. The PostgreSQL Replication RPO test helps administrators in this regard!

In a high availability setup, this test auto-discovers the replica servers connecting to the target PostgreSQL database server (which is the master) and for each replica server, reports the amount of data that is yet to be synced from the master. Using this test, administrators can figure out the replica server that is more vulnerable to data loss in case of potential server crash and fine-tune backup schedules accordingly.

Note:

This test will report metrics only if the monitored target PostgreSQL database server is the master in a high availability setup.

Outputs of the test : One set of results for each replica server connecting to the target PostgreSQL server

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Replication_lag_size Indicates the amount of data that is yet to be synced to this replica server. MB Compare the value of this measure across clients to figure out the replica server that is more vulnerable to data loss.

The detailed diagnosis of this measure lists the Process ID, OID of the user, User name, Application name, Sent location and Replay location.