eG Monitoring
 

Measures reported by AlibabaEngineTest

ApsaraDB for RDS is a stable, reliable, and scalable online database service. It is designed based on the Apsara Distributed File System and high-performance SSD storage media of Alibaba Cloud. It supports five database engines: MySQL, SQL Server, PostgreSQL, PPAS (compatible with Oracle), and MariaDB.

ApsaraDB RDS for MySQL is developed based on a branch of the MySQL source code. It provides basic features, such as instance management, account management, database management, backup and restoration, control access, Transparent Data Encryption (TDE), and data migration. ApsaraDB RDS for MySQL also provides advanced features and functions such as ApsaraDB for MyBase dedicated cluster, Read-only RDS instances to which read requests can be offloaded in times of an overload, Read/write splitting, a dedicated proxy, and the Database Autonomy Service that supports intelligent diagnostics and optimization at the instance level.

At the architecture level, ApsaraDB RDS for MySQL supports the InnoDB storage engine of MySQL. InnoDB is a general-purpose storage engine that balances high reliability and high performance. Quick query processing is a quality that any user would expect from a storage engine that promises high performance levels. To ensure that users stay loyal to the MySQL engine therefore, administrators must first measure the query processing ability of the engine, proactively capture slowness (if any), and accurately identify the type of SQL queries/statements (select, insert, update, replace, delete etc.) that are the slowest. If SELECT statements-i.e., read requests-are being processed slowly, then the administrators should try to fathom the reason for this slowness and fix it!

Poor buffer pool usage is one of the common causes for read requests to be processed slowly. The buffer pool is an area in main memory where InnoDB caches table and index data as it is accessed. The buffer pool permits frequently used data to be accessed directly from memory, which speeds up processing. If the buffer pool is inadequately sized, it will not be able to hold many frequently accessed objects in memory, owing to which, it will be unable to service read requests promptly. This can result in a spike in direct disk accesses; these are resource-intensive operations that can significantly slowdown query processing. To avoid this, administrators need to monitor buffer usage continuously, proactively detect sub-optimal usage, and promptly initiate measures to improve buffer usage and reduce disk accesses.

Besides query performance, the resilience of the storage engine to failures should also be checked from time-to-time. This is why, it is important that administrators monitor reads and writes to redo logs on a regular basis.

To perform all of the key monitoring tasks discussed above, administrators can periodically run the ApsaraDB MySQL Engine Monitoring-RDS Test. For every RDS instance for MySQL, this test monitors the speed with which the InnoDB storage engine executes queries, and in the process, reveals slowness (if any) in query processing. Moreover, by monitoring the SQL statements executed by the engine on each instance, the test reveals the level of SQL activity on the engine and how quickly the engine processes these statements. The types of SQL queries/statements that are being processed slowly can be pinpointed in the process. Also, the test periodically monitors the buffer pool and alerts administrators if a majority of read requests are serviced by direct disk accesses and not by the buffer pool. This way, the AlibabaEngineTest test enables administrators to determine whether poor buffer pool usage is the reason why read queries are processed slowly by an instance. Furthermore, reads and writes to redo logs per instance are also monitored, and abnormalities (if any)-eg., sluggishness in reading from or writing to logs, forced writes etc.-are promptly brought to the attention of administrators.

Outputs of the test : One set of results for every RDS instance for MySQL.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Average_qps Indicates the rate at which this instance processes queries. Queries/Sec A high value is desired for this measure. A low value signifies slowness in query processing. Compare the value of this measure across RDS instances to know which instance is processing queries slowly.
Average_tps Indicates the rate at which transactions are processed by this instance. Transactions/Sec A high value is desired for this measure. A low value signifies slowness in transaction processing. Compare the value of this measure across RDS instances to know which instance is processing transactions slowly.
Buffer_read_ratio Indicates the rate at which this instance read from the InnoDB buffer pool. Reads/Sec  
Buffer_usage_ratio Indicates the percentage of read requests to this instance that were serviced by the buffer pool. Percent Ideally, the value of this measure should be over 80%. A low value is indicative of poor cache usage. This in turn can cause queries to be served by directly accessing the disk, thereby degrading query performance and overall database health. To avoid this, the buffer pool should be well-sized, so that it can store many objects in memory. This will reduce the likelihood of cache misses, thereby improving query performance significantly.
Buffer_dirty_ratio Indicates the percentage of dirty data blocks in the buffer pool of this instance. Percent InnoDB performs certain tasks in the background, including flushing of dirty pages from the buffer pool. Dirty pages are those that have been modified but are not yet written to the data files on disk. Ideally, very few dirty data blocks should only remain in the buffer pool at a given point in time. This is because, if changes are not written to the data files in disk at a brisk pace, the risk of data loss becomes higher if the database instance crashes suddenly.

Buffer pool flushing is initiated when the percentage of dirty pages reaches the low water mark value defined by the innodb_ max_ dirty_ pages_pct_lwm variable. The default low water mark is 10% of buffer pool pages. A innodb_max_dirty_pages_ pct_lwm value of 0 disables this early flushing behaviour.

The purpose of the innodb_max_dirty_ pages_ pct_lwm threshold is to control the percentage dirty pages in the buffer pool and to prevent the amount of dirty pages from reaching the threshold defined by the innodb_max_dirty_pages_pct variable, which has a default value of 90. InnoDB aggressively flushes buffer pool pages if the percentage of dirty pages in the buffer pool reaches the innodb_max_dirty_pages_pct threshold.

By tuning these variables, you can make sure that dirty page flushing happens frequently, thereby ensuring data integrity.
InnoDB_read_volume Indicates the rate at which data this instance reads data. KB/Sec  
InnoDB_write_volume Indicates the rate at which this instance writes data. KB/Sec  
Buffer_read_freqcy Indicates the rate at which this instance reads data from the buffer pool. Reads/Sec A high value for this measure is indicative of a healthy buffer pool.
Buffer_write_freqcy Indicates the rate at which this instance writes data to the buffer pool. Writes/Sec Changes should be written quickly to the in-memory buffer, so that the buffer pool serves read requests using updated data, and not obsolete data. So, the value of this measure should be high.
Log_write_freqcy Indicates the rate at which write requests to the redo log files on disk are processed by this instance. Requests/Sec  
Physical_log_freqcy Indicates the rate of writes performed by this instance on the redo log files on disk. Writes/Sec  
Fsync_log_freqcy Indicates the rate at which this instance wrote to the log files on disk by calling fsync. Writes/Sec Using fsync, you can force redo logs to be written to disk. An fsync is required to keep changes to your data safe, so it’s not in an in memory buffer when power is lost.

This process should ideally be fast. That is why, a high value is desired for this measure.
No_of_table_created Indicates the number of temporary tables automatically created in the hard drive during MySQL statement execution. Number In MySQL, a temporary table is a special type of table that allows you to store a temporary result set, which you can reuse several times in a single session.

A temporary table is very handy when it is impossible or expensive to query data that requires a single SELECT statement with the JOIN clauses. In this case, you can use a temporary table to store the immediate result and use another query to process it.
Delete_state Indicates the rate at which DELETE statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of insert statement execution rate, Average of insert select statement execution rate, Average of replace statement execution rate, Average of replace select statement execution rate, Average of select statement execution rate, and Average of update statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Insert_state Indicates the rate at which INSERT statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of delete statement execution rate, Average of insert select statement execution rate, Average of replace statement execution rate, Average of replace select statement execution rate, Average of select statement execution rate, and Average of update statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Insert_select_state Indicates the rate at which INSERT_SELECT statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of delete statement execution rate, Average of insert statement execution rate, Average of replace statement execution rate, Average of replace select statement execution rate, Average of select statement execution rate, and Average of update statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Replace_state Indicates the rate at which REPLACE statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of delete statement execution rate, Average of insert statement execution rate, Average of insert select statement execution rate, Average of replace select statement execution rate, Average of select statement execution rate, and Average of update statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Replace_select_state Indicates the rate at which REPLACE_SELECT statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of delete statement execution rate, Average of insert statement execution rate, Average of insert select statement execution rate, Average of replace statement execution rate, Average of select statement execution rate, and Average of update statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Select_state Indicates the rate at which SELECT statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of delete statement execution rate, Average of insert statement execution rate, Average of insert select statement execution rate, Average of replace statement execution rate, Average of replace select statement execution rate, and Average of update statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Update_state Indicates the rate at which UPDATE statements were executed by this instance. Executions/Sec If the value of the Average query rate measure increases consistently for an instance, then compare the value of this measure with that of the Average of delete statement execution rate, Average of insert statement execution rate, Average of insert select statement execution rate, Average of replace statement execution rate, Average of replace select statement execution rate, and Average of select statement execution rate measures of that instance to know what type of queries/statements are slowing down query processing.
Line_read Indicates the rate at which rows are read by this instance from the InnoDB tables. Rows/Sec Use these measures to understand whether/not any instance experiences slowness when working with InnoDB tables. If so, then these metrics will also point you to the exact type of operation that was the slowest-insert, delete, update, or select.
Line_update Indicates the rate at which rows in InnoDB tables are updated by this instance. Rows/Sec
Line_delete Indicates the rate at which this instance deletes rows from InnoDB tables. Rows/Sec
Line_insert Indicates the rate at which this instance inserts rows into InnoDB tables. Rows/Sec
Line_write Indicates how many rows are written to log files in disk every second by this instance. Rows/Sec A high value is desired for this measure.