|
Measures reported by AlibabaRDSTest
ApsaraDB for RDS is a stable, reliable, and scalable online database service. Based on Apsara Distributed File System and high-performance SSD storage of Alibaba Cloud, ApsaraDB for RDS supports the MySQL, SQL Server, PostgreSQL, PPAS (highly compatible with Oracle), and MariaDB database engines. It provides a portfolio of solutions for disaster recovery, backup, restoration, monitoring, and migration to facilitate database operations and maintenance.
The first step to using RDS is to create an RDS instance. An instance is a virtualized database server on which you can create and manage multiple databases. If a cloud user complains that he/she is unable to access their database on an RDS instance, administrators need to quickly figure out why it is so-is it because the instance hosting the database is down? is the instance rebooting? is the instance being deleted? or is the instance being locked? Moreover, the administrator also needs to ensure that each instance is sized with adequate CPU, memory, network, and storage resources, so that no instance experiences any performance degradation. If it does, then administrators should be able to identify the resource-starved instances and right-size them, before users notice any slowness. The AlibabaRDSTest test helps with this and much more!
This test tracks the availability, operational state, and lock mode of every RDS instance, and alerts administrators to unavailable instances, those that are in an abnormal state currently, and locked instances. Additionally, the test reports the CPU, memory, connection, disk space, and I/O capacity
of each instance, and also measures how every instance uses the allocated capacity. In the process, the test pinpoints which instance is hogging which resource. With the help of these diagnostics, administrators can proactively identify and promptly eliminate issues hampering the overall performance of and user experience with the virtual database server instances.
Outputs of the test : One set of results for each RDS instance.
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Status |
Indicates the current status of this RDS instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
| Measure Value |
Numeric Value |
| Creating |
1 |
| Running |
2 |
| DBInstanceClassChanging |
3 |
| Transing |
4 |
| EngineVersionUpgrading |
5 |
| TransingToOthers |
6 |
| GuardDBInstanceCreating |
7 |
| Expired and being recycled |
8 |
| Importing |
9 |
| ImportingFromOthers |
10 |
| DBInstanceNetTypeChanging |
11 |
| GuardSwitching |
12 |
| Ins_cloning |
13 |
| Rebooting |
14 |
| Deleting |
15 |
The Measure Values discussed in the table are described in detail below:
Creating: The instance is being created.
Running: The instance is running.
DBInstanceClassChanging: The instance is being upgraded or downgraded.
TRANSING: The instance is being migrated.
EngineVersionUpgrading: The database engine version of the instance is being upgraded.
TransingToOthers: The data of the instance is being migrated to another instance.
GuardDBInstanceCreating: A disaster recovery instance is being created for the instance.
Importing: Data is being imported into the instance.
ImportingFromOthers: Data is being imported into the instance from another instance.
DBInstanceNetTypeChanging: The network type of the instance is being changed.
GuardSwitching: The instance is undergoing a disaster-triggered failover.
INS_CLONING: The instance is being cloned.
Rebooting: The instance is restarting.
Deleting: The instance is being deleted.
Note:
This measure reports the Measure Values listed in the table above to indicate the current state of an RDS instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.
The detailed diagnosis of this measure reveals additional details of the RDS instance, such as, its type, version, the instance class, its port number,
connection address, its network type, VPC, and the name of the zone to which it belongs. |
| Instance_type |
Indicates the type of this instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
| Measure Value |
Numeric Value |
Description |
| Primary |
1 |
The primary instance role |
| Readonly |
2 |
The read-only instance role |
| Guard |
3 |
The disaster recovery instance role |
| Temp |
4 |
The temporary instance role |
Note:
This measure reports the Measure Values listed in the table above to indicate the role assigned to the RDS instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
| Instance_class |
Indicates the instance family/class to which this instance belongs. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
| Measure Value |
Numeric Value |
| Shared DB |
1 |
| General instance |
2 |
| Dedicated instance |
3 |
| Dedicated host |
4 |
The Measure Values discussed in the table are described in detail below:
| Measure Value |
Description |
| Shared DB |
A shared instance exclusively occupies the allocated memory resources, but shares CPU and storage resources with the other shared instances that are deployed on the same physical host. CPU resources are highly reused among shared instances that are deployed on the same physical host. This maximizes costeffectiveness. Shared instances may compete for resources. |
| General instance |
A general-purpose instance exclusively occupies the allocated memory resources, but shares CPU and storage resources with the other general- purpose instances that are deployed on the same physical host. CPU resources are moderately reused among general-purpose instances that are deployed
on the same physical host. This increases cost-effectiveness. The storage capacity of a genera-purpose instance is independent of the number of CPU cores and memory capacity. You can flexibly configure the storage capacity based on your business requirements. |
| Dedicated instance |
A dedicated instance exclusively occupies the allocated CPU and memory resources. Its performance remains stable and is not affected by the other
instances that are deployed on the same physical host. |
| Dedicated host |
The top configuration of the dedicated instance family is dedicated host. A dedicated host instance occupies all the resources on the physical host
where it is deployed |
Note:
This measure reports the Measure Values listed in the table above to indicate the instance family. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
| Lock_mode |
Indicates the lock mode of this instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
| Measure Value |
Numeric Value |
| Unlock |
1 |
| Manual lock |
2 |
| Lock by expiration |
3 |
| Lock by restoration |
4 |
| Lock by disk quota |
5 |
The Measure Values discussed in the table are described in detail below:
| Measure Value |
Description |
| Unlock |
The instance is not locked. |
| Manual lock |
The instance has been manually locked. |
| Lock by expiration |
The instance has been automatically locked upon expiration. |
| Lock by restoration |
The instance has been automatically locked before a rollback. |
| Lock by disk quota |
The instance has been automatically locked because the storage capacity is exhausted. |
Note:
This measure reports the Measure Values listed in the table above to indicate the lock mode of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
| Connection_mode |
Indicates the access mode of this instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
| Measure Value |
Numeric Value |
| Standard |
1 |
| High security |
2 |
Note:
This measure reports the Measure Values listed in the table above to indicate the connection mode of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
| Total_memory |
Indicates the memory configuration of this instance. |
MB |
|
| Total_storage |
Indicates the total storage capacity of this instance. |
MB |
|
| DBMax_count |
Indicates the maximum number of databases that can be created on this instance. |
Number |
|
| DB_account_count |
Indicates the maximum number of accounts that can be created on this instance. |
Number |
|
| Availability |
Indicates whether/not this instance is available currently. |
Percent |
While the value 100 indicates that the instance is available, the value 0 denotes that the instance is unavailable. |
| Maximum_iops |
Indicates the maximum number of I/O requests this instance can process per second. |
Number |
|
| Maximum_conn |
Indicates the maximum number of concurrent connections this instance can handle. |
Number |
|
| Total_cpu |
Indicates the total number of CPU cores allocated to this instance. |
Number |
|
| Disk_usage |
Indicates the amount of disk space that this instance is currently utilizing. |
MB |
Ideally, the value of this measure should be much lesser than the value of the Total storage measure. If this measure value is close to or is rapidly approaching the value of the Total storage measure, it implies that the instance is fast-exhausting its storage capacity. This can be detrimental to the performance of the instance. To prevent the storage crunch, you may want to configure the instance with additional storage space. Alternatively, you can compare the values of the Space occupied by data files, Space occupied by log files, Space occupied by backups, Space occupied by SQL data, and Cold backup data measures, to understand what type of data is consuming storage space. You can then see if data of any of these types can be deleted, so as to make more storage space available for critical data. |
| Data_size |
Indicates the amount of storage space of this instance that is occupied by data files. |
MB |
If the Percent usage measure of an instance is close to 100%, then you can compare the values of these measures for that instance to know what type of files is contributing to the storage crunch-data files? log files? backup files? SQL data
files? or files in cold backup? |
| Log_size |
Indicates the amount of storage space of this instance that is occupied by log files. |
MB |
| Backup_size |
Indicates the amount of storage space of this instance that is occupied by backups. |
MB |
| Sql_size |
Indicates the amount of storage space of this instance that is occupied by SQL data. |
MB |
| Cold_backup |
Indicates the amount of storage space of this instance that is occupied by cold backups. |
MB |
| Current_iops |
Indicates the rate at which this instance processes I/O operations. |
Operations/Sec |
If the value of this measure is close to the value of the Maximum I/O requests measure for any instance, it means that the I/O load on that instance is very high. To ensure that the instance does not reject/drop I/O requests, you have to ensure that the instance has adequate processing power to meet with the demand-i.e., ensure that the instance has sufficient resources (CPU, memory, storage space etc.)-and then proceed to increase the limit set for the number of I/O requests that instance can process per second. |
| Network_inbound |
Indicates the average amount of data traffic flowing into this instance. |
KB |
|
| Network_outbound |
Indicates the average amount of data traffic flowing out of this instance. |
KB |
|
| Total_badwidth |
Indicates the network throughput of this instance. |
KB |
Compare the value of this measure across instances to identify the precise instance that is consuming bandwidth excessively. |
| Total_session |
Indicates the current number of connections to this instance. |
Number |
If the value of this measure is close to the Maximum concurrent connections measure for any instance, it implies that very soon the instance may not be able to entertain new connections. Under such circumstances, you may want to check to see if there are any idle connections to the instance and terminate them, so that the instance can handle more connections. The count of idle connections is the difference between the value of the Total current connections measure and the Total currently active connections measure. Alternatively, you can also increase the concurrent connection limit of the instance. |
| Active_session |
Indicates the count of connections to this instance that are currently active. |
Number |
Ideally, the value of this measure should be equal to the Total current connections measure. If it is much lesser than the value of the Total current connections measure, it means many connections to the instance are idle/inactive. By identifying and removing such connections, you can increase the connection handling capacity of the instance. |
| Used_memory |
Indicates the amount of memory currently used by this instance. |
MB |
Ideally, the value of this measure should be much lower than that of the Total memory measure. |
| Free_memory |
Indicates the amount of memory that this instance is not using currently. |
MB |
For best performance, the value of this measure should be high. |
| Free_disk |
Indicates the amount of storage space that this instance is not using currently. |
MB |
For best performance, the value of this measure should be high. |
| Cpu_utilization |
Indicates the percentage of allocated CPU resources that is used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it denotes that the instance is hogging the CPU resources. If the instance is a shared instance or a general-purpose instance, then excessive CPU utilization by that instance can cause the other
instances on the same physical host to contend for the remaining CPU resources. In this case, you may want to increase the CPU capacity of the host. |
| Memory_utilization |
Indicates the percentage of allocated memory resources that is used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it denotes that the instance is rapidly running out of memory. Without enough memory, the instance may fail to service user requests to it. To avoid this, make sure you size the instance with sufficient memory resources. |
| Disk_utilization |
Indicates the percentage of allocated disk space that is used by this instance. |
Percent |
If the Percent usage measure of an instance is close to 100%, then you can compare the values of the Space occupied by data files, Space occupied by log files, Space occupied by backups, Space occupied by SQL data, and Cold backup data measures for that
instance to know what type of files is contributing to the storage crunch-data files? log files? backup files? SQL data files? or files in cold backup? |
| IOPS_utilization |
Indicates the percent of I/O resources that is used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it denotes that the instance is rapidly approaching the I/O request limit configured for it. To ensure that the instance services I/O requests to it without rejecting them, you may want to consider increasing the maximum number of I/O requests that instance can handle. |
| Conn_utilization |
Indicates the percent of connections used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it implies that very soon the instance may not be able to entertain new connections. Under such circumstances, you may want to check to see if there are any idle connections to the instance and terminate them, so that the instance can handle more connections. The count of idle connections is the difference between the value of the Total current connections measure and the Total currently active connections measure. Alternatively, you can also increase the concurrent connection limit of the instance. |
|