eG Monitoring
 

Measures reported by AlibabaRedisDBTest

ApsaraDB for Redis is a database service that is compatible with native Redis protocols. It supports a hybrid of memory and hard disks for data persistence. ApsaraDB for Redis provides a highly available hot standby architecture and can scale to meet requirements for high-performance and low-latency read/write operations.

To ensure that every Redis instance in use on the cloud delivers on its promise of high-performance at all times, administrators must continuously monitor the status, resource usage, query processing ability, and overall health of every instance, swoop down on potential issues, and eliminate them before they impact user experience with that instance. This is where the AlibabaRedisDBTest helps!

This test auto-discovers the Redis instances that have been configured, and reports the status of each instance. This way, the test points administrators to inactive, unavailable, and error-prone instances. Additionally, the test measures how quickly an instance processes queries, and in the process, sheds light on probable bottlenecks in query processing. The CPU, memory, connection and bandwidth usage of each instance is also monitored, so that instances experiencing serious resource contentions can be identified quickly. If the contention persists, then administrators can consider resizing the instances to resolve it. This way, the test helps accurately isolate problematic instances, so that administrators can quickly fix those problems and ensure that the critical database service is uninterrupted.

Outputs of the test : One set of results for every RDS instance for MySQL.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Instance_status Indicates the current status of this instance.   The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
Normal 1
Creating 2
Changing 3
Flushing 4
Transforming 5
BackupRecovering 6
MinorVersionUpgrading 7
NetworkModifying 8
SSLModifying 9
MajorVersionUpgrading 10
Released 11
Inactive 12
Unavailable 13
Error 14

The Measure Values discussed in the table are described in detail below:

  • Normal: The instance runs as expected.

  • Creating: The instance is being created.

  • Changing: The configurations of the instance is being changed.

  • Inactive: The instance is disabled.

  • Flushing: The data of the instance is being flushed..

  • Released: The instance is released.

  • Transforming: The instance is being transformed.

  • Unavailable: The service is unavailable.

  • Error: Failed to create the instance.

  • Migrating: The instance is being migrated.

  • BackupRecovering: The instance is being backed up or restored.

  • MinorVersionUpgrading: The minor version is being upgraded.

  • NetworkModifying: The network is being changed.

  • SSLModifying: The SSL feature is being changed.

  • MajorVersionUpgrading: The major version is being upgraded and the service is available.

Note:

This measure reports the Measure Values listed in the table above to indicate the current state of an Redis instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Use the detailed diagnosis of this measure to view the complete details of the Redis instance.

Is_rds Indicates whether the instance is managed by Relational Database Service (RDS).   The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
True 1
False 0

Note:

This measure reports the Measure Values listed in the table above to indicate whether/not the target instance is managed by RDS. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Order_renewal Indicates whether there was an order of renewal with configuration change that had not taken effect.   After a subscription instance expires, you must renew the instance within days after the expiration to continue the use of the instance. To avoid service interruption caused by an expired subscription, we recommend that you manually renew the instance or enable auto-renewal before the instance expires.

You can change the specifications of a subscription instance before or after the instance expires. Higher specifications are charged more than lower specifications. For example, the price of an 8 GB read/write splitting instance with 5 read replicas is higher than that of a 16 GB cluster instance. If you want to change a 16 GB cluster instance to an 8 GB read/write splitting instance with 5 read replicas, you must upgrade the instance.

If the specification change you initiated at the time of renewal is not effected, then the value of this measure will be False. If the change is effected, then this measure will report the value True.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
True 1
False 0

Note:

This measure reports the Measure Values listed in the table above to indicate whether/not the configuration change initiated at the time of instance renewal has been applied. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Storage_capacity Indicates the storage capacity of this instance. MB  
Average_qps Indicates the rate at which this instance processes queries. Queries/Sec A high value is desired for this measure. A low value signifies slowness in query processing. Compare the value of this measure across RDS instances to know which instance is processing queries slowly.
Max_bandwidth Indicates the maximum bandwidth that this instance can support. MB/Sec If network resources are sufficient, the bandwidth is unlimited for ApsaraDB for Redis instances. However, if network resources are insufficient, the maximum bandwidth takes effect for the instances.
Max_connection Indicates the maximum number of connections that this instance can support. Number  
Account_status Indicates the current status of the account of this instance.   The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
Available 1
Unavailable 2

Note:

This measure reports the Measure Values listed in the table above to indicate the account status of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Account_type Indicates the account type of this instance.   The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
Normal 1
Super 2

Note:

This measure reports the Measure Values listed in the table above to indicate the account type of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Account_privilege Indicates the permissions of this instance's account.   The values that this measure can report, their descriptions, and their corresponding numeric values are discussed in the table below:

Measure Value Description Numeric Value
RoleReadOnly This account has read-only permissions. 1
RoleReadWrite This account has read and write permissions. 2
RoleRepl This account has replication permissions. 3

Note:

This measure reports the Measure Values listed in the table above to indicate the account permissions of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Backup_status Indicates the status of this instance's backup.   The values that this measure can report, their descriptions, and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
Success 1
Failed 0

Note:

This measure reports the Measure Values listed in the table above to indicate the backup status of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Use the detailed diagnosis of this measure to view the status of each backup of this instance. This way, you can identify the backup that failed.

Memory_usage Indicates the percentage of memory used by this instance. Percent If the value of this measure is close to 100% for any instance, it implies that that instance is running out of memory. You may want to consider resizing such instances, so as to avoid the memory contention.
Connection_usage Indicates the percentage of connections used by this instance. Percent If the value of this measure is close to 100% for any instance, it means that that instance is about to reach its connection limit. Once the limit is reached, the instance will not be able to entertain any new connections. To avoid this unpleasant outcome, you may want to consider increasing the connection limit of the instance.
Write_bandwidth Indicates the percentage of bandwidth consumed by this instance when performing write operations. Percent If the value of this measure is close to 100%, it means that that instance has spent almost its entire bandwidth limit on write operations. Without adequate bandwidth resources, read operations may slow down. To avoid this, you may want to consider increasing the maximum bandwidth that the instance can use.
Read_bandwidth Indicates the percentage of bandwidth consumed by this instance when performing read operations. Percent If the value of this measure is close to 100%, it means that that instance has spent almost its entire bandwidth limit on read operations. Without adequate bandwidth resources, write operations may slow down. To avoid this, you may want to consider increasing the maximum bandwidth that the instance can use.
Write_speed Indicates the intranet write speed of this instance. KB/Sec Ideally, the value of this measure should be high. A low value indicates that the instance is slow in performing write operations over the intranet.
Read_speed Indicates the intranet read speed of this instance. KB/Sec Ideally, the value of this measure should be high. A low value indicates that the instance is slow in performing read operations over the intranet.
Failed_operation Indicates the count of operations that failed on this instance's KVStore. Number Redis is an in-memory non-relational keyvalue store (KVStore). This means that it stores data based on keys and values-think of it as a giant dictionary that uses words and their definitions to store information. The keys (or words) are required in order to retrieve their values (definitions).

Ideally, the value of this measure should be 0. A non-zero value indicates that one/more operations have failed on this instance's key-value store. This can be detrimental to the health of the instance
CPU_usage Indicates the percentage of CPU resources used by this instance. Percent If the value of this measure is close to 100% for any instance, it implies that that instance is consuming CPU resources excessively. You may want to consider resizing such instances, so as to avoid a CPU contention.
Used_memory Indicates the amount of memory currently used by this instance. MB Compare the value of this measure across instances to identify the instance that is consuming maximum memory.
Used_connection Indicates the count of connections currently in use for this instance. Number Compare the value of this measure across instances to identify the instance that is supporting the maximum number of connections.

For such an instance, compare the value of this measure with that of the Maximum connections measure to figure out if that instance is about to reach its connection limit. If so, then consider increasing the connection limit of that instance, so as to avoid unnecessary contention for connections.
QPS_count Indicates the number of queries that this instance executes every second. Number Compare the value of this measure across instances to know which instance is running the maximum number of queries (each second). For such an instance, then compare the value of this measure with the maximum QPS configured for that instance to understand whether that instance is capable of running more queries, or is about to exhaust its query processing power. In the case of the latter, you may want to increase the QPS of the instance to make sure that the instance continues to process queries without a glitch.