eG Monitoring  

Measures reported by RedisClusterTest

Redis Cluster provides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes.

Redis Cluster also provides some degree of availability during partitions, that is in practical terms the ability to continue the operations when some nodes fail or are not able to communicate. However the cluster stops to operate in the event of larger failures (for example when the majority of masters are unavailable).

Redis Cluster does not use consistent hashing, but a different form of sharding where every key is conceptually part of what we call a hash slot. Every node in a Redis Cluster is responsible for a subset of the hash slots, so for example you may have a cluster with 3 nodes, where:

This system works using three main mechanisms:

  • Node A contains hash slots from 0 to 5500.

  • Node B contains hash slots from 5501 to 11000.

  • Node C contains hash slots from 11001 to 16383.

This allows to add and remove nodes in the cluster easily.

In order to remain available when a subset of master nodes are failing or are not able to communicate with the majority of nodes, Redis Cluster uses a master-slave model where every hash slot has from 1 (the master itself) to N replicas (N-1 additional slaves nodes).

In our example cluster with nodes A, B, C, if node B fails the cluster is not able to continue, since we no longer have a way to serve hash slots in the range 5501-11000.

However when the cluster is created (or at a later time) we add a slave node to every master, so that the final cluster is composed of A, B, C that are master nodes, and A1, B1, C1 that are slave nodes. This way, the system is able to continue if node B fails.

Node B1 replicates B, and B fails, the cluster will promote node B1 as the new master and will continue to operate correctly.

However, note that if nodes B and B1 fail at the same time, Redis Cluster is not able to continue to operate.

To avoid this, administrators must monitor the Redis cluster, understand how many master nodes it is composed of, track the status of hash lots assigned to each node, and be promptly alerted if any hash slot fails. For achieving this, administrators can use the RedisClusterTest.

For a cluster-enabled Redis instance, this test reports the composition of the cluster in terms of the number of master nodes and hash slots assigned to the cluster. In addition, the test tracks the status of the hash slots, and notifies administrators if any hash slot fails. Moreover, the test also alerts administrators if any node is added or removed from the cluster.

Outputs of the test : One set of results for the cluster-enabled instance

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
cluster_enabled Indicates whether/not the cluster feature is enabled for the target Redis instance.   If the instance is cluster-enabled, then this measure will report the value Yes. For a cluster-disabled instance, this measure will report the value No.

The numeric values that correspond to these measure values are discussed in the table below:

Measure Value Numeric Value
Yes 1
No 0


Note:

This measure reports the Measure Values listed in the table above to indicate whether/not the target instance is cluster-enabled. The graph of this measure however, indicates the same using the numeric equivalents only.
cluster_my_epoch Indicates the Config Epoch of this node. Number Basically the epoch is a logical clock for the cluster and dictates that given information wins over one with a smaller epoch.

An Epoch is used in order to An Epoch is used in order to give incremental versioning to events. When multiple nodes provide conflicting information, it becomes possible for another node to understand which state is the most up to dateive incremental versioning to events. When multiple nodes provide conflicting information, it becomes possible for another node to understand which state is the most up to date

Every master always advertises its configEpoch in ping and pong packets along with a bitmap advertising the set of slots it serves. Slave nodes also advertise the configEpoch field in ping and pong packets, but in the case of slaves the field represents the configEpoch of its master as of the last time they exchanged packets.

A new configEpoch is created during slave election. Slaves trying to replace failing masters increment their epoch and try to get authorization from a majority of masters. When a slave is authorized, a new unique configEpoch is created and the slave turns into a master using the new configEpoch.
cluster_slots_ok Indicates the number of hash slots in the cluster that are in the OK state. Number If the value of this measure is the same as the value of the cluster_slots_assigned measure, it means that all hash slots mapped to all nodes in the cluster are working correctly.

On the other hand, if the value of this measure is much lower than the value of the cluster_slots_assigned measure, it means that the hash slots mapped to some nodes are in the FAIL or PFAIL state. You may want to look up the values of the Number of hash slots in PFAIL state and Number of hash slots in FAIL state measures to confirm this.

cluster_slots_pfail Indicates the number of hash slots that are mapped to a node in PFAIL state. Number Ideally, the value of this measure should be very low or 0.

A node flags another node with the PFAIL flag when the node is not reachable for more than NODE_ TIMEOUT time. Both master and slave nodes can flag another node as PFAIL, regardless of its type.

Note that those hash slots still work correctly, as long as the PFAIL state is not promoted to FAIL by the failure detection algorithm. PFAIL only means that we are currently not able to talk with the node, but may be just a transient error.

cluster_slots_fail Indicates the number of hash slots that are mapped to a node in FAIL state. Number Every node sends gossip messages to every other node including the state of a few random known nodes. Every node eventually receives a set of node flags for every other node. This way every node has a mechanism to signal other nodes about failure conditions they have detected.

A PFAIL condition is escalated to a FAIL condition when the following set of conditions are met:

  • Some node, say node A, has another node B flagged as PFAIL.
  • Node A collected, via gossip sections, information about the state of B from the point of view of the majority of masters in the cluster.
  • Some node, say node A, has another node B flagged as PFAIL.
  • The majority of masters signaled the PFAIL or FAIL condition within NODE_TIMEOUT * FAIL_REPORT_VALIDITY_MULT time. (The validity factor is set to 2 in the current implementation, so this is just two times the NODE_TIMEOUT time).

    If all the above conditions are true, Node A will:

  • Mark the node as FAIL.
  • Send a FAIL message to all the reachable nodes.


  • Ideally therefore, the value of this measure should be 0.
cluster_stats_messages_received Indicates the number of messages sent via the cluster node-to-node binary bus Number All the cluster nodes are connected using a TCP bus and a binary protocol, called the Redis Cluster Bus. Every node is connected to every other node in the cluster using the cluster bus. Nodes use a gossip protocol to propagate information about the cluster in order to discover new nodes, to send ping packets to make sure all the other nodes are working properly, and to send cluster messages needed to signal specific conditions. The cluster bus is also used in order to propagate Pub/Sub messages across the cluster and to orchestrate manual failovers when requested by users (manual failovers are failovers which are not initiated by the Redis Cluster failure detector, but by the system administrator directly).
cluster_state Indicates the current state of the cluster.   This measure can report any of the following values:

  • OK: If the node is able to receive queries, then this measure will report the value OK.
  • Fail: If there is at least one hash slot that is unbound (no node associated), in error state ((node serving it is flagged with FAIL flag), or if the majority of masters can't be reached by this node, then this measure will report the value FAIL


  • The numeric values that correspond to the measure values discussed above are as follows:

    Measure Value Numeric Value
    Fail 0
    OK 1


    Note:

    This measure reports the Measure Values listed in the table above to indicate the cluster state. The graph of this measure however, indicates the same using the numeric equivalents only.
    cluster_known_nodes Indicates the number of nodes in the cluster. Number To know the details of the nodes in the cluster, use the detailed diagnosis of this measure.
    cluster_current_epoch Indicates the local current Epoch variable Number Basically the epoch is a logical clock for the cluster and dictates that given information wins over one with a smaller epoch.

    An Epoch is used in order to give incremental versioning to events. When multiple nodes provide conflicting information, it becomes possible for another node to understand which state is the most up to date.

    The currentEpoch is a 64 bit unsigned number.

    At node creation every Redis Cluster node, both slaves and master nodes, set the currentEpoch to 0.

    Every time a packet is received from another node, if the epoch of the sender (part of the cluster bus messages header) is greater than the local node epoch, the currentEpoch is updated to the sender epoch.

    Because of these semantics, eventually all the nodes will agree to the greatest currentEpoch in the cluster.

    This information is used when the state of the cluster is changed and a node seeks agreement in order to perform some action.

    Currently this happens only during slave promotion.
    cluster_slots_assigned Indicates the total number of hash slots assigned to the cluster. Number  
    cluster_size Indicates the number of master nodes in the cluster. Number  
    cluster_added_nodes Indicates the number of nodes added to the cluster. Number Use the detailed diagnosis of this measure to know which nodes were recently added to the cluster.
    cluster_deleted_nodes Indicates the number of nodes deleted from the cluster. Number Use the detailed diagnosis of this measure to know which nodes were recently deleted from the cluster.