|
Measures reported by LCOverViewTest
A Linux cluster is a connected array of Linux computers or nodes that work together and can be viewed and managed as a single system. The redundancy of cluster components eliminates single points of failure. Linux clusters may be connected nodes of servers, storage devices or virtualized containers and it helps to reduce downtime and deliver high availability of IT services and mission-critical workloads. Compared to a single computer, a Linux cluster can provide faster processing speed, larger storage capacity, better data integrity, greater reliability and wider availability of resources. Clusters are usually dedicated to specific functions, such as load balancing, high availability, high performance, storage or large-scale processing.
Pacemaker is an open source high-availability cluster resource manager software that runs on a set of nodes. Pacemaker provides a framework to manage the availability of resources. Resources are services on a host that needs to be kept highly available. Pacemaker is responsible to provide maximum availability for your cluster services/resources by detecting and recovering from node and resource-level failures. It uses messaging and membership capabilities provided by Corosync to keep the resource available on any of the cluster nodes. The pacemaker supports a maximum of 16 numbers of nodes per cluster. If the nodes are offline or under maintenance for a longer duration, then, delays may be noticed during failover. This may lead to poor user experience and loss of data in some cases. To avoid this, it is essential to monitor the overall status of the target cluster round the clock! The LCOverViewTest helps administrators perform a round the clock vigil on the target cluster.
This test reports the current status of the target linux cluster and the total number of nodes configured in the cluster. Using this test, administrators can precisely identify the number of Pacemaker nodes and Pacemaker remote nodes that are online/offline, in standby mode and are under maintenance. This test also reports the available resource groups in the cluster and the resources that are available in the cluster. Using this test, administrators can isolate those pacemaker nodes that are frequently offline and those that are frequently put under maintenance.
Outputs of the test : One set of results for the Linux cluster being monitored
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Cluster_status |
Indicates the current status of the cluster. |
|
The values reported by this measure and its numeric equivalents are mentioned in the table below:
| Measure Value |
Numeric Value |
| Ok |
100 |
| Abnormal |
0 |
Note:
By default, this measure reports the above-mentioned Measure Values to indicate the current status of the cluster. The graph of this measure however, is represented using the numeric equivalents only i.e., 0and 100.
|
| Noof_nodes |
Indicates the total number nodes configured in the cluster. |
Number |
|
| Noof_online_nodes |
Indicates the number of Pacemaker nodes that are currently online in the cluster. |
Number |
The detailed diagnosis of this measure reveals the name of the Pacemaker nodes that are online. |
| Noof_standby_nodes |
Indicates the number Pacemaker nodes that are in standby mode in the cluster. |
Number |
The detailed diagnosis of this measure displays the name of the Pacemaker nodes that are in standby mode. |
| Noof_mainta_nodes |
Indicates the number Pacemaker nodes that are in maintenance mode in the cluster. |
Number |
The detailed diagnosis of this measure displays the name of the Pacemaker nodes that are in maintenance mode. |
| Noof_offline_nodes |
Indicates the number of Pacemaker nodes that are offline in the cluster. |
Number |
The detailed diagnosis of this measure reveals the name of the Pacemaker nodes that are offline. |
| Noof_pr_online_nodes |
Indicates the number of Pacemaker remote nodes that are online in the cluster . |
Number |
The detailed diagnosis of this measure displays the name of the Pacemaker remote nodes that are online. |
| Noof_pr_standby_nodes |
Indicates the number Pacemaker remote nodes that are in standby mode in the cluster. |
Number |
The detailed diagnosis of this measure displays the name of the Pacemaker remote nodes that are in standby mode. |
| Noof_pr_mainta_nodes |
Indicates the number Pacemaker remote nodes that are in maintenance mode in the cluster. |
Number |
The detailed diagnosis of this measure displays the name of the Pacemaker remote nodes that are in maintenance mode. |
| Noof_pr_offline_nodes |
Indicates the number Pacemaker remote nodes that are offline in the cluster. |
Number |
The detailed diagnosis of this measure displays the name of the Pacemaker remote nodes that are offline. |
| Noof_res_groups |
Indicates the number of available resource groups in the cluster. |
Number |
The detailed diagnosis of this measure lists the names of the available resource groups. |
| Noof_resources |
Indicates the total number of resources available in the resource groups. |
Number |
The detailed diagnosis of this measure lists the names of the configured resources. |
|