eG Monitoring
 

Measures reported by FCCluDiskSumTest

One of the most important aspects to plan for before configuring a fail-over cluster is storage. Sufficient storage space must be available for the use of the cluster resources at all times, so that these critical resources do not fail owing to the lack of enough free space in the cluster storage. Administrators should hence periodically track the space usage in the cluster storage, check whether cluster disks in storage are used effectively or not, determine how much free space is available in the used and unused cluster disks, and figure out whether/not the space available is sufficient to handle the current and the future workload of the cluster. To monitor space usage in the cluster storage and take informed, intelligent storage management decisions, administrators can take the help of the FCCluDiskSumTest test.

This test monitors the cluster storage and presents a quick summary of the space usage across the used and unused cluster disks that are part of the storage. In the process, the test reveals how much free space is available in the used and unused disks in the storage; using this metric, administrators can figure out whether/not the cluster has enough free space to meet the current and the future demands. If not, administrators can use the pointers provided by this test again to decide what needs to be done to avert resource failures - should more physical disk resources be added to the cluster to handle the current and anticipated load? should space be cleared in the used cluster disks to make room for more data? can better management of unused disks help conserve storage space?

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Total_disk Indicates the total number of disks in the cluster storage. Number The detailed diagnosis of this measure, if enabled, lists the disks in the cluster storage, and the current state, path, and usage of each cluster disk. This way, disks that are running out of space can be isolated, so that efforts to increase the capacity of such disks can be initiated.
Avail_disk Indicates the number of cluster disks that are not currently used by any cluster resource (i.e., service/application). Number If the number of Unused cluster disks is more than the number of Used disks in cluster, it could indicate over-utilization of a few disks. In such a situation, compare the value of the Percentage of space free in used cluster disks measure with that of the Percentage of space free in unused cluster disks measure. If this comparison reveals that the used disks have very little free space as opposed to unused disks, it is a clear indicator that the storage resources have not been properly managed. You may want to consider reducing the load on some of the used disks by assigning the unused disks to services/applications that generate more data and hence consume more space.

To know which disks in the cluster storage are currently not used, use the detailed diagnosis of the Unused cluster disks measure.

To know which disks in the cluster storage are in use currently, take the help of the detailed diagnosis of the Used disks in cluster measure.

Used_disk Indicates the number of cluster disks that are currently used by a cluster resource. Number
Tot_space Indicates the total capacity of all the used disks in the cluster. GB  
Tot_space_on Indicates the total capacity of all unused disks in cluster. GB  
Tot_free Indicates the total amount of space in the used cluster disks that is currently available for use. GB  
Tot_free_on Indicates the total amount of space in the unused cluster disks that is currently available for use. GB  
Tot_fre_per Indicates the percentage of space that is free in used cluster disks. Percent For optimal cluster performance, the value of both these measures should be high. If both are low, then it indicates that the cluster is critically low on space; if the situation persists, or worse, aggravates, the resources clustered will fail! To prevent this, you can clear space on both the used and unused disks. If many disks are unused, you can even map data-intensive services/applications with these disks, so that the load on used disks is reduced. You may also want to consider adding more physical disk resources to the cluster to increase its total storage capacity.
Tot_fre_peron Indicates the percentage of space that is free in unused cluster disks. Percent
Online_disk Indicates the number of cluster disks that are currently online. Number  
Offline_disk Indicates the number of cluster disks that are currently offline. Number  
Failed_disk Indicates the number of cluster disks that are currently failed. Number  
Pending_disk Indicates the number of cluster disks that are currently in pending state. Number