|
Measures reported by HdpNameNodeTest
To provide uninterrupted storage services to end-users, the administrator of a Hadoop cluster should make sure that there is sufficient space in the cluster to fulfill the user's current and future storage needs. If the cluster suddenly runs out of space, critical data can no longer be stored in the cluster; nor can the cluster serve subsequent data requirements of applications. The reliability of the cluster thus becomes questionable.
To avoid this, administrators must continuously track the space usage across the DataNodes in the cluster, proactively detect a potential space crunch, determine the reason for the sudden/steady depletion of space, and eliminate it. This is where the HdpNameNodeTest test helps!
This test monitors how storage space is used across a cluster, and proactively alerts administrator if it finds that the cluster is running out of space. The test then turns administrator attention to the probable causes of the resource crunch - is it because some DataNodes are experiencing disk failures and are hence unable to provide storage space? is it because there are one too many stale nodes - i.e., unavailable DataNodes in the cluster? is too much space being used for non-DFS purposes? is the cache hogging space? or was the cluster not correctly sized to begin with?
Outputs of the test : One set of the results for the target Hadoop cluster
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Total_size |
Indicates the total storage capacity of the cluster across all DataNodes. |
TB |
|
| Used_size |
Indicates the amount of storage space in the cluster that is in use currently. |
TB |
If this value is equal to or close to the value of the Total space measure, it implies that the cluster is running out of storage space. |
| Free_size |
Indicates the amount of storage space in the cluster that is currently unused. |
TB |
Ideally, this value should be higher than that of the Used space measure.
If this value is close to 100%, it is a cause for concern, as it indicates that the cluster is rapidly running short of storage space. If the reason for this anomaly is not diagnosed and eliminated quickly, clients could be denied read/write access to the cluster. |
| Used_percent |
Indicates the percentage of space in the cluster that is being utilized currently. |
Percent |
Some of the common causes for the lack of free storage space in the cluster and the means to resolve them are detailed below:
Frequent volume failures: If DataNodes in the cluster experience disk failures frequently, then such DataNodes and their disks will no longer be available to provide storage services. This can cause a sudden and significant dip in the storage space availability. Check the Volume failures measure of this test to determine if disk failures are occurring frequently. If so, proceed to identify the problem DataNodes and disks and clear their issues, so that there is more storage space for data in the cluster.
Stale DataNodes: DataNodes whose heartbeat messages have not been received by the NameNode for more than a specified interval are marked as stale nodes. If you have configured reads and/or writes to not happen on stale nodes, then it means you do not want your clients to use the free storage space in such nodes. This will also reduce the overall free storage space availability of the cluster. Check the Stale datanodes measure reported by this test to figure out if any nodes have been marked as stale. If many nodes have been tagged as stale, then you can lift the read/write restriction on stale nodes to eliminate this storage bottleneck. Alternatively, you can increase the ‘heartbeat not received interval’ for DataNodes, so that lesser nodes are tagged as stale.
Excessive space usage by less-critical data: Sometimes, storage space can be excessively used for non-DFS purposes. Likewise, the cache can also hog storage space. In such situations, try limiting space usage for non-DFS purposes. Also, monitor cache usage over time; if this reveals that the cache is seldom used, then try reducing the cache size to conserve storage space.
Under-sized cluster: If a cluster is not sized commensurate to its current and anticipated load, then naturally, sooner or later the cluster will run out of storage space. To avoid this, you may have to resize the cluster by adding more DataNodes to it.
|
| Non_dfs_used |
Indicates the amount of storage space used for non-DFS purposes. |
TB |
Data in the directory set against the dfs.data.dir property or the dfs.datanode.data.dir property in the hdfs-site.xml file is typically used for DFS purposes. Data in any other directory is meant for non-DFS purposes only. This measure reveals how much space is consumed by such data only. If this value is close to or is rapidly approaching the value of the Total space measure, then you may want to consider freeing up space from non-DFS directories by deleting data in them. |
| Cache_capacity |
Indicates the total cache capacity across all DataNodes in the cluster. |
GB |
This will give you a good idea as to how much storage space has been allocated for cache usage. |
| Cache_used |
Indicates the amount of space in cache that is currently in use. |
GB |
To determine how well the cache is being utilized, track the value of this measure over time. In the process, if you find that only a fraction of the cache capacity is used for storing cached objects, you may want to consider reducing the cache size. This way, you can make more room in the cluster for critical application data. |
| Total_loads |
Indicates the current number of connections to the cluster. |
Number |
This is a good indicator of the current connection load on the cluster. |
| Total_files |
Indicates the number of files and directories currently stored in the cluster. |
Number |
|
| Volume_failures |
Indicates the rate at which volume failures occurred in this cluster. |
Failures/Sec |
If the value of this measure is high, it implies that some DataNodes are experiencing frequent disk failures. Identifying and clearing these disk issues can help increase storage space in the cluster.
If the NameNode does not receive heartbeat messages from a DataNode for more than a configured duration, then such a DataNode is marked as a stale node. A stale data node is avoided during lease/block recovery. It can be conditionally avoided for reads and for writes . If such conditional exclusions are in place, then even if storage space is free on stale nodes, clients cannot read from or write data to them - in other words, the free storage space on stale nodes will be unusable. |
| Stale_nodes |
Indicates the number of stale data nodes in the cluster. |
Number |
So, if the cluster is experiencing a space crunch, and this measure reports a high value, then you can do one or both the following to make more storage space available for the use of clients:
Increase the duration beyond which a DataNode should not be sending heartbeat messages to the NameNode for it to be considered a stale node. For this purpose, use the dfs.namenode.stale.datanode.interval property in the hdfs-default.xml file.
Allow reads and writes to occur on stale nodes by removing the condition preventing them. For this, use the dfs.namenode.avoid.read.stale.datanode and dfs.namenode.avoid.write.stale.datanode properties in the hdfs-default.xml file.
NameNode ensures consistency of the distributed file system by limiting concurrent namespace access to a single-writer or multiple-readers. This single global lock is known as the FSNamesystem lock. This lock does not block readers, but only writers.
|
| Threads_lock_waits |
Indicates the number of threads waiting to acquire FSNameSystem lock. |
Number |
A steady increase in the value of this measure over time is a sign that some thread has been holding this lock for too long a time, because of which more threads are in contention for this lock. If a thread is holding this lock for a long time, then it may sometimes bring every NameNode operation to a halt. To resolve this bottleneck, you first need to know which operation initiated this lock, and then investigate the reason why that operation has been holding that lock for such a long time. For that, you can configure NameNode to write a log message whenever the FSNamesystem read or write lock is held for longer than the configured threshold. By default, dfs.namenode.write-lock-reporting-threshold and dfs.namenode.read-lock-reporting-threshold are set to 1 second and 5 seconds respectively. Additionally, by setting fs.namenode.lock.detailed-metrics.enabled to true (by default it is false), NameNode will include the operation that initiated the lock hold. You can then analyze the logged messages to identify the operation that acquired the lock and held it for a duration longer than the configured threshold. |
|