|
Measures reported by HdpDatNodMemTest
HDFS supports writing to off-heap memory managed by the DataNodes. The DataNodes will flush in- memory data to disk asynchronously thus removing expensive disk IO and checksum computations from the performance-sensitive IO path; hence such writes are called Lazy Persist writes.
If one/more DataNodes are using their off-heap memory poorly and are instead writing data to disk directly, the I/O overheads of the cluster will significantly increase. Likewise, if any DataNode takes too long to flush the in-memory data to disk, it could result in data loss at the time of a node restart. This is why, administrators need to continuously track the blocks written to, evicted from, and flushed into disk by every DataNode, measure the time taken by every DataNode to write in-memory data to disk, and accurately isolate the following:
The HdpDatNodMemTest test helps with this!
For each DataNode in a Hadoop cluster, this test reveals how well that node uses its off-heap memory. In the process, the test accurately pinpoints those DataNodes that are not effectively utilizing their off-heap memory. Additionally, the test also evaluates how quickly (or otherwise) every
DataNode flushes in-memory data to disk, thus shedding light on DataNodes that are flushing data lazily. This also enables administrators to proactively detect chinks in the process that might adversely impact data reliability and integrity during a node restart.
Outputs of the test : One set of the results for each DataNode in the target Hadoop cluster
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Block_write_rate |
Indicates the rate at which this DataNode wrote blocks to off-heap memory. |
Blocks/Sec |
A high value is indicative of effective usage of the off-heap memory, and is hence desired. |
| Blocks_not_satisfied |
Indicates the rate at which blocks were written to memory by this DataNode but not satisfied (failed-over to disk). |
Blocks/Sec |
|
| Data_write_rate |
Indicates the rate at which data was written to off-heap memory by this DataNode. |
MB/Sec |
A high value is indicative of effective usage of the off-heap memory, and is hence desired. |
| Block_read_rate |
Indicates the rate which blocks were read from offheap memory by this DataNode |
Blocks/Sec |
If read requests are served from the off-heap memory and not the disk, it results in huge savings in terms of processing overheads. This means that ideally, the value of this measure should be high. |
| Block_evict_rate |
Indicates the rate at which blocks were evicted from the off-heap memory by this DataNode. |
Blocks/Sec |
Blocks need to be evicted at short intervals from the off-heap memory, so that there is room for new blocks. If more number of blocks are evicted every second, it will release more offheap memory for the use of new entries. This means that a high value
is desired for this measure. |
| Block_evicted_wo_read |
Indicates the rate at which this DataNode evicted blocks in memory without ever being read from
memory. |
Blocks/Sec |
|
| Avg_Blk_Inmemory_time |
Indicates the average time blocks spent in-memory before this DataNode evicted them. |
Milliseconds |
Frequently accessed blocks should spend more time in memory, whereas blocks that are seldom used should be evicted.
If blocks spend too much time inmemory on an average, then you may want to tweak the eviction policies to make sure that there is always room in memory for new blocks.
Typically, the following block priorities govern eviction:
Single access priority: The first time a block is loaded from HDFS, that block is given single access priority, which means that it will be part of the first group to be considered during evictions. Scanned blocks are more likely to be evicted than blocks that are used more frequently.
Multi access priority: If a block in the single access priority group is accessed again, that block is assigned multi access priority,
which moves it to the second group considered during evictions, and is therefore less likely to be evicted.
In-memory access priority: If the block belongs to a column family which is configured with the inmemory configuration option, its
priority is changed to in memory access priority, regardless of its access pattern. This group is the last group considered during evictions, but is not guaranteed not to be evicted. Catalog tables are configured with in-memory access priority.
|
| Blk_delbef_pers_disk |
Indicates the rate at which the blocks in the off-heap memory of this DataNode were deleted before being persisted to disk during the measure period |
Blocks/Sec |
A very high value for this measure is a cause for concern. This is because, blocks that are deleted before being written to the disk can cause data loss at the time of a node restart. |
| Data_writedisk_lazywrite |
Indicates the rate at which this DataNode wrote data to disk using lazy writer. |
MB/Sec |
A high value for these measures will reduce the likelihood of data loss during a node restart. |
| Blks_write_to_disk |
Indicates the rate at which this DataNode wrote blocks to disk using lazy writer. |
Blocks/Sec |
| Avg_blkwritetime_lazy |
Indicates the average time this DataNode took to write data to disk using lazy writer. |
Milliseconds |
Ideally, the value of this measure should be low. A high value or a consistent increase in this value is a cause for concern as it implies that the DataNode is flushing writes to disk very slowly. This can result in data loss at the time of a node restart. |
|