|
Configuration of HdpFsTest
This test monitors the following:
In the process, the test sheds light on latencies in communication and processing that could be slowing down uploads/downloads between the primary and secondary nodes in the cluster.
The default parameters associated with this test are as follows:
Specify the port at which the NameNode accepts client connections in the PORT text box. NameNode is the master node in the Apache Hadoop HDFS Architecture that maintains and manages the blocks present on the DataNodes (slave nodes). NameNode is a very highly available server that manages the File System Namespace and controls access to files by clients. By default, the NameNode's client connection port is 8020.
The eG agent collects metrics using Hadoop's WebHDFS REST API. While some of these API calls pull metrics from the NameNode, some others get metrics from the resource manager. NameNode is the master node in the Apache Hadoop HDFS Architecture that maintains and manages the blocks present on the DataNodes (slave nodes). NameNode is a very highly available server that manages the File System Namespace and controls access to files by clients. To run API commands on the NameNode and pull metrics, the eG agent needs access to the NameNode's web port.
To determine the correct web port of the NameNode, do the following:
Open the hdfs-default.xml file in the hadoop/conf/app directory.
Look for the dfs.namenode.http-address parameter in the file.
This parameter is configured with the IP address and base port where the DFS NameNode web user interface listens on. The format of this
configuration is: <IP_Address>:<Port_Number>. Given below is a sample configuration:
192.168.10.100:50070
Configure the <Port_Number> in the specification as the Name Node Web Port. In the case of the above sample configuration, this will be 50070.
The eG agent collects metrics using Hadoop's WebHDFS REST API. While some of these API calls pull metrics from the NameNode, some others get metrics from the resource manager. NameNode is the master node in the Apache Hadoop HDFS Architecture that maintains and manages the blocks present on the DataNodes (slave nodes). NameNode is a very highly available server that manages the File System Namespace and controls access to files by clients.
In some Hadoop configurations, a simple authentication User name may be required for running API commands and collecting metrics from the NameNode. When monitoring such Hadoop installations, specify the name of the simple authentication user here. If no such user is available/required, then do not disturb the default value none of this parameter.
The eG agent collects metrics using Hadoop's WebHDFS REST API. While some of these API calls pull metrics from the NameNode, some others get metrics from the resource manager. The YARN Resource Manager Service (RM) is the central controlling authority for resource management and makes resource allocation decisions.
To pull metrics from the resource manager, the eG agents first needs to connect to the resource manager. For this, you need to configure this test with the IP address/host name of the resource manager and its web port. Use the Resource Manager IP and Resource Manager Web Port parameters to configure these details.
To determine the IP/host name and web port of the resource manager, do the following:
Open the yarn-site.xml file in the /opt/mapr/hadoop/hadoop- 2.x.x/etc/hadoop directory.
Look for the yarn.resourcemanager.webapp.address parameter in the file.
This parameter is configured with the IP address/host name and web port of the resource manager. The format of this configuration is: <IP_Address_or_Host_Name>:<Port_Number>. Given below is a sample configuration:
192.168.10.100:8080
Configure the <IP_ Address_ or_ Host_ Name> in the specification as the Resource Manager IP, and the <Port_Number> as the Resource Manager Web Port. In the case of the above sample configuration, this will be 8080.
The eG agent collects metrics using Hadoop's WebHDFS REST API. While some of these API calls pull metrics from the NameNode, some others get metrics from the resource manager. The YARN Resource Manager Service (RM) is the central controlling authority for resource management and makes resource allocation decisions.
In some Hadoop configurations, a simple authentication User name may be required for running API commands and collecting metrics from the Resource Manager. When monitoring such Hadoop installations, specify the name of the simple authentication user here. If no such user is available/required, then do not disturb the default value none of this parameter.
If multiple components of the same component type are awaiting configuration, then an APPLY TO OTHER COMPONENTS check box will appear in this page. Clicking on this check box will allow you to apply the configuration to all/selected components of that type.
Once the necessary values have been provided, clicking on the UPDATE button will register the changes made.
When changing the configuration for specific servers, a “*” beside the text box corresponding to the parameter signifies that these values have to be manually configured by the user. The parameter values that require to be configured will typically be prefixed with a “$” or contain a series of “*”. A value of “none” in the parameter value indicates that the corresponding parameter value can be changed if required.
|