|
Configuration of KuberJobTest
This test auto-discovers the namespaces configured in the Kubernetes system, and for each namespace, reports the count of Jobs in different operational states. In the process, the test brings failed and slow Jobs to light. Detailed diagnostics of the test describes the failed and slow Jobs and also provides the reason why Jobs failed. Administrators can use this information to effectively troubleshoot the failure. Additionally, the test reports the status of Pods created by the Jobs, and alerts administrators if any Job resulted in Pod failures.
The default parameters associated with this test are as follows:
To run this test and report metrics, the eG agent needs to connect to the Kubernetes API on the master node and run API commands. To enable this connection, the eG agent has to be configured with either of the following:
If only a single master node exists in the cluster, then configure the eG agent with the IP address of the master node.
If the target cluster consists of more than one master node, then you need to configure the eG agent with the IP address of the load balancer that is managing the cluster. In this case, the load balancer will route the eG agent's connection request to any available master node in the cluster, thus enabling the agent to connect with the API server on that node, run API commands on it, and pull metrics.
By default, this parameter will display the LOAD BALANCER / MASTER NODE IP that you configured when manually adding the Kubernetes cluster for monitoring, using the Kubernetes Cluster Preferences page in the eG admin interface. Refer Monitoring the Kubernetes Cluster document for managing the cluster using the eG admin interface.
Whenever the eG agent runs this test, it uses the IP address that is displayed (by default) against this parameter to connect to the Kubernetes API. If there is any change in this IP address at a later point in time, then make sure that you update this parameter with it, by overriding its default setting.
By default, the Kubernetes cluster is SSL-enabled. This is why, the eG agent, by default, connects to the Kubernetes API via an HTTPS connection. Accordingly, this flag is set to Yes by default.
If the cluster is not SSL-enabled in your environment, then set this flag to No.
The eG agent requires an authentication bearer token to access the Kubernetes API, run API commands on the cluster, and pull metrics of interest. Refer the Monitoring the Kubernetes Cluster document for generating the AUTHENTICATION TOKEN.
Typically, once you generate the token, you can associate that token with the target Kubernetes cluster, when manually adding that cluster for monitoring using the eG admin interface. Refer Monitoring the Kubernetes Cluster document for managing the cluster using the eG admin interface.
By default, this parameter will display the AUTHENTICATION TOKEN that you provided in the Kubernetes Cluster Preferences page of the eG admin interface, when manually adding the cluster for monitoring.
Whenever the eG agent runs this test, it uses the token that is displayed (by default) against this parameter for accessing the API and pulling metrics. If for any reason, you generate a new authentication token for the target cluster at a later point in time, then make sure you update this parameter with the change. For that, copy the new token and paste it against this parameter.
By default, JOB AGE IN SECONDS parameter is set to 300 seconds. This means that, by default, this test will count any Job that runs for a duration over 300 seconds as a Longest running Job. You can override this default setting by specifying a different duration (in seconds) value here.
If the eG agent connects to the Kubernetes API on the master node via a proxy server, then provide the IP address of the proxy server in PROXY HOST textbox. If no proxy is used, then the default setting -none - of this parameter, need not be changed.
Provide a valid user name and password against the PROXY USERNAME and PROXY PASSWORD parameters, respectively. Then, confirm the password by retyping it in the CONFIRM PASSWORD text box. These parameters are applicable only if the eG agent uses a proxy server to connect to the Kubernetes cluster, and that proxy server requires authentication.
If no proxy server is used, or if the proxy server used does not require authentication, then the default setting - none - of these parameters, need not be changed.
The DD FREQUENCY refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD FREQUENCY.
To make diagnosis more efficient and accurate, eG embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test, by default, for a particular server, choose the On option against DETAILED DIAGNOSIS. To disable the capability, click on the Off option.
The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
When changing the configuration for specific servers, a “*” beside the text box corresponding to the parameter signifies that these values have to be manually configured by the user. The parameter values that require to be configured will typically be prefixed with a “$” or contain a series of “*”. A value of “none” in the parameter value indicates that the corresponding parameter value can be changed if required.
|