eG Monitoring
 

Measures reported by KuberCompntTest

Master components/services make global decisions about the cluster (for example, scheduling), and detect and respond to cluster events. These services are as follows:

  • kube-apiserver: This exposes the Kubernetes API and front-ends the control pane.

  • kube-scheduler: This watches newly created pods that have no node assigned, and selects a node for them to run on.

  • kube-controller-manager: This runs processes called controllers. These controllers include:

    • Node controller: Responsible for noticing and responding when nodes go down.

    • Replication controller: Responsible for maintaining the correct number of pods for every replication controller object in the system.

    • Endpoints controller: Populates the Endpoints object (that is, joins Services & Pods).

    • Service Account and Token controllers: Creates default accounts and API access tokens for new namespaces

  • Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.

The failure of any of these services can be business-impacting! For instance, if the kube-scheduler is not running, then pods will have no nodes to run on. Without the kube-controller-manager, cluster state cannot be managed. Such anomalies can threaten the availability of the cluster and deny users access to critical applications/services running on the cluster. To avoid this, administrators must keep track of the state of each of the master services. This is where, eG Enterprise helps!

Using the KuberApiAccessTest test, administrators can periodically check if the kube-api-server service is running or not. With the help of the KuberCompntTest test, administrators can keep tabs on the running state of the other master services, namely - the scheduler, the etcd, and the controller-manager. If any of these services is down, then the KuberCompntTest test promptly alerts administrators to the failure of the corresponding service. This way, the test enables administrators to rapidly troubleshoot the abnormal state of a critical master service, restore the service to normalcy, and assure users of uninterrupted access to containerized business applications.

Outputs of the test:One set of results for the Kubernetes cluster being monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Status Indicates the current state of this service.   The values that this measure can report and their corresponding numeric values are listed in the table below:

Numeric Value Measure Value
1 Running
0 Not Running
2 Unknown

If this measure reports the value Not Running or Unknown, then use the detailed diagnosis of this measure to determine why. You can also use the /var/log/kube-scheduler.log file on the master to troubleshoot issues with the scheduler. Likewise, use the /var/log/kube-controller-manager.log file on the master to troubleshoot issues with the controller-manager.

Note:

By default, this measure reports the Measure Values discussed above to indicate the state of a master service. In the graph of this measure however, the same is represented using the numeric equivalents only.