eG Monitoring
 

Measures reported by KuberHPATest

Horizontal Pod Autoscaling allows you to define rules that will scale the numbers of replicas up or down in deployments, replica sets, or replication controllers, based on CPU utilization and optionally external and custom metrics. For instance, if you have a containerized application that uses up a lot of CPU under load, then you can configure a Horizonal Pod Autoscaler to automatically scale up the Deployment, so that additional replicas of this application (Pod) are automatically created to provide extra capacity when CPU utilization exceeds a target level. Likewise, you can configure the Horizonal Pod Autoscaler to scale down a Deployment, so that replica Pods are automatically terminated to release CPU resources when actual CPU utilization drops below a target level.

Typically, when creating a horizontal autoscaler, you can specify the target utilization value of the metric - this can be a raw value or an average value. Optionally, you can also specify the following:

  • The maximum number of replicas the autoscaler can scale up to;

  • The minimum number of replicas the autoscaler can scale down to;

Whenever the autoscaler runs, the controller manager obtains the actual metrics from the resource metrics API (for per-pod resource metrics), or the custom metrics API (for metrics other than CPU and memory that are associated with a Pod), or the external metrics API (for metrics that are not associated with any object in the Kubernetes system - eg., an external queuing system, such as the AWS SQS service), as the case may be. Then, it does the following:

  • For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each Pod targeted by the HorizontalPodAutoscaler. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each pod. If a target raw value is set, the raw metric values are used directly. The controller then takes the mean of the utilization or the raw value (depending on the type of target specified) across all targeted pods, and produces a ratio, which will be used to scale the number of desired replicas.

  • For per-pod custom metrics, the controller functions similarly to per-pod resource metrics, except that it works with raw values, not utilization values.

  • For object metrics and external metrics, a single metric is fetched, which describes the object in question. This metric is compared to the target value, to produce a ratio as above.

If actual resource usage exceeds the targeted value, then the autoscaler uses the ratio it computes to scale up the replicas. On the other hand, if the actual resource usage falls below the targeted value, then the autoscaler uses the ratio it computes to scale down.

The efficiency of the autoscaler therefore relies on the successful computation of scales by the autoscaler, and how prudently you set the scaling limits (i.e., the minimum and maximum replica count for the autoscaler) and the target utilization values. Sometimes, the autoscaler may fail to compute scales. At some other times, user errors may restrict scalability or environmental issues may prevent scaling from even happening. At such times, the success of scaling hinges on the administrator's ability to promptly detect, diagnose, and fix the bottlenecks to scaling. With the Horizonal Pod Autoscaler by Namespaces test, administrators have the ability to achieve the above!

The test auto-discovers the Horizonal Pod autoscalers defined in each namespace. For each autoscaler in a namespace, the test then reports whether/not that autoscaler can actually perform scaling, reveals if its scalability is constricted by its configuration, and alerts administrators if the autoscaler is unable to compute the scales. This way, the test enables administrators promptly capture problems impeding efficient autoscaling. If minimum and maximum replica counts were specified as part of the autoscaler definition, then the test also reports these numbers, so administrators can quickly figure out if changing these values can enhance scalability. Moreover, by enabling administrators to track current CPU utilization levels alongside the target utilization levels, the test not only helps them compute the scaling ratio themselves, but also helps them figure out if the target needs to be reset. Furthermore, by reporting the desired and current replica counts, the test reveals to administrators whether/not the autoscaler has successfully scaled up the replica count to the desired level.

Outputs of the test: One set of results for each autoscaler in each namespace of the Kubernetes cluster being monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Hpa_age Indicates the age of this autoscaler.   The value of this measure is expressed in number of days, hours, and minutes.
Scales_state Indicates whether/not this autoscaler is allowed to scale.   This measure reports the value Yes if the autoscaler is able to fetch and update scales. The value No is reported if backoff conditions - eg., a CrashLoopBackOff that is causing a Pod to start and crashing in a loop - are preventing scaling. The value Unknown is reported if the state cannot be determined.

The numeric values that correspond to these measure values are as follows:

Numeric Value Measure Value
1 Yes
2 No
3 Unknown

Note:

By default, this measure reports the Measure Values discussed above to indicate whether/not an autoscaler is allowed to scale. In the graph of this measure however, the same is indicated using the numeric equivalents only.

If this measure reports the value No or Unknown, then use the detailed diagnosis of this measure to know what prevented the autoscaler from performing scaling.

Activescale_state Indicates whether/not this autoscaler is enabled and is able to calculate the desired scales.   This measure reports the value Yes if the autoscaler is able to fetch metrics and compute the scales. The value No is reported if there are problems with fetching metrics. The value Unknown is reported if the state cannot be determined.

The numeric values that correspond to these measure values are as follows:

Numeric Value Measure Value
1 Yes
2 No
3 Unknown

Note:

By default, this measure reports the Measure Values discussed above to indicate whether/not an autoscaler is able to fetch metrics. In the graph of this measure however, the same is indicated using the numeric equivalents only.

If this measure reports the value No or Unknown, then use the detailed diagnosis of this measure to know why the autoscaler could not fetch metrics.

Limitscale_state Indicates whether/not this autoscaler's ability to scale is restricted by a maximum / minimum replica count specification.   This measure reports the value Yes if you have to raise or lower the minimum or maximum replica count for the autoscaler to perform scaling. The value No is reported if the requested scaling is allowed. The value Unknown is reported if the state cannot be determined.

The numeric values that correspond to these measure values are as follows:

Numeric Value Measure Value
1 Yes
2 No
3 Unknown

Note:

By default, this measure reports the Measure Values discussed above to indicate whether/not an autoscaler is restricted by its minimum/maximum replica count specification. In the graph of this measure however, the same is indicated using the numeric equivalents only.

If this measure reports the value No or Unknown, then use the detailed diagnosis of this measure to know why the autoscaler could not scale.

Minimum_replica Shows the lower limit for the number of Pods that can be set by this autoscaler. (Default: 1) Number If the value of this measure is the same as that of the Current replicas measure, then the autoscaler will not be able to scale down until the minimum replica count is decreased in the autoscaler definition. Under such circumstances, you will find that the Is scaling limited? measure reports the value Yes.
Maximum_replica Shows the upper limit for the number of pods that can be set by this autoscaler. Number The value of this measure cannot be lesser than the value of the Minimum replicas measure.

If the value of this measure is the same as that of the Current replicas measure, then the autoscaler will not be able to scale up until the maximum replica count is increased in the autoscaler definition. Under such circumstances, you will find that the Is scaling limited? measure reports the value Yes.

Targetcpu Indicates the target average CPU utilization (represented as a percentage of requested CPU) set for this autoscaler. Percent If a target utilization is not set in the autoscaler's definition, then the default autoscaling policy will be used.
Currentcpu Indicates the actual average CPU utilization across all Pods targeted by this autoscaler. Percent If the value of this measure is greater than that of the Targetcpu measure, the autoscaler will automatically scale up the replica Pod count to the desired level or up to the maximum replica count (whichever limit is reached first).

If the value of this measure is lesser than that of the Targetcpu measure, the autoscaler will automatically scale down the replica pod count to the desired level or up to the minimum replica count (whichever limit is reached first).

Desired_replica Indicates the number of replicas up to which this autoscaler can scale up or scale down. Number  
Current_replica Indicates the number of replicas currently managed by this autoscaler. Number If the value of this measure is not equal to that of the Desired replicas measure, it could mean one of the following:

  • Autoscaling has failed;

  • The minimum / maximum replica count specification in the autoscaler definition are restricting scalability.

In the case of the former, you will have to investigate the reasons for the failure. In the case of the latter, check the value of the Minimum_replica and Maximum_replica measures and see if changing them will improve scalability of the autoscaler.