Component Administration
 

Configure Thresholds - Specific Thresholds

Configuring the thresholds is to select an appropriate threshold policy. Basically, the state of a measurement is determined based on threshold settings that eG Enterprise uses.

  • Specifying thresholds

    The next step towards configuring the thresholds is to specify the appropriate threshold values of your choice in the Configure Thresholds page using the threshold policies discussed in the previous section. eG Enterprise offers the flexibility of generating alerts for the measures in abnormal state i.e., the alarms are generated when the value of the measure is extremely low or high. To achieve this capability, eG Enterprise brings in a concept of multiple thresholding using which you can set Minimum/Maximum thresholds for the measure. The capability of providing both the Minimum and Maximum thresholds is also supported.

    Setting a Minimum/Maximum threshold involves providing the threshold values according to the thresholding policies mentioned in the section discussed above. You can choose any one of the configuration types i.e., Default/Specific and choose the test for which thresholds have to be configured based on the thresholding policy of your choice.

    If you wish to set threshold for a specific measure, clicking on the measure will lead you to this page. Now, you will be able to configure the thresholds for the chosen measure based on the thresholding policies discussed below.

    To set a Static Threshold , enter the specific values in the Critical, Major and Minor textboxes. If not check the None option in the Static section.

    The same way to set an Automatic Threshold, choose the percentage in the Critical, Major and Minor. If not check the None option in the Automatic section. Finally click on the Update button to save changes.

  • eG Enterprise supports the following threshold policies:

    • Static thresholding
    • Automatic thresholding
    • Auto-Static thresholding
    • None

    • Static thresholding policy

      For many metrics, thresholds can be set statically. For instance, based on the service level expectations and agreements, IT managers can set thresholds for metrics such as network availability, CPU usage, and latency. Application availability and response time can also be handled in the same manner. For example, availability should be 100% whenever the metric is measured. If not, a violation should be detected. Likewise, a network latency of several seconds is usually an indicator of a problem, no matter what time of day the measurement is made at.

      To enable administrators to set static baselines for time-invariant measures such as the ones discussed above, the eG Enterprise system includes the static thresholding policy.

      To illustrate how static thresholding works, consider the example of the CPU utilization of a host. The CPU utilization measure should never exceed a prescribed limit. Therefore, static threshold limits have to be explicitly defined for the CPU utilization measure. The measure graph which when clicked depicts the static threshold values of the CPU utilization measure and its actuals.

      To set the static threshold for a chosen measure in the Configure Thresholds page that appears, select the Specific Values option from the list box against Static option and proceed to provide the threshold values of your choice. Refer to the sections below to know more about how to set the static thresholds.

    • Automatic thresholding policy

      In infrastructures where a metric varies with time, a static threshold value cannot serve as a reliable basis for judging performance. For example, consider a web server hosting a web site. The number of TCP connections to the web site could be rather high on a particular day and low on another. Similarly, it could be high during the working hours and low during the nights. In such situations where measurement values change with the time of the day, it is very difficult to set accurate maximum and minimum limits manually. In such cases, the threshold value for this metric also has to be time variant.

      Even when a metric is not time variant, its value may change from one server to another. For example, a high-end datacenter server may be able to handle hundreds of users, whereas a low-end standard server may be able to handle only a few tens of servers. In such cases too, it is extremely laborious and time consuming to determine what the normal values are for each and every server.

      To handle such situations, eG Enterprise includes an automatic thresholding capability. Using past history of the values of the metric, eG Enterprise uses tried and tested statistical quality control techniques to analyze past values of the metrics and to automatically set the upper and lower bounds for each of the metrics, using the historical data. In this approach, for example, the threshold values for a metric between 9am-10am tomorrow are based on the value of the metric for the same time period over the past days (the number of days to be looked at in the past is configurable).

      Note:

      In MANAGER_SETTINGS section of the file <EG INSTALL DIR>/manager/config/eg_db.ini, a variable “ThresholdCheckPeriod” exists. The value of this variable defines how far back the manager will check for past history when computing automatic thresholds for a measurement. The default value of this variable is 14 days (i.e., 2 weeks). You can change this value, if required.

      With eG's auto-thresholding capability, like the metric value, the threshold also is time varying. Whenever a deviation from this auto baseline (threshold) is detected, an alert is triggered. Since the baseline is set automatically, using this technique ensures that administrators are informed of problems well before they become critical enough to impact the end user experience.

      Automatic thresholding is ideal for time varying metrics such as number of requests to a web server, the workload on a database server, queue lengths of requests waiting for processing, etc.

      The measure graphs provided by eG Enterprise's monitor interface can bring out the differences between static and automatic thresholding, more clearly. The graph in Figure 8 depicts the threshold limits that were automatically assigned to the Current connections measure. Notice that the statistical data is very periodic and the threshold that is automatically computed by eG Enterprise follows the same pattern as the measurement values.

      To set the automatic threshold for a chosen measure in the Configure Thresholds page that appears, select the Specific Values option from the list box against Automatic option and proceed to set the desired values by adjusting the sensitivity sliders. Refer to the sections below to know more about the computation of the automatic threshold values.

    • Auto-static combination threshold policy

      Automatic thresholds are ideal for metrics that are time variant. Often, the same metric may vary significantly from one server to another and from time to time. Consider a staging environment with a web server. Typically, there is no load on the web server and the automatic threshold is set accordingly. When someone logs in, the threshold will be breached and an alert may be raised by the system. This is a false alert because one user logging in does not signify a situation of interest to an IT manager. This scenario shows that while automatic thresholding reduces the effort involved in configuring the monitoring tool (because IT managers do not have to configure thresholds for every metric and server), it does not eliminate false alerts.

      Therefore, eG Enterprise allows IT managers to use a combination of static and automatic thresholds. A static threshold applied along with an automatic threshold provides a realistic boundary that has to be crossed before an alert is to be triggered.

      You can set the auto-static combination of a measure by picking the Specific Values option from the list box against both the Static and Automatic from the Configure Thresholds page.

    • None

      If the threshold policy for a measurement is none, an eG agent will stop tracking the state of this measurement (i.e., The agent will continue to collect values for this measurement but will not generate any alarms relating to this measurement).

    • Using the above-mentioned thresholding policies, you can set either the default or specific thresholds for the chosen test.

  • Selecting an Alarm Policy
  • The final step to configure the thresholds for the measure is to select a suitable alarm policy. This alarm policy specification indicates when alarms should be generated by the eG manager. Just choose an alarm policy from the Policy list box of the Alarm Policy before updating the thresholds for the chosen measure. The priority that will be assigned to such an alarm depends upon the threshold configuration and its corresponding alarm policy specification. By default, the following rules are applied when determining the alarm priority, if the number of violations in a time window matches the alarm policy specification (e.g., 4 threshold violations out of 6 consecutive measurements):

    • If all violations are critical, then alarm priority would be critical
    • If all violations are major, then the alarm priority would be major
    • If all the violations are minor, then the alarm priority is minor
    • If the number of critical violations is greater than the number of major, and the number of critical violations is greater than the number of minor violations, then the alarm priority is critical
    • If the number of major violations is greater than or equal to the number of critical violations, and the number of major violations is greater than the number of minor violations, then the alarm priority is major
    • In all other cases, the alarm priority is minor
    • Finally click on the Update button to save changes.