eG Monitoring
 

History of Alarms

This page is opened when the user selects the History of Alarms option under the Alarms menu. You can also access this page by clicking on the Click here for more events >> link in the Event Analysis section of the Monitor Home page. Through this page, a user can view the entire history of all alarms generated by the eG Enterprise System. The user can browse the alarm history based on specific selection criteria.

  • The first step towards viewing specific alarms alone can be achieved by picking an option from the Analysis By list. The options available here are as follows:

    • Component: This is the default selection in the Analysis By list. Owing to this default setting, the HISTORY OF ALARMS page displays the alarm history of all managed components in the environment, by default. If you proceed with the default selection, then, you will find that the Component Type and Component lists are populated with all the managed component types and components (respectively) in the environment. If you want to view the alarm history of a particular component-type, pick that type from the Component Type list. Likewise, if you want to view the alarm history of a particular managed component, pick the name of that component from the Component list. If the Component list has too many components to choose from, then, you can condense the list by first picking a Component Type; this will make sure that the Component list consists of only those managed components that are of the chosen type. You can then easily pick the component of your choice from the Component list.
    • Zone: Selecting this option from the Analysis By list will invoke a Zone list. Select a particular zone from this list, if you want to view the history of alarms related to that zone. An Include Subzone flag also appears. By setting this flag to Yes, you can make sure that the alarm history also includes those alarms that are associated with the sub-zones of the chosen zone.
      Once a Zone is selected, the Component Type and Component lists will be populated with those types and components (respectively) that are part of the selected zone. To view the alarm history of a particular component-type that is part of a zone, pick that type from the Component Type list. Similarly, to view the alarm history of a component that is part of a zone, pick that component from the Component list. If the Component list still has too many components to choose from, then, you can condense the list further by first picking a Component Type; this will make sure that the Component list consists of those components in the selected zone that are of the chosen type. You can then easily pick the component of your choice from the Component list.

      Note that the 'Zone' option will not be available in the 'Analysis By' list if no zones are configured in the environment.
    • Segment: If this option is chosen from the Analysis By list, a Segment list will additonally appear. In order to view the alarm history pertaining to a specific segment, pick a segment from the Segment list.
      Once a Segment is selected, the Component Type and Component lists will be populated with those types and components (respectively) that are part of the selected segment. To view the alarm history of a particular component-type that is part of a segment, pick that type from the Component Type list. Similarly, to view the alarm history of a component that is part of a segment, pick that component from the Component list. If the Component list still has too many components to choose from, then, you can condense the list further by first picking a Component Type; this will make sure that the Component list consists of those components in the selected segment that are of the chosen type. You can then easily pick the component of your choice from the Component list.

      Note that the 'Segment' option will not be available in the 'Analysis By' list if no segments are configured in the environment.
    • Service: If this option is chosen from the Analysis By list, a Service list will additonally appear. In order to view the alarm history pertaining to a specific service, pick a service from the Service list.
      Once you choose a Service, the Component Type and Component lists will be populated with those types and components (respectively) that are engaged in the delivery of the said service. If you want to view the alarm history of a particular component-type that is part of the selected service offering, then, pick that type from the Component Type list. Similarly, if you want to view the alarm history of a component that supports the selected service offering, pick that component from the Component list. If the Component list still has too many components to choose from, then, you can condense the list further by first picking a Component Type; this will make sure that the Component list consists of those components in the selected service that are of the chosen type. You can then easily pick the component of your choice from the Component list.

      Note that the 'Service' option will not be available in the 'Analysis By' list if no services are configured in the environment.

  • Next, to view the alarms that have remained unresolved for a time period that is in excess of a specified duration, select the greater than option from the Duration is list, enter a value in the adjacent text box, and then select a unit of time from the list box alongside. For example, to view the history of the alarms that have remained unresolved for over 2 hours, select the greater than option, enter 2 in the text box alongside, and select hours from the list box adjacent to it.
  • Similarly, you can view the history of alarms that have remained unresolved for a time period lesser than a specified duration. To achieve this, select the lesser than option from the Duration is list, specify a value in the adjacent text box, and select a unit of time from the list box.
  • You can even choose to view the details of past alarms that are of a particular priority, by selecting that priority from the Priority list.
  • For viewing the details of alarms that were generated during a specific time window, select a fixed Timeline, or choose Any to provide a date/time range.
  • If you want to view the alarm history of components with names that embed a specified string, enter the string to search for in the Component Search text box.
  • If need be, you can search for specific alarms by providing the whole/part of the alarm description in the Description search text box. Wild card patterns are also supported. For instance, to look for alarms with a description that begins with the word 'service', you can enter service* in the Description search text box.
  • By default, you cannot view the acknowledgement/deletion history of alarms in the HISTORY OF ALARMS page. Accordingly, the Show acknowledgements flag is set to No by default. To view the acknowledgement/deletion history of alarms, set this flag to Yes.
  • By default, the alarm history will not provide information on the users who are responsible for fixing the problems indicated by an alarm - i.e., the users who have been assigned the server/device on which an alarm has been raised. To ensure that every alarm displayed in the HISTORY OF ALARMS page is accompanied by this useful user information, do the following:

    • Edit the eg_ui.ini file in the <EG_INSTALL_DIR>\manager\config directory.
    • Set the Show_Users flag in the [ALARM_HISTORY] section of the file to true.
    • Save the file.

    When this is done, the alarm history will include an additional User(s) column, where the names of users who are responsible for fixing the problems indicated by each alarm will be displayed. With this information, the alarm history page will not only enable help desk managers to instantly identify those problems that have remained unresolved for the longest time, but also pinpoint those help desk personnel who were unsuccessful / had taken a long time to resolve those problems - the efficiency of the help desk staff can thus be ascertained. Moreover, a User(s) list will also additionally appear, which will be set to All by default. If need be, you can pick a particular user name from this list and click the Show Alarms button. Doing so will invoke the history of alarms associated with the chosen user alone.

  • By picking an option from the Sort by list, you can indicate the order in which the resulting alarm history should be sorted. By default, the alarm history is sorted in the descending order of the Start Time of the alarms.
  • In addition, you can configure the number of event records to be displayed per page of the event history. By default, 15 records are displayed per page. To display more records, select an appropriate value from the Events per page list.
  • Finally, click the Show Alarms button to generate the history of events.
  • The details pertaining to every alarm like the start time, duration, name of the component, test and the corresponding measure experiencing the problem are available. Every row of alarm information will be accompanied by a colored indicator, that indicates the corresponding alarm priority. Critical alarms will be of the color red, major alarms will be in orange, and the minor ones come in pink. An alarm with the end time set to current denotes a problem that has still not been fixed.

    In the HISTORY OF ALARMS page, the Service name accompanied only those alarms that were generated on the Web Site or the Web Transactions layers of Web servers delivering web site services. In reality however, many environments monitor non-web site services as well. Moreover, regardless of the type of service (i.e., web site or non-web site), the performance of a service may be impacted by issues with any service component and not just Web servers. This is why, everytime a service component experienced a performance dip, many users would need to know which service was affected by the performance setback, regardless of the type of service or component. To address this need, the SERVICE(S) column of the additional alarm details displays the affected service.

  • Sometimes, a single alarm raised by the eG manager could have undergone many transitions/changes during the specified Timeline. An alarm can change under any of the following circumstances:

    • A change in the alarm priority: This could be a switch to a higher or lower priority.
    • A change in the alarm description: For example, originally, a usage-related alarm may have been raised on disk 'D' of a server. Later, disk 'C' of the same server might have experienced a space crunch, causing another alarm to be raised.
    • A change in the list of impacted services

    Using the HISTORY OF ALARMS page, you can even view the history of transitions experienced by a particular alarm. For this, just click on an alarm in the HISTORY OF ALARMS page. If the alarm has not undergone any transitions, then the Alarm transitions window that appears will once again display the details of the alarm that was clicked on. On the other hand, if the alarm had experienced one/more transitions during the given Timeline, then the Alarm transitions window will provide the details of each transition - such details include, the alarm priority at the time of the transition, the component name, test, and alarm description during the transition, when the transition began (start time), when it ended (end time), and the total duration of the transition. The End time of a transition can also be interpreted as the time at which the next transition began. If the End time column displays no value for a transition, it implies that the transition is still active, and no further transitions have occurred thereinafter. Accordingly, the Duration column will display the value 'Current'. On the other hand, if all alarm transitions have a definite End time, it could indicate that the alarm has been closed. Using the details provided in the Alarm transitions window, you can understand how many transitions have occurred for an alarm in a specified time window, and what they are. To focus only on the state (critical/major/minor) changes that an alarm experienced, click on the left-arrow button to the right of the Alarm transitions window. Alternatively, you can click on any of the alarm transitions in this window. This will invoke a distribution pie chart that reveals the percentage of time during the total transition period the alarm has been in the critical, major, and minor states. This reveals how alarm priorities have changed during the entire transition period. If an alarm finally closes after undergoing multiple transitions, then, in the ALARMS HISTORY page, such an alarm will be assigned the highest priority across all its transitions. Moreover, the Start time of this alarm in the Alarm History page will be the start time of its first transition, and the End time will be the end time of its last transition.

    Note:

    If a user who has been assigned VMs alone logs into the eG monitoring console, then in the HISTORY OF ALARMS page and in the Alarm transitions window, every issue related to each of the assigned VMs will be specially tagged with the component TYPE, VM.

  • Note:

    Typically whenever an alarm is raised for the problems at the host-level of a component, the HISTORY OF ALARMS page and the Alarm transitions window automatically sets the Component type to the Host system, even if the component affected is say, an Oracle Database server or a Web server. Users could not determine the exact Component type of the affected component from the alarm information. Moreover many users would want the host level alarms to indicate the operating system of the host as this would aid them in a more simplified troubleshooting process. Therefore the users are allowed an option to display the actual Component type, Host system or the Operating system as part of the alarm information related to host level alarms in the Comp Type column of the HISTORY OF ALARMS page and the Alarm transitions window.

    To replace the Host system in the alarms with the corresponding Component type or the Operating system, do the following:

    • Edit the eg_ui.ini flag in the <EG_INSTALL_DIR>\manager\config directory
    • In the [HOST_SYSTEM] section of this file, set the Show_HostSystem field to any one of the following values mentioned below:
      • Set the Show_HostSystem flag to HostSystem if you want the component type to be displayed as Host system for the host-level alarms;
      • Set the Show_HostSystem flag to CompType if you want to display the affected component; This is the default setting that is provided;
      • Set the Show_HostSystem flag to OS if you want to display the operating system of the host;
    • Finally, save the file.

  • In large environments, it is but natural that the same set of components are assigned to multiple users for monitoring. In such environments, some/all the users with monitoring rights to a component might want to post their comments for an alarm related to that component. If acknowledgment rights are granted to all these users, then each of them can login to the monitor interface and provide an acknowledgement description for the same alarm. eG Enterprise maintains a history of the acknowledgement descriptions provided by multiple users with rights to monitor a single component, and enables these users to acccess this historical acknowledgement information using multiple mechanisms. The HISTORY OF ALARMS page for starters enables users to view the acknowledgement history associated with a problem that occurred in the recent/distant past. If any alarm listed in this page is associated with one/more acknowledgements, then the details of the same will be automatically displayed below that alarm in this page.
  • Typically, in large, multi-user environments, multiple users may be granted the privilege to monitor a single component. In such environments, any of these users can delete an alarm raised on that component without the knowledge of the others, thereby causing confusion. To avoid this confusion, eG Enterprise provides users with multiple mechanisms for tracking the deleted alarms. The HISTORY OF ALARMS page for one has been embedded with the intelligence to indicate whether a past alarm was deleted or not. The deletion details will be displayed along with the acknowledgement history of an alarm in this page. The deletion details include: the user who deleted the alarm, the reason for the deletion, and the date and time of deletion.
  • The page also comprises of a GRAPH icon, which when clicked, allows you to view the graph of the corresponding measure for the last one hour. If the detailed diagnosis capability has been enabled for the eG installation, then problem measures for which detailed diagnosis is available will be accompanied by the DIAGNOSIS icon. When this icon is clicked, the detailed diagnosis of the measure will appear, throwing greater light on the problem condition. By default, the graph and detailed diagnosis information will be displayed in the same window as the event history. If you want to view the graph and detailed diagnosis in a separate window, click on the check box preceding the symbol, and then click on the GRAPH or DIAGNOSIS icons.
  • You can save the event history is the CSV format by clicking on the CSV button in this page. To save it as a PDF document, click on the PDF icon.
  • The NEXT and PREVIOUS buttons, and the hyperlinked page numbers are provided to enable you to easily browse the alarm information that runs across pages.