eG Monitoring
 

Measures reported by Exc2013EvtLogTest

Managed availability, also known as Active Monitoring or Local Active Monitoring, is the integration of built-in monitoring and recovery actions with the Exchange high availability platform. It's designed to detect and recover from problems as soon as they occur and are discovered by the system. Unlike previous external monitoring solutions and techniques for Exchange, managed availability doesn't try to identify or communicate the root cause of an issue. It's instead focused on recovery aspects that address three key areas of the user experience:

  • Availability Can users access the service?
  • Latency How is the experience for users?
  • Errors Are users able to accomplish what they want?

Managed availability is an internal process that runs on every Exchange 2013 server. It polls and analyzes hundreds of health metrics every second. If something is found to be wrong, most of the time it will be fixed automatically. But there will always be issues that managed availability won't be able to fix on its own. In those cases, managed availability will escalate the issue to an administrator by means of event logging. Using the Exc2013EvtLogTest test, administrators can scan the event logs for information, warning, or error messages logged by the Managed Availability process, and capture critical errors/warnings that Exchange cannot self-heal using the Managed Availability engine.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Alert_val Indicates the type of event that was captured and logged by the managed availability process in the event log during the last measurement period.   The values that this measure can take and their corresponding numeric values are as follows:

Measure Value Numeric Value
Information 0
Warning 1
Error 2

Note:

By default, this measure reports one of the Measure Values listed in the table above. In the graph of this measure however, the alert status is indicated by the corresponding numeric equivalents only.

Summary_val Indicates the total number of errors that the Managed Availability process logged in the event logs. Number A high value is a cause for concern.