eG Monitoring
 

Measures reported by AWSRoute53Test

Using Amazon Route 53, one can get a website or web application up and running. One of the key functions of Route53 is that it sends automated requests over the internet to a resource, such as a web server or an email server, to verify that it's reachable, available, and functional. You also can choose to receive notifications when a resource becomes unavailable and choose to route internet traffic away from unhealthy resources.

Here's an overview of how health checking works:

  1. You create a health check and specify values that define how you want the health check to work.

  2. Route 53 starts to send requests to the endpoint at the interval that you specified in the health check.

  3. If the endpoint responds to the requests, Route 53 considers the endpoint to be healthy and takes no action.

  4. If the endpoint does not respond to a request, Route 53 starts to count the number of consecutive requests that the endpoint does not respond to.

  5. If the count reaches the value that you specified for the failure threshold, Route 53 considers the endpoint unhealthy.

  6. If the endpoint starts to respond again before the count reaches the failure threshold, Route 53 resets the count to 0.

If a health check fails, then administrators should be promptly notified of it, so that they can investigate the reasons for the failure and initiate relevant appropriate remedial measures. This is where the AWSRoute53Test test will be most useful!

This test discovers all the health checks configured for each AWS region and alerts administrators if any health check fails. In order to help administrators determine how often during a given measure period a health check reported abnormalities with an endpoint, the test also reports the percentage of time for which each health check reported that its endpoint is healthy. This introduces administrators to problem-prone areas of their infrastructure.

Outputs of the test : One set of results for each health check ID in each AWS region.

First-level descriptor: AWS Region

Second-level descriptor: Health check ID

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Health_check_status Indicates the status of the endpoint of this health check.   If the health check reports that the endpoint is in an healthy state, then the value of this measure is Healthy. If the health check finds that an endpoint is in an abnormal state, the value of this measure Unhealthy.

The numeric values that correspond to these measure values are listed in the table below:

Measure Value Numeric Value
Healthy 0
Unhealthy 1

Note:

By default, the test reports the Measure Values in the table above to indicate the status of a health check. In the graph of this measure however, the status is indicated using numeric equivalents only.

Health_percnt_healthy Indicates the percentage of time this health check reported that an endpoint is healthy. Percent A very low value for this measure is indicative of problem-prone endpoints.