eG Monitoring
 

Measures reported by AnsibleTowerTest

Red Hat® Ansible® Tower helps you scale IT automation, manage complex deployments and speed productivity. Ansible Tower also helps you keep the inventories and projects in sync. The Tower is centred around the idea of organizing Projects (which run your playbooks via Jobs) and Inventories (which describe the servers on which your playbooks should be run) inside of Organizations.

In Tower automated environments, the jobs play a vital role in performing various operations on the Tower such as updating and synchronizing the inventories and projects, performing system upgrades, updating applications delivered via Tower, etc. Therefore, to ensure peak performance of the Tower, it is important for administrators to continuously track job health of the Tower. If, for any reason, job health deteriorates on the Tower, then the overall performance, data synchronization, reliability and data integrity of the Tower will also deteriorate. To avoid such anomalies, administrators can use the AnsibleTowerTest test to proactively detect the job health of the Tower before anything untoward happens.

This test continuously monitors the Ansible tower, and proactively reveals the number of hosts that failed while executing the jobs, and the number of jobs that failed. These statistics help administrators to find out how well/badly the jobs are performed on the Tower. In addition, this test also reports the synchronization failures among the inventories/projects while launching the jobs. This helps administrators to detect the issues in the synchronization, if any, and take remedial actions immediately.

Outputs of the Test: One set of the results for the Ansible Tower being monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Total_hosts Indicates the total number of hosts in the Tower. Number   
Successhosts Indicates the number of hosts that are successfully performing the jobs. Number   
Failed_hosts Indicates the number of hosts that failed to perform the jobs. Number   
Jobs_health Indicates the percentage of jobs that is successfully performed on the Tower. Percent Ideally, the value of this measure is should be high.
Total_jobs Indicates the total number of jobs launched on the Tower. Number This measure is good indicators of the workload on the Tower.
Success_jobs Indicates the number of jobs that are completed successfully on the Tower. Number A high value is desired for this measure.
Failed_jobs Indicates the number of jobs that failed on the Tower. Number The value of this measure should be zero. A non-zero value indicates that one/more jobs failed; this is a cause for concern and requires investigation.
Total_projects Indicates the total number of projects in the Tower. Number  
Failed_projects Indicates the number of times the project syncing process failed during job execution. Number Ideally, the value of this measure should be zero. A non-zero value for this measure may lead to serious synchronization issues, and is a cause for concern.
Credentials Indicates the number of credentials available on the Tower. Number  
Inventory_Details Indicates the total number of inventories in the Tower. Number  
Inventory_sync_failure Indicates the number of times the inventory syncing process failed during job execution. Number Ideally, the value of this measure should be zero. A non-zero value for this measure is a cause for concern.