|
Measures reported by EsxUptimeTest
In most virtualized environments, it is essential to monitor the uptime of critical vSphere/ESX servers in the infrastructure. By tracking the uptime of each of the servers, administrators can determine what percentage of time a server has been up. Comparing this value with service level targets, administrators can determine the most trouble-prone areas of the infrastructure. In some environments, administrators may schedule periodic reboots of their servers. By knowing that a specific server has been up for an unusually long time, an administrator may come to know that the scheduled reboot task is not working on a server. The EsxUptime test included in the eG agent monitors the uptime of critical vSphere/ESX servers in a virtualized infrastructure.
The measures made by this test are as follows:
| Measurement |
Description |
Measurement
Unit |
Interpretation |
| Rebooted |
Indicates whether the server has been rebooted during the last measurement period or not |
Boolean |
If this measure shows 1, it means that the server was rebooted during the last measurement period. By checking the time periods when this metric changes from 0 to 1, an administrator can determine the times when this server was rebooted. The detailed diagnosis of this measure, if enabled, will provide you with the details of the last reboot of the ESX host. Such details will include the shutdown date/time, reboot date/time, the shutdown duration (in minutes), and whether the host has been configured for maintenance or not. |
| Uptime |
Indicates the time period that the system has been up since the last time this test ran. |
Secs |
If the server has not been rebooted during the last measurement period and the agent has been running continuously, this value will be equal to the measurement period. If the server was rebooted during the last measurement period, this value will be less than the measurement period of the test. For example, if the measurement period is 300 secs, and if the server was rebooted 120 secs back, this metric will report a value of 120 seconds. The accuracy of this metric is dependent on the measurement period - the smaller the measurement period, greater the accuracy. |
| Total_uptime |
Indicates the total time that the server has been up since its last reboot. |
|
This measure displays the number of years, months, days, hours, minutes and seconds since the last reboot. Administrators may wish to be alerted if a server has been running without a reboot for a very long period. Setting a threshold for this metric allows administrators to determine such conditions.   |
|