|
Measures reported by VmotionLogTest
VMware VMotion technology, unique to VMware, leverages the complete virtualization of servers, storage and networking to migrate an entire running virtual machine instantaneously from a failing/underperforming server to a healthy one, to ensure high availability of the virtual machine. The entire state of a virtual machine is encapsulated by a set of files stored on shared storage, and VMware's VMFS cluster file system allows both the source and the target ESX Server to access these virtual machine files concurrently. The active memory and precise execution state of a virtual machine can then be rapidly transmitted over a high speed network. Since the network is also virtualized by ESX Server, the virtual machine retains its network identity and connections, ensuring a seamless migration process.
To troubleshoot errors in migration, administrators typically use the VMotion history log file available in the /proc/vmware/migration/history directory. The VmotionLogTest parses the VMotion history log and reports key statistics that reveal the level of efficiency of the VMware VMotion technology.
| Measurement |
Description |
Measurement
Unit |
Interpretation |
| Total_migrations |
The total number of migrations that occurred on the ESX server during the last measurement period |
Number |
  |
| Migrations_to_other_servers |
The total number of virtual machines that were successfully migrated from this ESX server to other servers during the last measurement period |
Number |
  |
| Migrations_from_other_servers |
The total number of virtual machines that were successfully migrated from other ESX servers to this server during the last measurement period |
Number |
  |
| Migration_errors |
The total number of migration errors that occurred on the ESX server during the last measurement period |
Number |
Ideally, this value should be 0. |
| Total_time |
The total time taken by all migrations |
Secs |
The total migration time is the duration between when migration is initiated and when the original VM may be finally discarded and, hence, the source host may potentially be taken down for maintenance, upgrade or repair. If the value of this measure is very high, you might want to investigate where the migration process is spending too much time. By comparing the values of the Precopy_time, Cpt_transfer_time, Cpt_load_time, Page_in_time, and Downtime measures, you may receive a fair idea of where what the migration bottleneck lies. |
| Precopy_time |
The total time taken by all migrations for pre-copying |
Secs |
A pre-copy approach is where pages of memory are iteratively copied from the source machine to the destination host, all without ever stopping the execution of the virtual machine being migrated. It is important for VMware administrators to determine how much time pre-copying should consume, and when it should be stopped, as pre-copying is a resource-intensive process. Typically, if the VM being migrated never modifies memory, a single pre-copy of each memory page will suffice to transfer a consistent image to the destination. However, should the VM continuously dirty pages faster than the rate of copying, then all pre-copy work will be in vain and one should immediately stop and copy. |
| Cpt_transfer_time |
The total time taken by all migrations for copying checkpoints to the destination host |
Secs |
Typically, live migration involves creating a checkpoint of the entire state of a virtual machine while it is running, copying the checkpoint to another host, starting a second copy of the virtual machine on the other host, and then stopping the first copy on the original machine. If a very high value is reported by the Cpt_transfer_time and Cpt_load_time metrics, it indicates process of copying checkpoints to the destination host took very long. |
| Cpt_load_time |
The total time taken by all migrations for loading check points |
Secs |
| Page_in_time |
The time taken to transfer memory pages to a destination host while migrating VMs |
Secs |
A very high value of this measure could indicate that page transfer is taking too much time. |
| Downtime |
The total downtime during migration |
Secs |
Downtime is the period during which the service is unavailable due to there being no currently executing instance of the VM. An effective migration activity is one which minimizes downtime. So, if the value of this measure is very high, it indicates problems in migration which require further investigation. |
|