eG Monitoring
 

Measures reported by VmhCpuTest

This test reports real-time CPU utilization statistics pertaining to every ESX server world, and helps identify those worlds that consume too much CPU resources. A world refers to a critical process executing on the VMKernel. Typically, this test reports the CPU usage of the following worlds:

  • console - The world for the service console; it always runs on physical CPU.

  • system - The worlds needed to perform various system services.
  • drivers - The worlds needed to execute drivers.
Measurement Description Measurement Unit Interpretation
Used Indicates the percentage of physical CPU used by this world. Percent A very high value for this measure indicates excessive CPU utilization by the world. The CPU utilization may be high because a few processes are consuming a lot of CPU, or because there are too many processes contending for a limited resource.

A high value for the console world indicates that one/more service console processes are consuming CPU resources excessively. The detailed diagnosis capability that is available only for the console world, will list the CPU-intensive console processes. Similarly, a significant spike in the value for the system world indicates that one/more system services are taking up considerable CPU resources. Likewise, abnormalities in the CPU usage of the driver world indicate that a driver-related process is the culprit.

Cpu_usage_Mhz Indicates the Cpu usage of this world in Mhz Mhz  

System Indicates the percentage of time this world spent at the ESX VMKernel to process interrupts and to perform other system activities. Percent An unusually high value indicates a problem and may be due to too many system-level tasks executing simultaneously.
Busy_wait Indicates the percentage of time the world spent in the wait state - i.e., IDLE, waiting for interrupt, etc. Percent While the "Ready" metric denotes the time when the world is waiting for CPU to execute, the busy wait state represents the time when the world is waiting for some event to happen before it is ready to execute.
Max_limited Indicates the percentage of time the ESX Server VMKernel deliberately did not run the world because that would violate the world's limit setting. Percent A Limit refers to the maximum CPU the host can make available to this world.

Even though the world is ready to run, if it is prevented from running owing to a probable limit violation, then the value of this measure will not be included in the value of the Ready measure.

A high value for this metric indicates that the world has been trying to get additional CPU resources but has not been able to do so because the CPU usage for this world or its resource pool (if it is a part of a resource pool) has reached the allocated limit.

Ready Indicates the percentage of time the world was ready to run (i.e., it had instructions to execute) but was not able to because of processor contention. Percentage This metric should typically be low - generally 5% or less. The more time a world spends waiting to run, the more lag time there is in responsiveness within the world.

For the console world, the following additional measures are available:

Measurement Description Measurement Unit Interpretation
Run_queue_length Indicates the instantaneous length of the queue in which threads are waiting for the processor cycle. This length does not include the threads that are currently being executed. Number A value consistently greater than 2 indicates that many processes could be simultaneously contending for the CPU resources.
Num_blocked_procs Indicates the number of processes blocked for I/O, paging, etc. Number A high value could indicate an I/O problem on the host (e.g., a slow disk).

Note:

In case of multi-processor systems, only the CPU usage per processor (i.e., the Used measure) is reported. The Total descriptor reports the average CPU usage across processors.