eG Monitoring
 

Measures reported by NutAhvComputeTest

The Nutanix Acropolis host and the VMs operating on it share the compute and storage resources of the Nutanix platform. This is why, if one/more VMs on a host hog these resources, it will not only impact the performance of the other VMs of that host, but will also degrade the host performance as well. Likewise, a resource contention at the host-level can also adversely impact VM performance. To ensure that the host and VMs perform at peak capacity at all times, administrators should track how the AHV host and its VMs use the physical resources, proactively capture a potential resource contention, and precisely pinpoint the reason for the same - is it because of excessive resource usage by the AHV host? or are one/more VMs on the host resource-hungry? This is what the NutAhvComputeTest test helps achieve.

This test reports how the physical CPU and memory resources are used by an AHV host, and alerts you to erratic usage patterns. In the event of abnormal resource usage, the test also points you to the resource-starved VMs on the host, and thus reveals what is causing the usage anomaly - the VMs? or resource-intensive processing at the host-level?

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
CPU_CORES Indicates the number of CPU cores on the host. Number  
CPU_SOCKET Indicates the number of CPU sockets on the host. Number  
CPU_FREQUENCY Indicates the frequency per CPU core. GHz  
CPU_CAPACITY Indicates the total CPU capacity of this host across all cores. GHz  
CPU_USAGE Indicates the percentage of CPU resources used by the host and the VMs. Percent A value close to 100% is a cause for concern, as it signals a potential CPU contention on the host. In such a case, use the detailed diagnosis of this measure to view the top-10 CPU-consuming VMs on the host. From this, you can instantly identify the VM that is hogging the CPU resources. If no VM appears to be consuming CPU excessively, then you can conclude that resource-intensive processing at the host-level is causing the contention.
MEM_CAPACITY Indicates the total memory capacity of the host. GB
MEM_USED Indicates the total amount of memory used by the VMs and the host. GB A low value is desired for this measure.
FREE_MEMORY Indicates the amount of physical memory still unused on the host. GB A high value is desired for this measure.
MEMORY_USAGE Indicates the percentage of memory used by the VMs and the host. Percent A value close to 100% is indicative of excessive memory utilization. In such a situation, use the detailed diagnosis of this measure to view the top-10 memory consuming VMs on the host. From this, you can instantly identify the VM that is hogging the memory resources. If no VM appears to be consuming memory excessively, then you can conclude that memory-intensive processing at the host-level is causing the contention.
FREE_MEMORY_PERC Indicates the percentage of memory that is still unused on the host. Percent Ideally, the value of this measure should be high. A consistent drop in this value is indicative of excessive memory usage. In such a case, use the detailed diagnosis of the Memory utilization measure to isolate the cause of the memory drain.
OPLOG_DISK_SIZE Indicates the current size of the oplog. GB The OpLog is similar to a filesystem journal and is built as a staging area to handle bursts of random writes, coalesce them, and then sequentially drain the data to the extent store. A portion of the metadata disk is reserved for the oplog, and you can change the size through the nCLI.
OPLOG_DISK_PERC Indicates the percentage of allocated space that is used by the oplog. Percent A value close to 100% indicates that the oplog is running out of space. This can happen if data is rapidly written to the oplog but is not drained from the log just as quickly. You may want to consider resizing the oplog to ensure that there is always room for writing more data.