eG Monitoring
 

Measures reported by IGELSystemTest

This test collects various metrics pertaining to the CPU and memory usage of every processor supported by each IGEL Endpoint.

Outputs of the test :One set of results for every combination of IGEL Endpoint:processor

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Cpu_util This measurement indicates the percentage of CPU utilized by the processor. Percent A high value could signify a CPU bottleneck. The CPU utilization may be high because a few processes are consuming a lot of CPU, or because there are too many processes contending for a limited resource.

The detailed diagnosis of this test reveals the top-10 CPU-intensive processes on the IGEL Endpoint.
System_cpu_util Indicates the percentage of CPU time spent for system-level processing. Percent An unusually high value indicates a problem and may be due to too many system-level tasks executing simultaneously.
Run_queue_length Indicates the instantaneous length of the queue in which threads are waiting for the processor cycle. This length does not include the threads that are currently being executed. Number A value consistently greater than 2 indicates that many processes could be simultaneously contending for the processor.
Num_blocked_procs Indicates the number of processes blocked for I/O, paging, etc. Number A high value could indicate an I/O problem on the IGEL Endpoint (e.g., a slow disk).
Swap_memory Denotes the committed amount of virtual memory. This corresponds to the space reserved for virtual memory on disk paging file(s). MB An unusually high value for the swap usage can indicate a memory bottleneck. Check the memory utilization of individual processes to figure out the process(es) that has (have) maximum memory consumption and look to tune their memory usages and allocations accordingly.
Free_memory Indicates the free memory available. MB This measure typically indicates the amount of memory available for use by applications running on the target IGEL Endpoint.
Scan_rate Indicates the memory scan rate. Pages/Sec A high value is indicative of memory thrashing. Excessive thrashing can be detrimental to the IGEL Endpoint performance.
Steal_time Indicates the percentage of time a virtual processor waits for a real CPU while the hypervisor is servicing another virtual processor. Percent This measure is applicable only for the Windows VMs that are provisioned via a VMware vSphere ESX.

A low value is desired for this measure.

A high value for this measure indicates that a particular virtual processor is waiting longer for real CPU resources. If this condition is left unattended, it can stall the tasks performed by the virtual processor and cause the overall performance of the virtual processor to deteriorate significantly and badly impact user-experience with the target server.

The impact of stolen CPU always manifests in slowness but can have more profound effects on your infrastructure. Here are some examples:

Slower page load times

Slower database query times

Slower processing of reports

Increased queue size of asynchronous tasks because of an inability to process them quickly

Increased IaaS bill due to launching more servers to handle the same amount of load

To avoid such eventualities, administrators should either immediately terminate the virtual machine and launch a replacement or upgrade the VM to have more CPU.