eG Monitoring
 

Measures reported by VmgWinMemoryTest

To understand the metrics reported by this test, it is essential to understand how memory is handled by the Operating System. On any system, memory is partitioned into a part that is available for user processes, and another that is available to the OS kernel. The kernel memory area is divided into several parts, with the two major parts (called “pools”) being a nonpaged pool and a paged pool. The nonpaged pool is a section of memory that cannot, under any circumstances, be paged to disk. The paged pool is a section of memory that can be paged to disk. (Just being stored in the paged pool doesn't necessarily mean that something has been paged to disk. It just means that it has either been paged to disk or it could be paged to disk.) Sandwiched directly in between the nonpaged and paged pools (although technically part of the nonpaged pool) is a section of memory called the “System Page Table Entries”, or “System PTEs”. The VmgWinMemoryTest tracks critical metrics corresponding to the System PTEs and the pool areas of kernel memory.

Measurement Description Measurement Unit Interpretation
Free_sys_page_table Indicates the number of page table entries not currently in use by the guest. Number The maximum number of System PTEs that a server can have is set when the server boots. In heavily-used servers, you can run out of system PTEs. You can use the registry to increase the number of system PTEs, but that encroaches into the paged pool area, and you could run out of paged pool memory. Running out of either one is bad, and the goal should be to tune your server so that you run out of both at the exact same time. Typically, the value of this metric should be above 3000.
Page_read_rate Indicates the average number of times per second the disk was read to resolve hard fault paging. Reads/Sec A hard page fault occurs when a program requests a data which is not in physical memory. In this case, the operating system finds the specific data on disk and restores it to the physical memory.

By tracking the variations to this measure over time, you can keep tabs on hard page faults.
Page_write_rate Indicates the average number of times per second the pages are written to disk to free up the physical memory. Writes/Sec  
Page_input_rate Indicates the number of times per second that a process needed to access a piece of memory that was not in its working set, meaning that the guest had to retrieve it from the page file. Pages/Sec  
Page_output_rate Indicates the number of times per second the guest decided to trim a process's working set by writing some memory to disk in order to free up physical memory for another process. Pages/Sec This value is a critical measure of the memory utilization on a guest. If this value never increases, then there is sufficient memory in the guest. Instantaneous spikes of this value are acceptable, but if the value itself starts to rise over time or with load, it implies that there is a memory shortage on the guest.
Pool_nonpaged_data Indicates the total size of the kernel memory nonpaged pool. MB The kernel memory nonpage pool is an area of guest memory (that is, memory used by the guest operating system) for kernel objects that cannot be written to disk, but must remain in memory as long as the objects are allocated. Typically, there should be no more than 100 MB of non-paged pool memory being used.
Pool_paged_data Indicates the total size of the Paged Pool. MB If the Paged Pool starts to run out of space (when it's 80% full by default), the guest will automatically take some memory away from the System File Cache and give it to the PagedPool.

This makes the System File Cache smaller. However, the system file cache is critical, and so it will never reach zero. Hence, a significant increase in the paged pool size is a problem.This metric is a useful indicator of memory leaks in a guest. A memory leak occurs when the guest allocates more memory to a process than the process gives back to thepool. Any time of process can cause a memory leak. If the amount of pagedpool data keeps increasing even though the workload on the guest remains constant, it is an indicator of a memory leak.
Commited_bytes_in_use Indicates the committed bytes as a percentage of the Commit Limit. Percent Whenever this measure exceeds 80-90%, application requests to allocate memory in the memory (page file). This ratio can be reduced by increasing the Physical memory or the Page file.
Pool_nonpaged_failures Indicates the number of times allocations have failed from non paged pool. Number Generally, a non-zero value indicates a shortage of physical memory.
Pool_paged_failures Indicates the number of times allocations have failed from paged pool. Number A non-zero value indicates a shortage of physical memory.
Copy_read_hits Indicates what percent of read I/O being served is coming from system cache, not disk. Percentage This is an important counter for applications like the Citrix Provisioning server that stream large volumes of data. If the RAM cache of the server is not sufficiently large, a lot of the I/O requests will be served from the disk, and not the system cache. This will reduce performance. Hence, it is critical to monitor this metric. The higher the value, the better the performance you can see from the server.
Copy_reads_sec Indicates how many hits you are really getting. Reads/Sec A copy read is a file read operation that is satisfied by a memory copy from a page in the cache to the application's buffer. The LAN redirector uses this method for retrieving information from the cache, as does the LAN server for small transfers. This method is also used by the disk file systems.
Page_fault_rate Indicates the rate at which the page faults occurred. Faults/sec Page Faults occur in the threads executing in a process. A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. If the page is on the standby list and hence already in main memory, or if the page is in use by another process with whom the page is shared, then the page fault will not cause the page to be fetched from disk. Excessive page faults could result in decreased performance.