eG Monitoring
 

Measures reported by FCCluSharedVolTest

A CSV is a disk or pool of disks which is accessible by each node in a Hyper-V cluster as if it were a logical disk on the system. Each node in the cluster willl be able to connect to the CSV simultaneously. This allows you to have a common storage location for the VM disk and machine configuration which can be passed to another node in the event of a node failure, without the need for manually mounting a volume or copying files.

To use CSV, a Hyper-V VM is configured and the associated virtual hard disk(s) are created on or copied to a CSV disk. Multiple VHDs can be placed on a CSV that in turn are associated with multiple VMs which can be running on different nodes in the cluster.

Since multiple VMs access a CSV simultaneously, the I/O load on the CSV is bound to increase with the count of VMs sharing it! For maximizing CSV and VM performance, administrators should make sure that I/O load is always evenly distributed across the CSVs. To keep an eye on the I/O load on each  CSV and to instantly identify overloaded CSVs, administrators can use the FCCluSharedVolTest test.

This test auto-discovers the CSVs and closely monitors the I/O load on each CSV, measures the rate at which every CSV processes the load, and thus points to those CSVs that are overloaded or are experiencing processing bottlenecks.

Note:

This test is only applicable to Microsoft Hyper-V servers running Windows 2008.

Outputs of the test : One set of results will be reported for every CSV on the server.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Read_throughput Indicates the rate at which this CSV reads data from the disk in the Direct I/O Mode or in the Block Level Redirected I/O Mode. Kbps These measures include both Direct I/O and Block Level Redirected I/O. In Direct Mode, I/O operations from the application on the cluster node can be sent directly to the storage. It therefore, bypasses the NTFS or ReFS volume stack. In Block level redirected Mode, I/O passes through the local CSVFS proxy file system stack and is written directly to Disk.sys on the coordinator node. As a result it avoids traversing the NTFS/ReFS file system stack twice.

The technologies that let CSV-enabled volumes operate require one cluster node that's responsible for the coordination of file access. This cluster node is called the coordinator node, with each individual LUN having its own coordinator node.

If the node being monitored is a co-ordinator node, then these measures include the following:

  • the rate at which this CSV reads/writes (as the case may be) data directly to the storage, in the Direct I/O Mode.

  • the rate at which this CSV reads/writes I/O redirected by all slave nodes in the cluster directly to the storage, in the Block Level Redirected I/O Mode.

If the node being monitored is a non-coordinator node, then these measures include the following:

  • the rate at which this CSV reads/writes (as the case may be) data directly to the storage, in the Direct I/O Mode.

  • the rate at which this CSV reads/writes (as the case may be) I/O to the disk by redirecting the I/O to the coordinator node, in the Block Level Redirected I/O Mode.

Write_throughput Indicates the rate at which this CSV writes data to the disk in the Direct I/O Mode or in the Block Level Redirected I/O Mode. Kbps
Total_throughput Indicates the rate at which this CSV reads data from and writes data to the disk in the Direct I/O Mode or in the Block Level Redirected I/O Mode. Kbps The value of this measure is the sum of the values of the Read_throughput and Write_throughput measures.

This is a good indicator of the level of direct I/O activity on a CSV. By comparing the value of this measure across CSVs, you can figure out which CSV is experiencing maximum direct traffic. If this max value is abnormally high for that CSV, you may want to investigate the reasons for the same.
Redir_read_throughput If the node being monitored is a co-ordinator node, then this measure indicates the rate at which this CSV reads data from the physical disk via NTFS, in the File System Redirected Mode. If the node being monitored is a non-coordinator node, then this measure indicates the rate at which this CSV reads data from the disk by redirecting the I/O to the co-ordinator node via SMB, in the File System Redirected Mode. Kbps The technologies that let CSV-enabled volumes operate require one cluster node that's responsible for the coordination of file access. This cluster node is called the coordinator node, with each individual LUN having its own coordinator node.

That node can be any of your cluster hosts, with each host having an equal chance of being given the job. While this responsibility doesn't come into play often—typically, Hyper-V interacts with its disk files directly, not necessarily through a coordinator node—it's important for certain types of actions. One of those actions is copying VHD files to a LUN. Hyper-V transparently redirects the file copy through the coordinator node.

I/O redirection can also occur if slave nodes in a cluster are unable to access the disk directly. In this case, the slave nodes will redirect the I/O to the co-ordinator node via the SMB Client protocol. The coordinator node then processes the redirected I/I/O it receives using the SMB Server protocol. This redirection is performed in the File System Redirected Mode only. In File System Redirected Mode, I/O on a cluster node is redirected at the top of the CSV pseudo-file system stack over SMB to the disk. This traffic is written to the disk via the NTFS or ReFS file system stack on the coordinator node.

From this, we can conclude that for a CSV attached to a co-ordinator node, the value of the Redir_read_throughput measure will represent the rate at which the read I/Os redirected by all slave nodes in the cluster are received and processed by this CSV in the File System Redirected Mode. For a CSV on a slave/non-coordinator node, the value of this measure will indicate the rate at which that CSV redirected the read I/Os to the coordinator node and read data from the disk. In case of a slave node, the value of this measure will also include the rate at which VHD files are read from that CSV to be written/copied to a CSV on the coordinator node.

The value of the Redir_write_throughput measure for a CSV attached to a coordinator node will include:

  • the rate at which the write I/Os redirected by all slave nodes in the cluster are received and processed by this CSV in the File System Redirected Mode.

  • the rate at which the VHD files are copied to the LUN;

For a slave/non-coordinator node on the other hand, the value of the Redir_write_throughput measure will represent only the rate at which that CSV redirects write I/Os to the coordinator node and writes data to the disk, in the File System Redirected Mode.

Redir_write_throughput If the node being monitored is a co-ordinator node, then this measure indicates the rate at which this CSV writes data to the physical disk via NTFS, in the File System Redirected Mode. If the node being monitored is a non-coordinator node, then this measure indicates the rate at which this CSV writes data to the disk by redirecting the I/O to the co-ordinator node via SMB, in the File System Redirected Mode. Kbps
Redir_throughput If the node being monitored is a co-ordinator node, then this measure indicates the rate at which this CSV writes data to the physical disk via NTFS, in the File System Redirected Mode. If the node being monitored is a non-coordinator node, then this measure indicates the rate at which this CSV writes data to the disk by redirecting the I/O to the co-ordinator node via SMB, in theFile System Redirected Mode. Kbps This is the sum of the values of the Redir_read_throughput and Redir_write_throughput measures.

This is a good indicator of the level of redireced I/O activity on a CSV. By comparing the value of this measure across CSVs, you can figure out which CSV is experiencing maximum redirected traffic. If this max value is abnormally high for that CSV, you may want to investigate the reasons for the same.
Redir_read_iops If the node being monitored is a co-ordinator node, then this measure indicates the number of read operations performed by this CSV via NTFS, in the File System Redirected Mode. If the node being monitored is a non-coordinator node, then this measure indicates the number of read operations performed by this CSV by redirecting the read requests to the co-ordinator node via SMB, in the File System Redirected Mode. Number  
Redir_write_iops If the node being monitored is a co-ordinator node, then this measure indicates the number of CSV writes to the disk via NTFS, in the File System Redirected Mode. If the node being monitored is a non-coordinator node, then this measure indicates the number of CSV writes to the disk by redirecting the write requests to the co-ordinator node via SMB, in the File System Redirected Mode. Number  
Redir_iops Indicates the number of I/O reads and writes performed by this CSV on the disk via NTFS, in the File System .Redirected Mode. Number The value of this measure is the sum of the values of the Redir_read_iops and Redir_write_iops measures.

This is a good indicator of the level of I/O activity in the File System Redirected Mode.
Csv_read_iops Indicates the number of disk reads performed by this CSV in the Direct I/O Mode or in the Block Level Redirected I/O Mode. Reads/Sec These measures include both Direct I/O and Block Level Redirected I/O. In Direct Mode, I/O operations from the application on the cluster node can be sent directly to the storage. It therefore, bypasses the NTFS or ReFS volume stack. In Block level redirected Mode, I/O passes through the local CSVFS proxy file system stack and is written directly to Disk.sys on the coordinator node. As a result it avoids traversing the NTFS/ReFS file system stack twice.

If the node being monitored is a co-ordinator node, then these measures will include the following:

  • the number of read/write(as the case may be)operations performed by this CSV directly on the storage, in the Direct I/O Mode.

  • the number of read/write(as the case may be) operations performed by this CSV in response to read/write requests redirected to it by all slave nodes in the cluster, in the Block Level Redirected I/O Mode.

If the node being monitored is a non-coordinator node, then these measures will include the following:

  • the number of read/write operations (as the case may be) performed by this CSV directly on the storage, in the Direct I/O Mode.

  • the number of read/write operations (as the case may be) performed by this CSV by redirecting read/write requests to the coordinator node, in the Block Level Redirected I/O Mode.

Csv_write_iops Indicates the number of write operations performed by this CSV in the Direct I/O Mode or in the Block Level Redirected I/O Mode. Writes/Sec
Csv_iops Indicates the total number of I/O operations performed by this CSV in the Direct I/O Mode or in the Block Level Redirected I/O Mode. Operations/Sec The value of this measure is the sum of the values of the Csv_read_iops and Csv_write_iops measures.

This is a good indicator of the level of I/O activity on the CSV in the Direct I/O Mode or in the Block Level Redirected I/O Mode.