Measures reported by XenDiskIOTest
XenServer provides support for a broad range of storage hardware. The term Storage Repository (SR) is used to describe a particular storage target on which Virtual Disk Images (VDIs) are stored. A VDI is a disk abstraction that contains the contents of a disk as presented to a virtual machine. XenServer allows these VDIs to be supported on a large number of SR types, including local disks, NFS filers, Fibre Channel disks and shared iSCSI LUNs. The SR abstraction allows advanced storage features such as thin provisioning, VDI snapshots, and fast cloning to be exposed on storage targets that support them.
If a XenServer host is unable to or takes too much time to read from or write to an SR, it can result in undue delays in the provisioning and maintenance (i.e., creation, deletion, cloning, connecting, resizing, etc.) of virtual disk images. This, in turn, can significantly slowdown VM accesses. To ensure that the user experience with VMs remains top-notch, administrators should continuously monitor the I/O throughput of each storage repository (SR) supported by a XenServer host and quickly isolate the slow SRs. This is where the XenDiskIOTest test helps. By continuously measuring and reporting how well each SR handles read and write requests, this test precisely pinpoints slow SRs, thus prompting administrators to probe into the reasons for the slowness and fix them.
Note:
The performance metrics reported by this test are enabled by default in the XenServer 6.1.0 Performance and Monitoring Supplemental Pack. In XenServer 6.2.0 however, these metrics, though part of the core product, are disabled by default, owing to performance reasons related to XenCenter. This means that, when monitoring XenServer 6.2.0, this test will not report any metrics by default. In such cases, to make sure that the test reports metrics, do the following:
Login to the XenServer host as root user.
Enable the metrics by issuing the following command from the CLI:
xe-enable-all-plugin-metrics true
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| tot_throughputs |
Indicates the throughput of this SR. |
MB/Sec |
A high value indicates high throughput and rapid I/O processing by the SR. Compare the value of this measure across SRs to identify the SR with the lowest throughput. |
| read_throughputs |
Indicates the rate at which the host reads data from this SR. |
MB/Sec |
Ideally, the value of this measure should be high. A consistent drop in the value of this measure indicates a reading bottleneck in the SR. You can compare the value of this measure across SRs to identify that SR which is the slowest in processing read requests. |
| write_throughputs |
Indicates the rate at which the host writes data to this SR. |
MB/Sec |
Ideally, the value of this measure should be high. A consistent drop in the value of this measure indicates a writing bottleneck in the SR. You can compare the value of this measure across SRs to identify that SR which is the slowest in processing write requests. |
| tot_io_requests |
Indicates the rate at which I/O operations are performed by this SR. |
Requests/Sec |
This measure is a good indicator of the I/O processing capacity of the SR. A high value is hence desired for this measure. A consistent drop in this value could indicate a processing bottleneck. In such a situation, you can compare the value of the read_io_requests and write_io_requests measures of the corresponding SR to figure out where the bottleneck lies - in reading data from the SR? or in writing to the SR? |
| read_io_requests |
Indicates the rate at which this SR services read requests. |
Requests/Sec |
Ideally, the value of this measure should be high. A steady drop in this value indicates a slowdown in processing read requests. Compare the value of this measure across SRs to know which SR is the slowest in responding to read requests. |
| write_io_requests |
Indicates the rate at which this SR services write requests. |
Requests/Sec |
Ideally, the value of this measure should be high. A steady drop in this value indicates a slowdown in processing write requests. Compare the value of this measure across SRs to know which SR is the slowest in responding to write requests. |
| iowait |
Indicates the percentage of time the host's CPU was waiting for this SR to complete I/O processing. |
Percent |
A high value for this measure indicates that the SR is taking too long to complete I/O processing. This hints at a probable processing bottleneck with the SR. |
| latency |
Indicates the average time taken by this SR to process I/O requests. |
MilliSeconds |
A high value for this measure is a cause for concern, as it indicates that the SR is highly latent and takes too long to process I/O. Compare the value of this measure across SRs to identify the most latent SR. |
| queue_size |
Indicates the average number of I/O requests to this SR that are in queue for processing. |
Number |
If the value of this measure grows consistently, it indicates that the SR is unable to process requests quickly enough to clear the queue. The SR with the maximum number of queued requests could be experiencing a serious I/O processing bottleneck. To identify this SR, compare the value of this measure across SRs. |
| in_flight |
Indicates the number of I/O requests to this SR that are currently being processed. |
Number |
|
|