eG Monitoring
 

Measures reported by IBMV7000DriveTest

IBM Storwize V7000 enclosures currently support SSD, SAS, and Nearline SAS drive types. Each SAS drive has two ports (two PHYs) and I/O can be issued down both paths simultaneously. If even a single drive lags behind in I/O processing, the overall I/O performance of the storage system will suffer. It is therefore, imperative that administrators watch out for slowness in drives and proactively detect potential I/O processing bottlenecks in drives, so that end-users need not have to deal with slowness when reading from or writing into the storage system. The IBMV7000DriveTest test helps administrators with this.

For each drive in the IBM Storwize v7000 storage system, this test reports the load on the drive and how well the drive handles the load. This way, overloaded drives and those experiencing processing slowdowns can be identified quickly and accurately.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
RO_drive Indicates the rate at which read operations were performed on this drive. Reads/Sec Comparing the value of each of these measures across drives helps you in identifying the overloaded drives - it could shed light on irregularities on load balancing across the drives.
WO_drive Indicates the rate at which write operations were performed to this drive. Write/Sec
RB_drive Indicates the rate at which data blocks were read from this drive. MB/Sec By comparing the value of each of these measures across drives, you can identify the drive that is the slowest in reading and writing. The reason for the slowness has to be determined and eliminated to ensure the high availability and performance of the storage system.
WB_drive Indicates the rate at which the data blocks were written to this drive. MB/Sec
RE_drive Indicates the average time taken by this drive to respond to read requests. Millisec A low value is desired for this counter. A high value is indicative of slowness when responding to read requests.

The least responsive drive can be identified by comparing the value of this measure across drives.

WE_drive Indicates the average time taken by this drive to respond to write requests. Millisec A low value is desired for this counter. A high value is indicative of slowness when responding to write requests.

To know which drive is taking an unreasonably long time to service write requests, compare the value of this measure across drives.

PRE_drive Indicates the maximum time taken by this drive to respond to a read request during the last measurement period. Millisec Compare the value of this measure across th edrives to identify the drive that was the slowest in responding to read requests during the last measurement period. If the same drive tops this comparison consistently, it could indicate a read I/O processing bottleneck in that drive.
PWE_drive Indicates the maximum time taken by this drive to respond to a write request during the last measurement period. Millisec Compare the value of this measure across the drives to identify the drive that was the slowest in responding to write requests during the last measurement period. If the same drive tops this comparison consistently, it could indicate a write I/O processing bottleneck in that drive.
PRO_drive Indicates the maximum time a read request was waiting in the queue before being sent to this drive during the last measurement period. Millisec The value of this measure includes the time spent by the read request in the queue and the time taken for execution of this request by the drive.

By comparing the value of this measure with that of the Peak read external response measure, you can understand where the read requests could have spent maximum time - in the queue? or in the drive, being processed?

PWO_drive Indicates the maximum time a write request was waiting in the queue before being sent to this drive during the last measurement period. Millisec The value of this measure includes the time spent by the write request in the queue and the time taken for execution of this request by the drive.

By comparing the value of this measure with that of the Peak write external response measure, you can understand where the read requests could have spent maximum time - in the queue? or in the drivek, being processed?