eG Monitoring
 

Measures reported by OraExaCelDIOTest

When database performance issues are related to I/O load on the Exadata storage servers, typically there will be increased latencies in the I/O-related wait events, and increased database time in the User I/O or System I/O wait classes.

A cell disk with a processing bottleneck will not be able to process user requests for data quickly, thereby causing prolonged delays in data access for users. Similarly, a cell disk that is overloaded will not be able to perform at peak capacity, thus affecting the user experience with the storage server. Administrators hence have to continuously track the load on and the processing speed of each of the cell disks, so that potential overload conditions and probable processing delays can be detected proactively and pre-emptively treated. The OraExaCelDIOTest test helps administrators with this.

This test monitors the level of traffic on each cell disk created on an Oracle Exadata Storage Server, and helps isolate irregularities in load balancing across the cell disks. Alongside, the test also helps identify which cell disk is experiencing processing bottlenecks (if any), so that the bottleneck can be cleared before users complain of slowdowns.

Outputs of the test: One set of results for each cell disk on the target Oracle Exadata Storage Server being monitored

Descriptor: Cell disk

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
largeBlockReadData Indicates the rate at which data was read in large blocks from this cell disk. MB/sec

These measures are a good indicator of read I/O processing ability of the cell disks.

Compare the value of these measures across the cell disks to figure out the cell disk that reads maximum large blocks/small blocks.

smallBlockReadData Indicates the rate at which data was read in small blocks from this cell disk. MB/sec

Compare the value of this measure with Ethernet interfaces transmitted data to figure out the type of interface through which maximum amount of data was received.

receivedEthernetData Indicates the rate at which data was received from ethernet interfaces. MB/sec

These measures are a good indicator of read I/O processing ability of the cell disks.

Compare the value of these measures across the cell disks to figure out the cell disk that reads maximum large blocks/small blocks.

largeBlockWriteData Indicates the rate at which data was written in large blocks to this cell disk. MB/sec

These measures are a good indicator of write I/O processing ability of the cell disks.

Compare the value of these measures across the cell disks to figure out the cell disk that reads maximum large blocks/small blocks.

smallBlockWriteData Indicates the rate at which data was written in small blocks to this cell disk. MB/sec

These measures are a good indicator of write I/O processing ability of the cell disks.

Compare the value of these measures across the cell disks to figure out the cell disk that reads maximum large blocks/small blocks.

scrupJobReadData Indicates the rate at which data was read from this cell disk by the scrubbing job. MB/sec

Scrub IO - occurs when Oracle Exadata System Software automatically inspects and repairs the hard disks. Scrub I/O is performed periodically when the hard disks are idle, and mostly results in large disk reads, which should be throttled automatically if the disk becomes I/O bound.

ioErrors Indicates the number of I/O errors recorded for this cell disk per minute. Errors/min

Ideally, the value of this measure should be zero.

Compare the value of this measure across cell disks to identify the cell disk that is prone to errors.

largeBlockReadRequests Indicates the number of read requests to read large blocks from this cell disk per second. Requests/sec

Compare the value of these measures across the cell disks to figure out the cell disk that processes maximum read requests to read large blocks/ small blocks.

smallBlockReadRequests Indicates the number of read requests to read small blocks from this cell disk per second. Requests/sec

Compare the value of these measures across the cell disks to figure out the cell disk that processes maximum read requests to read large blocks/ small blocks.

largeBlockWriteRequests Indicates the number of write requests to write large blocks to this cell disk per second. Requests/sec

Compare the value of these measures across the cell disks to figure out the cell disk that processes maximum write requests to write large blocks/ small blocks.

smallBlockWriteRequests Indicates the number of write requests to write small blocks to this cell disk per second. Requests/sec

Compare the value of these measures across the cell disks to figure out the cell disk that processes maximum write requests to write large blocks/ small blocks.

scrubbingJobReadRequests Indicates the number of requests to read data from from this cell disk by the scrubbing job per second. Requests/sec

 

largeBlockAvgReadLatency Indicates the average time taken to read large blocks from this cell disk per request. Milliseconds/request

A low value is desired for this measure. A sudden/gradual increase in the value of this measure indicates that the I/O processing ability of the cell disk is on decline. Administrators need to analyze the reason behind such issues and rectify them at the earliest.

smallBlockAvgReadLatency Indicates the average time taken to read small blocks from this cell disk per request. Milliseconds/request

A low value is desired for this measure. A sudden/gradualincrease in the value of this measure indicates that the I/O processing ability of the cell disk is on decline. Administrators need to analyze the reason behind such issues and rectify them at the earliest.

largBlockAvgWriteLatency Indicates the average time taken to write large blocks to this cell disk per request. Milliseconds/request

A low value is desired for this measure. A sudden/gradual increase in the value of this measure indicates that the I/O processing ability of the cell disk is on decline. Administrators need to analyze the reason behind such issues and rectify them at the earliest.

smalBlockAvgWriteLatency Indicates the average time taken to write small blocks to this cell disk per request. Milliseconds/request

A low value is desired for this measure. A sudden/gradual increase in the value of this measure indicates that the I/O processing ability of the cell disk is on decline. Administrators need to analyze the reason behind such issues and rectify them at the earliest.

deviceUtilization Indicates the percentage of disk resources utilized for this cell disk. Percent

A high value indicates that this cell disk is utilizing the maximum of disk resources.

Compare the value of this measure across cell disks to figure out the cell disk that is utilizing maximum disk resources.

largeRequestDeviceUtil Indicates the percentage of disk resources utilized by large requests for this cell disk. Percent

Compare the value of this measure across cell disks to figure out the cell disk that is utilizing the maximum of disk resources for processing large requests.

smallRequestDeviceUtil Indicates the percentage of disk resources utilized by small requests for this cell disk. Percent

Compare the value of this measure across cell disks to figure out the cell disk that is utilizing the maximum of disk resources for processing small requests.