eG Monitoring
 

Measures reported by OraRacAsmDiskIOTest

TASM is a volume manager and a file system for Oracle database files that supports single-instance Oracle Database and Oracle Real Application Cluster (Oracle RAC) configuration. ASM is Oracle’s recommended storage management solution that provides an alternative to conventional volume managers, file systems, and raw devices.

ASM uses disk groups to store datafiles; an ASM disk group is a collection of disks that ASM manages as a unit. Within a disk group, ASM exposes a file system interface for Oracle database files. The content of files that are stored in a disk group are evenly distributed, or striped, to eliminate hot spots and to provide uniform performance across the disks.

You need to periodically monitor the read-write activity on each disk in a disk group to make sure that I/O load is uniformly balanced across all disks in a group. The OraRacAsmDiskIOTest test helps you do just that. At pre-configured intervals, this test monitors the I/O activity on each disk in every disk group of an Oracle cluster, reveals I/O-intensive and error-prone disks, and brings irregularities in load balancing to the fore.

Outputs of the test : One set of results for each DiskGroup:Disk pair in the Oracle cluster being monitored

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Reads Indicates the rate at which reads occur on this disk. Reads/Sec Compare the values of each of these measures across the disks in a disk group to identify the I/O-intensive disks in that group. In the process, you can also determine whether/not I/O load is equally balanced across all the disks in the group. If any irregularities are noticed in load-balancing are noticed, you may want to consider adding more disks to the group.
Writes Indicates the rate at which writes occur on this disk. Writes/Sec
ReadErrors Indicates the number of errors that occur per second while reading from this disk. ReadErrors/Sec The value 0 is desired for both these measures. A non-zero value is indicative of I/O errors. By comparing the values of each of these measures across disks and across disk groups, you can not only point to the error-prone disks and groups, but can also figure out when most of the errors occurred on the disk/group - when reading? or when writing?
WriteErrors Indicates the number of errors that occur per second when writing to this disk on this cluster node. WriteErrors/Sec