eG Monitoring
 

Measures reported by EqlRaidStTest

The disks in EqualLogic are automatically protected with RAID (RAID 10, RAID 5, or RAID 50) and hot spares.

This test monitors this protective shield by periodically checking the status of the RAID and the number of hot spares available, and promptly reporting RAID failures.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Status Indicates the current state of the RAID.   This measure reports the current state of the RAID as follows:
  • Ok
  • Degraded
  • Verifying
  • Reconstructing
  • Failed
  • Catastrophic Loss
  • Expanding
  • Mirroring

The numeric values that correspond to the above-mentioned states are as follows:

State Numeric Value
Ok 1
Degraded 2
Verifying 3
Reconstructing 4
Failed 5
Catastrophic Loss 6
Expanding 7
Mirroring 8

Note:

By default, this measure reports the above-mentioned states while indicating the current state of the RAID. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.
Number_of_spares Indicates the number of disks that are currently alloted as spares in the RAID. Number If a drive fails in a RAID array that includes redundancy--meaning all of them except RAID 0--it is desirable to get the drive replaced immediately so the array can be returned to normal operation. There are two reasons for this: fault tolerance and performance. If the drive is running in a degraded mode due to a drive failure, until the drive is replaced, most RAID levels will be running with no fault protection at all: a RAID 1 array is reduced to a single drive, and a RAID 3 or RAID 5 array becomes equivalent to a RAID 0 array in terms of fault tolerance. At the same time, the performance of the array will be reduced, sometimes substantially.

An extremely useful RAID feature that helps alleviate this problem is the use of hot spares. Additional drives are attached to the controller and left in a "standby" mode. If a failure occurs, the controller can use the spare drive as a replacement for the bad drive. Moreover, with a controller that supports hot sparing, rebuild will be automatic. If the controller detects that a drive has gone down, it disables it, and immediately rebuilds the data onto the hot spare.