eG Monitoring
 

Measures reported by NAUSDUnGrpDiskTest

This test monitors the disks such as spare disks that do not belong to any RAID group in the NetApp Unified Storage system and reports the following:

  • The number of disks that are currently zeroing
  • The number of disks that are offline and the number of broken disks
  • How well media scrubbing has been completed in those disks?

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Zeroing_disks Indicates the number of disks that are currently zeroing in this storage system. Number Disk zeroing is usually a time consuming background operation that is used to initialize the spare disks before they can be used.

Disk zeroing is the process of formatting the disk by filling zeroes i.e., overwriting the files with zeroes before being used.

Offline_disks Indicates the number of disks that are currently offline in this storage system. Number Unresponsive or semi-responsive disks are taken offline by the operating system and its data is reconstructed from the associated parity disks. This puts a strain on the performance of the associated RAID group. Irrecoverable offline disks will be failed.
Broken_disks Indicates the number of disks whose RAID status is Broken in this storage system. Number The disks may be broken due to disk failure, labeling issues or intentional setting to phsyical removal. Broken disks affect constituent raid group performance and put the system at risk of losing data if spares are unavailable.
Media_scrub_completed Indicates the average percentage of media scrubbing that is currently completed across all spare disks in this storage system. Percent Media scrubbing is a continuous background process. The purpose of the continuous media scrub is to detect and correct media errors in order to minimize the chance of storage system disruption due to a media error while a storage system is in degraded or reconstruction mode.

By default, Data ONTAP runs continuous background media scrubbing for media errors on all storage system disks. If a media error is found, Data ONTAP uses RAID to reconstruct the data and repairs the error.

Due to media scrubbing process, the disk LEDs may blink on an apparently idle storage system and some CPU activity may occur even when no user workload is present.