eG Monitoring
 

Measures reported by EMCRAIDSysStTest

The storage processor enables the administrator in serving the purpose of the following:

  • creating raid groups
  • binding LUNs
  • execute CLI commands
  • perform read/write operations from external server to SAN
Excessive usage of or heavy I/O load on a single storage processor can cause a marked deterioration in the overall performance of the storage sub-system, as it is indicative of severe deficiencies in the load-balancing algorithm that drives the storage processors. Using the EMCRAIDSysStTest test, administrators can easily monitor the current state, usage, and load on each of the storage processors on the storage system, quickly detect an overload condition, precisely point to the storage processor that is bearing its brunt, and promptly initiate measures to resolve the issue, so as to ensure the optimal performance of the storage system.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
operationalStatus Indicates the current operational state of this storage processor.   The values that this measure can report and their corresponding numeric values are discussed in the table below:

Numeric Value Measure Value
0 OK
1 In Service
2 Power Mode
3 Completed
4 Starting
5 Dormant
6 Other
7 Unknown
8 Stopping
9 Stressed
10 Stopped
11 Supporting Entity in Error
12 Degraded or Predicted Failure
13 Predictive Failure
14 Lost Communication
15 No Contact
16 Aborted
17 Error
18 Non-Recoverable Error

Note:

By default, this measure reports the Measure Values discussed above to indicate the operational state of a storage processor. In the graph of this measure however, operational states are represented using the numeric equivalents only.

detailedStatus Describes the current operational state of this storage processor.   This measure will be reported only if the API provides a detailed operational state.

Typi cally, the detailed state will describe why the storage processor is in a particular operational state. For instance, if the operationalStatus measure reports the value Stopping for a storage processor, then this measure will explain why that storage processor is being stopped.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Numeric Value Measure Value
0 Online
1 Success
2 Power Saving Mode
3 Write Protected
4 Write Disabled
5 Not Ready
6 Removed
7 Rebooting
8 Offline
9 Failure

Note:

By default, this measure reports the Measure Values discussed above to indicate the detailed operational state of a storage processor. In the graph of this measure however, detailed operational states are represented using the numeric equivalents only.

dataTransmitted Indicates the rate at which data was transmitted by this storage processor. MB/Sec  
iops Indicates the rate at which I/O operations were performed on this storage processor. IOPS Compare the value of this measure across storage processors to know which storage processor handled the maximum number of I/O requests and which handled the least. If the gap between the two is very high, then it indicates serious irregularities in load-balancing across storage processors.

You may then want to take a look at the reads and writes measures to understand what to fine-tune – the load-balancing algorithm for read requests or that of the write requests.

reads Indicates the rate at which read operations were performed on this storage processor. Reads/Sec Compare the value of this measure across storage processors to know which storage processor handled the maximum number of read requests and which handled the least.
writes Indicates the rate at which write operations were performed on this storage processor. Writes/Sec Compare the value of this measure across storage processors to know which storage processor handled the maximum number of write requests and which handled the least.
dataReads Indicates the rate at which data is read from this storage processor. MB/Sec Compare the value of these measures across storage processors to identify the slowest storage processor in terms of servicing read and write requests (respectively).
dataWritten Indicates the rate at which data is written to this storage processor. MB/Sec
avgReadSize Indicates the amount of data read from this storage processor per I/O operation. MB/Op Compare the value of these measures across storage processors to identify the slowest storage processor in terms of servicing read and write requests (respectively).
avgWriteSize Indicates the amount of data written to this storage processor per I/O operation. MB/Op
readHits Indicates the percentage of read requests that were serviced by the cache of this storage processor. Percent A high value is desired for this measure. A very low value is a cause for concern, as it indicates that cache usage is very poor; this in turn implies that direct storage processor accesses, which are expensive operations, are high.
writeHits Indicates the percentage of write requests that were serviced by the cache of this storage processor. Percent
highWaterFlushes Indicates the count of times data was flushed out of the write cache of this storage processor because a high watermark was violated. Number To regulate cache usage, watermark levels can be set using Navisphere Manager, Let’s assume your Low Watermark (LWM) is set at 60% and your High Watermark (HWM) is at 80%. In this scenario, Clariion Algorithms will try to keep your cache levels between 60% and 80% since those are defined as the low and high watermarks.

If for some reason the cache exceeds 80% occupancy (HWM), Forced Flushing kicks in disabling all the write cache in the Clariion.

idleWaterFlushes Indicates the count of times data was flushed out of the write cache of this storage processor via idle cache flushing. Number When a host is writing data to the connected Clariion Disk via cache on the Clariion, the Clariion takes that data, writes it to cache and acknowledges back to the host that the data has been written to disk. This data can actually be sitting in the cache or being written to the disk when this acknowledgement goes out. The process happens in 64 Kilobyte chunks when the data is being transferred to the disk from the cache.

Due to large chunks of data coming in from the host, sometimes Idle Cache Flushing is not able to maintain the Low Watermark (LWM), in those cases Watermark Cache Flushing kicks in.

lowWaterFlushes Indicates the count of times data was flushed out of the write cache of this storage processor because a low watermark was violated. Number Due to large chunks of data coming in from the host, sometimes Idle Cache Flushing is not able to maintain the Low Watermark (LWM), in those cases Watermark Cache Flushing kicks in.
writeFlushes Indicates the number of requests to flush the write cache of this storage processor. Number  
writeCacheFlushed Indicates the amount of data flushed out of the write cache of this storage processor. KB  
queueArrivals Indicates the number of times a user request arrived while at least one other request was being processed by this storage processor. Number  
queueLength Indicates the count of queue length by arrivals for this storage processor. Number A consistent increase in the value of this measure could indicate a processing bottleneck.
dirtyPagesPct Indicates the percentage of dirty pages currently in cache, that is, pages that have been modified in the SP’s write cache, but that have not yet been written to disk. Percent A high percentage of dirty pages means the cache is handling many write requests.