|
Measures reported by NetCluJobTypeTest
A job is any asynchronous task performed on the NetApp Cluster. Jobs are typically long-running volume operations such as copy, move, and mirror. You can monitor, pause, stop, and restart jobs, and configure them to run on specified schedules.
There are three categories of jobs that you can manage: server-affiliated, cluster-affiliated, and private.
A job can be in any of the following categories:
- Server-Affiliated jobs: These jobs are queued by the management framework to a specific node to be run.
- Cluster-Affiliated jobs: These jobs are queued by the management framework to any node in the cluster to be run.
- Private jobs: These jobs are specific to a node and do not use the replicated database (RDB) or any other cluster mechanism.
Jobs are placed into a job queue and run when resources are available. If the jobs in the job queue are not processed quickly, it would result in an overload condition characterized by long-winding job queues thus leading to the slowdown of the NetApp Cluster. In the event of such abnormalities, administrators will have to instantly figure out which type of jobs are contributing to the overload and why – is it because jobs of this type are failing frequently owing to errors? Or is it because the Cluster is not adequately configured to handle these jobs? The Job Status test helps administrators answer these questions!
This test auto-discovers the type of jobs in queue, and for each job type, reports the count of jobs that were successful, running, rescheduled, failed etc. This way, the test sheds light on job types that fail often, those that are taking too long to complete, and the probable reasons for the same.
The measures made by this test are as follows:
| Measurement |
Description |
Measurement Unit |
Interpretation |
| Success |
Indicates the number of jobs of this job type that were completed successfully. |
Number |
A high value is desired for this measure.
|
| Initiall |
Indicates the number of jobs of this job type that had been created but yet to be queued. |
Number |
|
| Running |
Indicates the number of jobs of this job type that ran upon picked by an instance of the Job Manager. |
Number |
|
| Waiting |
Indicates the number of jobs of this job type that were waiting for another job to complete. |
Number |
A high value for this measure is an indication of an endlessly running job which needs to be terminated failing which there may be a performance bottleneck.
|
| Queued |
Indicates the number of jobs of this job type that were queued for execution. |
Number |
Queued jobs could be run immediately or may be scheduled to run at a later time.
|
| Pausing |
Indicates the number of jobs of this job type that were in the process of pausing after being requested to pause. |
Number |
|
| Paused |
Indicates the number of jobs of this job type that were paused indefinetely. |
Number |
|
| Quitting |
Indicates the number of jobs of this job type that had been requested to terminate and were shutting down. |
Number |
|
| Quit |
Indicates the number of jobs of this job type that had been requested to terminate. |
Number |
|
| Reschedule |
Indicates the number of jobs of this job type that were rescheduled. |
Number |
|
| Error |
Indicates the number of times internal error occurred while processing the jobs of this job type. |
Number |
Ideally, the value of this measure should be zero.
The detailed diagnosis of this measure if enabled, lists the name of the vServer, the name of the Job, the priority of the job, description of the job and the progress of the job.
|
| Failure |
Indicates the number of jobs of this job type that failed to execute. |
Number |
A low value is desired for this measure.
The detailed diagnosis of this measure if enabled, lists the name of the vServer, the name of the Job, the priority of the job, description of the job and the progress of the job.
|
| Dead |
Indicates the number of jobs of this job type that exceeded the drop dead time and are being removed from the queue. |
Number |
The detailed diagnosis of this measure if enabled, lists the name of the vServer, the name of the Job, the priority of the job, description of the job and the progress of the job.
|
| Unknownn |
Indicates the number of jobs of this job type that were in the Unknown state. |
Number |
The detailed diagnosis of this measure if enabled, lists the name of the vServer, the name of the Job, the priority of the job, description of the job and the progress of the job. |
| Restart |
Indicates the number of jobs of this job type that were restarted. |
Number |
|
| Dormant |
Indicates the number of jobs of this job type that were inactive while waiting on some external event. |
Number |
|
|