eG Monitoring
 

Measures reported by OraRacJobsTest

This test monitors Oracle jobs and reports the number of jobs that have failed and those that are broken. The detailed diagnosis capability offered by this test enables administrators perform further diagnosis on failed/broken jobs, by additionally revealing the complete details of the failed and broken jobs.

Outputs of the test : One set of results for the Oracle cluster being monitored

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Num_broken_jobs Indicates the number of jobs that failed. Number Ideally, the value of this measure should be 0. Any value greater than zero, is a cause of concern, as it indicates the existence of a failed job. To know which job(s) has failed, use the detailed diagnosis capability of this measure.

Typically, if a job fails, Oracle attempts to run the job again 16 times, at fixed time intervals. You are advised to investigate the reason for the failure and fix it, by the time Oracle completes its 16th attempt. This is because, if the 16th attempt too fails, Oracle flags the job as a ‘broken job’, which can then be executed only manually.
Num_failed_jobs Indicates the number of jobs broken. Number Ideally, the value of this measure should be 0. Any value greater than 0 is a problem, as it indicates the existence of one/more broken jobs. A job is considered broken, only if the 16th attempt made by Oracle to run the job fails. To know which jobs have broken, use the detailed diagnosis capability of this measure. Once the jobs are identified, you can proceed to manually run the broken jobs through the DBMS_JOB.RUN procedure after logging in as the owner of that job.