eG Monitoring
 

Measures reported by OraDeadKilProcsTest

If one/more sessions or processes on the Oracle server are obstructing the execution of a few other sessions/processes, then, it is quiet natural for administrators to want to kill the blocking sessions/processes to ensure the smooth execution of critical database transactions. Typically, these ‘dead’ sessions/processes continue to consume resources, until the PMON process automatically cleans up these sessions/processes. If cleanup is delayed, then the Oracle instance will not be able to release those objects and resources that have been locked by the dead sessions/processes for long time periods.  In such situations, administrators often resort to killing these dead sessions/processes at the operating system-level, so as to hasten the release of valuable resources. Before attempting the OS-level kill, administrators should first figure out which sessions/processes are ‘dead’ presently and how long they have been ‘dead’. This can be ascertained using the OraDeadKilProcsTest test.

This test auto-discovers the dead processes/sessions and reports the current cleanup state of each process/session. In addition, the test reveals the duration for which each process/session remained dead and the count of processes that are being blocked by that dead process/session. This way, administrators can determine whether/not cleanup is occurring as per schedule, and if not, how badly the delay in cleanup is affecting other processes. Alongside, administrators can figure out whether an OS-level process kill is justified or not.

Output of the test : One set of results for deadprocessaddress_deadsessionaddress on the Oracle instance monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Process_state Indicates the current cleanup state of this process.   The values that this measure can report and their corresponding numeric values have been discussed hereunder:

Measure Value Description Numeric Value
unsafe to attempt Occurs for a killed session that has not been moved, so no cleanup can occur on it yet 1
cleanup pending Occurs for a dead process / killed session that can be cleaned up, but PMON has not yet made an attempt 2
resources freed Occurs for a dead process / killed session where all children have been freed, but the process / killed session itself is not yet freed 3
resources freed – pending ack Occurs for a killed session where all children have been freed, but the session itself cannot be freed until the owner has acknowledged it 4
partial cleanup Occurs if some of the children have been cleaned up 5

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the current cleanup state of a dead process. However, in the graph of this measure, the same will be represented using the corresponding numeric equivalents only.

Dead_time Indicates how long it has been since this process was marked dead or this session was marked killed. Secs A consistent increase in the value of this measure is a cause for concern as it indicates that auto-cleanup has not occurred. This can cause the dead process/session to continue consuming resources and blocking object, thereby degrading server performance.
Num_blocked Indicates the count of processes that are blocked by this process. Number A high value indicates that the dead process is impeding the execution of many other processes, some of which may also be mission-critical.

If the Dead time of such a process is also very high, it is a matter of great concern, and must be looked into immediately.

In such circumstances, you may want to consider killing the process at the OS-level. On a Unix system, you can issue the KILL -9 <PID> command at the Shell prompt to kill the process at that level.