eG Monitoring
 

Measures reported by PitbossTest

Pitboss, a watchdog deamon controls the processes on a NetScaler appliance. If the pitboss detects a failing process, it will try to restart it, and if the pitboss fails to restart a process in 5 attempts, on the sixth failure, the NetScaler will undergo a full reboot. If the failure of the pitboss forces the process as well as the NetScaler to reboot frequently, then with each reboot, the performance of the NetScaler appliance would be on the downhill! To detect such failures at the earliest and identify the processes that are failing, administrators can use the PitbossTest test.

Using this test, administrators can figure out the number of times a pitboss watch was added to a process and deleted to a process. In addition, this test throws light on the number of times the process has reached the maximum number of restarts thus allowing the pitboss to restart the NetScaler appliance.

For this test to run and report metrics, the NetScaler device should be configured to create a Syslog file in a remote Syslog server, where the details of all interactions with the NetScaler appliance will be logged. To know how to configure the Syslog server where this Syslog file should be created, Click here.

Outputs of the test : One set of results for the NetScaler appliance being monitored

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
pb_watch_added Indicates the number of times pitboss watch was added on a process with the process ID. Number A high value for this measure is a cause of concern as this may indicate that a large number of processes running on the NetScaler appliance are failing.
pb_watch_deleted Indicates the number of times pitboss watch was deleted from a process with the process ID. Number  
pb_system_restarts Indicates the number of times the process with pid had reached the maximum number of restarts thus leading to system reboot. Number Ideally, the value of this measure should be zero.
pb_process_restarts Indicates the number of times the process with pid had reached the maximum number of restarts thus leading to process reboot. Number Ideally, the value of this measure should be zero.