eG Monitoring
 

Measures reported by WinNewSvcTest

Many server applications in Windows environments run as background services. For each service that is running, there may be one or more associated processes. In some Windows environments, a service may suddenly become unresponsive causing the application corresponding to that service to stop temporarily. This may be due to one/more processes associated with the service that suddenly becomes unresponsive or may be consuming excessive physical resources. To figure out the exact service/process combination that is unresponsive , you can use the Windows Service Details test.

This test checks the availability of the service that corresponds to an application, reports the count of processes running for each service, the threads and handles count, the rate at which I/O data is read and written for each service, the CPU and memory utilization of the service etc. Using this test, administrators can figure out process overheads with ease and further investigate the reason for such process overheads. By proactively monitoring the processes, administrators may be in a better position to control unresponsiveness of the applications in their environment often.

Ouputs of the test:One set of results for the target Cassandra Database node being monitored.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
Cpu_util Indicates the percentage of CPU utilized by this service. Percent  
Memory_util Indicates the percentage of memory utilized by this service. Percent  
Handle_count Indicates the number of handles opened by the processes of this service. Number By closely tracking the handle usage of these processes over time, you can identify potential handle leaks.
IO_data_oper_rate Indicates the rate at which I/O operations were performed by all the processes to read the data for this service. Operations/sec  
IO_data_rate Indicates the rate at which data was read by all the processes running for this service. KB/sec  
IO_read_data_rate Indicates the rate of I/O reads done by all the processes of this service. KB/sec Compare the value of these measures across services helps identify the service that is running the most I/O-intensive processes.
IO_write_data_rate Indicates the rate of I/O writes done by all the processes of this service. KB/sec
No_of_threads Indicates the number of threads that are currently open for this service. Number  
Num_procs_running Indicates the number of processes that are currently running for this service. Number The detailed diagnosis of this measure lists the name of the processes associated with the service and the status of each process.
Page_fault_rate Indicates the rate at which page faults are happening for this service. Faults/sec Page Faults occur in the threads executing in a process. A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. If the page is on the standby list and hence already in main memory, or if the page is in use by another process with whom the page is shared, then the page fault will not cause the page to be fetched from disk. Excessive page faults could result in decreased performance. Compare values across services to figure out which service is causing most page faults.
Virtual_memory_used Indicates the amount of virtual memory utilized by this service. MB Compare the value of this measure across services reveals the service that is being a drain on the virtual memory space.
Availability Indicates the availability of this service. Percent A value of 100 indicates that the specified service has been configured and is currently executing. A value of 0 for this measure indicates that the specified service has been configured on the server but is not running at this time. A value of –1 indicates that the service has not been configured on the target system.
Service_state Indicates the current state of this service.   The values that this measure can report and their corresponding numeric values have been listed in the table below:

Numeric Value Measure Value
1 Running
2 StartPending
3 Stopped
4 Stop Pending
5 Paused
6 Pause Pending

Note:

By default, this measure reports the above-mentioned Measure Values listed in the table above to indicate service state. However, in the graph of this measure, the status of the node will be represented using the numeric equivalents only.