eG Monitor
 

Measures reported by FCCluSerAppTest

A variety of different services or applications can be configured for high availability in a failover cluster. While some services/applications are cluster-aware - i.e., are applications that function in a co-ordinated way with other cluster components - some others are cluster-unaware - i.e., are applications that do not interact with the cluster at all.

The list of cluster-aware applications that administrators can choose from when configuring high-availability are as follows:

  • DFS Namespace Server: Provides a virtual view of shared folders in an organization. When a user views the namespace, the folders appear to reside on a single hard disk. Users can navigate the namespace without needing to know the server names or shared folders that are hosting the data.

  • DHCP Server: Automatically provides client computers and other TCP/IP-based network devices with valid IP addresses.

  • Distributed Transaction Coordinator (DTC): Supports distributed applications that perform transactions. A transaction is a set of related tasks, such as updates to databases, that either succeed or fail as a unit.

  • File Server: Provides a central location on your network where you can store and share files with users.

  • Internet Storage Name Service (iSNS) Server: Provides a directory of iSCSI targets.

  • Message Queuing: Enables distributed applications that are running at different times to communicate across heterogeneous networks and with computers that may be offline.

  • Other Server: Provides a client access point and storage only.

  • Print Server: Manages a queue of print jobs for a shared printer.

  • Remote Desktop Connection Broker (formerly TS Session Broker): Supports session load balancing and session reconnection in a load-balanced remote desktop server farm. RD Connection Broker is also used to provide users access to RemoteApp programs and virtual desktops through RemoteApp and Desktop Connection.

  • Virtual Machine: Runs on a physical computer as a virtualized computer system. Multiple virtual machines can run on one computer.

  • WINS Server: Enables users to access resources by a NetBIOS name instead of requiring them to use IP addresses that are difficult to recognize and remember.

To configure high-availability for services/applications that are cluster-unaware, administrators can use the Generic Application, Generic Script, and Generic Service options.

When configuring fail-over for a service/application, you need to assign an IP address to that service/application. You can also add storage to a clustered service/application, or even associate additional resources with the service/application.

When a service/application fails over, administrators may need to know which cluster node that service/application has switched to. Likewise, administrators will also need to know if fail-over was unsuccessful for a service/application, and if so, why - is it because the cluster disk used by the service/application has run out of space? Is it because the IP address of the service/application is in conflict with another IP address in the environment? Is it because the service/application has been deliberately stopped or brought to the offline mode? The FCCluSerAppTest test provides administrators with answers to all these questions!

For each service/application that has been configured for high-availability, this test reports the current state of that service/application, thus enabling administrators to figure out if fail-over was successful or not. The test additionally reports the IP state and server state of each service/application and tracks the space usage in the storage mapped to a service/application, thus pointing administrators to the probable cause for service failures. The resources added to every service/application and the current state of the resources is also revealed, so that administrators can determine whether/not the offline state of a resource is causing the dependent service/application to fail.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
State Indicates the current state of this service/application.   The values that this measure can report and the states they indicate have been listed in the table below:

State Measure Value
Online 100
Partially online 95
Online pending 90
Inherited 80
Initializing 70
Pending 60
Offline Pending 50
Unknown 40
Offline 20
Failed 0

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state of this service/application. The graph of this measure however, represents the same using the numeric equivalents mentioned in the table above.

If this measure reports the value 50 for a service/application, it is a clear indicator that that service/application could not be failed over. In such a situation, you can check the value of the Ser_state, Ip_state, Resource_failed, and Tot_fre_per measures of that service to know what could have possibly caused the service/application to fail.

For further diagnosis, you can also use the detailed diagnostics reported by this test, which reveals the resources associated with the service/application and the current state of each resource.

Ser_state Indicates the current state of the server created in the cluster for this service/application.   When using the Failover Cluster Manager to configure high availability for a service/application, you are required to provide a fully qualified DNS name for the service/application being configured and assign an IP address to it. This measure reports the current state of that DNS name. To know which name was assigned to the service, use the detailed diagnosis of this measure.

The values that this measure can report and the states they indicate have been listed in the table below:

State Measure Value
Online 100
Online pending 90
Inherited 80
Initializing 70
Pending 60
Offline Pending 50
Unknown 40
Offline 20
Failed 0


Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state of the server created in the cluster for this service/application. The graph of this measure however, represents the same using the numeric equivalents mentioned in the table above.
Ip_state Indicates the current status of the IP address assigned to this service/application.   The values that this measure can report and the states they indicate have been listed in the table below:

State Measure Value
Online 100
Online pending 90
Inherited 80
Initializing 70
Pending 60
Offline pending 50
Unknown 40
Offline 20
Failed 0


Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the IP address assigned to this service/application. The graph of this measure however, represents the same using the numeric equivalents mentioned in the table above.
Owner_change Indicates whether/not the owner of this service/application has changed since the last measurement period.   The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value Numeric Value
No 0
Yes 1


Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of the IP address assigned to this service/application. The graph of this measure however, represents the same using the numeric equivalents mentioned in the table above.

> If this measure reports the value No for a service/application, and Service state is Failed, then it clearly indicates that fail-over has not occurred for that service/application.

To know which node currently owns the service/application, use the detailed diagnosis of this measure.
Tot_space Indicates the total capacity of all cluster disks mapped to this service/application. MB Use the detailed diagnosis of this measure to know which cluster disks are attached to a service/application, the current status of the disks, and the usage of each disk.
Tot_free Indicates the total amount of free space in all cluster disks mapped to this service/application. MB Ideally, the value of this measure should be high.
Tot_fre_per Indicates the percentage of space that is free in the cluster disks mapped to this service/application. Percent Ideally, the value of this measure should be high. Compare the value of this measure across services/applications to know which service/application has the least free space. You may want to make space in the cluster disks mapped to this service/application, so as to prevent service/application failure owing to lack of space.
Othr_res Indicates the number of other resources that are online in this service/application. Number Use the detailed diagnosis of this measure to know the name, type, and owner of all the resources associated with a service/application.
Resource_online Indicates the number of resources associated with this service/application that are currently online. Number Use the detailed diagnosis of this measure to know the name, type, state and owner of the online resources associated with a service/application.
Resource_offline Indicates the number of resources associated with this service/application that are currently offline. Number Use the detailed diagnosis of this measure to know the name, type, state, and owner of the offline resources associated with a service/application.
Resource_failed Indicates the number of resources associated with this service/application that have failed currently. Number Ideally, the value of this measure should be 0. If this measure reports a non-zero value, you can use the detailed diagnosis of this measure to know the name, type, state, and owner of each of the failed resources associated with a service/application.