eG Monitoring
 

Measures reported by UCSCsPSUsTest

A Cisco UCS Blade Server Chassis can be provided with upto four 2500 Watt hot-swapable power supplies.

As issues in the power supply units can adversely impact the performance of the blades in a chassis, administrators need to promptly detect power-related issues and rectify them before any irrepairable damage is done. This test aids in the timely detection of the following anomalies related to PSUs:

  • Abnormalities in the overall PSU health;

  • Operational deficiencies;

  • Critical performance setbacks;

  • Unrecoverable power/thermal/voltage failures;

  • Disturbing rise in temperature;

  • Input/output voltage, current, and power that exceeds permissible limits.

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
OperState Indicates the overall status of this PSU present in this chassis.   This measure reports the status of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 Operable
2 Inoperable
3 Degraded
4 Powered-off
5 Power-problem
6 Removed
7 Voltage-problem
8 Thermal-problem
9 Performance-problem
10 Accessibility-problem
11 Identity-unestablishable
12 Bios-post-timeout
13 Disabled
51 Fabric-conn-problem
52 Fabric-unsupported-conn
81 Config
82 Equipment-problem
83 Decommissioning
84 Chassis-limit-exceeded
101 Discovery
102 Discovery-failed
103 Identify
104 Post-failure
105 Upgrade-problem
106 Peer-comm-problem
107 Auto-upgrade

Note:

By default, this measure reports the above-mentioned States while indicating the status of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.

The detailed diagnosis of this measure provides the Time, ID, PID, Revision, Serial Number and Vendor attributes for the PSUs in this chassis.
Operability Indicates the operating state of this PSU present in this chassis.   This measure reports the operating state of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 Operable
2 Inoperable
3 Degraded
4 Powered-off
5 Power-problem
6 Removed
7 Voltage-problem
8 Thermal-problem
9 Performance-problem
10 Accessibility-problem
11 Identity-unestablishable
12 Bios-post-timeout
13 Disabled
51 Fabric-conn-problem
52 Fabric-unsupported-conn
81 Config
82 Equipment-problem
83 Decommissioning
84 Chassis-limit-exceeded
101 Discovery
102 Discovery-failed
103 Identify
104 Post-failure
105 Upgrade-problem
106 Peer-comm-problem
107 Auto-upgrade

Note:

By default, this measure reports the above-mentioned States while indicating the operating state of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.
Performance Indicates the current performance status of this PSU present in this chassis.   This measure reports the current performance status of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 Ok
2 Upper-non-recoverable
3 Upper-critical
4 Upper-non-critical
5 Lower-non-critical
6 Lower-critical
7 Lower non-recoverable

Note:

By default, this measure reports the above-mentioned States while indicating the performance status of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.
Power Indicates the power status of this PSU present in this chassis.   This measure reports the power status of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 On
2 Test
3 Off
4 Online
5 Offline
6 Offduty
7 Degraded
8 Power-save
9 Error

Note:

By default, this measure reports the above-mentioned States while indicating the power status of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states i.e., 0 to 10.
Presence Indicates the current state of this PSU present in this chassis.   This measure reports the current state of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 Empty
10 Equipped
11 Missing
12 Mismatch
13 Equipped-not-primary
20 Equipped-identity-unestablishable
30 Inaccessible
40 Unauthorized

Note:

By default, this measure reports the above-mentioned States while indicating the current state of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.
ThermalState Indicates the thermal state of this PSU present in this chassis.   This measure reports the thermal state of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 Ok
2 Upper-non-recoverable
3 Upper-critical
4 Upper-non-critical
5 Lower-non-critical
6 Lower-critical
7 Lower non-recoverable

Note:

By default, this measure reports the above-mentioned States while indicating the thermal state of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.
Voltage Indicates the voltage state of this PSU present in this chassis.   This measure reports the voltage state of the PSUs and their numeric equivalents as shown in the table:

Numeric Value State
0 Unknown
1 Ok
2 Upper-non-recoverable
3 Upper-critical
4 Upper-non-critical
5 Lower-non-critical
6 Lower-critical
7 Lower non-recoverable

Note:

By default, this measure reports the above-mentioned States while indicating the voltage state of the PSUs in this chassis. However, the graph of this measure will be represented using the corresponding numeric equivalents of the states as mentioned in the table above.
Ambient_temp Indicates the internal temperature of the PSUs present in this chassis. Celcius A high temperature is not preferred as it may cause severe damage to the PSUs and therefore cause performance bottleneck in the Cisco UCS chassis.
Input210v Indicates the input voltage of the PSUs present in the chassis. Volts Any value higher than 210 volts could indicate a problem condition that may require further investigation.
Output12v Indicates the output voltage of the PSUs present in the chassis. Volts Any value higher than 12 volts could indicate a problem condition that may require further investigation.
Output3v3 Indicates the output voltage of the PSUs present in the chassis. Volts Any value higher than 3.3 volts could indicate a problem condition that may require further investigation.
Output_current Indicates the output current of the PSUs present in the chassis. Amps Ideally, the value of this measure should be low. A sudden/consistent increase in this value could warrant an investigation.
Output_power Indicates the output power of the PSUs present in each chassis. Watts Ideally, the value of this measure should be low. A sudden/consistent increase in this value could warrant an investigation.