eG Monitoring
 

Measures reported by MLXTempSensorTest

The operating temperature of the Mellanox switch is an important factor in its overall operability. In order to avoid a temperature-related system failure, the switch must always run at a permissible operating temperature range.

It is very important to monitor the switch for temperature because if the temperature of the switch reaches higher than the limit, it can cause both temporary and permanent damage to electronics. Sometimes temperature may be increasing despite fans working properly, so fan status may not be enough to ensure the proper functioning of the switch. In these kinds of cases, administrators will be alerted and they can diagnose further to identify and fix the issue before switch operation is impacted.

This test reports temperature statistics for the switch. Using this test, administrators will always be aware of temperature movements and can take corrective actions even before the heat can harm other components or affect the functioning of the switch.

Outputs of the test: One set of result for each temperature sensor in Mallanox switch.

Descriptor: Temperature Sensor

The measures made by this test are as follows:

Measurement Description Measurement Unit Interpretation
SensorValue Indicates the temperature sensor value to denote the current temperature of the switch. Celcius

If the temperature is higher than tolerance limit, the switch needs to shut down to save it from burning circuits.

OperationalStatus Indicates the operational status of temperature sensor to denote if the sensor is working or not.  

The values reported by this measure and its numeric equivalent are mentioned in the table below:

Measure Value Numeric Value
Ok 1
Unavailable 2
Non operational 3

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of this fan. The graph of this measure, however, represents the status of the fan using the numeric equivalents only.