Professional Documents
Culture Documents
Hp-Ux HW Monitor
Hp-Ux HW Monitor
sheets.
There are many ways to check and/or create alarms based on ambient changes like temperature.
Note that temperatures are measured on airflow IN to the system rather than airflow LEAVING
the system.
In practice, you do not need to know the temperature, since the EMS warnings are triggered when
action is needed to prevent the system shutting down. If you are getting the notifications, then you
need to do something about cooling. The hardware knows what are acceptable temperatures (it ma
y vary between server types), and triggers the events when limits are reached.
In any case, measuring the temperature depends on the firmware and hardware. I enclose here wi
th some of the possibilities to check the current temperature values:
a) cstm can do it in some versions.
b) EMS will raise alerts whenever an exceptional condition is met. That can be captured by OVO,
and many other tools, including open-source Munin-node and Nagios
An example:
# sfmconfig -a -l -t SystemTemp -d NULL
Caption DeviceID Status
HP_SystemTemperatureCollection NULL OK
d) Access via console can provide some details (how much - it depends on the model and firmw
are). Here is one from rx2600:
MP:CM> ps
System Power state: On
Temperature : Normal
Here is a result from HP Education HP-11.31 server running March 2010 Update:
# cprop -summary -c "Temperature"
[Component]: Temperature
[Table]: Temperature
-------------------------------------------------------
[Instance]: 1
****************************************************
[Hash ID]: Temperature:2796559075
[Status]: OK
[Sensor]: TempSensorInfo 1: Proc 0 ThermTrip
[Location]: CPU board
[Temp]:
[Threshold]:
****************************************************
[Instance]: 2
****************************************************
[Hash ID]: Temperature:2740717654
[Status]: OK
[Sensor]: TempSensorInfo 2: Proc 1 ThermTrip
[Location]: CPU board
[Temp]:
[Threshold]:
****************************************************
If "IPMI over LAN" is enabled in the MP configurations, you can do the following:
# ipmitool -I lan -H -P "" sdr type Temperature
Ambient Temp | D8h | ok | 23.1| 16 degrees C
Processor 0 Temp | D9h | ok | 3.1 | 53 degrees C
Processor 1 Temp | DAh | ns | 3.2 | Disabled
Finally, take a look at envd(1M). The envd daemon provides a means for the system to respond to
environmental conditions detected by hardware. Such responses are typically designed to maintain
file system integrity and prevent data loss.
envd works with two threshold levels for environmental temperature: critical and emergency
(OVERTEMP_EMERG and OVERTEMP_CRIT). By default, when emergency threshold is reach
ed, envd issues a shutdown.
In general terms:
OVERTEMP_CRIT = over 30 degrees Celsius
OVERTEMP_EMERG = over 34 degrees Celsius -
the usual response to this is for envd to complete a graceful shutdown
There is another event at about 40 degrees Celsius which will cause the platform monitor to just
remove DC power to the system (non graceful halt), but unless temperatures go up VERY quickly
you should not reach that one.
You cannot change any of these values nor should you even attempt to - HP has determined the
safe operating envelopes for the systems, changing them would probably invalidate warranty.