Lopsa Feb 2016

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 63

OOB Management and You!

Chris Layton
R&D Linux System
Engineering @ ORNL
laytoncc@ORNL.gov
linux@misterx.org
http(s)://misterx.org/LOPSA_Feb_2016.pdf

Terminology

OOB/LOM
IPMI
${WHAT_VENDORS_CALL_THEIR_STUFF}

What is it!
- Sometimes called lights out management (LOM), Out of Band management is a way to access
A system independent of the main hardware's operating system.
- Usually a device running Linux serving some sort of remote access based around the IPMI
Standard .

IPMI
(Definition from Wikipedia)

Intelligent Platform Management Interface (IPMI) is


a set of computer interface specifications developed
from a patent [US 6,367,035 B1 by Adrian White][1]
for an autonomous computer subsystem that
provides management and monitoring capabilities
independently of the host system's CPU, firmware
(BIOS or UEFI) and operating system.
Currently we are on a revision of IPMI V2.0 with the main contributors to the
project mostly being Intel (publisher) , Hewlett Packard, NEC, and Dell.

What You Might see Vendors call


their version of it ..

HP iLO

Dell - iDRAC

Supermicro SIM

Intel - AMT AKA Active Management Technology

https://fsf.org/blogs/community/active-management-technology

IBM RSA (remote supervisor adapter), IMM2

Cray - Intelligent Server Control Board (iSCB)

Why not just call it a Dell/HP/IBM/ETC IPMI


Device ?!
Good Question!

The answer is likely because they


want a cool name for their :

And lets not forget the Extra Special


Sauce! (aka paid features!)

Paid Feature Examples

Remote KVM (Dell)

Remote Config Backup

Group System Control (ILO Fed)

AD auth/Two Factor

Easier Virt-media controls

Console save and playback

Heads up that what you see on


others systems or in internet
walkthroughs may not be free!

Wait ...did you say Special Sauce?!

Virtual Media Management

Has a few limitations but very useful

OOB Visibility to useful hardware information

(Dell) Intel ME(Management Engine),G13,Ent Lic

PCI-E Bus Bandwidth Usage (SystemBoardIOUsageStat)

Advanced Power/Heat details

Extended Asset Tags with detailed information

REST API with Redfish (HP iLO 4 / Gen 9 )

Two Factor Authentication

MORE

Or , at a more basic level, OOB


devices are ...

A small independent computer to help manage


your server
They are ready to go regardless of the state of
the main system.
Have many different ways to be accessed via
several different tools.

Tools

Tool

Short List

ipmitool (open - has hooks to add features on Dells via delloem flag)

very

Keep in mind tool can only do so much. To get more functionality learn and
use RAW commands!

Ipmicmd (open part of http://openipmi.sourceforge.net/ toolkit)

ipmitool (Dell version)

freeipmi (open - multiple tools here)

Bmc-config useful to set basic security related settings on a device.

Ipmi-oem Adds lots of vendor specific commands

And about 25 more...

racadm (Dell)

bmc (Dell and others)

hponcfg (HP)

ribcl (HP)

Ipmicfg (Supermicro)

.(insert one of the many others here..)

What these tools can do...

power control (including Power/BTU tuning)

Set boot devices (one time or permanent)

Set/Save BIOS settings (OOB controller and


system)

Re-install OS (via virtual Media or setting to PXE)

Handle parts of a automated deployment

Alerting / Monitoring

Fencing / STONITH Node management

Inventory via asset tag or other fields

And more.

Remote access

Serial over LAN

Virtual Console/Serial

Web Interface

VNC

Access System Information/Stats (Via Agents)

Basic Environmental Monitoring

BMC (OOB) Standardization

Information on Custom system devices (like GPUs)

Power Management

Virtual Media

Update a systems BIOS and Firmware

And more...

The ones I listed just scratches the surface of what you can do!

Wow, This Sounds Easy Its not

TOOLS CAN LIE , CHANGE, and MISDIRECT!

The Dell G13 Issue with ipmitool

Version On Vendor site did not support G13 systems!

Varying support for the IPMI standard means tools


can mis-lead on what they can do.

Firmware updates can change features/output.

MORE

Tool Usage Examples

Saving a BMC Config (local system)


#bmc-config -o &> /tmp/ipmi
# diff /tmp/ipmi /tmp/ipmi.new
<
IP_Address
-->
IP_Address
288c288
<
Default_Gateway_IP_Address
-->
Default_Gateway_IP_Address

282c282
192.168.1.62
216.37.64.99

# bmc-config --commit -n /tmp/ipmi

0.0.0.0
216.37.64.1

Ipmitool delloem example


delloem powermonitor
Power Tracking Statistics
Statistic

: Cumulative Energy Consumption

Start Time

: Wed Sep 16 18:59:23 2015

Finish Time

: Tue Feb 2 15:22:40 2016

Reading

: 352.0 kWh

Statistic
Start Time
Peak Time

: System Peak Power


: Wed Sep 16 18:59:23 2015
: Wed Sep 16 20:30:16 2015

Peak Reading : 492 W


Statistic
Start Time
Peak Time

: System Peak Amperage


: Wed Sep 16 18:59:23 2015
: Wed Nov 11 09:24:01 2015

Peak Reading : 4.4 A

Right Tool For the Job Example


Ipmitool power check

Dell 6320 (hint wrong tool for the job)


Power is

: on

Power drawn by this node

: unimplemented in tool for this platform

Power limit for this node

: UNIMPLEMENTED

Power drawn by whole chassis


Power limit for this chassis

: unimplemented in tool for this platform


: UNIMPLEMENTED

Chassis power limiting enabled

: UNIMPLEMENTED

Dell 6220
Power is

: on

Power drawn by this node


Power limit for this node

: 123 Watts
: ---

Power drawn by whole chassis


Power limit for this chassis
Chassis power limiting enabled
Chassis emergency throttling

: 381 Watts
:0
: no
: limit power via hardware

Right Tool (racadm)


#Avg.LastDay=146 W | 498 Btu/hr
#Avg.LastHour=146 W | 498 Btu/hr
#Avg.LastWeek=146 W | 498 Btu/hr
#Cap.ActivePolicy.BtuHr=N/A
#Cap.ActivePolicy.Name=N/A
#Cap.ActivePolicy.Watts=N/A
Cap.BtuHr=1102 btu/hr
Cap.Enable=Disabled
#Cap.MaxThreshold=323 W | 1104 Btu/hr
#Cap.MinThreshold=99 W | 337 Btu/hr
Cap.Percent=100
Cap.Watts=323 W
#EnergyConsumption=379.117 KWh | 1293926 Btu
#EnergyConsumption.Clear=******** (Write-Only)
#EnergyConsumption.StarttimeStamp=Fri May 08 05:25:26 2015
#Max.Amps=0.0 Amps
#Max.Amps.Timestamp=Wed Dec 31 18:00:00 1969
#Max.LastDay=182 W | 621 Btu/hr
#Max.LastDay.Timestamp=Wed Oct 14 01:40:11 2015
#Max.LastHour=165 W | 563 Btu/hr
#Max.LastHour.Timestamp=Wed Oct 14 10:32:31 2015
#Max.LastWeek=189 W | 645 Btu/hr
#Max.LastWeek.Timestamp=Thu Oct 08 10:19:09 2015
#Max.Power=334 W | 1140 Btu/hr
#Max.Power.Timestamp=Fri May 8 06:19:59 2015
#Max.PowerClear=******** (Write-Only)
#Min.LastDay=146 W | 498 Btu/hr
#Min.LastDay.Timestamp=Tue Oct 13 11:31:55 2015
#Min.LastHour=146 W | 498 Btu/hr
#Min.LastHour.Timestamp=Wed Oct 14 10:29:35 2015
#Min.LastWeek=145 W | 495 Btu/hr
#Min.LastWeek.Timestamp=Wed Oct 07 21:20:27 2015
#Realtime.Amps=0.0 Amps
#Realtime.Power=147 W | 502 Btu/hr
#Status=1

Why is racadm the right tool in this


case !?

Excellent Question :
My answer :
In many , but not all instances, I find that the
more it deviates from the IPMI standard (aka
extra vendor special sauce) the greater the
need is to use vendor tools.

Racadm inventory example


------------------------SOFTWARE INVENTORY-----------------------&#8203;
ComponentType = FIRMWARE
ElementName = Integrated Dell Remote Access Controller
FQDD = iDRAC.Embedded.1-1
InstallationDate = NA
Rollback Version = 2.14.14.12
------------------------------------------------------------------&#8203;
ComponentType = FIRMWARE
ElementName = Integrated Dell Remote Access Controller
FQDD = iDRAC.Embedded.1-1
InstallationDate = 2015-09-16T19:59:46Z
Current Version = 2.14.14.12
------------------------------------------------------------------&#8203;
ComponentType = FIRMWARE
ElementName = Intel(R) Ethernet 10G X520 LOM 00:8C:FA:F0:65:66
FQDD = NIC.Embedded.2-1-1
InstallationDate = 2015-09-16T20:03:02Z
Current Version = 16.5.0
------------------------------------------------------------------&#8203;
ComponentType = FIRMWARE
ElementName = Intel(R) Ethernet 10G X520 LOM 00:8C:FA:F0:65:64
FQDD = NIC.Embedded.1-1-1
InstallationDate = 2015-09-16T20:02:59Z
Current Version = 16.5.0
------------------------------------------------------------------&#8203;
ComponentType = BIOS
ElementName = BIOS
FQDD = BIOS.Setup.1-1
InstallationDate = NA
Rollback Version = 1.0.3
------------------------------------------------------------------&#8203;
ComponentType = BIOS
ElementName = BIOS
FQDD = BIOS.Setup.1-1
InstallationDate = NA
Available Version = 1.0.3

ComponentType = BIOS
ElementName = BIOS
FQDD = BIOS.Setup.1-1
InstallationDate = 2015-09-16T20:49:43Z
Current Version = 1.0.3
------------------------------------------------------------------&#8203;
ComponentType = FIRMWARE
ElementName = SAS2008 FW v0.94
FQDD = RAID.Mezzanine.1A-1
InstallationDate = 2015-09-16T20:02:51Z
Current Version = 00.00.00.00
------------------------------------------------------------------&#8203;
ComponentType = APPLICATION
ElementName = Lifecycle Controller
FQDD = USC.Embedded.1:LC.Embedded.1
InstallationDate = 2015-09-16T19:59:49Z
Current Version = 2.14.14.12
------------------------------------------------------------------&#8203;
ComponentType = APPLICATION
ElementName = Dell 32 Bit uEFI Diagnostics, version 4239, 4239A22, 4239.30
FQDD = Diagnostics.Embedded.1:LC.Embedded.1
InstallationDate = 2015-09-16T23:16:52Z
Current Version = 4239A22
------------------------------------------------------------------&#8203;
ComponentType = APPLICATION
ElementName = Dell OS Driver Pack, 15.05.10, A00
FQDD = DriverPack.Embedded.1:LC.Embedded.1
InstallationDate = 2015-09-16T23:16:52Z
Current Version = 15.05.10
------------------------------------------------------------------&#8203;
ComponentType = APPLICATION
ElementName = OS COLLECTOR 1.1, OSC_1.1, A00
FQDD = OSCollector.Embedded.1
InstallationDate = 2015-09-16T23:16:52Z
Current Version = OSC_1.1
------------------------------------------------------------------&#8203;
ComponentType = FIRMWARE
ElementName = System CPLD
FQDD = CPLD.Embedded.1
InstallationDate = 2015-09-16T19:59:49Z
Current Version = 1.0.0

How to Determine a Cards IPMI


Capabilities (using ipmitool dcmi discover)

Ways to Access OOB

Physical Connections

Shared Ethernet Port aka Side-Band - Can have limited


bandwidth (aka no/limited GUI access)

Dedicated Ethernet port (either on-board or off daughter


cards)

Shared Multi-Node port (Blades common place to see this)

Also via

Local call to device via Kernel module and a relevant tool


(NOTE: NO AUTH NEEDED IF ROOT).

Remote via open source IPMI tools or vendor proprietary


standard tools.

Web Gui (watch out for Java version issues here)

SoL

SSH

Telnet (yes, this really wont die !)

InBand Access to a OOB device

Via a ipmi kernel module.


#Load the module and see if things look good
[root@pbnj ~]#modprobe ipmi_si && grep ipmi_si
[ 853.247997] ipmi_si: probing via SMBIOS
[ 853.248002] ipmi_si: SMBIOS: io 0xca8 regsize 1 spacing 4 irq 0
[ 853.248030] ipmi_si: Adding SMBIOS-specified kcs state machine
[ 853.248041] ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address
0xca8, slave address 0x20, irq 0
[ 853.310122] ipmi_si ipmi_si.0: Found new BMC (man_id: 0x0015d9, prod_id: 0x1134,
dev_id: 0x20)
[ 853.310152] ipmi_si ipmi_si.0: IPMI kcs interface initialized
# Run your local commands..
[root@pbnj ~]# ipmitool -d 0 fru
FRU Device Description : Builtin FRU Device (ID 0)
Board Mfg Date
: Sun Dec 31 19:00:00 1995
Board Mfg
: Super Micro
Board Product
: IPMI2.0
Board Serial
:
Board Part Number : AOC-IPMI20-E
Product Manufacturer : Super Micro
Product Name
: IPMI2.0
Product Part Number : AOC-IPMI20-E
Product Version
: 1.0
Product Serial
:

That same command done remotely


# ipmitool -I lanplus -U $USER -P $PASS -H 216.37.76.254 fru
FRU Device Description : Builtin FRU Device (ID 0)
Board Mfg Date
: Sun Dec 31 19:00:00 1995
Board Mfg
: Super Micro
Board Product
: IPMI2.0
Board Serial
:
Board Part Number : AOC-IPMI20-E
Product Manufacturer : Super Micro
Product Name
: IPMI2.0
Product Part Number : AOC-IPMI20-E
Product Version
: 1.0
Product Serial
:

A Dell Deployment Example

Rack and Stack the system (power plugged in and


OOB connected to network)
All Dell DRACs come set to a single 192.x.x.x IP
using the racadm tool and some shell code we can
set this to be DHCP
ID system for use in inventory system via :
Setting system name or identifier in the asset tag
field (ipmitool,racadm, or other)
Retrieve and record SN and other core items that
are retrieved via ipmitool and/or racadm.
Determine Dell model (ipmitool fru print) to
determine what updates the system gets in next
step and record . This can also be done via a
racadm inventory command (more accurate)

Reboot the system


Ipmitool set boot dev pxe (one time)
ipmitool power cycle
boot to a custom image (via PXE,forman, etc ) where the following
happens.
update system with all vendor patches
Set permanent boot device to vdisk
Create raid array
Burn-in Tests
Benchmarks (network and hardware)
Set device settings to standard across platform type.

Raid Controller

BIOS

Many others...
Record relevant status and settings in CMDB
If anything fails stop and notify admin via LCD flash or similar
(racadm)

System is now ready for OS install

Using one of several automation techniques set


the PXE server to boot a OS install from the
MAC retrieved during previous steps.
Set device to one time PXE boot via ipmitool (or
other)
System boots to kickstart/preseed set in PXE
config for system's MAC address.
OS install finishes and regular CFGMGMT
takes over (your using CFGMGMT, right !?)

BAM you have a server!


With a few more steps...

For the folks still here.

Take a look at the next couple of slides...and tell me what


657 root 13736 S //lib/systemd/systemd-journald
662 root
3312 S //lib/systemd/systemd-udevd you think it is...

676 root
0 SW< [cryptodev_queue]
689 root
3828 S /bin/sh
721 root
0 SW [kjournald]
722 root
0 SW [kjournald]
723 root
0 SW [kjournald]
726 root
0 SW [kjournald]
740 root
0 SW [kkcs]
783 root
0 SWN [jffs2_gcd_mtd7]
843 root
0 SW [dell_fpdrv thre]
848 root
0 SW< [bond0]
905 root
1776 S /sbin/watchdog
934 messageb 2840 S /usr/bin/dbus-daemon --system
--address=systemd: --n
1000 root
0 SW< [loop7]
1004 root
0 SW< [kdmflush]
1005 root
0 SW< [kcryptd_io]
1006 root
0 SW< [kcryptd]
1012 root
0 SW [kjournald]
1031 root
0 SW [kjournald]
1099 root 11488 S /sbin/aim
1102 root 14492 S /usr/sbin/dsm_sa_datamgrd
1103 root 12408 S /avct/sbin/os
1104 root
6096 S /usr/bin/syscallagent
1108 root 14364 S /usr/sbin/dsm_sa_popproc lclpop
1109 root 52760 S {SoftTimer} /bin/fullfw
1142 root 14188 S /usr/sbin/dsm_sa_popproc lmpop
1285 root 12020 S /avct/sbin/pm
1288 root 14212 S /usr/sbin/dsm_sa_snmpd
1289 root
0 SW [MSD-0]
1338 root 13840 S /usr/sbin/dsm_sa_eventmgrd
1351 root
2232 S /sbin/syslogd -m 0
1354 root 12112 S /usr/bin/fmgr
1426 root 11636 S /usr/bin/tm
1766 root
0 SW [MSD1-0]

1818 root
0 SW< [sh_pbi_wq]
1822 root
0 SW [UsbEventMonitor]
1823 root
0 SW [PchDeviceRemova]
1957 root 34420 S {START} /avct/sbin/osinet
1961 root
6204 S /sbin/vfk
1967 root
3824 S {dhclient_daemon} /bin/sh
/sbin/dhclient_daemon
1968 root
1888 S /usr/sbin/ifplugd -i bond0 -afqIn
-u0 -d0 -miff
1974 root
8148 S /bin/ipmi_gateway
1986 root
6792 S /bin/fb_vnc_server
1990 root
9396 S /bin/fb_source
1996 root
8072 S /usr/sbin/raclogd
2020 root
9324 S /bin/jdaemon
2066 root 20276 S /sbin/avct_server
2070 root
7516 S /sbin/vkvm_pm
2163 root
6612 S /sbin/sshd -g 60
2214 root 16356 S /bin/maserserver
2215 root
3752 S {cfgbkup.sh} /bin/sh
/etc/sysapps_script/cfgscripts/
2220 root
4820 S /usr/sbin/mrcached
2249 root
3752 S /sbin/crond -b
2291 root 26604 S /usr/local/bin/appweb
--config /var/run/appweb.conf
2398 root
9288 S /bin/mctpd
2411 root 30996 S /usr/sbin/ipmiextd
2422 root 24072 S /usr/sbin/dsm_sa_popproc
iracpop

Active Internet connections (w/o servers)


Proto Recv-Q Send-Q Local Address
Foreign Address
State
tcp
0
0 127.0.0.1:33452
127.0.0.1:8195
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:33453
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:33497
ESTABLISHED
tcp
0
0 127.0.0.1:199
127.0.0.1:40284
ESTABLISHED
tcp
699
0 127.0.0.1:33497
127.0.0.1:8195
ESTABLISHED
tcp
0
0 127.0.0.1:33450
127.0.0.1:8195
ESTABLISHED
tcp
0
0 127.0.0.1:33451
127.0.0.1:8195
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:33449
ESTABLISHED
tcp
0
0 127.0.0.1:199
127.0.0.1:40283
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:33450
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:33452
ESTABLISHED
tcp
0
0 127.0.0.1:33453
127.0.0.1:8195
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:33451
ESTABLISHED
tcp
0
0 127.0.0.1:35155
127.0.0.1:8195
ESTABLISHED
tcp
0
0 127.0.0.1:8195
127.0.0.1:35155
ESTABLISHED
tcp
0
0 127.0.0.1:40284
127.0.0.1:199
ESTABLISHED
tcp
0
0 127.0.0.1:40283
127.0.0.1:199
ESTABLISHED
tcp 61652
0 127.0.0.1:33449
127.0.0.1:8195
ESTABLISHED

Kernel IPv6 routing table


Destination
::1/128
::1/128
fe80::28c:faff:feeb:c9ea/128
fe80::28c:faff:feeb:c9ea/128
fe80::/64
fe80::/64
ff00::/8
ff00::/8
Kernel IP routing table
Destination Gateway
0.0.0.0
172.23.4.1
172.23.4.0
0.0.0.0

::
::

::
::

Next Hop

::
::

::
::

Flags Metric Ref Use Iface


U 0
478
1 lo
U 256 0
0 lo
U 0
0
1 lo
U 0
0
1 lo
U 256 0
0 eth1
U 256 0
0 eth2
U 256 0
0 eth1
U 256 0
0 eth2

Genmask
Flags MSS Window irtt Iface
0.0.0.0
UG
00
0 bond0
255.255.255.0 U
00
0 bond0

What are we looking at exactly?

What this means .

OOM situations (see code at end for example memory check for Dells)
Local OS tools locking up that interact with the OOB/ LoM
device

Java (see item 1)

Too many remote users over running resources.

CPU Spikes causing monitoring time-outs.

Heat ! The Silent Killer! Many of these cards (especially


older ones) are not in optimal air flow locations on the MB.
Network Saturation (some LoM's are 100Mb or shared
with primary NIC and allow full GUI access)

Misuse...(talk in and of itself)

Anything else that can go wrong with a Linux based OS.

Security

Security

IPMI has a spotty security history. Consider the


following points:

Vendors try and force 8 char OOB device


passwords on systems with 5 char or less defaults
(some are just ADMIN for the user/password)

The remote user account exposure in Supermicro


(and a few others) via PSBLock expolit.

If a box has been root compromised assume OOB


is compromised due to un-authed local access to
OOB device.

But I bet those exploits like


PSBLock are hard to do...

If you have
Port 49152
On your network
And you have SM
Systems you could
be Compromised!

Lets Talk OOB compromise


What Ifs

From a Competitor

hacks OOB and turns off systems at a opportune time.

Remote Drive wipe via custom virtual media. DBANOOB-BOMB

Change boot device and password lock BIOS to prevent booting.

From a Hacker

compromises OOB and uses it to mount remote media with


nefarious payload.

Uses power cycle and SoL to reboot machine into single user
mode (you password grub right?!)
Just scratching the surface of the nasty things a OOB hack can
do to your system!

Securing OOB

Easy

Remove Gateway (could cause issues in some network environments)

Use lanplus (AKA IPMI V2..its been around since 2004ish) for connections

On shared ports use VLANS for OOB access

Wipe/Reset OOB during decommission / compromise

Check OOB Firmware with same frequency as hardware and always for major exploits!

Prevent passwords from appearing in shell history.

Moderate

Use a OOB/LO Proxy.

If no proxy is present and network is not confirmed secure consider lock down with ipmi firewall if all
devices support

Disable G-ARP (default on most)

Disable cipher 0 for IPMI V2 connections

Ensure you are using a strong cipher in client and lock weak ones out at OOB device (bmcconfig)

Log and audit traffic leaving your management network. Very few reasons it should!

Include OOB devices in security audits/reports (NMAP NSE is a good tool to help with this!)

Complicated

Lock down SoL access

Serial-over-LAN key based auth

Get OOB configuration under configuration management and enforce changes there.

Use of two factory auth or Secure Radius/LDAP combo

OOB Proxy Design

Iptables to access via DNAT passthrough (bad


idea?!)

Two factor /LDAP if needed can go here

Conserver use for easy access to SoL devices.

Use of NX or other Remote Desktop for Access to


Web interfaces (security implications)
Tool Suite for accessing and working with OOB
devices
Wrapper Scripts to allow ease of tool use and to
make raw commands easier if in use in your
environment

OOB Infrastructure Design


Considerations

Set Reasonable Expectations for


OOB use like...

Serial Over Lan

Hardware Determination

System Event Log (SEL)

Sensors

IPMI Hard Reset

Boot Order Control

Power Control

And Standardize as much as you can !

Speaking of Standardization..

When talking design and standardization one


must always consider

Configuration Management

Tools led very easily to integration with Ansible,


Puppet, Chef, and others!

Monitoring with OOB

Snmp

Traps

Useful OIDs with items like MTU, Routes, stats, IP's ,


etc

Custom scripts to parse tool output

Check_MK plugins (SNMP based)

Open Source Tools to poll sensors

Many many many many more...

Not OOB devices are Equal...


10 year old IPMI 2 Daughter Card Dual Xeon
Command being timed: "ipmitool -U
$DEFUALTSFUN -P $NOTADMIN -H cluster04ipmi sensor list"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss):
0:07.70
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1096
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 328
Voluntary context switches: 244
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

6 Year old IPMI 2 oboard card Atom 510 System


Command being timed: "ipmitool -U $REDACTED
-P $GANKED -H bowie-ipmi sensor list"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 2%
Elapsed (wall clock) time (h:mm:ss or m:ss):
0:00.22
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1396
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 43
Voluntary context switches: 190
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

Other Interesting Uses for OOB

Vendor Auto Generated


Replacement Tickets

Many OOB devices can report on things like :


CPQHLTH-MIB::cpqHeFltTolPowerSupplySparePartNum.0.1 =
STRING: "754377-001"
CPQHLTH-MIB::cpqHeFltTolPowerSupplySparePartNum.0.2 =
STRING: "754377-001"

Use snmp traps or remote OID queries to create a


auto generated report for vendors with everything
they need to get you a new part.

RCA

Use in conjunction with OS troubleshooting to


help see things the OS might miss due to
kernel panic/lockup/bad hair day like :

Hardware failures

Temp spikes

Facility Power issues through seen through


dips/spikes in sensors.

Power supply failures at peak power usage (over


loading the remaining power supplie(s) )

Watchdog errors/lockup issues


Boot system into diagnostic image with virtual
media.

Alt Event Logging

Use `bmc-device` with '--platform-event=' to create SEL logs for


events like failing hardware or RCA type events for parsing in the
vent the system is locked or reboots.
You can also use ipmitool to do a similar thing remotely or locally.
A local example would be to script around smartctl or raid vendor
tools with something like this :
ipmitool -I lanplus -U ADMIN -P ADMIN -H cluster05-ipmi raw
0x04 0x02 0x04 0x0d 0x01 0x6f 0x02 00 00
which generates a SEL entry view via
ipmitool -I lanplus -U ADMIN -P ADMIN -H cluster05-ipmi sel
list
1 | 02/01/2016 | 06:28:17 | Drive Slot #0x01 | Predictive Failure |
Asserted

Server LCD Bling!

Its common to see a Server ID on the front of a


system with a LCD display but you can also
show (with a little scripting) :

CPU load (via proc/mpstat/or other)

(Openstack) VM's running on node (virsh list | wc -l)

RAM Usage (VMZ or RSS)

Active Users

Network Utilization

Or you can just rotate around these every X


minutes. The options are almost endless !

Save a Buck...or several..

Thanks to Kilowatt hour measurements on


some OOB devices you could use it to measure
savings while testing different power tunings on
your system!
This could also all you to graph power usage in

BTU

Kilowatt per hour

System Watts

Shameless Plug for Free Stuff

Scripting to Investigate a OOB card

My IPMI Reporting tool :


http://gitlab.misterx.org/MisterX/OOB_Tools/blob/master/ipmi_print.sh

Example output of the script :


http://gitlab.misterx.org/MisterX/OOB_Tools/tree/master/Report_Examples

This has pre/post psblock firmware on a SM BMC Atom system.


Checks

System Power is ON
Hardware Vendor via OUIDB (detecting all vendors is a work in progress)
SoL is responding
If vendor default logins are in use
Can run one report or all via flags

Lightweight System Dependencies


I am currently working on collecting a Manufacturer and Product ID database
to be able to quickly ID Vendors/cards.
Just starting to scratch the surface...lots to improve on!

===============================================
= IPMI REPORT FOR IP 192.168.1.50 POWER:On =
= Script Version : 0.8
=
= Report System Date :01/24/2016 02:32:56 =
= IPMITOOL VERSION: 1.8.11
=
= IPMI Date :
01/24/2016 15:57:46 =
= MAC System Vendor (VIA OUIDB) : SuperMicro =
!!Login Defaults Present : Supermicro!!
===============================================
=
FRU Output
=
===============================================
FRU Device Description : Builtin FRU Device (ID 0)
Board Mfg Date
: Sun Dec 31 19:00:00 1995
Board Mfg
: Supermicro
Board Serial
:
Product Serial
:

===============================================
=

Chassis Status Output

System Power

---- IPMI LAN Interface Information---Set in Progress


: Set Complete
Auth Type Support
: NONE MD2 MD5 PASSWORD
Auth Type Enable
: Callback : MD2 MD5 PASSWORD
: User : MD2 MD5 PASSWORD
: Operator : MD2 MD5 PASSWORD
: Admin : MD2 MD5 PASSWORD
: OEM
: MD2 MD5 PASSWORD
IP Address Source
: Static Address
IP Address
: 192.168.1.50
Subnet Mask
: 255.255.255.0
MAC Address
: 00:25:90:3d:8e:4d
SNMP Community String : public
IP Header
: TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
BMC ARP Control
: ARP Responses Enabled, Gratuitous ARP Disabled
Default Gateway IP
: 0.0.0.0
Default Gateway MAC : 00:00:00:00:00:00
Backup Gateway IP
: 0.0.0.0
Backup Gateway MAC
: 00:00:00:00:00:00
802.1q VLAN ID
: Disabled
802.1q VLAN Priority : 0
RMCP+ Cipher Suites : 1,2,3,6,7,8,11,12
Cipher Suite Priv Max : aaaaXXaaaXXaaXX
: X=Cipher Suite Unused
: c=CALLBACK
: u=USER
: o=OPERATOR
: a=ADMIN
: O=OEM
---- IPMI LAN Interace Stats ---IP Rx Packet
: 18437
IP Rx Header Errors
:0
IP Rx Address Errors
:0
IP Rx Fragmented
:0
IP Tx Packet
: 46340
UDP Rx Packet
: 47360
RMCP Rx Valid
: 47360
UDP Proxy Packet Received : 0
UDP Proxy Packet Dropped : 0

: on

Power Overload

: false

Power Interlock

: inactive

Main Power Fault

: false

Power Control Fault : false


Power Restore Policy : always-off
Last Power Event
Chassis Intrusion

:
: inactive

Front-Panel Lockout : inactive


Drive Fault

: false

Cooling/Fan Fault

: false

===============================================
=

===============================================
=
Lan Print Output
=
==============================================

===============================================

Chassis Status Output

= OPTIONS : always-on,previous,always-off

===============================================
Supported chassis power policy: always-off
===============================================
=

MC Info Output

===============================================
---- Management Controller Info---Device ID

: 32

Device Revision

:1

Firmware Revision

: 3.16

IPMI Version

: 2.0

Manufacturer ID

: 47488

Manufacturer Name
Product ID

: Unknown (0xB980)
: 2566 (0x0a06)

Product Name

: Unknown (0xA06)

Device Available

: yes

Provides Device SDRs

: no

Additional Device Support :


Sensor Device
SDR Repository Device
SEL Device
FRU Inventory Device
IPMB Event Receiver
IPMB Event Generator
Chassis Device
---- Management Controller Show Enables ---Receive Message Queue Interrupt

: enabled

Event Message Buffer Full Interrupt


Event Message Buffer

: disabled

: disabled

System Event Logging

: enabled

OEM 0

: disabled

OEM 1

: disabled

OEM 2

: disabled

---- Management Controller Watchdog Info ---Watchdog Timer Use:


Watchdog Timer Is:

BIOS/POST (0x02)
Stopped

Watchdog Timer Actions: No action (0x00)


Pre-timeout interval: 0 seconds
Timer Expiration Flags: 0x00
Initial Countdown:
Present Countdown:

6553 sec
6553 sec

In Closing

Testing and Learning

Want to learn ...try a IPMI simulator !

https://www.mankier.com/1/ipmi_sim

https://gist.github.com/bot11/a34ff0008cae75bd662d

Try a cheap motherboard with OOB

Be careful not to get too old or the features will be


VERY limited. I recommend a ILO or DRAC as new as
you can afford (and with a license if you can get one).

If using daughter cards be wary of firmware being


locked to Motherboard used on.

Questions, Statements ,
Comments ?

No I dont know why they antenna..

Drac Memory Usage Code


#!/bin/sh

http://gitlab.misterx.org/snippets/3

# A quick hack for me to clean up (a lot) later and intergrate into DRAC monitoring perhaps...
# Yeah ..that regex foo is weak....but it works for now ..its on the long list of things to improve here.
# This gets VSZ from the process list running on the idrac
# Requires racadm from Dell, awk, DRAC on remote system.
# Set Vars
TOTAL=0
HOST=
USER=
PASSWORD=
# Walk the process list on the DRAC Device
for b in $(/opt/dell/srvadmin/bin/racadm5 -r ${HOST} -u ${USER} -p ${PASSWORD} racdump | egrep '[0-9]+ [a-z].*[09].*[a-z]' | awk '{print $3}'); do
let TOTAL=${TOTAL}+b;
done
#print out totals
echo "${TOTAL} KB"
echo $(awk "BEGIN {printf ${TOTAL}/1024 } ") "MB"
# example output
# >sh /bin/drac_memory.sh
# 769132 KB
# 751.105 MB

Useful Links

My Reporting Script :
http://gitlab.misterx.org/MisterX/OOB_Tools/blob/master/ipmi_print.sh
Example Reports from My Script :
http://gitlab.misterx.org/MisterX/OOB_Tools/tree/master/Report_Examples

Slides : http(s)://misterx.org/LOPSA_Feb_2016.pdf
HP Redfish API
http://www8.hp.com/us/en/products/servers/proliant/restful-interface-tool.html
IPMI Security : https://github.com/zenfish/ipmi
Dell Out-Of-Band Enhancements (G13 systems)
http://en.community.dell.com/techcenter/extras/m/white_papers/20440944/download
IPMI Interface Spec (great for hacking together RAW commands)
http://www.intel.com/content/www/us/en/servers/ipmi/ipmi-second-gen-interface-specv2-rev1-1.html

You might also like