Table of Contents

apama{001,002} (RETIRED)

https://opendcim.engineering.redhat.com/devices.php?DeviceID=7792

This hardware was retired on 30JAN2020 to make room for new vossi (rex replacements after LA office closure).

Summary

apama001 and apama002 are two nodes in an HP SL4540 Gen8 chassis. Both nodes are running Xenial because the systems RSOD when running rpm-based distros. It's possible to get past it by messing with grub but not worth it. See Quirks below for more info.

apama001 is running RHEL and the plan is to use it as a docker host. apama002 is an OSD node in the longrunningcluster.

The iLOs are accessible but the java KVM is mostly useless since our lack of Enterprise License causes it to quit after POST.

The systems were updated to latest firmwares (2016.04.0-SPP2016040) the week of 1MAY2016 using hpsum on CentOS.

Hardware Specs

Count Manufacturer Model Capacity Notes
Chassis N/A HP ProLiant SL4540 Gen8 N/A
Mainboard N/A HP Not Specified N/A
CPU 2 Intel Xeon(R) CPU E5-2407 0 @ 2.20GHz N/A ARK
RAM 12 DIMMs Not Specified Not Specified 4GB 48GB total
HDD 1x HP MM0500GBKAK 500GB
HDD 25x HP MB3000GBKAC 3TB Labelled as Logical Volumes in smartctl output
SSD 0
NIC 2 Intel I350 Gigabit Network Connection 1Gb Connected but not used
NIC 1 Mellanox MT27500 10Gbps/40Gbps 10Gb link in use
BMC 1 HP iLO Firmware 1.13

Quirks

I was able to install Fedora23 on apama001 but still got Red Screen of Death and “Illegal Opcode” after grub tries to boot the OS. Something about rpm-based distros doesn't like searching for the root partition using UUID. See UPDATE below

What fixed it was finding the proper root drive and manually booting to it in a grub rescue prompt. See below.

grub> ls (hd0,msdos1)/boot

vmlinuz-4.2.3-300.fc23.x86_64 System.map-4.2.3-300.fc23.x86_64 config-4. 2.3-300.fc23.x86_64 initramfs-4.2.3-300.fc23.x86_64.img initramfs-0-rescue-ba8d c4c42e5a4ec483a635961112dfd8.img vmlinuz-0-rescue-ba8dc4c42e5a4ec483a635961112d fd8 initrd-plymouth.img                                                         

grub> set root=(hd0 <tab>
Possible partitions are:

Device hd0: No known filesystem detected - Sector size 512B - Total size        488386584KiB

Partition hd0,msdos1: Filesystem type ext* - Last modification time     2016-05-03 18:11:21 Tuesday, UUID 4390d2e3-3154-4855-8ad4-8417c430d982 -        Partition start at 1024KiB - Total size 488385536KiB                                                                                                            
grub> set root=(hd0,msdos1)
grub> linux /boot/vmlinuz-4.2.3-300.fc23.x86_64 root=/dev/sda1
grub> initrd /boot/initramfs-4.2.3-300.fc23.x86_64.img
grub> boot

For CentOS, this had to be modified a bit:

grub> set root=(hd0,msdos1)
grub> linux /vmlinuz-3.10.0-327.el7.x86_64 root=/dev/sdz3 console=ttyS1,115200
grub> initrd /initramfs-3.10.0-327.el7.x86_64.img
grub> boot

There's probably a permanent fix that can be applied in grub.cfg but haven't looked into it.

UPDATE 2NOV2016
Turns out this isn't a UUID or disk issue at all. Changing the boot mode to not use 16-bit fixes it.

[root@apama001 ~]# diff /etc/grub.d/10_linux.orig /etc/grub.d/10_linux
109c109
< 	sixteenbit="16"
---
> 	sixteenbit=""


# Then,
grub2-mkconfig -o /boot/grub/grub.cfg

To open a Java KVM successfully, log into the iLO web UI in another tab. Then browse to https://apama{001|002}.ipmi.sepia.ceph.com/html/java_irc.html?lang=en

Checking Disk SMART data

# If needed...
wget http://downloads.linux.hpe.com/SDR/downloads/MCP/ubuntu/pool/non-free/hpacucli_9.40.1-1._amd64.deb
dpkg -i hpacucli_9.40.1-1._amd64.deb

# Then...
smartctl -x -A -d sat+cciss,0 /dev/sdX