====== apama{001,002} (RETIRED) ====== https://opendcim.engineering.redhat.com/devices.php?DeviceID=7792 This hardware was retired on 30JAN2020 to make room for new [[hardware:vossi]] (rex replacements after LA office closure). ===== Summary ===== apama001 and apama002 are two nodes in an HP SL4540 Gen8 chassis. Both nodes are running Xenial because the systems [[http://h20564.www2.hp.com/hpsc/doc/public/display?docId=mmr_kc-0111501|RSOD]] when running rpm-based distros. It's possible to get past it by messing with grub but not worth it. See Quirks below for more info. apama001 is running RHEL and the plan is to use it as a docker host. apama002 is an OSD node in the [[services:longrunningcluster]]. The iLOs are accessible but the java KVM is mostly useless since our lack of Enterprise License causes it to quit after POST. The systems were updated to latest firmwares (2016.04.0-SPP2016040) the week of 1MAY2016 using hpsum on CentOS. ===== Hardware Specs ===== | ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^ ^ Chassis | N/A | HP | ProLiant SL4540 Gen8 | N/A | | ^ Mainboard | N/A | HP | Not Specified | N/A | | ^ CPU | 2 | Intel | Xeon(R) CPU E5-2407 0 @ 2.20GHz | N/A | [[http://ark.intel.com/products/64614/Intel-Xeon-Processor-E5-2407-10M-Cache-2_20-GHz-6_40-GTs-Intel-QPI|ARK]] | ^ RAM | 12 DIMMs | Not Specified | Not Specified | 4GB | 48GB total | ^ HDD | 1x | HP | MM0500GBKAK | 500GB | | ^ HDD | 25x | HP | MB3000GBKAC | 3TB | Labelled as Logical Volumes in smartctl output | ^ SSD | 0 | | | | | ^ NIC | 2 | Intel | I350 Gigabit Network Connection | 1Gb | Connected but not used | ^ NIC | 1 | Mellanox | MT27500 | 10Gbps/40Gbps | 10Gb link in use | ^ BMC | 1 | HP | iLO | | Firmware 1.13 | ===== Quirks ===== I was able to install Fedora23 on apama001 but still got Red Screen of Death and "Illegal Opcode" after grub tries to boot the OS. Something about rpm-based distros doesn't like searching for the root partition using UUID. **See UPDATE below** What fixed it was finding the proper root drive and manually booting to it in a grub rescue prompt. See below. grub> ls (hd0,msdos1)/boot vmlinuz-4.2.3-300.fc23.x86_64 System.map-4.2.3-300.fc23.x86_64 config-4. 2.3-300.fc23.x86_64 initramfs-4.2.3-300.fc23.x86_64.img initramfs-0-rescue-ba8d c4c42e5a4ec483a635961112dfd8.img vmlinuz-0-rescue-ba8dc4c42e5a4ec483a635961112d fd8 initrd-plymouth.img grub> set root=(hd0 Possible partitions are: Device hd0: No known filesystem detected - Sector size 512B - Total size 488386584KiB Partition hd0,msdos1: Filesystem type ext* - Last modification time 2016-05-03 18:11:21 Tuesday, UUID 4390d2e3-3154-4855-8ad4-8417c430d982 - Partition start at 1024KiB - Total size 488385536KiB grub> set root=(hd0,msdos1) grub> linux /boot/vmlinuz-4.2.3-300.fc23.x86_64 root=/dev/sda1 grub> initrd /boot/initramfs-4.2.3-300.fc23.x86_64.img grub> boot For CentOS, this had to be modified a bit: grub> set root=(hd0,msdos1) grub> linux /vmlinuz-3.10.0-327.el7.x86_64 root=/dev/sdz3 console=ttyS1,115200 grub> initrd /initramfs-3.10.0-327.el7.x86_64.img grub> boot There's probably a permanent fix that can be applied in grub.cfg but haven't looked into it. **UPDATE 2NOV2016** \\ Turns out this isn't a UUID or disk issue at all. Changing the boot mode to not use 16-bit fixes it. [root@apama001 ~]# diff /etc/grub.d/10_linux.orig /etc/grub.d/10_linux 109c109 < sixteenbit="16" --- > sixteenbit="" # Then, grub2-mkconfig -o /boot/grub/grub.cfg ----- To open a Java KVM successfully, log into the iLO web UI in another tab. Then browse to %%https://apama{001|002}.ipmi.sepia.ceph.com/html/java_irc.html?lang=en%% ===== Checking Disk SMART data ===== # If needed... wget http://downloads.linux.hpe.com/SDR/downloads/MCP/ubuntu/pool/non-free/hpacucli_9.40.1-1._amd64.deb dpkg -i hpacucli_9.40.1-1._amd64.deb # Then... smartctl -x -A -d sat+cciss,0 /dev/sdX