This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
hardware:apama [2016/03/01 22:48] dgalloway adding table |
hardware:apama [2020/01/30 15:55] (current) djgalloway |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== apama{001,002} ====== | + | ====== apama{001,002} (RETIRED) ====== |
https://opendcim.engineering.redhat.com/devices.php?DeviceID=7792 | https://opendcim.engineering.redhat.com/devices.php?DeviceID=7792 | ||
+ | |||
+ | This hardware was retired on 30JAN2020 to make room for new [[hardware:vossi]] (rex replacements after LA office closure). | ||
===== Summary ===== | ===== Summary ===== | ||
- | apama001 and apama002 are two nodes in an HP SL4540 Gen8 chassis. | + | apama001 and apama002 are two nodes in an HP SL4540 Gen8 chassis. Both nodes are running Xenial because the systems [[http://h20564.www2.hp.com/hpsc/doc/public/display?docId=mmr_kc-0111501|RSOD]] when running rpm-based distros. It's possible to get past it by messing with grub but not worth it. See Quirks below for more info. |
- | As of this writing, they need to be reinstalled with (probably*) Fedora and added to the [[services:longrunningcluster]]. | + | apama001 is running RHEL and the plan is to use it as a docker host. apama002 is an OSD node in the [[services:longrunningcluster]]. |
- | The iLOs are currently not accessible. Probably not cabled or switch port config required. | + | The iLOs are accessible but the java KVM is mostly useless since our lack of Enterprise License causes it to quit after POST. |
- | //* Fedora because it's good to have multiple OSes running OSDs in the same cluster.// | + | The systems were updated to latest firmwares (2016.04.0-SPP2016040) the week of 1MAY2016 using hpsum on CentOS. |
===== Hardware Specs ===== | ===== Hardware Specs ===== | ||
+ | | ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^ | ||
+ | ^ Chassis | N/A | HP | ProLiant SL4540 Gen8 | N/A | | | ||
+ | ^ Mainboard | N/A | HP | Not Specified | N/A | | | ||
+ | ^ CPU | 2 | Intel | Xeon(R) CPU E5-2407 0 @ 2.20GHz | N/A | [[http://ark.intel.com/products/64614/Intel-Xeon-Processor-E5-2407-10M-Cache-2_20-GHz-6_40-GTs-Intel-QPI|ARK]] | | ||
+ | ^ RAM | 12 DIMMs | Not Specified | Not Specified | 4GB | 48GB total | | ||
+ | ^ HDD | 1x | HP | MM0500GBKAK | 500GB | | | ||
+ | ^ HDD | 25x | HP | MB3000GBKAC | 3TB | Labelled as Logical Volumes in smartctl output | | ||
+ | ^ SSD | 0 | | | | | | ||
+ | ^ NIC | 2 | Intel | I350 Gigabit Network Connection | 1Gb | Connected but not used | | ||
+ | ^ NIC | 1 | Mellanox | MT27500 | 10Gbps/40Gbps | 10Gb link in use | | ||
+ | ^ BMC | 1 | HP | iLO | | Firmware 1.13 | | ||
- | ===== Hardware Specs ===== | + | ===== Quirks ===== |
- | | ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^ | + | I was able to install Fedora23 on apama001 but still got Red Screen of Death and "Illegal Opcode" after grub tries to boot the OS. <del>Something about rpm-based distros doesn't like searching for the root partition using UUID.</del> **See UPDATE below** |
- | ^ Chassis | N/A | | | N/A | | | + | |
- | ^ Mainboard | N/A | | | N/A | | | + | What fixed it was finding the proper root drive and manually booting to it in a grub rescue prompt. See below. |
- | ^ CPU | | | | N/A | http://link.to.ARK.com | | + | |
- | ^ RAM | DIMMs | | | | | | + | <code> |
- | ^ HDD | | | | | | | + | grub> ls (hd0,msdos1)/boot |
- | ^ SSD | | | | | | | + | |
- | ^ NIC | | | | | | | + | vmlinuz-4.2.3-300.fc23.x86_64 System.map-4.2.3-300.fc23.x86_64 config-4. 2.3-300.fc23.x86_64 initramfs-4.2.3-300.fc23.x86_64.img initramfs-0-rescue-ba8d c4c42e5a4ec483a635961112dfd8.img vmlinuz-0-rescue-ba8dc4c42e5a4ec483a635961112d fd8 initrd-plymouth.img |
+ | |||
+ | grub> set root=(hd0 <tab> | ||
+ | Possible partitions are: | ||
+ | |||
+ | Device hd0: No known filesystem detected - Sector size 512B - Total size 488386584KiB | ||
+ | |||
+ | Partition hd0,msdos1: Filesystem type ext* - Last modification time 2016-05-03 18:11:21 Tuesday, UUID 4390d2e3-3154-4855-8ad4-8417c430d982 - Partition start at 1024KiB - Total size 488385536KiB | ||
+ | grub> set root=(hd0,msdos1) | ||
+ | grub> linux /boot/vmlinuz-4.2.3-300.fc23.x86_64 root=/dev/sda1 | ||
+ | grub> initrd /boot/initramfs-4.2.3-300.fc23.x86_64.img | ||
+ | grub> boot | ||
+ | </code> | ||
+ | |||
+ | For CentOS, this had to be modified a bit: | ||
+ | |||
+ | <code> | ||
+ | grub> set root=(hd0,msdos1) | ||
+ | grub> linux /vmlinuz-3.10.0-327.el7.x86_64 root=/dev/sdz3 console=ttyS1,115200 | ||
+ | grub> initrd /initramfs-3.10.0-327.el7.x86_64.img | ||
+ | grub> boot | ||
+ | </code> | ||
+ | |||
+ | There's probably a permanent fix that can be applied in grub.cfg but haven't looked into it. | ||
+ | |||
+ | **UPDATE 2NOV2016** | ||
+ | \\ Turns out this isn't a UUID or disk issue at all. Changing the boot mode to not use 16-bit fixes it. | ||
+ | <code> | ||
+ | [root@apama001 ~]# diff /etc/grub.d/10_linux.orig /etc/grub.d/10_linux | ||
+ | 109c109 | ||
+ | < sixteenbit="16" | ||
+ | --- | ||
+ | > sixteenbit="" | ||
+ | |||
+ | |||
+ | # Then, | ||
+ | grub2-mkconfig -o /boot/grub/grub.cfg | ||
+ | </code> | ||
+ | ----- | ||
+ | To open a Java KVM successfully, log into the iLO web UI in another tab. Then browse to %%https://apama{001|002}.ipmi.sepia.ceph.com/html/java_irc.html?lang=en%% | ||
- | **Per node** | + | ===== Checking Disk SMART data ===== |
- | * 48GB RAM | + | <code> |
- | * 2x Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz | + | # If needed... |
- | * 1x 500GB HP HDD | + | wget http://downloads.linux.hpe.com/SDR/downloads/MCP/ubuntu/pool/non-free/hpacucli_9.40.1-1._amd64.deb |
- | * 26x 3TB HP HDD | + | dpkg -i hpacucli_9.40.1-1._amd64.deb |
- | These also have 10Gb SFP+ ports that are cabled but may need switch port configuration. Probably useful to have before re-adding to cluster. | + | # Then... |
+ | smartctl -x -A -d sat+cciss,0 /dev/sdX | ||
+ | </code> |