====== Officinalis | o{01..10} ======
===== Summary =====
10 nodes donated by Intel for Performance testing.
===== Hardware Specs =====
| ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^
^ Chassis | 1U | Quanta | D52B-1U | N/A | |
^ Mainboard | N/A | Quanta | S5B-MB (LBG-1G) | N/A | |
^ CPU | 2 | Intel | Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz | 112 | [[https://ark.intel.com/content/www/us/en/ark/products/192471/intel-xeon-platinum-8276m-processor-38-5m-cache-2-20-ghz.html|ARK]] |
^ RAM | 12 DIMMs | Micron | 36ASF4G72PZ-2G6H1 | 32GB | 384GB Total |
^ SSD | 1 | Intel | SSDSC2KB960G8 | 1TB | For OS |
^ NVMe | 2 | Intel | SSDPE21K750GA | 1TB | For OSD journals? |
^ NVMe | 8 | Intel | SSDPE2KX080T8 | 8TB | For OSDs |
^ NIC | 2 NICs 2 x ports | Intel | XXV710 | 25Gb | All 4 ports cabled and bonded |
^ BMC | 1 | Quanta | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com using usual IPMI credentials or admin:cmb9.admin |
===== PXE/Reimaging =====
These machines are configured in DHCP to receive ''/var/lib/tftpboot/grub/grub-x86_64.efi'' from the Cobbler host when PXE booting. I had trouble PXEing using BIOS mode on these machines so we're using UEFI.
Our usual cephlab_rhel.ks kickstart is not set up to do UEFI so Anaconda will stop and say the Storage Configuration needs editing.
==== Network Config ====
These nodes are connected to their own QFX5200 (s/n WH0218170419 [formerly WH3619030401]) uplinked and managed by Red Hat IT. For an example of how to report an outage, see https://redhat.service-now.com/surl.do?n=INC1201508.
There is an ansible module that is supposed to allow you to create bonds but it requires NetworkManager-glib which isn't in CentOS8. So I found and used https://github.com/linux-system-roles/network.
Here's the command I ran from the ''examples'' dir:
for num in {1..9}; do sed -i "s/172.21.3..*/172.21.3.$num\/20/g" officinalis.yml; ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 officinalis.yml --limit o0${num}*; done; sed -i "s/172.21.3..*/172.21.3.10\/20/g" officinalis.yml; ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 officinalis.yml --limit o10*
Here's the yml I used:
---
- hosts: officinalis
become: true
vars:
network_connections:
- name: ens20f0
persistent_state: absent
- name: ens20f1
persistent_state: absent
- name: ens49f0
persistent_state: absent
- name: ens49f1
persistent_state: absent
# Create a bond profile
- name: bond0
state: up
type: bond
ip:
address: 172.21.3.10/20
gateway4: 172.21.15.254
dns:
- 172.21.0.1
- 172.21.0.2
dns_search:
- front.sepia.ceph.com
bond:
mode: 802.3ad
mtu: 1450
# enslave an ethernet to the bond
- name: ens20f0
state: up
type: ethernet
master: bond0
# enslave an ethernet to the bond
- name: ens20f1
state: up
type: ethernet
master: bond0
# enslave an ethernet to the bond
- name: ens49f0
state: up
type: ethernet
master: bond0
# enslave an ethernet to the bond
- name: ens49f1
state: up
type: ethernet
master: bond0
roles:
- linux-system-roles.network
==== List of Outages ====
The QFX5200 serving the Officinalis lab goes down on a regular basis. This table will keep track of dates and tickets.
^ Date ^ Ticket | Notes |
| 12/10/2019 | https://redhat.service-now.com/surl.do?n=PNT0731289 | |
| 3/2/2020 | https://redhat.service-now.com/surl.do?n=INC1201508 | |
| 6/16/2020 | https://redhat.service-now.com/surl.do?n=RITM0706076 | |
| 3/16/2021 | https://redhat.service-now.com/surl.do?n=INC1672259 | Added redundant link after this one\\ https://redhat.service-now.com/surl.do?n=RITM0884572 |
| 5/11/2021 | https://redhat.service-now.com/surl.do?n=INC1758733 | |