10 nodes donated by Intel for Performance testing.
Count | Manufacturer | Model | Capacity | Notes | |
---|---|---|---|---|---|
Chassis | 1U | Quanta | D52B-1U | N/A | |
Mainboard | N/A | Quanta | S5B-MB (LBG-1G) | N/A | |
CPU | 2 | Intel | Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz | 112 | ARK |
RAM | 12 DIMMs | Micron | 36ASF4G72PZ-2G6H1 | 32GB | 384GB Total |
SSD | 1 | Intel | SSDSC2KB960G8 | 1TB | For OS |
NVMe | 2 | Intel | SSDPE21K750GA | 1TB | For OSD journals? |
NVMe | 8 | Intel | SSDPE2KX080T8 | 8TB | For OSDs |
NIC | 2 NICs 2 x ports | Intel | XXV710 | 25Gb | All 4 ports cabled and bonded |
BMC | 1 | Quanta | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com using usual IPMI credentials or admin:cmb9.admin |
These machines are configured in DHCP to receive /var/lib/tftpboot/grub/grub-x86_64.efi
from the Cobbler host when PXE booting. I had trouble PXEing using BIOS mode on these machines so we're using UEFI.
Our usual cephlab_rhel.ks kickstart is not set up to do UEFI so Anaconda will stop and say the Storage Configuration needs editing.
These nodes are connected to their own QFX5200 (s/n WH0218170419 [formerly WH3619030401]) uplinked and managed by Red Hat IT. For an example of how to report an outage, see https://redhat.service-now.com/surl.do?n=INC1201508.
There is an ansible module that is supposed to allow you to create bonds but it requires NetworkManager-glib which isn't in CentOS8. So I found and used https://github.com/linux-system-roles/network.
Here's the command I ran from the examples
dir:
for num in {1..9}; do sed -i "s/172.21.3..*/172.21.3.$num\/20/g" officinalis.yml; ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 officinalis.yml --limit o0${num}*; done; sed -i "s/172.21.3..*/172.21.3.10\/20/g" officinalis.yml; ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 officinalis.yml --limit o10*
Here's the yml I used:
--- - hosts: officinalis become: true vars: network_connections: - name: ens20f0 persistent_state: absent - name: ens20f1 persistent_state: absent - name: ens49f0 persistent_state: absent - name: ens49f1 persistent_state: absent # Create a bond profile - name: bond0 state: up type: bond ip: address: 172.21.3.10/20 gateway4: 172.21.15.254 dns: - 172.21.0.1 - 172.21.0.2 dns_search: - front.sepia.ceph.com bond: mode: 802.3ad mtu: 1450 # enslave an ethernet to the bond - name: ens20f0 state: up type: ethernet master: bond0 # enslave an ethernet to the bond - name: ens20f1 state: up type: ethernet master: bond0 # enslave an ethernet to the bond - name: ens49f0 state: up type: ethernet master: bond0 # enslave an ethernet to the bond - name: ens49f1 state: up type: ethernet master: bond0 roles: - linux-system-roles.network
The QFX5200 serving the Officinalis lab goes down on a regular basis. This table will keep track of dates and tickets.
Date | Ticket | Notes |
---|---|---|
12/10/2019 | https://redhat.service-now.com/surl.do?n=PNT0731289 | |
3/2/2020 | https://redhat.service-now.com/surl.do?n=INC1201508 | |
6/16/2020 | https://redhat.service-now.com/surl.do?n=RITM0706076 | |
3/16/2021 | https://redhat.service-now.com/surl.do?n=INC1672259 | Added redundant link after this one https://redhat.service-now.com/surl.do?n=RITM0884572 |
5/11/2021 | https://redhat.service-now.com/surl.do?n=INC1758733 |