This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
hardware:officinalis [2019/09/26 17:05] djgalloway [Table] |
hardware:officinalis [2022/06/01 17:40] (current) djgalloway [Table] |
||
---|---|---|---|
Line 8: | Line 8: | ||
^ Mainboard | N/A | Quanta | S5B-MB (LBG-1G) | N/A | | | ^ Mainboard | N/A | Quanta | S5B-MB (LBG-1G) | N/A | | | ||
^ CPU | 2 | Intel | Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz | 112 | [[https://ark.intel.com/content/www/us/en/ark/products/192471/intel-xeon-platinum-8276m-processor-38-5m-cache-2-20-ghz.html|ARK]] | | ^ CPU | 2 | Intel | Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz | 112 | [[https://ark.intel.com/content/www/us/en/ark/products/192471/intel-xeon-platinum-8276m-processor-38-5m-cache-2-20-ghz.html|ARK]] | | ||
- | ^ RAM | 2 DIMMs | Micron | 36ASF4G72PZ-2G6H1 | 32GB | 384GB Total | | + | ^ RAM | 12 DIMMs | Micron | 36ASF4G72PZ-2G6H1 | 32GB | 384GB Total | |
^ SSD | 1 | Intel | SSDSC2KB960G8 | 1TB | For OS | | ^ SSD | 1 | Intel | SSDSC2KB960G8 | 1TB | For OS | | ||
^ NVMe | 2 | Intel | SSDPE21K750GA | 1TB | For OSD journals? | | ^ NVMe | 2 | Intel | SSDPE21K750GA | 1TB | For OSD journals? | | ||
^ NVMe | 8 | Intel | SSDPE2KX080T8 | 8TB | For OSDs | | ^ NVMe | 8 | Intel | SSDPE2KX080T8 | 8TB | For OSDs | | ||
- | ^ NIC | 2 NICs 2 x ports | Intel | XXV710 | 25Gb | Not used | | + | ^ NIC | 2 NICs 2 x ports | Intel | XXV710 | 25Gb | All 4 ports cabled and bonded | |
- | ^ BMC | 1 | Quanta | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com using usual IPMI credentials. | | + | ^ BMC | 1 | Quanta | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com using usual IPMI credentials or admin:cmb9.admin | |
===== PXE/Reimaging ===== | ===== PXE/Reimaging ===== | ||
Line 19: | Line 19: | ||
Our usual cephlab_rhel.ks kickstart is not set up to do UEFI so Anaconda will stop and say the Storage Configuration needs editing. | Our usual cephlab_rhel.ks kickstart is not set up to do UEFI so Anaconda will stop and say the Storage Configuration needs editing. | ||
+ | |||
+ | ==== Network Config ==== | ||
+ | These nodes are connected to their own QFX5200 (s/n WH0218170419 [formerly WH3619030401]) uplinked and managed by Red Hat IT. For an example of how to report an outage, see https://redhat.service-now.com/surl.do?n=INC1201508. | ||
+ | |||
+ | There is an ansible module that is supposed to allow you to create bonds but it requires NetworkManager-glib which isn't in CentOS8. So I found and used https://github.com/linux-system-roles/network. | ||
+ | |||
+ | Here's the command I ran from the ''examples'' dir: | ||
+ | |||
+ | <code> | ||
+ | for num in {1..9}; do sed -i "s/172.21.3..*/172.21.3.$num\/20/g" officinalis.yml; ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 officinalis.yml --limit o0${num}*; done; sed -i "s/172.21.3..*/172.21.3.10\/20/g" officinalis.yml; ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 officinalis.yml --limit o10* | ||
+ | </code> | ||
+ | |||
+ | Here's the yml I used: | ||
+ | |||
+ | <code> | ||
+ | --- | ||
+ | - hosts: officinalis | ||
+ | become: true | ||
+ | vars: | ||
+ | network_connections: | ||
+ | |||
+ | - name: ens20f0 | ||
+ | persistent_state: absent | ||
+ | |||
+ | - name: ens20f1 | ||
+ | persistent_state: absent | ||
+ | |||
+ | - name: ens49f0 | ||
+ | persistent_state: absent | ||
+ | |||
+ | - name: ens49f1 | ||
+ | persistent_state: absent | ||
+ | |||
+ | # Create a bond profile | ||
+ | - name: bond0 | ||
+ | state: up | ||
+ | type: bond | ||
+ | ip: | ||
+ | address: 172.21.3.10/20 | ||
+ | gateway4: 172.21.15.254 | ||
+ | dns: | ||
+ | - 172.21.0.1 | ||
+ | - 172.21.0.2 | ||
+ | dns_search: | ||
+ | - front.sepia.ceph.com | ||
+ | bond: | ||
+ | mode: 802.3ad | ||
+ | mtu: 1450 | ||
+ | |||
+ | # enslave an ethernet to the bond | ||
+ | - name: ens20f0 | ||
+ | state: up | ||
+ | type: ethernet | ||
+ | master: bond0 | ||
+ | |||
+ | # enslave an ethernet to the bond | ||
+ | - name: ens20f1 | ||
+ | state: up | ||
+ | type: ethernet | ||
+ | master: bond0 | ||
+ | |||
+ | # enslave an ethernet to the bond | ||
+ | - name: ens49f0 | ||
+ | state: up | ||
+ | type: ethernet | ||
+ | master: bond0 | ||
+ | |||
+ | # enslave an ethernet to the bond | ||
+ | - name: ens49f1 | ||
+ | state: up | ||
+ | type: ethernet | ||
+ | master: bond0 | ||
+ | |||
+ | roles: | ||
+ | - linux-system-roles.network | ||
+ | </code> | ||
+ | |||
+ | ==== List of Outages ==== | ||
+ | The QFX5200 serving the Officinalis lab goes down on a regular basis. This table will keep track of dates and tickets. | ||
+ | |||
+ | ^ Date ^ Ticket | Notes | | ||
+ | | 12/10/2019 | https://redhat.service-now.com/surl.do?n=PNT0731289 | | | ||
+ | | 3/2/2020 | https://redhat.service-now.com/surl.do?n=INC1201508 | | | ||
+ | | 6/16/2020 | https://redhat.service-now.com/surl.do?n=RITM0706076 | | | ||
+ | | 3/16/2021 | https://redhat.service-now.com/surl.do?n=INC1672259 | Added redundant link after this one\\ https://redhat.service-now.com/surl.do?n=RITM0884572 | | ||
+ | | 5/11/2021 | https://redhat.service-now.com/surl.do?n=INC1758733 | | |