User Tools

Site Tools


hardware:ivan

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
hardware:ivan [2022/05/10 13:43]
djgalloway [Table]
hardware:ivan [2022/06/08 15:17] (current)
djgalloway
Line 2: Line 2:
 ===== Summary ===== ===== Summary =====
 The Ceph Foundation purchased 7 more servers to join the [[services:​longrunningcluster]]. ​ The three primary goals were: The Ceph Foundation purchased 7 more servers to join the [[services:​longrunningcluster]]. ​ The three primary goals were:
-  - Faster networking between hosts+  - Faster networking ​(25Gbps) ​between hosts
   - Large NVMe devices as OSDs   - Large NVMe devices as OSDs
   - 12TB HDDs (largest up until now was 4TB)   - 12TB HDDs (largest up until now was 4TB)
Line 20: Line 20:
 ^ NVMe       | 2        | Intel         | SSDPE2KE016T8 ​                               | 1.6TB     | For large NVMe OSDs                                                                                                            | ^ NVMe       | 2        | Intel         | SSDPE2KE016T8 ​                               | 1.6TB     | For large NVMe OSDs                                                                                                            |
 ^ NVMe       | 1        | Intel         | SSDPE21M375GA ​                               | 375GB     | Carved up as logical volumes for OSD journals ​                                                                                 | ^ NVMe       | 1        | Intel         | SSDPE21M375GA ​                               | 375GB     | Carved up as logical volumes for OSD journals ​                                                                                 |
-^ NIC        | 2 ports  | Intel         | X722                                         | 10Gb      | 1 port cabled ​for front iface                                                                                                  ​|+^ NIC        | 2 ports  | Intel         | X722                                         | 10Gb      | 1 port cabled ​BUT DISABLED. See below. ​                                                                                               ​|
 ^ NIC        | 2 ports  | Mellanox ​     | ConnectX-4 ​                                  | 25Gb      | For ''​back''​ / storage traffic ​                                                                                                | ^ NIC        | 2 ports  | Mellanox ​     | ConnectX-4 ​                                  | 25Gb      | For ''​back''​ / storage traffic ​                                                                                                |
 ^ BMC        | 1        | Supermicro ​   | N/A                                          | N/A       | Reachable at $host.ipmi.sepia.ceph.com ​                                                                                        | ^ BMC        | 1        | Supermicro ​   | N/A                                          | N/A       | Reachable at $host.ipmi.sepia.ceph.com ​                                                                                        |
Line 26: Line 26:
  
 ===== OSD/Block Device Information ===== ===== OSD/Block Device Information =====
-The ivan have 9x 12TB HDD, 2x 1.5TB NVMe, and 1x 350GB NVMe. +I used the Orchestrator ​to deploy OSDs on the ivan hosts (I did this one by one to avoid a mass data rebalance all to one rack).
- +
-The 12TB were added to so we can say we're testing ​on drives larger than 8TB. +
- +
-The smaller NVMe device is split into eleven equal logical volumes. One for each OSD's journal.+
  
 <​code>​ <​code>​
-root@ivan04:~# lsblk +root@reesi001:~# cat ivan_osd_spec.yml ​ 
-NAME                 MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT +service_typeosd 
-sda                    8:0    0 894.3G ​ 0 disk  ​ +service_idosd_using_paths 
-`-sda1 ​                8:1    0 894.3G ​ 0 part  ​ +placement
-  ​`-md0                9:   0 894.1G ​ 0 raid1 / +  ​hosts: 
-sdb                    8:16   0 894.3G ​ 0 disk  ​ +    - ivan01 
-`-sdb1                 ​8:​17 ​  0 894.3G ​ 0 part  ​ +    - ivan02 
-  `-md0                9:0    0 894.1G ​ 0 raid1 / +    ivan03 
-sdc                    8:32   ​0 ​ 10.9T  0 disk  ​ +    ivan04 
-sdd                    8:48   ​0 ​ 10.9T  0 disk  ​ +    - ivan05 
-sde                    8:64   ​0 ​ 10.9T  0 disk  ​ +    - ivan06 
-sdf                    8:80   ​0 ​ 10.9T  0 disk   +    - ivan07 
-sdg                    8:96   ​0 ​ 10.9T  0 disk  ​ +spec
-sdh                    8:112  0  10.9T  0 disk   +  ​data_devices
-sdi                    8:128  0  10.9T  0 disk   +    ​paths
-sdj                    8:144  0  10.9T  0 disk   +    - /dev/sdc 
-sdk                    8:160  0  10.9T  0 disk   +    ​- ​/dev/sdd 
-sr0                   ​11:​0 ​   1   ​841M ​ 0 rom    +    ​- ​/dev/sde 
-nvme0n1 ​             259:0    0 349.3G ​ 0 disk  ​ +    ​- ​/dev/sdf 
-`-nvme0n1p1 ​         259:3    0 349.3G ​ 0 part   +    ​- ​/dev/sdg 
-  |-journals-sdc     253:0    0    31G  0 lvm    +    ​- ​/dev/sdh 
-  |-journals-sdd     253:1    0    31G  0 lvm    +    - /dev/sdi 
-  |-journals-sde     253:2    0    31G  0 lvm    +    - /dev/sdj 
-  |-journals-sdf     253:3    0    31G  0 lvm    +    - /dev/sdk 
-  |-journals-sdg     253:4    0    31G  0 lvm    +    - /dev/nvme1n1 
-  |-journals-sdh     253:5    0    31G  0 lvm    +    - /dev/nvme2n1 
-  ​|-journals-sdi ​    ​253:​6 ​   0    31G  0 lvm    +  ​db_devices:​ 
-  |-journals-sdj ​    ​253:​7 ​   0    31G  0 lvm    +    paths: 
-  |-journals-sdk ​    ​253:​8 ​   0    31G  0 lvm    +    ​- /dev/nvme0n1
-  |-journals-nvme1n1 253:9    0    31G  0 lvm    +
-  `-journals-nvme2n1 253:​10 ​  ​0 ​   31G  0 lvm    +
-nvme1n1 ​             259:1    0   ​1.5T ​ 0 disk   +
-nvme2n1 ​             259:2    0   ​1.5T ​ 0 disk +
-</code> +
- +
-==== How to partition/re-partition the NVMe device ==== +
-Here's my bash history that can be used to set up a reesi machine'​s NVMe card. +
- +
-<​code>​ +
-ansible -a "sudo parted ​-/dev/nvme0n1 mktable gpt" reesi* +
-ansible ​-a "sudo parted ​/dev/nvme0n1 unit '​%'​ mkpart foo 0 50" reesi* +
-ansible ​-a "sudo parted ​/dev/nvme0n1 unit '​%'​ mkpart foo 51 100" reesi* +
-ansible ​-a "sudo pvcreate ​/dev/nvme0n1p1"​ reesi* +
-ansible ​-a "sudo vgcreate journals ​/dev/nvme0n1p1"​ reesi* +
-for disk in sd{a..l}; do ansible -a "sudo lvcreate -L 31G -n $disk journals"​ reesi*; done+
 </​code>​ </​code>​
  
Line 89: Line 69:
  
 ===== Installation Quirks/​Difficulties ===== ===== Installation Quirks/​Difficulties =====
 +==== Networking ====
 +Initially, I wanted to have the 1Gb interface cabled on VLAN100 and the 25Gb interfaces cabled to VLAN101 (back.sepia.ceph.com). Up until now I have never really used VLAN101. I was able to get both NICs up, IPs assigned, and the servers could reach each other. The LRC could also reach these servers on their 25Gb/''​back''​ interfaces.
 +
 +I added the hosts to the cluster using the ''​back''​ IPs. The cluster became very unhappy complaining about slow OPs. Come to find out the ivan servers couldn'​t get **out** from their ''​back''​ interfaces so the OSDs defaulted back to the 1Gb link.
 +
 +I reached out to Red Hat IT to have the 25Gb network ports switched over to VLAN100. After that, I struggled to get eno1 (the 1Gb interface) to **not** come up on boot since I didn't need it anymore.
 +
 +Finally I figured out<​code>​
 +# cat /​etc/​systemd/​network/​10-eno1.network ​
 +[Match]
 +Name=eno1
 +
 +[Network]
 +DHCP=no
 +</​code>​
 +
 ==== CentOS 8 ==== ==== CentOS 8 ====
 I could not for the life of me get ivan05 to install using the Ubuntu preseed below. Its settings are identical to the rest of the machines. I remember someone (I think GregF?) suggest in a CLT call that we should have a mixture of OSes in the LRC so I decided to use CentOS8 instead. I could not for the life of me get ivan05 to install using the Ubuntu preseed below. Its settings are identical to the rest of the machines. I remember someone (I think GregF?) suggest in a CLT call that we should have a mixture of OSes in the LRC so I decided to use CentOS8 instead.
hardware/ivan.1652190212.txt.gz · Last modified: 2022/05/10 13:43 by djgalloway