This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
hardware:ivan [2022/05/10 13:37] djgalloway [Summary] |
hardware:ivan [2022/06/08 15:17] (current) djgalloway |
||
---|---|---|---|
Line 2: | Line 2: | ||
===== Summary ===== | ===== Summary ===== | ||
The Ceph Foundation purchased 7 more servers to join the [[services:longrunningcluster]]. The three primary goals were: | The Ceph Foundation purchased 7 more servers to join the [[services:longrunningcluster]]. The three primary goals were: | ||
- | - Faster networking between hosts | + | - Faster networking (25Gbps) between hosts |
- Large NVMe devices as OSDs | - Large NVMe devices as OSDs | ||
- 12TB HDDs (largest up until now was 4TB) | - 12TB HDDs (largest up until now was 4TB) | ||
Line 11: | Line 11: | ||
===== Hardware Specs ===== | ===== Hardware Specs ===== | ||
- | | ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^ | + | | ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^ |
- | ^ Chassis | 2U | Supermicro | SSG-6028R-E1CR12H | N/A | | | + | ^ Chassis | 2U | Supermicro | SSG-6029P-E1CR12L | N/A | | |
- | ^ Mainboard | N/A | Supermicro | X10DRH-iT | N/A | | | + | ^ Mainboard | N/A | Supermicro | X11DPH-T | N/A | | |
- | ^ CPU | 1 | Intel | Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | N/A | [[https://ark.intel.com/products/92986/Intel-Xeon-Processor-E5-2620-v4-20M-Cache-2_10-GHz|ARK]] | | + | ^ CPU | 2 | Intel | Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz | N/A | [[https://ark.intel.com/content/www/us/en/ark/products/199349/intel-xeon-silver-4215r-processor-11m-cache-3-20-ghz.html|ARK]] | |
- | ^ RAM | 4 DIMMs | Samsung | M393A2G40EB1-CRC | 16GB | 64GB total | | + | ^ RAM | 4 DIMMs | SK Hynix | HMAA4GR7AJR8N-XN | 32GB | 128GB Total | |
- | ^ SSD | 2 | Intel | SSDSC2BB150G7 (S3520) | 150GB | Software RAID1 for OS | | + | ^ SSD | 2 | Intel | SSDSC2KG960G8 (S4510) | 1TB | Software RAID1 for OS | |
- | ^ HDD | 11 | Seagate | ST4000NM0025 | 4TB | SAS 7200RPM for OSDs | | + | ^ HDD | 9 | Seagate | ST12000NM002G | 12TB | SAS 7200RPM for OSDs | |
- | ^ HDD | 1 | HGST | HUH721212AL5200 | 12TB | SAS 7200RPM added 1AUG2019 at Brett's request. | | + | ^ NVMe | 2 | Intel | SSDPE2KE016T8 | 1.6TB | For large NVMe OSDs | |
- | ^ NVMe | 1 | Micron | MTFDHBG800MCG-1AN1ZABYY | 800GB | Carved up as logical volumes on two partitions. 400GB as an OSD and the other 400GB divided by 12 for HDD OSD journals | | + | ^ NVMe | 1 | Intel | SSDPE21M375GA | 375GB | Carved up as logical volumes for OSD journals | |
- | ^ NIC | 2 ports | Intel | X540-AT2 | 10Gb | RJ45 (not used) | | + | ^ NIC | 2 ports | Intel | X722 | 10Gb | 1 port cabled BUT DISABLED. See below. | |
- | ^ NIC | 2 ports | Intel | 82599ES | 10Gb | 1 port cabled per system on front VLAN | | + | ^ NIC | 2 ports | Mellanox | ConnectX-4 | 25Gb | For ''back'' / storage traffic | |
- | ^ BMC | 1 | Supermicro | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com | | + | ^ BMC | 1 | Supermicro | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com | |
===== OSD/Block Device Information ===== | ===== OSD/Block Device Information ===== | ||
- | The ivan have 9x 12TB HDD, 2x 1.5TB NVMe, and 1x 350GB NVMe. | + | I used the Orchestrator to deploy OSDs on the ivan hosts (I did this one by one to avoid a mass data rebalance all to one rack). |
- | + | ||
- | The 12TB were added to so we can say we're testing on drives larger than 8TB. | + | |
- | + | ||
- | The smaller NVMe device is split into eleven equal logical volumes. One for each OSD's journal. | + | |
<code> | <code> | ||
- | root@ivan04:~# lsblk | + | root@reesi001:~# cat ivan_osd_spec.yml |
- | NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT | + | service_type: osd |
- | sda 8:0 0 894.3G 0 disk | + | service_id: osd_using_paths |
- | `-sda1 8:1 0 894.3G 0 part | + | placement: |
- | `-md0 9:0 0 894.1G 0 raid1 / | + | hosts: |
- | sdb 8:16 0 894.3G 0 disk | + | - ivan01 |
- | `-sdb1 8:17 0 894.3G 0 part | + | - ivan02 |
- | `-md0 9:0 0 894.1G 0 raid1 / | + | - ivan03 |
- | sdc 8:32 0 10.9T 0 disk | + | - ivan04 |
- | sdd 8:48 0 10.9T 0 disk | + | - ivan05 |
- | sde 8:64 0 10.9T 0 disk | + | - ivan06 |
- | sdf 8:80 0 10.9T 0 disk | + | - ivan07 |
- | sdg 8:96 0 10.9T 0 disk | + | spec: |
- | sdh 8:112 0 10.9T 0 disk | + | data_devices: |
- | sdi 8:128 0 10.9T 0 disk | + | paths: |
- | sdj 8:144 0 10.9T 0 disk | + | - /dev/sdc |
- | sdk 8:160 0 10.9T 0 disk | + | - /dev/sdd |
- | sr0 11:0 1 841M 0 rom | + | - /dev/sde |
- | nvme0n1 259:0 0 349.3G 0 disk | + | - /dev/sdf |
- | `-nvme0n1p1 259:3 0 349.3G 0 part | + | - /dev/sdg |
- | |-journals-sdc 253:0 0 31G 0 lvm | + | - /dev/sdh |
- | |-journals-sdd 253:1 0 31G 0 lvm | + | - /dev/sdi |
- | |-journals-sde 253:2 0 31G 0 lvm | + | - /dev/sdj |
- | |-journals-sdf 253:3 0 31G 0 lvm | + | - /dev/sdk |
- | |-journals-sdg 253:4 0 31G 0 lvm | + | - /dev/nvme1n1 |
- | |-journals-sdh 253:5 0 31G 0 lvm | + | - /dev/nvme2n1 |
- | |-journals-sdi 253:6 0 31G 0 lvm | + | db_devices: |
- | |-journals-sdj 253:7 0 31G 0 lvm | + | paths: |
- | |-journals-sdk 253:8 0 31G 0 lvm | + | - /dev/nvme0n1 |
- | |-journals-nvme1n1 253:9 0 31G 0 lvm | + | |
- | `-journals-nvme2n1 253:10 0 31G 0 lvm | + | |
- | nvme1n1 259:1 0 1.5T 0 disk | + | |
- | nvme2n1 259:2 0 1.5T 0 disk | + | |
- | </code> | + | |
- | + | ||
- | ==== How to partition/re-partition the NVMe device ==== | + | |
- | Here's my bash history that can be used to set up a reesi machine's NVMe card. | + | |
- | + | ||
- | <code> | + | |
- | ansible -a "sudo parted -s /dev/nvme0n1 mktable gpt" reesi* | + | |
- | ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 0 50" reesi* | + | |
- | ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 51 100" reesi* | + | |
- | ansible -a "sudo pvcreate /dev/nvme0n1p1" reesi* | + | |
- | ansible -a "sudo vgcreate journals /dev/nvme0n1p1" reesi* | + | |
- | for disk in sd{a..l}; do ansible -a "sudo lvcreate -L 31G -n $disk journals" reesi*; done | + | |
</code> | </code> | ||
Line 89: | Line 69: | ||
===== Installation Quirks/Difficulties ===== | ===== Installation Quirks/Difficulties ===== | ||
+ | ==== Networking ==== | ||
+ | Initially, I wanted to have the 1Gb interface cabled on VLAN100 and the 25Gb interfaces cabled to VLAN101 (back.sepia.ceph.com). Up until now I have never really used VLAN101. I was able to get both NICs up, IPs assigned, and the servers could reach each other. The LRC could also reach these servers on their 25Gb/''back'' interfaces. | ||
+ | |||
+ | I added the hosts to the cluster using the ''back'' IPs. The cluster became very unhappy complaining about slow OPs. Come to find out the ivan servers couldn't get **out** from their ''back'' interfaces so the OSDs defaulted back to the 1Gb link. | ||
+ | |||
+ | I reached out to Red Hat IT to have the 25Gb network ports switched over to VLAN100. After that, I struggled to get eno1 (the 1Gb interface) to **not** come up on boot since I didn't need it anymore. | ||
+ | |||
+ | Finally I figured out<code> | ||
+ | # cat /etc/systemd/network/10-eno1.network | ||
+ | [Match] | ||
+ | Name=eno1 | ||
+ | |||
+ | [Network] | ||
+ | DHCP=no | ||
+ | </code> | ||
+ | |||
==== CentOS 8 ==== | ==== CentOS 8 ==== | ||
I could not for the life of me get ivan05 to install using the Ubuntu preseed below. Its settings are identical to the rest of the machines. I remember someone (I think GregF?) suggest in a CLT call that we should have a mixture of OSes in the LRC so I decided to use CentOS8 instead. | I could not for the life of me get ivan05 to install using the Ubuntu preseed below. Its settings are identical to the rest of the machines. I remember someone (I think GregF?) suggest in a CLT call that we should have a mixture of OSes in the LRC so I decided to use CentOS8 instead. |