====== reesi{001..006} ====== ===== Summary ===== The Sepia lab has 6 storage hosts purchased December 2017 to replace the miras used in the [[services:LONGRUNNINGCLUSTER]]. ===== Purchasing details ===== https://docs.google.com/spreadsheets/d/1-CqEo-927duC0sClr78WcZgVIFdY55iE_K8oU2G7aSw Racking ticket: https://redhat.service-now.com/surl.do?n=PNT0117459 ===== Hardware Specs ===== | ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^ ^ Chassis | 2U | Supermicro | SSG-6028R-E1CR12H | N/A | | ^ Mainboard | N/A | Supermicro | X10DRH-iT | N/A | | ^ CPU | 1 | Intel | Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | N/A | [[https://ark.intel.com/products/92986/Intel-Xeon-Processor-E5-2620-v4-20M-Cache-2_10-GHz|ARK]] | ^ RAM | 4 DIMMs | Samsung | M393A2G40EB1-CRC | 16GB | 64GB total | ^ SSD | 2 | Intel | SSDSC2BB150G7 (S3520) | 150GB | Software RAID1 for OS | ^ HDD | 11 | Seagate | ST4000NM0025 | 4TB | SAS 7200RPM for OSDs | ^ HDD | 1 | HGST | HUH721212AL5200 | 12TB | SAS 7200RPM added 1AUG2019 at Brett's request. | ^ NVMe | 1 | Micron | MTFDHBG800MCG-1AN1ZABYY | 800GB | Carved up as logical volumes on two partitions. 400GB as an OSD and the other 400GB divided by 12 for HDD OSD journals | ^ NIC | 2 ports | Intel | X540-AT2 | 10Gb | RJ45 (not used) | ^ NIC | 2 ports | Intel | 82599ES | 10Gb | 1 port cabled per system on front VLAN | ^ BMC | 1 | Supermicro | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com | ===== OSD/Block Device Information ===== The reesi have 11x 4TB HDD, 1x 12TB HDD, and 1x 800GB NVMe. The 12TB were added to so we can say we're testing on drives larger than 8TB. The NVMe device is split into two equal partitions: - Split into 12 LVMs to serve as block.db for each HDD OSD - Used as an SSD OSD root@reesi001:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 3.7T 0 disk └─ceph--28f7427e--5558--4ffd--ae1a--51ec3042759a-osd--block--63c0a64e--a0d2--4daf--87ec--c4b00b9f3ab9 252:12 0 3.7T 0 lvm sdb 8:16 0 3.7T 0 disk └─ceph--9ddb7c35--20a6--4099--9127--947141c5452e-osd--block--7233f174--6402--4094--a48b--9aaabf508cb2 252:13 0 3.7T 0 lvm sdc 8:32 0 3.7T 0 disk └─ceph--cbfd182d--02e9--4c5a--b06e--7497d6109c87-osd--block--58eecdfe--b984--46da--a37a--cd04867b4e3f 252:14 0 3.7T 0 lvm sdd 8:48 0 3.7T 0 disk └─ceph--9632797e--c4d3--45df--a2bc--03e466c16224-osd--block--20171f57--5931--4402--bbe8--a7f703aa47db 252:16 0 3.7T 0 lvm sde 8:64 0 3.7T 0 disk sdf 8:80 0 3.7T 0 disk sdg 8:96 0 3.7T 0 disk sdh 8:112 0 3.7T 0 disk sdi 8:128 0 3.7T 0 disk sdj 8:144 0 3.7T 0 disk sdk 8:160 0 3.7T 0 disk sdl 8:176 0 3.7T 0 disk sdm 8:192 0 139.8G 0 disk ├─sdm1 8:193 0 4.7G 0 part │ └─md2 9:2 0 4.7G 0 raid1 /boot ├─sdm2 8:194 0 116.4G 0 part │ └─md1 9:1 0 116.4G 0 raid1 / └─sdm3 8:195 0 14.9G 0 part [SWAP] sdn 8:208 0 139.8G 0 disk ├─sdn1 8:209 0 4.7G 0 part │ └─md2 9:2 0 4.7G 0 raid1 /boot └─sdn2 8:210 0 116.4G 0 part └─md1 9:1 0 116.4G 0 raid1 / nvme0n1 259:0 0 745.2G 0 disk ├─nvme0n1p1 259:3 0 372.6G 0 part │ ├─journals-sda 252:0 0 31G 0 lvm │ ├─journals-sdb 252:1 0 31G 0 lvm │ ├─journals-sdc 252:2 0 31G 0 lvm │ ├─journals-sdd 252:3 0 31G 0 lvm │ ├─journals-sde 252:4 0 31G 0 lvm │ ├─journals-sdf 252:5 0 31G 0 lvm │ ├─journals-sdg 252:6 0 31G 0 lvm │ ├─journals-sdh 252:7 0 31G 0 lvm │ ├─journals-sdi 252:8 0 31G 0 lvm │ ├─journals-sdj 252:9 0 31G 0 lvm │ ├─journals-sdk 252:10 0 31G 0 lvm │ └─journals-sdl 252:11 0 31G 0 lvm └─nvme0n1p2 259:4 0 365.2G 0 part └─ceph--9f7b3261--4778--47f9--9291--55630a41c262-osd--block--90e64c51--3344--47ce--a390--7931be9f95f1 252:15 0 365.2G 0 lvm ==== How to partition/re-partition the NVMe device ==== Here's my bash history that can be used to set up a reesi machine's NVMe card. ansible -a "sudo parted -s /dev/nvme0n1 mktable gpt" reesi* ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 0 50" reesi* ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 51 100" reesi* ansible -a "sudo pvcreate /dev/nvme0n1p1" reesi* ansible -a "sudo vgcreate journals /dev/nvme0n1p1" reesi* for disk in sd{a..l}; do ansible -a "sudo lvcreate -L 31G -n $disk journals" reesi*; done ===== Replacing Drives ===== Like the [[hardware:mira]], the drive letters do **not** correspond to drive bays. So ''/dev/sda'' isn't necessarily in Drive Bay 1. Keep this in mind when zapping/setting up OSDs. Also, drive ''/dev/sda'' may not necessarily have its WAL Block on ''/dev/journals/sda''. While watching the front of a system, send ''dd if=/dev/$DRIVE of=/dev/null'' where ''$DRIVE'' is the drive you're replacing to identify each drive. ===== Set up a new OSD with journal on NVMe logical volume ===== ceph-volume lvm create --bluestore --data /dev/sda --block.db journals/sda **Example of successfully deployed OSD** root@reesi001:~# ceph-volume lvm list ====== osd.94 ====== [block] /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9 type block osd id 94 cluster fsid 28f7427e-5558-4ffd-ae1a-51ec3042759a cluster name ceph osd fsid 63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9 db device /dev/journals/sda encrypted 0 db uuid X2SlQ5-0zx2-CuHJ-GEbJ-5JS8-ly5O-emmFI9 cephx lockbox secret block uuid Xvjsm3-95vU-KNmw-5DuK-i3cx-fmic-xfSK2w block device /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9 crush device class None [ db] /dev/journals/sda type db osd id 94 cluster fsid 28f7427e-5558-4ffd-ae1a-51ec3042759a cluster name ceph osd fsid 63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9 db device /dev/journals/sda encrypted 0 db uuid X2SlQ5-0zx2-CuHJ-GEbJ-5JS8-ly5O-emmFI9 cephx lockbox secret block uuid Xvjsm3-95vU-KNmw-5DuK-i3cx-fmic-xfSK2w block device /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9 crush device class None ===== Checking NVMe Card SMART Data ===== nvme smart-log /dev/nvme0n1 ===== Updating BIOS ===== TBD