The Sepia lab has 6 storage hosts purchased December 2017 to replace the miras used in the LONGRUNNINGCLUSTER.
https://docs.google.com/spreadsheets/d/1-CqEo-927duC0sClr78WcZgVIFdY55iE_K8oU2G7aSw
Racking ticket: https://redhat.service-now.com/surl.do?n=PNT0117459
| Count | Manufacturer | Model | Capacity | Notes | |
|---|---|---|---|---|---|
| Chassis | 2U | Supermicro | SSG-6028R-E1CR12H | N/A | |
| Mainboard | N/A | Supermicro | X10DRH-iT | N/A | |
| CPU | 1 | Intel | Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | N/A | ARK |
| RAM | 4 DIMMs | Samsung | M393A2G40EB1-CRC | 16GB | 64GB total |
| SSD | 2 | Intel | SSDSC2BB150G7 (S3520) | 150GB | Software RAID1 for OS |
| HDD | 11 | Seagate | ST4000NM0025 | 4TB | SAS 7200RPM for OSDs |
| HDD | 1 | HGST | HUH721212AL5200 | 12TB | SAS 7200RPM added 1AUG2019 at Brett's request. |
| NVMe | 1 | Micron | MTFDHBG800MCG-1AN1ZABYY | 800GB | Carved up as logical volumes on two partitions. 400GB as an OSD and the other 400GB divided by 12 for HDD OSD journals |
| NIC | 2 ports | Intel | X540-AT2 | 10Gb | RJ45 (not used) |
| NIC | 2 ports | Intel | 82599ES | 10Gb | 1 port cabled per system on front VLAN |
| BMC | 1 | Supermicro | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com |
The reesi have 11x 4TB HDD, 1x 12TB HDD, and 1x 800GB NVMe.
The 12TB were added to so we can say we're testing on drives larger than 8TB.
The NVMe device is split into two equal partitions:
root@reesi001:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 3.7T 0 disk └─ceph--28f7427e--5558--4ffd--ae1a--51ec3042759a-osd--block--63c0a64e--a0d2--4daf--87ec--c4b00b9f3ab9 252:12 0 3.7T 0 lvm sdb 8:16 0 3.7T 0 disk └─ceph--9ddb7c35--20a6--4099--9127--947141c5452e-osd--block--7233f174--6402--4094--a48b--9aaabf508cb2 252:13 0 3.7T 0 lvm sdc 8:32 0 3.7T 0 disk └─ceph--cbfd182d--02e9--4c5a--b06e--7497d6109c87-osd--block--58eecdfe--b984--46da--a37a--cd04867b4e3f 252:14 0 3.7T 0 lvm sdd 8:48 0 3.7T 0 disk └─ceph--9632797e--c4d3--45df--a2bc--03e466c16224-osd--block--20171f57--5931--4402--bbe8--a7f703aa47db 252:16 0 3.7T 0 lvm sde 8:64 0 3.7T 0 disk sdf 8:80 0 3.7T 0 disk sdg 8:96 0 3.7T 0 disk sdh 8:112 0 3.7T 0 disk sdi 8:128 0 3.7T 0 disk sdj 8:144 0 3.7T 0 disk sdk 8:160 0 3.7T 0 disk sdl 8:176 0 3.7T 0 disk sdm 8:192 0 139.8G 0 disk ├─sdm1 8:193 0 4.7G 0 part │ └─md2 9:2 0 4.7G 0 raid1 /boot ├─sdm2 8:194 0 116.4G 0 part │ └─md1 9:1 0 116.4G 0 raid1 / └─sdm3 8:195 0 14.9G 0 part [SWAP] sdn 8:208 0 139.8G 0 disk ├─sdn1 8:209 0 4.7G 0 part │ └─md2 9:2 0 4.7G 0 raid1 /boot └─sdn2 8:210 0 116.4G 0 part └─md1 9:1 0 116.4G 0 raid1 / nvme0n1 259:0 0 745.2G 0 disk ├─nvme0n1p1 259:3 0 372.6G 0 part │ ├─journals-sda 252:0 0 31G 0 lvm │ ├─journals-sdb 252:1 0 31G 0 lvm │ ├─journals-sdc 252:2 0 31G 0 lvm │ ├─journals-sdd 252:3 0 31G 0 lvm │ ├─journals-sde 252:4 0 31G 0 lvm │ ├─journals-sdf 252:5 0 31G 0 lvm │ ├─journals-sdg 252:6 0 31G 0 lvm │ ├─journals-sdh 252:7 0 31G 0 lvm │ ├─journals-sdi 252:8 0 31G 0 lvm │ ├─journals-sdj 252:9 0 31G 0 lvm │ ├─journals-sdk 252:10 0 31G 0 lvm │ └─journals-sdl 252:11 0 31G 0 lvm └─nvme0n1p2 259:4 0 365.2G 0 part └─ceph--9f7b3261--4778--47f9--9291--55630a41c262-osd--block--90e64c51--3344--47ce--a390--7931be9f95f1 252:15 0 365.2G 0 lvm
Here's my bash history that can be used to set up a reesi machine's NVMe card.
ansible -a "sudo parted -s /dev/nvme0n1 mktable gpt" reesi*
ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 0 50" reesi*
ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 51 100" reesi*
ansible -a "sudo pvcreate /dev/nvme0n1p1" reesi*
ansible -a "sudo vgcreate journals /dev/nvme0n1p1" reesi*
for disk in sd{a..l}; do ansible -a "sudo lvcreate -L 31G -n $disk journals" reesi*; done
Like the mira, the drive letters do not correspond to drive bays. So /dev/sda isn't necessarily in Drive Bay 1. Keep this in mind when zapping/setting up OSDs. Also, drive /dev/sda may not necessarily have its WAL Block on /dev/journals/sda.
While watching the front of a system, send dd if=/dev/$DRIVE of=/dev/null where $DRIVE is the drive you're replacing to identify each drive.
ceph-volume lvm create --bluestore --data /dev/sda --block.db journals/sda
Example of successfully deployed OSD
root@reesi001:~# ceph-volume lvm list
====== osd.94 ======
[block] /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
type block
osd id 94
cluster fsid 28f7427e-5558-4ffd-ae1a-51ec3042759a
cluster name ceph
osd fsid 63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
db device /dev/journals/sda
encrypted 0
db uuid X2SlQ5-0zx2-CuHJ-GEbJ-5JS8-ly5O-emmFI9
cephx lockbox secret
block uuid Xvjsm3-95vU-KNmw-5DuK-i3cx-fmic-xfSK2w
block device /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
crush device class None
[ db] /dev/journals/sda
type db
osd id 94
cluster fsid 28f7427e-5558-4ffd-ae1a-51ec3042759a
cluster name ceph
osd fsid 63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
db device /dev/journals/sda
encrypted 0
db uuid X2SlQ5-0zx2-CuHJ-GEbJ-5JS8-ly5O-emmFI9
cephx lockbox secret
block uuid Xvjsm3-95vU-KNmw-5DuK-i3cx-fmic-xfSK2w
block device /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
crush device class None
nvme smart-log /dev/nvme0n1
TBD