====== reesi{001..006} ======
===== Summary =====
The Sepia lab has 6 storage hosts purchased December 2017 to replace the miras used in the [[services:LONGRUNNINGCLUSTER]].
===== Purchasing details =====
https://docs.google.com/spreadsheets/d/1-CqEo-927duC0sClr78WcZgVIFdY55iE_K8oU2G7aSw
Racking ticket: https://redhat.service-now.com/surl.do?n=PNT0117459
===== Hardware Specs =====
| ^ Count ^ Manufacturer ^ Model ^ Capacity ^ Notes ^
^ Chassis | 2U | Supermicro | SSG-6028R-E1CR12H | N/A | |
^ Mainboard | N/A | Supermicro | X10DRH-iT | N/A | |
^ CPU | 1 | Intel | Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | N/A | [[https://ark.intel.com/products/92986/Intel-Xeon-Processor-E5-2620-v4-20M-Cache-2_10-GHz|ARK]] |
^ RAM | 4 DIMMs | Samsung | M393A2G40EB1-CRC | 16GB | 64GB total |
^ SSD | 2 | Intel | SSDSC2BB150G7 (S3520) | 150GB | Software RAID1 for OS |
^ HDD | 11 | Seagate | ST4000NM0025 | 4TB | SAS 7200RPM for OSDs |
^ HDD | 1 | HGST | HUH721212AL5200 | 12TB | SAS 7200RPM added 1AUG2019 at Brett's request. |
^ NVMe | 1 | Micron | MTFDHBG800MCG-1AN1ZABYY | 800GB | Carved up as logical volumes on two partitions. 400GB as an OSD and the other 400GB divided by 12 for HDD OSD journals |
^ NIC | 2 ports | Intel | X540-AT2 | 10Gb | RJ45 (not used) |
^ NIC | 2 ports | Intel | 82599ES | 10Gb | 1 port cabled per system on front VLAN |
^ BMC | 1 | Supermicro | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com |
===== OSD/Block Device Information =====
The reesi have 11x 4TB HDD, 1x 12TB HDD, and 1x 800GB NVMe.
The 12TB were added to so we can say we're testing on drives larger than 8TB.
The NVMe device is split into two equal partitions:
- Split into 12 LVMs to serve as block.db for each HDD OSD
- Used as an SSD OSD
root@reesi001:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 3.7T 0 disk
└─ceph--28f7427e--5558--4ffd--ae1a--51ec3042759a-osd--block--63c0a64e--a0d2--4daf--87ec--c4b00b9f3ab9 252:12 0 3.7T 0 lvm
sdb 8:16 0 3.7T 0 disk
└─ceph--9ddb7c35--20a6--4099--9127--947141c5452e-osd--block--7233f174--6402--4094--a48b--9aaabf508cb2 252:13 0 3.7T 0 lvm
sdc 8:32 0 3.7T 0 disk
└─ceph--cbfd182d--02e9--4c5a--b06e--7497d6109c87-osd--block--58eecdfe--b984--46da--a37a--cd04867b4e3f 252:14 0 3.7T 0 lvm
sdd 8:48 0 3.7T 0 disk
└─ceph--9632797e--c4d3--45df--a2bc--03e466c16224-osd--block--20171f57--5931--4402--bbe8--a7f703aa47db 252:16 0 3.7T 0 lvm
sde 8:64 0 3.7T 0 disk
sdf 8:80 0 3.7T 0 disk
sdg 8:96 0 3.7T 0 disk
sdh 8:112 0 3.7T 0 disk
sdi 8:128 0 3.7T 0 disk
sdj 8:144 0 3.7T 0 disk
sdk 8:160 0 3.7T 0 disk
sdl 8:176 0 3.7T 0 disk
sdm 8:192 0 139.8G 0 disk
├─sdm1 8:193 0 4.7G 0 part
│ └─md2 9:2 0 4.7G 0 raid1 /boot
├─sdm2 8:194 0 116.4G 0 part
│ └─md1 9:1 0 116.4G 0 raid1 /
└─sdm3 8:195 0 14.9G 0 part [SWAP]
sdn 8:208 0 139.8G 0 disk
├─sdn1 8:209 0 4.7G 0 part
│ └─md2 9:2 0 4.7G 0 raid1 /boot
└─sdn2 8:210 0 116.4G 0 part
└─md1 9:1 0 116.4G 0 raid1 /
nvme0n1 259:0 0 745.2G 0 disk
├─nvme0n1p1 259:3 0 372.6G 0 part
│ ├─journals-sda 252:0 0 31G 0 lvm
│ ├─journals-sdb 252:1 0 31G 0 lvm
│ ├─journals-sdc 252:2 0 31G 0 lvm
│ ├─journals-sdd 252:3 0 31G 0 lvm
│ ├─journals-sde 252:4 0 31G 0 lvm
│ ├─journals-sdf 252:5 0 31G 0 lvm
│ ├─journals-sdg 252:6 0 31G 0 lvm
│ ├─journals-sdh 252:7 0 31G 0 lvm
│ ├─journals-sdi 252:8 0 31G 0 lvm
│ ├─journals-sdj 252:9 0 31G 0 lvm
│ ├─journals-sdk 252:10 0 31G 0 lvm
│ └─journals-sdl 252:11 0 31G 0 lvm
└─nvme0n1p2 259:4 0 365.2G 0 part
└─ceph--9f7b3261--4778--47f9--9291--55630a41c262-osd--block--90e64c51--3344--47ce--a390--7931be9f95f1 252:15 0 365.2G 0 lvm
==== How to partition/re-partition the NVMe device ====
Here's my bash history that can be used to set up a reesi machine's NVMe card.
ansible -a "sudo parted -s /dev/nvme0n1 mktable gpt" reesi*
ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 0 50" reesi*
ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 51 100" reesi*
ansible -a "sudo pvcreate /dev/nvme0n1p1" reesi*
ansible -a "sudo vgcreate journals /dev/nvme0n1p1" reesi*
for disk in sd{a..l}; do ansible -a "sudo lvcreate -L 31G -n $disk journals" reesi*; done
===== Replacing Drives =====
Like the [[hardware:mira]], the drive letters do **not** correspond to drive bays. So ''/dev/sda'' isn't necessarily in Drive Bay 1. Keep this in mind when zapping/setting up OSDs. Also, drive ''/dev/sda'' may not necessarily have its WAL Block on ''/dev/journals/sda''.
While watching the front of a system, send ''dd if=/dev/$DRIVE of=/dev/null'' where ''$DRIVE'' is the drive you're replacing to identify each drive.
===== Set up a new OSD with journal on NVMe logical volume =====
ceph-volume lvm create --bluestore --data /dev/sda --block.db journals/sda
**Example of successfully deployed OSD**
root@reesi001:~# ceph-volume lvm list
====== osd.94 ======
[block] /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
type block
osd id 94
cluster fsid 28f7427e-5558-4ffd-ae1a-51ec3042759a
cluster name ceph
osd fsid 63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
db device /dev/journals/sda
encrypted 0
db uuid X2SlQ5-0zx2-CuHJ-GEbJ-5JS8-ly5O-emmFI9
cephx lockbox secret
block uuid Xvjsm3-95vU-KNmw-5DuK-i3cx-fmic-xfSK2w
block device /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
crush device class None
[ db] /dev/journals/sda
type db
osd id 94
cluster fsid 28f7427e-5558-4ffd-ae1a-51ec3042759a
cluster name ceph
osd fsid 63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
db device /dev/journals/sda
encrypted 0
db uuid X2SlQ5-0zx2-CuHJ-GEbJ-5JS8-ly5O-emmFI9
cephx lockbox secret
block uuid Xvjsm3-95vU-KNmw-5DuK-i3cx-fmic-xfSK2w
block device /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-63c0a64e-a0d2-4daf-87ec-c4b00b9f3ab9
crush device class None
===== Checking NVMe Card SMART Data =====
nvme smart-log /dev/nvme0n1
===== Updating BIOS =====
TBD