User Tools

Site Tools


Sidebar

General Lab Info (Mainly for Devs)

Hardware

Lab Infrastructure Services

Misc Admin Tasks
These are infrequently completed tasks that don't fit under any specific service

Production Services

OVH = OVH
RHEV = Sepia RHE instance
Baremetal = Host in Sepia lab

The Attic/Legacy Info

hardware:ivan

ivan{01..07}

Summary

The Ceph Foundation purchased 7 more servers to join the longrunningcluster. The three primary goals were:

  1. Faster networking between hosts
  2. Large NVMe devices as OSDs
  3. 12TB HDDs (largest up until now was 4TB)

Purchasing details

Hardware Specs

Count Manufacturer Model Capacity Notes
Chassis 2U Supermicro SSG-6029P-E1CR12L N/A
Mainboard N/A Supermicro X11DPH-T N/A
CPU 2 Intel Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz N/A ARK
RAM 4 DIMMs SK Hynix HMAA4GR7AJR8N-XN 32GB 128GB Total
SSD 2 Intel SSDSC2KG960G8 (S4510) 1TB Software RAID1 for OS
HDD 9 Seagate ST12000NM002G 12TB SAS 7200RPM for OSDs
NVMe 2 Intel SSDPE2KE016T8 1.6TB For large NVMe OSDs
NVMe 1 Intel SSDPE21M375GA 375GB Carved up as logical volumes for OSD journals
NIC 2 ports Intel X722 10Gb 1 port cabled BUT DISABLED. See below.
NIC 2 ports Mellanox ConnectX-4 25Gb For back / storage traffic
BMC 1 Supermicro N/A N/A Reachable at $host.ipmi.sepia.ceph.com

OSD/Block Device Information

The ivan have 9x 12TB HDD, 2x 1.5TB NVMe, and 1x 350GB NVMe.

The 12TB were added to so we can say we're testing on drives larger than 8TB.

The smaller NVMe device is split into eleven equal logical volumes. One for each OSD's journal.

root@ivan04:~# lsblk
NAME                 MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                    8:0    0 894.3G  0 disk  
`-sda1                 8:1    0 894.3G  0 part  
  `-md0                9:0    0 894.1G  0 raid1 /
sdb                    8:16   0 894.3G  0 disk  
`-sdb1                 8:17   0 894.3G  0 part  
  `-md0                9:0    0 894.1G  0 raid1 /
sdc                    8:32   0  10.9T  0 disk  
sdd                    8:48   0  10.9T  0 disk  
sde                    8:64   0  10.9T  0 disk  
sdf                    8:80   0  10.9T  0 disk  
sdg                    8:96   0  10.9T  0 disk  
sdh                    8:112  0  10.9T  0 disk  
sdi                    8:128  0  10.9T  0 disk  
sdj                    8:144  0  10.9T  0 disk  
sdk                    8:160  0  10.9T  0 disk  
sr0                   11:0    1   841M  0 rom   
nvme0n1              259:0    0 349.3G  0 disk  
`-nvme0n1p1          259:3    0 349.3G  0 part  
  |-journals-sdc     253:0    0    31G  0 lvm   
  |-journals-sdd     253:1    0    31G  0 lvm   
  |-journals-sde     253:2    0    31G  0 lvm   
  |-journals-sdf     253:3    0    31G  0 lvm   
  |-journals-sdg     253:4    0    31G  0 lvm   
  |-journals-sdh     253:5    0    31G  0 lvm   
  |-journals-sdi     253:6    0    31G  0 lvm   
  |-journals-sdj     253:7    0    31G  0 lvm   
  |-journals-sdk     253:8    0    31G  0 lvm   
  |-journals-nvme1n1 253:9    0    31G  0 lvm   
  `-journals-nvme2n1 253:10   0    31G  0 lvm   
nvme1n1              259:1    0   1.5T  0 disk  
nvme2n1              259:2    0   1.5T  0 disk

How to partition/re-partition the NVMe device

Here's my bash history that can be used to set up an ivan machine's 375GB NVMe card.

ansible -a "sudo parted -s /dev/nvme0n1 mktable gpt" ivan
ansible -a "sudo parted /dev/nvme0n1 unit '%' mkpart foo 0 100" ivan
ansible -a "sudo pvcreate /dev/nvme0n1p1" ivan
ansible -a "sudo vgcreate journals /dev/nvme0n1p1" ivan
for disk in sd{c..k} nvme1n1 nvme2n1; do ansible -a "sudo lvcreate -L 31G -n $disk journals" ivan; done

Checking NVMe Card SMART Data

nvme smart-log /dev/nvme0n1

Updating BIOS

TBD

Installation Quirks/Difficulties

Networking

Initially, I wanted to have the 1Gb interface cabled on VLAN100 and the 25Gb interfaces cabled to VLAN101 (back.sepia.ceph.com). Up until now I have never really used VLAN101. I was able to get both NICs up, IPs assigned, and the servers could reach each other. The LRC could also reach these servers on their 25Gb/back interfaces.

I added the hosts to the cluster using the back IPs. The cluster became very unhappy complaining about slow OPs. Come to find out the ivan servers couldn't get out from their back interfaces so the OSDs defaulted back to the 1Gb link.

I reached out to Red Hat IT to have the 25Gb network ports switched over to VLAN100. After that, I struggled to get eno1 (the 1Gb interface) to not come up on boot.

Finally I figured out

# cat /etc/systemd/network/10-eno1.network 
[Match]
Name=eno1

[Network]
DHCP=no

CentOS 8

I could not for the life of me get ivan05 to install using the Ubuntu preseed below. Its settings are identical to the rest of the machines. I remember someone (I think GregF?) suggest in a CLT call that we should have a mixture of OSes in the LRC so I decided to use CentOS8 instead.

That led to its own difficulties. For example, I couldn't ping the back interface from a front interface on another host. This worked fine on Ubuntu. I finally landed on this very helpful post: https://unix.stackexchange.com/a/589133

After running sysctl -w net.ipv4.conf.enp216s0f0.rp_filter=2, I could ping 172.21.18.225 from a front interface on reesi001.

Ubuntu Preseed

Here is the kickstart template used in cobbler to provision most of the hosts. As mentioned above, it did not work on ivan05 (would boot to grub rescue prompt).

## This file is managed by ansible, don't make changes here - they will be overwritten.

# Fetch the os_version from the distro using this profile.
#set os_version = $getVar('os_version','')

# Fetch Ubuntu version (e.g., 14.04)
#set distro_ver = $getVar('distro','').split("-")[1]

# Fetch Ubuntu major version (e.g., 14)
#set distro_ver_major = $distro_ver.split(".")[0]

### Apt setup
# You can choose to install non-free and contrib software.
#d-i apt-setup/non-free boolean true
#d-i apt-setup/contrib boolean true

# Preseeding only locale sets language, country and locale.
d-i debian-installer/locale string en_US

# Keyboard selection.
# Disable automatic (interactive) keymap detection.
d-i console-setup/ask_detect boolean false

# If you select ftp, the mirror/country string does not need to be set.
#d-i mirror/protocol string ftp
d-i mirror/country string manual
d-i mirror/http/hostname string archive.ubuntu.com
d-i mirror/http/directory string /ubuntu
d-i mirror/suite string $os_version

#Removes the prompt about missing modules:
# Continue without installing a kernel?
#d-i base-installer/kernel/skip-install boolean true
# Continue the install without loading kernel modules?
#d-i anna/no_kernel_modules boolean true

# Stop Ubuntu from installing random kernel choice
#d-i base-installer/kernel/image select none

# Controls whether or not the hardware clock is set to UTC.
d-i clock-setup/utc boolean true
#
# # You may set this to any valid setting for $TZ; see the contents of
# # /usr/share/zoneinfo/ for valid values.
d-i time/zone string Etc/UTC

# Controls whether to use NTP to set the clock during the install
d-i clock-setup/ntp boolean true
# NTP server to use. The default is almost always fine here.
d-i clock-setup/ntp-server string pool.ntp.org

### Partitioning
d-i partman/unmount_active boolean true


#----------------------------------------------------------------------#
# Partitioning
d-i partman/early_command string \
	umount /media ; \
	mdadm --stop /dev/md0 ; \
	mdadm --remove /dev/md0 ; \
	mdadm --stop /dev/md127 ; \
	mdadm --remove /dev/md127 ; \
    for partition in /dev/sda* /dev/sdb*; do mdadm --zero-superblock $partition ; dd if=/dev/zero of=$partition bs=1M count=10; done ; \
    echo 1 > /sys/block/sda/device/rescan ; \
    echo 1 > /sys/block/sdb/device/rescan ; \
    ls -C /dev/sd*; \
    sleep 5; \
	exit 0; \


# this only makes partman automatically partition without confirmation:
d-i partman-partitionining/confirm_write_new_label  boolean true
d-i partman-md/device_remove_md     boolean true
d-i partman-md/confirm_nooverwrite  boolean true
d-i partman-md/confirm              boolean true
d-i partman-lvm/device_remove_lvm   boolean true
d-i partman-lvm/confirm_nooverwrite boolean true
d-i partman-lvm/confirm             boolean true
d-i partman/confirm_nooverwrite     boolean true
d-i partman/choose_partition        select  finish
d-i partman/confirm                 boolean true
d-i mdadm/boot_degraded             boolean true

d-i partman-auto/method string raid
d-i partman-auto/disk string /dev/sda /dev/sdb

d-i partman-auto/expert_recipe      string multiraid :: \
    256   512    512   free       $bootable{ } method{ efi } format{ } . \
    1024  10000  -1    raid       format{ } method{ raid } .

# specify how the previously defined partitions will be
# used in the RAID setup.
d-i partman-auto-raid/recipe string     \
    1 2 0 xfs / /dev/sda5#/dev/sdb5 .

d-i partman/choose_partition select Finish partitioning and write changes to disk
d-i partman-efi/non_efi_system boolean true

# Partitioning
#----------------------------------------------------------------------#

#User account.
d-i passwd/root-login boolean false 
d-i passwd/make-user boolean true
d-i passwd/user-fullname string cm
d-i passwd/username string cm
d-i passwd/user-password-crypted password $default_password_crypted
d-i passwd/user-uid string 1100
d-i user-setup/allow-password-weak boolean false
d-i user-setup/encrypt-home boolean false

# Individual additional packages to install
#if $os_version == 'precise'
d-i pkgsel/include string wget ntpdate bash sudo openssh-server
#else if int($distro_ver_major) == 16
d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
#else if int($distro_ver_major) == 18
d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown python ntp curl
#else if int($distro_ver_major) >= 20
d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown ntp curl gpg
#else
d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware linux-firmware-nonfree ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
#end if

# Whether to upgrade packages after debootstrap.
# Allowed values: none, safe-upgrade, full-upgrade
d-i pkgsel/upgrade select safe-upgrade

# Policy for applying updates. May be "none" (no automatic updates),
# "unattended-upgrades" (install security updates automatically), or
# "landscape" (manage system with Landscape).
d-i pkgsel/update-policy select none

# During installations from serial console, the regular virtual consoles
# (VT1-VT6) are normally disabled in /etc/inittab. Uncomment the next
# line to prevent this.
d-i finish-install/keep-consoles boolean true

# Avoid that last message about the install being complete.
d-i finish-install/reboot_in_progress note

# This command is run just before the install finishes, but when there is
# still a usable /target directory. You can chroot to /target and use it
# directly, or use the apt-install and in-target commands to easily install
# packages and run commands in the target system.

# cephlab_preseed_late lives in /var/lib/cobbler/scripts
# It is passed to the cobbler xmlrpc generate_scripts function where it's rendered.
# This means that snippets or other templating features can be used.
d-i preseed/late_command string \
in-target wget http://$http_server/cblr/svc/op/script/system/$system_name/?script=cephlab_preseed_late -O /tmp/postinst.sh; \
in-target /bin/chmod 755 /tmp/postinst.sh; \
in-target /tmp/postinst.sh;
hardware/ivan.txt · Last modified: 2022/05/11 19:59 by djgalloway