The Ceph Foundation purchased 7 more servers to join the longrunningcluster. The three primary goals were:
Racking ticket: https://redhat.service-now.com/surl.do?n=RITM0880714
Count | Manufacturer | Model | Capacity | Notes | |
---|---|---|---|---|---|
Chassis | 2U | Supermicro | SSG-6029P-E1CR12L | N/A | |
Mainboard | N/A | Supermicro | X11DPH-T | N/A | |
CPU | 2 | Intel | Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz | N/A | ARK |
RAM | 4 DIMMs | SK Hynix | HMAA4GR7AJR8N-XN | 32GB | 128GB Total |
SSD | 2 | Intel | SSDSC2KG960G8 (S4510) | 1TB | Software RAID1 for OS |
HDD | 9 | Seagate | ST12000NM002G | 12TB | SAS 7200RPM for OSDs |
NVMe | 2 | Intel | SSDPE2KE016T8 | 1.6TB | For large NVMe OSDs |
NVMe | 1 | Intel | SSDPE21M375GA | 375GB | Carved up as logical volumes for OSD journals |
NIC | 2 ports | Intel | X722 | 10Gb | 1 port cabled BUT DISABLED. See below. |
NIC | 2 ports | Mellanox | ConnectX-4 | 25Gb | For back / storage traffic |
BMC | 1 | Supermicro | N/A | N/A | Reachable at $host.ipmi.sepia.ceph.com |
I used the Orchestrator to deploy OSDs on the ivan hosts (I did this one by one to avoid a mass data rebalance all to one rack).
root@reesi001:~# cat ivan_osd_spec.yml service_type: osd service_id: osd_using_paths placement: hosts: - ivan01 - ivan02 - ivan03 - ivan04 - ivan05 - ivan06 - ivan07 spec: data_devices: paths: - /dev/sdc - /dev/sdd - /dev/sde - /dev/sdf - /dev/sdg - /dev/sdh - /dev/sdi - /dev/sdj - /dev/sdk - /dev/nvme1n1 - /dev/nvme2n1 db_devices: paths: - /dev/nvme0n1
nvme smart-log /dev/nvme0n1
TBD
Initially, I wanted to have the 1Gb interface cabled on VLAN100 and the 25Gb interfaces cabled to VLAN101 (back.sepia.ceph.com). Up until now I have never really used VLAN101. I was able to get both NICs up, IPs assigned, and the servers could reach each other. The LRC could also reach these servers on their 25Gb/back
interfaces.
I added the hosts to the cluster using the back
IPs. The cluster became very unhappy complaining about slow OPs. Come to find out the ivan servers couldn't get out from their back
interfaces so the OSDs defaulted back to the 1Gb link.
I reached out to Red Hat IT to have the 25Gb network ports switched over to VLAN100. After that, I struggled to get eno1 (the 1Gb interface) to not come up on boot since I didn't need it anymore.
Finally I figured out
# cat /etc/systemd/network/10-eno1.network [Match] Name=eno1 [Network] DHCP=no
I could not for the life of me get ivan05 to install using the Ubuntu preseed below. Its settings are identical to the rest of the machines. I remember someone (I think GregF?) suggest in a CLT call that we should have a mixture of OSes in the LRC so I decided to use CentOS8 instead.
That led to its own difficulties. For example, I couldn't ping the back
interface from a front
interface on another host. This worked fine on Ubuntu. I finally landed on this very helpful post: https://unix.stackexchange.com/a/589133
After running sysctl -w net.ipv4.conf.enp216s0f0.rp_filter=2
, I could ping 172.21.18.225 from a front
interface on reesi001.
Here is the kickstart template used in cobbler to provision most of the hosts. As mentioned above, it did not work on ivan05 (would boot to grub rescue
prompt).
## This file is managed by ansible, don't make changes here - they will be overwritten. # Fetch the os_version from the distro using this profile. #set os_version = $getVar('os_version','') # Fetch Ubuntu version (e.g., 14.04) #set distro_ver = $getVar('distro','').split("-")[1] # Fetch Ubuntu major version (e.g., 14) #set distro_ver_major = $distro_ver.split(".")[0] ### Apt setup # You can choose to install non-free and contrib software. #d-i apt-setup/non-free boolean true #d-i apt-setup/contrib boolean true # Preseeding only locale sets language, country and locale. d-i debian-installer/locale string en_US # Keyboard selection. # Disable automatic (interactive) keymap detection. d-i console-setup/ask_detect boolean false # If you select ftp, the mirror/country string does not need to be set. #d-i mirror/protocol string ftp d-i mirror/country string manual d-i mirror/http/hostname string archive.ubuntu.com d-i mirror/http/directory string /ubuntu d-i mirror/suite string $os_version #Removes the prompt about missing modules: # Continue without installing a kernel? #d-i base-installer/kernel/skip-install boolean true # Continue the install without loading kernel modules? #d-i anna/no_kernel_modules boolean true # Stop Ubuntu from installing random kernel choice #d-i base-installer/kernel/image select none # Controls whether or not the hardware clock is set to UTC. d-i clock-setup/utc boolean true # # # You may set this to any valid setting for $TZ; see the contents of # # /usr/share/zoneinfo/ for valid values. d-i time/zone string Etc/UTC # Controls whether to use NTP to set the clock during the install d-i clock-setup/ntp boolean true # NTP server to use. The default is almost always fine here. d-i clock-setup/ntp-server string pool.ntp.org ### Partitioning d-i partman/unmount_active boolean true #----------------------------------------------------------------------# # Partitioning d-i partman/early_command string \ umount /media ; \ mdadm --stop /dev/md0 ; \ mdadm --remove /dev/md0 ; \ mdadm --stop /dev/md127 ; \ mdadm --remove /dev/md127 ; \ for partition in /dev/sda* /dev/sdb*; do mdadm --zero-superblock $partition ; dd if=/dev/zero of=$partition bs=1M count=10; done ; \ echo 1 > /sys/block/sda/device/rescan ; \ echo 1 > /sys/block/sdb/device/rescan ; \ ls -C /dev/sd*; \ sleep 5; \ exit 0; \ # this only makes partman automatically partition without confirmation: d-i partman-partitionining/confirm_write_new_label boolean true d-i partman-md/device_remove_md boolean true d-i partman-md/confirm_nooverwrite boolean true d-i partman-md/confirm boolean true d-i partman-lvm/device_remove_lvm boolean true d-i partman-lvm/confirm_nooverwrite boolean true d-i partman-lvm/confirm boolean true d-i partman/confirm_nooverwrite boolean true d-i partman/choose_partition select finish d-i partman/confirm boolean true d-i mdadm/boot_degraded boolean true d-i partman-auto/method string raid d-i partman-auto/disk string /dev/sda /dev/sdb d-i partman-auto/expert_recipe string multiraid :: \ 256 512 512 free $bootable{ } method{ efi } format{ } . \ 1024 10000 -1 raid format{ } method{ raid } . # specify how the previously defined partitions will be # used in the RAID setup. d-i partman-auto-raid/recipe string \ 1 2 0 xfs / /dev/sda5#/dev/sdb5 . d-i partman/choose_partition select Finish partitioning and write changes to disk d-i partman-efi/non_efi_system boolean true # Partitioning #----------------------------------------------------------------------# #User account. d-i passwd/root-login boolean false d-i passwd/make-user boolean true d-i passwd/user-fullname string cm d-i passwd/username string cm d-i passwd/user-password-crypted password $default_password_crypted d-i passwd/user-uid string 1100 d-i user-setup/allow-password-weak boolean false d-i user-setup/encrypt-home boolean false # Individual additional packages to install #if $os_version == 'precise' d-i pkgsel/include string wget ntpdate bash sudo openssh-server #else if int($distro_ver_major) == 16 d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl #else if int($distro_ver_major) == 18 d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown python ntp curl #else if int($distro_ver_major) >= 20 d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown ntp curl gpg #else d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware linux-firmware-nonfree ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl #end if # Whether to upgrade packages after debootstrap. # Allowed values: none, safe-upgrade, full-upgrade d-i pkgsel/upgrade select safe-upgrade # Policy for applying updates. May be "none" (no automatic updates), # "unattended-upgrades" (install security updates automatically), or # "landscape" (manage system with Landscape). d-i pkgsel/update-policy select none # During installations from serial console, the regular virtual consoles # (VT1-VT6) are normally disabled in /etc/inittab. Uncomment the next # line to prevent this. d-i finish-install/keep-consoles boolean true # Avoid that last message about the install being complete. d-i finish-install/reboot_in_progress note # This command is run just before the install finishes, but when there is # still a usable /target directory. You can chroot to /target and use it # directly, or use the apt-install and in-target commands to easily install # packages and run commands in the target system. # cephlab_preseed_late lives in /var/lib/cobbler/scripts # It is passed to the cobbler xmlrpc generate_scripts function where it's rendered. # This means that snippets or other templating features can be used. d-i preseed/late_command string \ in-target wget http://$http_server/cblr/svc/op/script/system/$system_name/?script=cephlab_preseed_late -O /tmp/postinst.sh; \ in-target /bin/chmod 755 /tmp/postinst.sh; \ in-target /tmp/postinst.sh;