User Tools

Site Tools


services:longrunningcluster

LONG_RUNNING_CLUSTER

Summary

A small subset of mira systems and the apama systems are used in a Ceph cluster primarily used to store teuthology logs and storage for ceph-post-file.

Topology

Current as of 2018/04/03 21:03. OSDs are also collocated on all 3 MON hosts.

MONs

reesi{001..006}
mira055
mira070

MGRs

reesi{001..003}
mira049

MDSs

reesi{001,002}
mira021
mira049
mira070

OSD hosts

mira019
mira021
mira031
mira049
mira055
mira060
mira070
mira087
mira093
mira099
mira116
mira120
mira122
apama002
reesi{001.006}

ceph.conf

This file can be saved on your workstation so you can use it as an admin node.

Current as of 2018/04/03 21:03

[global]
fsid = 28f7427e-5558-4ffd-ae1a-51ec3042759a
mon_host = 172.21.6.140, 172.21.6.108, 172.21.2.201, 172.21.2.202, 172.21.2.203, 172.21.2.204, 172.21.2.205
public_network = 172.21.0.0/20

# Setting below for cephmetrics.sepia.ceph.com dashboard use - dgalloway
mon_health_preluminous_compat = true

# ick, we have too many pgs on this cluster.
mon_max_pg_per_osd = 400

[mon]
debug ms = 1
debug mon = 10

[osd]
debug_ms = 1
debug_osd = 10
debug_filestore = 10
setuser_match_path = $osd_data
bluestore cache size = 512000000

[mds]
mds cache size = 500000
mds session timeout = 120
mds session autoclose = 600
debug mds = 4

[mgr]
debug mgr = 20
debug ms = 1

[mon.mira070]
public addr = 172.21.6.108

Upgrading the Cluster

As of this writing, the luminous branch is the repo defined in /etc/apt/sources.list.d/ceph.list on the LRC nodes. The Ceph docs can be followed for this procedure but, basically, update and reboot each host at a time starting with MONs, MGRs, MDSs, then OSD hosts.

Replace LRC Host's root drive

On non-mon hosts

  1. ceph osd set noout on admin host
    1. ceph osd set noscrub; ceph osd set nodeep-scrub to avoid unnecessary I/O
  2. Stop ceph services on OSD host
    1. stop ceph-osd-all on Ubuntu
    2. service ceph stop osd.# on RHEL
  3. Back up /etc/ceph
    1. scp root@mira###.front.sepia.ceph.com:/etc/ceph/ceph.conf .
  4. umount /var/lib/ceph/osd/*
  5. Back up /var/lib/ceph/osd
    1. scp -r root@mira###.front.sepia.ceph.com:/var/lib/ceph/osd/ .
  6. Reimage the machine
  7. Install ceph packages
    1. If needed. set up repo file
    2. Also if needed, import repo GPG key wget -qO - http://download.ceph.com/keys/release.asc | sudo apt-key add -
    3. apt-get install ceph ceph-base ceph-common ceph-osd ceph-test libcephfs1 python-cephfs ceph-deploy
  8. Make sure ntpd is configured and enabled
    1. Manually run ntpdate $ntpserver for one-time sync
  9. Configure or disable firewall
  10. Replace /etc/ceph and /var/lib/ceph/osd structures
    1. scp ceph.conf root@mira###.front.sepia.ceph.com:/etc/ceph/
    2. scp -r osd/* root@mira###.front.sepia.ceph.com:/var/lib/ceph/osd/
  11. Set permissions
    1. chown -R ceph:ceph /var/lib/ceph/osd/
    2. chown ceph:ceph /etc/ceph/ceph.conf
  12. Create an ssh key, copy the pubkey to /root/.ssh/authorized_keys on a monhost and run ceph-deploy gatherkeys $mon where $mon is a mon host
  13. Copy keys to their appropriate places
    1. For the bootstrap key,
      1. mv ceph.bootstrap-osd.keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
      2. mv ceph.client.admin.keyring /etc/ceph/
      3. chown ceph:ceph /var/lib/ceph/bootstrap-osd/ceph.keyring
  14. reboot
  15. Unset flags from step 1

See Ceph Docs - Stopping without rebalancing

Add blank disk as OSD

disk=sdX
ceph-disk zap /dev/$disk
ceph-disk prepare /dev/$disk
ceph-disk activate /dev/${disk}1

Replace Failing OSD disk

Evacuating OSD data

If the disk is still relatively healthy and you think it can survive a while longer, you should evacuate the data off it slowly.

  1. On a mon node, ceph osd reweight $osdnum 0.75 or -0.25 the current weight
  2. Wait until recovery I/O is done and keep doing this until the OSD is reweighted to 0

Taking the OSD out of the cluster

  1. On a mon node, ceph osd out $id. This makes sure there are 3 replicas of each PG evacuated.
    1. If any recovery I/O occurs, wait for it to finish
  2. On the OSD host, stop ceph-osd id=$id
    1. Some recovery I/O will occur. This is just the cluster rebalancing. It's fine.
  3. Back on the mon host,
    ceph osd crush remove osd.$id
    ceph osd down osd.$id  # may not be needed as long as osd service is stopped
    ceph osd rm osd.$id
    ceph auth del osd.$id
  4. Unmount the disk from the OSD host
    1. umount /var/lib/ceph/osd/ceph-$id
    2. rm -rf /var/lib/ceph/osd/ceph-$id
  5. Replace the disk
  6. On the OSD host,
    disk=sdX
    ceph-disk zap /dev/$disk
    ceph-disk prepare /dev/$disk
    mkdir /mnt/tmp
    mount /dev/${disk}1 /mnt/tmp
    mkdir /var/lib/ceph/osd/ceph-$(cat /mnt/tmp/whoami)
    chown ceph:ceph /var/lib/ceph/osd/ceph-$(cat /mnt/tmp/whoami)
    umount /mnt/tmp
    ceph-disk activate /dev/${disk}1
services/longrunningcluster.txt · Last modified: 2018/04/03 21:08 by djgalloway