User Tools

Site Tools


services:rhev

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
services:rhev [2020/02/05 16:27]
djgalloway
services:rhev [2023/02/08 12:40] (current)
akraitman [Updating Hypervisors]
Line 2: Line 2:
 ===== Summary ===== ===== Summary =====
 We have have a RHEV instance running on [[hardware:​infrastructure#​hv_0103|hv{01..04}]] as the main hypervisor nodes. ​ They'​re listed as **Hosts** in the RHEV Manager. We have have a RHEV instance running on [[hardware:​infrastructure#​hv_0103|hv{01..04}]] as the main hypervisor nodes. ​ They'​re listed as **Hosts** in the RHEV Manager.
 +
 +Currently the RHEV Hosts installed version is 4.3.5-1
  
 The [[http://​mgr01.front.sepia.ceph.com|RHEV Manager]] is a [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​4.1/​html/​self-hosted_engine_guide/​|Self-Hosted VM]] inside the cluster. ​ The username for logging in is ''​admin''​ and the password is our standard root password. The [[http://​mgr01.front.sepia.ceph.com|RHEV Manager]] is a [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​4.1/​html/​self-hosted_engine_guide/​|Self-Hosted VM]] inside the cluster. ​ The username for logging in is ''​admin''​ and the password is our standard root password.
Line 52: Line 54:
 The Hypervisors (hv{01..04}) and Storage nodes (ssdstore{01..02}) have entries in ''/​etc/​hosts''​ in case of DNS failure. The Hypervisors (hv{01..04}) and Storage nodes (ssdstore{01..02}) have entries in ''/​etc/​hosts''​ in case of DNS failure.
  
 +Note: it is important that the version of glusterfs packages on the hypervisors does not exceed the version on the storage nodes (i.e. client is older or equal to server).
 ---- ----
  
 ===== Creating New VMs ===== ===== Creating New VMs =====
 +==== How-To ====
 +  - Log in
 +  - Go to the **Virtual Machines** tab
 +  - Click **New VM**
 +  - General Settings
 +    - **Cluster:​** ''​hv_cluster''​
 +    - **Operating System:** ''​Linux''​
 +    - **Optimized for:** ''​Server''​
 +    - **Name:** Whatever you want
 +    - Descriptions are also nice
 +  - System
 +    - **Memory Size:** Up to you (it can take ''​4GB''​ as input and will convert)
 +    - **Total Virtual CPUs:** Also up to you
 +  - High Availability
 +    - **Highly Available:​** Checked (if desired)
 +    - Set the **Priority**
 +  - Boot Options
 +    - Probably PXE then Hard Disk.  This will boot to our Cobbler menu.
 +    - You could also do CD-ROM then Hard Disk.  Just check **Attach CD** and select the ISO (these are on ''​store01.front.sepia.ceph.com:/​srv/​isos/​67ff9a5d-b5da-4a2f-b5ce-2286bc82e3e4/​images/​11111111-1111-1111-1111-111111111111''​ if you want to add one)
 +  - **OK**
 +  - Now highlight your new VM
 +  - At the bottom, **Disks** tab
 +    - **New**
 +    - Set the **Size**
 +    - **Storage Domain:** ''​ssdstorage''​
 +    - **Allocation Policy:** ''​Preallocated''​ if IO performance is important
 +      - ''​Preallocated''​ will take longer to create the disk but IO performance in the VM will be faster
 +      - ''​Thin Provision''​ is almost immediate during VM creation but may slow down VM IO performance
 +    - **OK**
 +  - At the bottom, **Network Interfaces** tab
 +    - **New**
 +    - **Profile** should be ''​front''​ or ''​wan''​ (or both [one of them being on a second NIC] if desired).
 +    - **OK**
 +  - Now power the VM up (green arrow) and open the console (little computer monitor icon)
 +  - You can either
 +    - Select an entry from the Cobbler PXE menu (if the new VM is **NOT** in the ansible inventory
 +      - Make sure you press ''​[Tab]''​ and delete almost all of the kickstart parameters (the ''​ks=''​ most importantly)
 +    - Add the host to the ansible inventory, and thus, Cobbler, DNS, and DHCP, then set a kickstart in the Cobbler Web UI (see below)
 +
 +=== Using a Kickstart with Cobbler ===
 +The Sepia Cobbler instance has some kickstart profiles that will automate RHV VM installation. ​ I think in order to use these, you'd have to get the MAC from the **Network Interfaces** tab in RHV, then put your new VM in the ''​ceph-sepia-secrets''​ [[https://​github.com/​ceph/​ceph-sepia-secrets/​blob/​master/​ansible/​inventory/​sepia|ansible inventory]].
 +
 +Then run:
 +  - ''​%%ansible-playbook cobbler.yml --tags systems%%''​
 +  - ''​%%ansible-playbook dhcp-server.yml%%''​
 +  - ''​%%ansible-playbook nameserver.yml --tags records%%''​
 +
 +(See https://​wiki.sepia.ceph.com/​doku.php?​id=tasks:​adding_new_machines for more info)
 +
 +In cobbler, you can browse to the system and set the ''​Profile''​ and ''​Kickstart'':​
 +  * ''​dgalloway-ubuntu-vm''​ - Installs a basic Ubuntu installation using the entire disk and ''​ext4''​ filesystem. ​ I couldn'​t get ''​xfs''​ working.
 +  * ''​dgalloway-rhel-vm''​ - I don't remember if this one works but you can try.
 +
 +=== A note about installing RHEL/CentOS ===
 +You need to specify the URL for the installation repo as a kernel parameter. ​ So in the Cobbler PXE menu, when you hit ''​[Tab]'',​ add ''​%%ksdevice=link inst.repo=http://​172.21.0.11/​cobbler/​ks_mirror/​CentOS-X.X-x86_64%%''​ replacing X.X with the appropriate version.
 +
 +Otherwise you'll end up with an error like ''​dracut initqueue timeout''​ and the installer dies.
 +
 ==== ovirt-guest-agent ==== ==== ovirt-guest-agent ====
 After installing a new VM, be sure to install VM guest agent. ​ This, at the very least, allows a VM's FQDN and IP address(es) to show up in the RHEV Web UI. After installing a new VM, be sure to install VM guest agent. ​ This, at the very least, allows a VM's FQDN and IP address(es) to show up in the RHEV Web UI.
Line 109: Line 170:
     - (Make sure you make these changes persistent for subsequent reboots. ​ See https://​askubuntu.com/​a/​921830/​906620,​ for example)     - (Make sure you make these changes persistent for subsequent reboots. ​ See https://​askubuntu.com/​a/​921830/​906620,​ for example)
   - Edit network config   - Edit network config
 +  - It's also probably beneficial to remove cloud-init: e.g., ''​apt-get purge cloud-init''​
 +    - Even though cloud-init is purged, its grub.d settings still get read.
 +    - It might work to just delete ''/​etc/​default/​grub.d/​50-cloudimg-settings.cfg''​ but otherwise,
 +      - Modify it and get rid of any ''​console=''​ parameters
 +      - Run ''​update-grub''​
  
 ===== Troubleshooting ===== ===== Troubleshooting =====
Line 117: Line 183:
 ==== Emergency RHEV Web UI Access w/o VPN ==== ==== Emergency RHEV Web UI Access w/o VPN ====
 In the event the OpenVPN gateway VM is inaccessible/​locked up/​whatever,​ you can open an SSH tunnel (''​ssh -D 9999 $YOURUSER@8.43.84.133''​) and set your browser'​s proxy settings to SOCKS5 localhost:​9999 to get at the RHEV web UI.  That public IP is on store01 and is a leftover artifact from when store01 ran OpenVPN. In the event the OpenVPN gateway VM is inaccessible/​locked up/​whatever,​ you can open an SSH tunnel (''​ssh -D 9999 $YOURUSER@8.43.84.133''​) and set your browser'​s proxy settings to SOCKS5 localhost:​9999 to get at the RHEV web UI.  That public IP is on store01 and is a leftover artifact from when store01 ran OpenVPN.
 +
 +==== GFIDs listed in ''​gluster volume heal ssdstorage info''​ forever ====
 +This is https://​bugzilla.redhat.com/​show_bug.cgi?​id=1361518.
 +
 +As long as the unsynced entries are GFIDs only and they only appear under the arbiter (senta01) server, you can paste **just** the GFIDs into a ''/​tmp/​gfids''​ file and run the following script:
 +
 +<​code>​
 +#!/bin/bash
 +set -ex
 +
 +VOLNAME=ssdstorage
 +for id in $(gluster volume heal $VOLNAME info | egrep '​[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{8}'​ -o); do
 +  file=$(find /​gluster/​arbiter/​.glusterfs -name $id -not -path '/​gluster/​arbiter/​.glusterfs/​indices/​*' ​ -type f)
 +  if [ $(getfattr -d -m . -e hex $(echo $file) | grep trusted.afr.$VOLNAME* | grep "​0x000000"​ | wc -l) == 2 ]; then
 +    echo "​deleting xattr for gfid $id"
 +    for i in $(getfattr -d -m . -e hex $(echo $file) |grep trusted.afr.$VOLNAME*|cut -f1 -d'​='​);​ do
 +      setfattr -x $i $(echo $file)
 +    done
 +  else
 +    echo "not deleting xattr for gfid $id"
 +  fi
 +done
 +</​code>​
  
 ---- ----
Line 138: Line 227:
 I used to have a summary of steps here but it's safer to just follow the [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​|Red Hat docs]]. I used to have a summary of steps here but it's safer to just follow the [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​|Red Hat docs]].
  
 +==== VM has paused due to no storage space error ====
 +We started seeing this issue on VMs like teuthology and it looks like it's a known bug I updated /​etc/​vdsm/​vdsm.conf.d/​99-local.conf and restarted systemctl restart vdsmd as described here:
 +
 +https://​access.redhat.com/​solutions/​130843
 ==== Growing a VM's virtual disk ==== ==== Growing a VM's virtual disk ====
   - Log into the [[https://​mgr01.front.sepia.ceph.com|Web UI]]   - Log into the [[https://​mgr01.front.sepia.ceph.com|Web UI]]
Line 163: Line 256:
 ==== Onlining Hot-Plugged CPU/RAM ==== ==== Onlining Hot-Plugged CPU/RAM ====
 https://​askubuntu.com/​a/​764621 https://​askubuntu.com/​a/​764621
 +
 +<​code>​
 +#!/bin/bash
 +# Based on script by William Lam - http://​engineering.ucsb.edu/​~duonglt/​vmware/​
 +
 +# Bring CPUs online
 +for CPU_DIR in /​sys/​devices/​system/​cpu/​cpu[0-9]*
 +do
 +    CPU=${CPU_DIR##​*/​}
 +    echo "Found cpu: '​${CPU_DIR}'​ ..."
 +    CPU_STATE_FILE="​${CPU_DIR}/​online"​
 +    if [ -f "​${CPU_STATE_FILE}"​ ]; then
 +        if grep -qx 1 "​${CPU_STATE_FILE}";​ then
 +            echo -e "​\t${CPU} already online"​
 +        else
 +            echo -e "​\t${CPU} is new cpu, onlining cpu ..."
 +            echo 1 > "​${CPU_STATE_FILE}"​
 +        fi
 +    else 
 +        echo -e "​\t${CPU} already configured prior to hot-add"​
 +    fi
 +done
 +
 +# Bring all new Memory online
 +for RAM in $(grep line /​sys/​devices/​system/​memory/​*/​state)
 +do
 +    echo "Found ram: ${RAM} ..."
 +    if [[ "​${RAM}"​ == *":​offline"​ ]]; then
 +        echo "​Bringing online"​
 +        echo $RAM | sed "​s/:​offline$//"​|sed "​s/​^/​echo online > /"​|source /dev/stdin
 +    else
 +        echo "​Already online"​
 +    fi
 +done
 +</​code>​
services/rhev.1580920057.txt.gz ยท Last modified: 2020/02/05 16:27 by djgalloway