User Tools

Site Tools


services:rhev

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
services:rhev [2021/03/19 22:47]
djgalloway [Onlining Hot-Plugged CPU/RAM]
services:rhev [2023/02/08 12:40] (current)
akraitman [Updating Hypervisors]
Line 2: Line 2:
 ===== Summary ===== ===== Summary =====
 We have have a RHEV instance running on [[hardware:​infrastructure#​hv_0103|hv{01..04}]] as the main hypervisor nodes. ​ They'​re listed as **Hosts** in the RHEV Manager. We have have a RHEV instance running on [[hardware:​infrastructure#​hv_0103|hv{01..04}]] as the main hypervisor nodes. ​ They'​re listed as **Hosts** in the RHEV Manager.
 +
 +Currently the RHEV Hosts installed version is 4.3.5-1
  
 The [[http://​mgr01.front.sepia.ceph.com|RHEV Manager]] is a [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​4.1/​html/​self-hosted_engine_guide/​|Self-Hosted VM]] inside the cluster. ​ The username for logging in is ''​admin''​ and the password is our standard root password. The [[http://​mgr01.front.sepia.ceph.com|RHEV Manager]] is a [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​4.1/​html/​self-hosted_engine_guide/​|Self-Hosted VM]] inside the cluster. ​ The username for logging in is ''​admin''​ and the password is our standard root password.
Line 52: Line 54:
 The Hypervisors (hv{01..04}) and Storage nodes (ssdstore{01..02}) have entries in ''/​etc/​hosts''​ in case of DNS failure. The Hypervisors (hv{01..04}) and Storage nodes (ssdstore{01..02}) have entries in ''/​etc/​hosts''​ in case of DNS failure.
  
 +Note: it is important that the version of glusterfs packages on the hypervisors does not exceed the version on the storage nodes (i.e. client is older or equal to server).
 ---- ----
  
Line 180: Line 183:
 ==== Emergency RHEV Web UI Access w/o VPN ==== ==== Emergency RHEV Web UI Access w/o VPN ====
 In the event the OpenVPN gateway VM is inaccessible/​locked up/​whatever,​ you can open an SSH tunnel (''​ssh -D 9999 $YOURUSER@8.43.84.133''​) and set your browser'​s proxy settings to SOCKS5 localhost:​9999 to get at the RHEV web UI.  That public IP is on store01 and is a leftover artifact from when store01 ran OpenVPN. In the event the OpenVPN gateway VM is inaccessible/​locked up/​whatever,​ you can open an SSH tunnel (''​ssh -D 9999 $YOURUSER@8.43.84.133''​) and set your browser'​s proxy settings to SOCKS5 localhost:​9999 to get at the RHEV web UI.  That public IP is on store01 and is a leftover artifact from when store01 ran OpenVPN.
 +
 +==== GFIDs listed in ''​gluster volume heal ssdstorage info''​ forever ====
 +This is https://​bugzilla.redhat.com/​show_bug.cgi?​id=1361518.
 +
 +As long as the unsynced entries are GFIDs only and they only appear under the arbiter (senta01) server, you can paste **just** the GFIDs into a ''/​tmp/​gfids''​ file and run the following script:
 +
 +<​code>​
 +#!/bin/bash
 +set -ex
 +
 +VOLNAME=ssdstorage
 +for id in $(gluster volume heal $VOLNAME info | egrep '​[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{8}'​ -o); do
 +  file=$(find /​gluster/​arbiter/​.glusterfs -name $id -not -path '/​gluster/​arbiter/​.glusterfs/​indices/​*' ​ -type f)
 +  if [ $(getfattr -d -m . -e hex $(echo $file) | grep trusted.afr.$VOLNAME* | grep "​0x000000"​ | wc -l) == 2 ]; then
 +    echo "​deleting xattr for gfid $id"
 +    for i in $(getfattr -d -m . -e hex $(echo $file) |grep trusted.afr.$VOLNAME*|cut -f1 -d'​='​);​ do
 +      setfattr -x $i $(echo $file)
 +    done
 +  else
 +    echo "not deleting xattr for gfid $id"
 +  fi
 +done
 +</​code>​
  
 ---- ----
Line 201: Line 227:
 I used to have a summary of steps here but it's safer to just follow the [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​|Red Hat docs]]. I used to have a summary of steps here but it's safer to just follow the [[https://​access.redhat.com/​documentation/​en-us/​red_hat_virtualization/​|Red Hat docs]].
  
 +==== VM has paused due to no storage space error ====
 +We started seeing this issue on VMs like teuthology and it looks like it's a known bug I updated /​etc/​vdsm/​vdsm.conf.d/​99-local.conf and restarted systemctl restart vdsmd as described here:
 +
 +https://​access.redhat.com/​solutions/​130843
 ==== Growing a VM's virtual disk ==== ==== Growing a VM's virtual disk ====
   - Log into the [[https://​mgr01.front.sepia.ceph.com|Web UI]]   - Log into the [[https://​mgr01.front.sepia.ceph.com|Web UI]]
services/rhev.1616194034.txt.gz ยท Last modified: 2021/03/19 22:47 by djgalloway