User Tools

Site Tools


services:fog

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
services:fog [2017/12/01 19:50]
djgalloway
services:fog [2023/07/14 22:41] (current)
dmick Added description of reimage flow for both fog and cobbler
Line 9: Line 9:
  
 ===== How-Tos ===== ===== How-Tos =====
 +==== Adding a new distro ====
 +
 +The [[https://​github.com/​ceph/​ceph-build/​pull/​1706|Jenkins job]] does this automatically now.
 ==== Capturing OS images ==== ==== Capturing OS images ====
-This can be done manually by basically deciphering the bash monster in CEPH_BUILD_LINK_HERE+This can be done manually by basically deciphering the bash monster in https://​github.com/​ceph/​ceph-build/​tree/​master/​sepia-fog-images
  
   - Navigate to https://​jenkins.ceph.com/​job/​sepia-fog-images/​build?​delay=0sec   - Navigate to https://​jenkins.ceph.com/​job/​sepia-fog-images/​build?​delay=0sec
Line 20: Line 23:
 If capturing any image fails, the job is configured to cancel the OS capture and will leave the testnodes locked so you can debug/​investigate. If capturing any image fails, the job is configured to cancel the OS capture and will leave the testnodes locked so you can debug/​investigate.
  
-==== Adding ​new distro ​====+==== Image capture control flow ==== 
 + 
 +jenkins job sepia-fog-images runs on the teuthology host, and 
 + 
 +  - clones/​updates and sets up teuthology to use teuthology-lock 
 +  - clones/​updates ceph-cm-ansible 
 +  - locks machine of the requested type(s) (or uses hosts passed in as arguments), setting their descriptions to "​Locked to capture FOG image for Jenkins build ###" 
 +  - uses /​usr/​local/​sbin/​set-next-server.sh on the store01 DHCP server to set the targets to PXE boot from cobbler (rather than fog) and restarts the dhcpd 
 +  - sshes to ubuntu@cobbler.front to set the right cobbler profile for the host and enable netboot 
 +  - powercycles the hosts in question 
 +  - while the hosts are rebooting, Uses curl and the FOG api to GET an image id or POST an image template to create the image, and then sets up fog for image capture 
 +  - sleeps for 10s to allow the hosts to become inaccessible so it can.. 
 +  - ..start polling for the sentinel file /​ceph-qa-ready which is created at the very end of the process. ​ (The cobbler install flow is documented below) 
 +  - If there'​s an error or /​ceph-qa-ready isn't present, retry for up to 2 hours. ​ If normal completion is seen, set DHCP back to PXE-from-fog,​ run ansible-playbook (from teuthology, against the host) with tools/​prep-fog-capture.yml,​ which removes some files from the prior installation:​ 
 +    * /​etc/​udev/​rules.d/​70-persistent-net.rules 
 +    * /​.cephlab_net_configured 
 +    * /​ceph-qa-ready 
 +  - disables network configuration,​ kills the /​var/​lib/​ceph mount and removes from fstab, removes any ssh host keys, unsubscribes from RHEL, removes a katello.facts file, disables periodic dnf makecache jobs, cleans the dnf cache, stops ntp/chrony, and sets the hwclock 
 +  - restarts dhcpd 
 +  - waits for any in-progress fog images to complete, pauses the teuthology queue if there are any 
 +  - powercycles the targets to boot into FOG and capture, and waits for FOG task completion 
 +  - teuthology-lock --unlock'​s any locked hosts and unpauses the queue if needed 
 + 
 +==== Cobbler install flow for reimaging process ====  
 +  - do a normal preseed/​kickstart install with cobbler-defined preseed/​kickstart files. ​ Some extra definitions:​ 
 +    * a smallish set of packages to install 
 +    * grub serial console setup 
 +    * install an /​etc/​rc.local to run once on first reboot 
 +    * install with ext4 on the appropriate drive, without swap 
 +    * set up subscription manager 
 +    * add the cm user with the admin_users'​ keys and passwordless sudo 
 +    * turn off cobbler PXE boot 
 +  - after rebooting to the fresh install, /​etc/​rc.local runs: 
 +    * search the nics for any active interfaces, and set them up for DHCP; if they receive no DHCP address, unconfigure them, assumption being they'​re not on any network we should configure 
 +    * touch /​.cephlab_net_configured when done 
 +    * try to get a hostname from reverse DNS and configure it 
 +    * generate SSH host keys 
 +    * ping the cobbler host to make sure it's reachable 
 +    * curl the cblr/​svc/​op/​trig/​mode/​post/​system/<​hostname>,​ which will run the /​var/​lib/​cobbler/​triggers/​install/​post/​cephlab_ansible.sh script from cobbler to the target host 
 +  -  cephlab_ansible.sh will (running on the cobbler host): 
 +    * use scl on Centos 7 to get python 3.8 
 +    * clone/​update ceph-cm-ansible and ceph-sepia-secrets (root'​s ssh key allows access to the latter on github) 
 +    * look for port 22 to be open; there is apparently a way this trigger might run before the install is done, and if so, 22 won't be available; when it's run from the /​etc/​rc.local in step 2 above, it will find 22 open 
 +    * create a /​var/​log/​ansible and put log output there in a file named <​hostname>​ 
 +    * for CentOS 8 Stream, run tools/​convert-to-centos-stream.yml 
 +    * if the Cobbler profile is named '​*-stock',​ stop there 
 +    * run ansible cephlab.yml,​ skipping users,​pubkeys,​zap 
 +  - cephlab.yml runs  
 +    * teuthology.yml for teuthology 
 +    * testnodes.yml for testnodes 
 +    * container-host.yml for docker/​podman installation 
 +    * cobbler.yml for cobbler hosts 
 +    * same with paddles and pulpito 
 +    * finally, for testnodes, touches the /​ceph-qa-ready sentinel, used by the fog capture process above to notice that the installation is finished and proceed with the capture process.
  
-  - Update the friendly distro names in the ''​parameters''​ dictionary in THE_JENKINS_JOB_CONFIG 
-  - Update the Cobbler and FOG image names at the top of JENKINS_JOB_BUILD_SCRIPT 
-  - Create an Image in FOG for each machine type 
-    - Navigate to http://​fog.front.sepia.ceph.com/​fog/​management/​index.php?​node=image 
-    - Click **Create New Image** 
-    - Set **Image Name** to MACHINETYPE_DISTRO_DISTROVERSION. ​ (e.g., mira_centos_7.5) 
-    - Set **Image Path** to the image'​s name (e.g., /​images/​mira_centos_7.5) 
-    - Set **Operating System** to Linux (I'd hope) 
-    - Click **Add** 
-    - Repeat for each machine type! 
services/fog.1512157841.txt.gz · Last modified: 2017/12/01 19:50 by djgalloway