Solutions

Z guest running Rocky Linux 8 doesn’t boot latest kernel after updating the kernel

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

Operating System and Software

  • Rocky Linux (RHEL) 8
    • IBM Z

Problem

  • After updating the kernel, zipl doesnt’ boot the latest kernel

How to Fix

Follow the procedure in the Diagnostic Steps section. If this is a match, proceed further.

  1. List all the kernels present on the system

    # rpm -qa kernel
    kernel-4.18.0-147.5.1.el8_1.s390x
    kernel-4.18.0-193.1.2.el8_2.s390x
    
  2. Note down the machine-id for the system

    # cat /etc/machine-id
    1d25f0e3e2e54858ad724962c9f2dc5f
    
  3. List the boot entries

    # ls -1 /boot/loader/entries
    10c858261cdb4a1f9ef318779f2ca873-0-rescue.conf
    10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.5.1.el8_1.s390x.conf
    10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.el8.s390x.conf
    1d25f0e3e2e54858ad724962c9f2dc5f-0-rescue.conf
    1d25f0e3e2e54858ad724962c9f2dc5f-4.18.0-193.1.2.el8_2.s390x.conf
    
  4. Delete the unexpected boot entries or rename the files

    Entries referencing invalid kernel can be safely deleted. Entries referencing a valid kernel but unexpected machine-id should be renamed into the expected machine-id.
    In the example above:

    • File 10c858261cdb4a1f9ef318779f2ca873-0-rescue.conf can be deleted since the corresponding 1d25f0e3e2e54858ad724962c9f2dc5f-0-rescue.conf exists
    • File 10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.el8.s390x.conf can be deleted, since it doesn’t reference an installed kernel
    • File 10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.5.1.el8_1.s390x.conf should be renamed into 1d25f0e3e2e54858ad724962c9f2dc5f-4.18.0-147.5.1.el8_1.s390x.conf since it references an installed kernel
    # cd /boot/loader/entries
    # rm 10c858261cdb4a1f9ef318779f2ca873-0-rescue.conf
    # rm 10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.el8.s390x.conf
    # mv 10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.5.1.el8_1.s390x.conf 1d25f0e3e2e54858ad724962c9f2dc5f-4.18.0-147.5.1.el8_1.s390x.conf
    
  5. List the boot entries again

    # ls -1 /boot/loader/entries
    1d25f0e3e2e54858ad724962c9f2dc5f-0-rescue.conf
    1d25f0e3e2e54858ad724962c9f2dc5f-4.18.0-147.5.1.el8_1.s390x.conf
    1d25f0e3e2e54858ad724962c9f2dc5f-4.18.0-193.1.2.el8_2.s390x.conf
    

Optionally, you may delete all the kernel related files in /boot that are not referenced by any installed kernel packages by following the procedure below:

  1. Build a regular expression of installed kernel versions for reference

    # KERNELS="$(echo $(rpm -q kernel --qf "%{VERSION}-%{RELEASE}\n") | tr ' ' '|')"
    # echo $KERNELS
    4.18.0-147.5.1.el8_1|4.18.0-193.1.2.el8_2
    
  2. List all the kernel related files in /boot that are not part of kernel packages anymore

    # cd /boot
    # ls -1 | grep 4.18.0 | egrep -v "$KERNELS"
    config-4.18.0-147.el8.s390x
    initramfs-4.18.0-147.el8.s390x.img
    System.map-4.18.0-147.el8.s390x
    vmlinuz-4.18.0-147.el8.s390x
    
  3. List all the rescue files in /boot that are not corresponding to the expected machine-id

    # cd /boot
    # ls -1 | grep -v 1d25f0e3e2e54858ad724962c9f2dc5f
    initramfs-0-rescue-10c858261cdb4a1f9ef318779f2ca873.img
    vmlinuz-0-rescue-10c858261cdb4a1f9ef318779f2ca873
    
  4. Delete all the files listed above in steps 2 and 3

Origin of the Problem

  • When more than one machine-id is present in /boot/loader/entries file entries, zipl takes the alphanumerically lower machine-id, which may not be the expected machine-id for the system.
  • This usually happens after cloning a Z guest, note that cloning a system is not supported by Red Hat.
  • Another scenario is when the machine-id has been modified through editing /etc/machine-id

Diagnostic Steps

  1. Verify that the /etc/zipl.conf file is the default for Rocky Linux 8

    # cat /etc/zipl.conf
    [defaultboot]
    defaultauto
    prompt=1
    timeout=5
    target=/boot
    
  2. Note down the machine-id of the system

    # cat /etc/machine-id
    1d25f0e3e2e54858ad724962c9f2dc5f
    
  3. List the zipl boot entries

    # ls -1 /boot/loader/entries
    10c858261cdb4a1f9ef318779f2ca873-0-rescue.conf
    10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.5.1.el8_1.s390x.conf
    10c858261cdb4a1f9ef318779f2ca873-4.18.0-147.el8.s390x.conf
    1d25f0e3e2e54858ad724962c9f2dc5f-0-rescue.conf
    1d25f0e3e2e54858ad724962c9f2dc5f-4.18.0-193.1.2.el8_2.s390x.conf
    
  4. Compare the zipl boot entries with expected machine-id

    In the example above, the listing of zipl entries shows there are entries for an unexpected machine-id 10c858261cdb4a1f9ef318779f2ca873.
    Due to this machine-id being alphanumerically ordered before expected machine-id, zipl will boot an entry from that unexpected machine-id, explaining the issue seen.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: