LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Stop grub2 from erasing screen and rebooting (https://www.linuxquestions.org/questions/linux-newbie-8/stop-grub2-from-erasing-screen-and-rebooting-4175597665/)

ronatartifact 01-17-2017 01:33 PM

Stop grub2 from erasing screen and rebooting
 
I have a problem with CentOS 7 update to version 514 (7.3).
It creates a kernel that will not boot.
If I try to do a clean install from a Centos 7 version 514 DVD, the DVD will not load.
The load starts, displays errors and then reboots - all at the speed of light.

Can I do something in grub2 to avoid the instant reboot.

The server that was upgraded has a working CentOS7 version 327 kernal so I can get in and fix grub2 so that is leaves the errors on the screen.
Some of the errors are "normal" and appear on the boot of a working CentOS7.

It appears to be something particular to the server (MSI board) and 7.3 since the same DVD will create a nicely working 7.3 on other hardware and 7.2 runs fine on the same server that will not run the upgraded kernel nor start a fresh install.

dlb101010 01-18-2017 08:55 AM

Quote:

7.2 runs fine on the same server that will not run the upgraded kernel nor start a fresh install
Excuse me if I'm misunderstanding, but have you tried reinstalling the 7.2 that works to see if it will install and still work? From the overall description it sounds like it could be hardware going bad.

Dave

ronatartifact 01-18-2017 07:26 PM

There are now more than 2 bootable kernels on the machine.
The 7.2 boots(-327) fine and the 7.3 (-514) does not.
If I try to do a clean install of 7.3, the DVD just starts up and dies.

If I use the same DVD on another server, it works fine - I just upgraded my main e-mail server using that DVD and the OS installed just fine.
The motherboard where 7.3 does not work is only 2 years old which should be pretty new in the Linux world.
The other server where 7.3 works is much older (2007 or 2008 vintage).

I was hoping to use the newer server instead but I do not want to load 7.2 with the possibility of a software upgrade rendering a key server unbootable.

I am really trying to find out how to fix the boot on the 7.3 installation which was created by the kernel upgrade so that the messages are not erased as soon as they are displayed.

I am hoping that the messages will give some insight into what was done in the latest 7.3 upgrade (-514 version) that prevents the kernel from booting.

This will lead to a bug report on 7.3 unless I can find (Google) something that tells me how to adjust the bios to match what 7.3 expects.

dlb101010 01-20-2017 05:14 PM

I don't know if you've seen this bug report, but it seems to parallel what you're saying and offers a possible work around. It has to do with: "Kernel 3.10.0-514 does not boot with mlx4_core module on some IBM systems"

HTH,
Dave

ronatartifact 01-26-2017 01:12 PM

I tried to apply the patch from the IBM issue and it did not change the problem as far as I could see.

syg00 01-26-2017 04:43 PM

Doesn't sound like a grub issue at all. Its job is done when it loads the kernel. If the kernel takes a fault that causes a hardware reset, it's not grubs fault.
Have a look in the systemd journal to see if anything got saved for the error.

If you want to see all the boot messages, hitting <Esc> should work once the kernel loads - else edit the grub command line and remove "rhbg" and "quiet".
Won't help with the (hardware) reset clearing the screen though ... you might need to connect a serial console for that.

ronatartifact 01-26-2017 05:08 PM

Nothing in the boot logs.
I can not hit esc fast enough. I am old enough to remember when the speed of light was a lot slower!

Does the fact that the DVD does not even start the install if you try to do a fresh install of 7.3, give you any ideas?

The DVD is OK, I created another server with it with no problems.

What files should I post to help get to the bottom of this?

I am hesitating putting the server into production even though 7.2 has been running on it for a year until I understand what causes 7.3 not to load.
I have never had a CentOS that failed to run after an update and I am sure that there will be more updates later and I don't want the server to suddenly fail to run in production after a normal round of software updates.

Ron

syg00 01-26-2017 05:20 PM

The fact that the DVD doesn't boot is a worry - I would take it that means you have a real hardware conflict.
If there is nothing in journalctl for the failing boot ("journalctl -b -1" from the 7.2 kernel and go to the bottom), no other file is going to help.

So this happens after you select the 7.3 kernel ?. Just want to make sure it's not the initramfs causing the problem.

ronatartifact 01-26-2017 10:54 PM

The DVD does not display anything on the screen for long enough to see any message.

The normal boot from the hard drive shows the menu of Linux kernels to boot and if you chose the 7.2, it boots fine.
If you chose the 7.3, it starts to boot but reboots the bios after flashing about 8 to 10 lines on the screen.
Some of these appear to be the same as flashed when the 7.2 boot correctly.

dmesg seems to get overwritten on each boot. It seems to contain a lot on interesting messages.
On a successful boot it complains about a few things but they do not have any effect on the operation of the system.

[ 0.000000] AGP: No AGP bridge found

[ 0.000000] ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20130
517/tbfadt-603)

[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave a aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM

many lines later
[ 0.914401] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for ve
ctor 0xf9 on another cpu
[ 0.914437] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100)
[ 0.914469] Failed to setup IBS, -22

many lines later while it appears to be setting up the video
[ 1.856404] ACPI Error: [\_SB_.ALIB] Namespace lookup failure, AE_NOT_FOUND (20130517/psargs-359)
[ 1.856409] ACPI Error: Method parse/execution failed [\_SB_.PCI0.VGA_.ATC0] (Node ffff8804295c6d20), AE_NOT_FOUND (20130517/pspar
se-536)
[ 1.856416] ACPI Error: Method parse/execution failed [\_SB_.PCI0.VGA_.ATCS] (Node ffff8804295c6cf8), AE_NOT_FOUND (20130517/pspar
se-536)
[ 1.856611] [drm] Initialized radeon 2.42.0 20080528 for 0000:00:01.0 on minor 0


I did find a file /var/log/grubby which does mention the 7.3 OS
The whole section is
DBG: 28821: Fri Jan 20 16:21:12 2017: command line: --grub2 -c /boot/grub2/grub.cfg --add-kernel=/boot/vmlinuz-3.10.0-514.6.1.el7.x86
_64 --copy-default --title CentOS Linux (3.10.0-514.6.1.el7.x86_64) 7 (Core) --args=root=/dev/mapper/centos-root --remove-kernel=TIT
LE=CentOS Linux (3.10.0-514.6.1.el7.x86_64) 7 (Core) --make-default
DBG: Image entry succeeded:
DBG: menuentry 'CentOS Linux (3.10.0-514.2.2.el7.x86_64) 7 (Core)' --class centos --class gnu-linux --class gnu --class os --unrestri
cted $menuentry_id_option 'gnulinux-3.10.0-123.el7.x86_64-advanced-76f559c2-d36f-43e6-be89-a57e852e5d0a' {
DBG: load_video
DBG: set gfxpayload=keep
DBG: insmod gzio
DBG: insmod part_msdos
DBG: insmod xfs
DBG: set root='hd0,msdos1'
DBG: if [ x$feature_platform_search_hint = xy ]; then
DBG: search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='
hd0,msdos1' 06b8406b-a595-47aa-b1f3-a4adf11ccaeb
DBG: else
DBG: search --no-floppy --fs-uuid --set=root 06b8406b-a595-47aa-b1f3-a4adf11ccaeb
DBG: fi
DBG: linux16 /vmlinuz-3.10.0-514.2.2.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-su
n16 rd.lvm.lv=centos/root crashkernel=auto vconsole.keymap=us rhgb quiet LANG=en_US.UTF-8
DBG: initrd16 /initramfs-3.10.0-514.2.2.el7.x86_64.img
DBG: }

the 7.2 entries look similar
DBG: 21337: Fri Oct 28 02:41:31 2016: command line: --grub2 -c /boot/grub2/grub.cfg --add-kernel=/boot/vmlinuz-3.10.0-327.36.3.el7.x86_64 --copy-default --title CentOS Linux (3.10.0-327.36.3.el7.x86_64) 7 (Core) --args=root=/dev/mapper/centos-root --remove-kernel=TITLE=CentOS Linux (3.10.0-327.36.3.el7.x86_64) 7 (Core) --make-default
DBG: Image entry succeeded:
DBG: menuentry 'CentOS Linux (3.10.0-327.18.2.el7.x86_64) 7 (Core)' --class centos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-123.el7.x86_64-advanced-76f559c2-d36f-43e6-be89-a57e852e5d0a' {
DBG: load_video
DBG: set gfxpayload=keep
DBG: insmod gzio
DBG: insmod part_msdos
DBG: insmod xfs
DBG: set root='hd0,msdos1'
DBG: if [ x$feature_platform_search_hint = xy ]; then
DBG: search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1' 06b8406b-a595-47aa-b1f3-a4adf11ccaeb
DBG: else
DBG: search --no-floppy --fs-uuid --set=root 06b8406b-a595-47aa-b1f3-a4adf11ccaeb
DBG: fi
DBG: linux16 /vmlinuz-3.10.0-327.18.2.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=auto vconsole.keymap=us rhgb quiet LANG=en_US.UTF-8
DBG: initrd16 /initramfs-3.10.0-327.18.2.el7.x86_64.img
DBG: }

syg00 01-27-2017 02:12 AM

Like I said earlier, might be time for a serial console. Plenty of how-to's on the net.

ronatartifact 01-27-2017 08:42 AM

In the old days before video monitors, I would have had a hardcopy on the console from the start!
I am starting to have an attack of nostalgia.
I probably threw out dozens on serial cables over the years.


All times are GMT -5. The time now is 10:14 AM.