LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 09-25-2022, 11:46 AM   #16
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled

Hm, kexec just gives me a *truly* blank screen, i.e. no video signal and the monitor goes into standby mode. No useful output.

I *think* i'm using it right, so... no help there. Still stuck.
 
Old 09-25-2022, 12:13 PM   #17
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
OK, are you absolutely sure it does not boot? Plenty of cases in Gentoo forums where custom kernels cannot handle video output and screen goes blank, looks like kernel is frozen. While in actuality kernel boots and in some cases when X is configured to autoload even the display comes alive again.
 
Old 09-25-2022, 12:22 PM   #18
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
Yeah, it's definitely not booting, no sign of activity even when i leave it for minutes, no network, no blinky flashies, nothing, just a blank screen with a video signal but no output.

I tried putting it on the USB stick as well and directing grub to load it from there, nothing. So all signs do seem to point to it being the kernel that's the problem, despite this being the same kernel that's booting my other machines and used to boot this one just fine. I guess my next step is to try installing gentoo or arch on the drive, using that to build a kernel with the necessary driver support, and then using that to boot my old root partition... at least that would get me using the machine again, assuming it works. Then i could comb the kernel configs for differences i guess. I'll report back if that works.

It just doesn't make any sense! I'll bet it turns out to be something stupid i'm overlooking, though.
 
Old 09-25-2022, 01:58 PM   #19
colorpurple21859
LQ Veteran
 
Registered: Jan 2008
Location: florida panhandle
Distribution: Slackware Debian, Fedora, others
Posts: 7,346

Rep: Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589
Did you have an initramfs before putting the server drives into the ryzen and did you rebuild the initramfs after putting the server drives in
 
Old 09-25-2022, 04:40 PM   #20
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colorpurple21859 View Post
Did you have an initramfs before putting the server drives into the ryzen and did you rebuild the initramfs after putting the server drives in
Yes, but as mentioned not a critical one and i tried booting without it. I've also since rebuilt it.
 
Old 10-05-2022, 04:06 PM   #21
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
I'm still stymied after 2 weeks of poking at this as time allows.

I've successfully installed gentoo on the SSD (/dev/sda) and have re-installed my own system on /dev/nvme0n1. I can boot gentoo with its own kernel.

I cannot get the gentoo kernel to boot my root drive on the NVMe drive, nor can i get my kernel to boot the gentoo root.

I did notice the gentoo install of grub had some 'echo' commands and i was seeing their output, then nothing else once control passed to the kernel. So i added some similar commands to my own grub install. Now trying to boot my system with my own kernel and grub install at least has some output (from grub) rather than just a blank screen - but still nothing after control passes to the kernel. At that point it just hangs with a solid underline cursor on the next line after the last echo message from grub, and all i can do is power off (ctrl-alt-del does nothing either).

From all of this it's just not clear to me if this is a problem with the kernel, with the root filesystem, or both. I would have expected at least one of the combinations above to do *something*.

FWIW i'm doing things like this from both grubs. (hd0,gpt1) / /dev/sda1 is the gentoo boot partition containing its kernel/initramfs, (hd0,gpt3) / /dev/sda3 is the gentoo root. (hd2,gpt3) / /dev/nvme0n1p3 is the root partition for my own system, which contains the kernel/initramfs under /boot.

# Gentoo kernel, my root:
linux (hd0,gpt1)vmlinuz-gentoo-XXXX root=/dev/nvme0n1p3 ro
initrd (hd0,gpt)/initramfs-gentoo-XXXX

# My kernel, gentoo root:
linux (hd3,gpt3)/boot/vmlinuz root=/dev/sda1 ro
initrd (hd3,gpt3)/boot/initramfs

# My kernel, my root:
linux (hd3,gpt3)/boot/vmlinuz root=/dev/nvme0n1p3 ro
initrd (hd3,gpt3)/boot/initramfs

All of these seem to have the same effect.

Note, i've updated my other machines to kernel 6.0.0 and have fully re-built the system with that kernel on the ryzen machine as well. No change.

I was hoping the gentoo install would be a quicker alternative for messing with this than booting the (arch-based) system rescue stick, but for some reason if i try to chroot into my system under gentoo i get a segfault. Maybe something to do with the native architecture code? I may just slap arch on that drive instead as i know that works.

Kind of at my wit's end here! If anyone has any more bright ideas for me to try i'd sure appreciate it.
 
Old 10-05-2022, 04:26 PM   #22
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
Use Gentoo wgetpaste utility and post the link to your kernel .config.

Edit: just saw your segfault problem, yes, you cannot build with -march=native or any other CPU specific -march and hope it boots all CPU's.

Last edited by Emerson; 10-05-2022 at 04:28 PM.
 
Old 10-05-2022, 05:18 PM   #23
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Emerson View Post
Use Gentoo wgetpaste utility and post the link to your kernel .config.

Edit: just saw your segfault problem, yes, you cannot build with -march=native or any other CPU specific -march and hope it boots all CPU's.
It does boot the gentoo install, it just won't let me then chroot into my own root partition.

I'm going to slap arch on the machine and see if i have any better luck.

My 6.0 config (for my self-built kernel) is at https://pastebin.com/CBaeQUSk FWIW. This is the kernel that's booting all the other machines and used to boot this one fine.
 
Old 10-05-2022, 05:23 PM   #24
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
OK, this config you posted, what it does not do, what is the problem with it?
 
Old 10-05-2022, 05:38 PM   #25
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
Hi Emerson, i think we got a bit lost in this thread. The issue is still as noted previously in the thread, i'm unable to boot one machine (ryzen 3900X CPU) with my self-built system (kernel and root partition) and am just getting a blank screen after the grub hand-off for any combination involving either my kernel or my root partition (my kernel booting my root, gentoo kernel booting my root, my kernel booting gentoo root). No kernel output at all. I'm really stymied as this was all working beautifully before i opened up this machine to and hooked up some server RAID drives to test (that turned out to be fine), took them out and closed it back up again.

I can still boot this machine off various USB sticks and even off arch/gentoo installs on the SSD. Can't boot my self-built system off either NVMe or SSD, despite multiple times wiping the drives and rebuilding. The self-built system is copied from my server which is running just fine; it's also booting other machines in the house and used to boot the ryzen machine just fine before testing those other drives. I essentially copy over the main parts of the root partition via rsync and then create some standard directories... same way i did it when originally setting up the ryzen machine a couple of years ago.

I think the kernel config is a red herring, as like i said it used to work fine. There's some puzzle piece here that i'm just not seeing. I appreciate you trying to help but i'm just not sure what info i can give you that will actually be useful - the lack of *any* kernel output at boot is making this particularly mysterious.
 
Old 10-05-2022, 05:59 PM   #26
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
To troubleshoot a non-booting kernel we need to look at its config. Assuming this kernel is loaded by bootloader and there is no bootloader failure. Furthermore, you need to be sure this actually is the kernel you are booting. Gentoo users often get lost with last step of kernel install and in actuality they boot some other kernel image, resulting in constant headache for everybody because the config everybody is looking at is fine, it just wasn't used to build the actual image what is booting and failing ...
 
Old 10-05-2022, 07:09 PM   #27
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
To be clear, the gentoo kernel boots fine with the gentoo root install and it's not the point - the ultimate point is to get my self-built (non-gentoo) system booting again. That's where the mystery is.

I'll just have to keep poking at it until i find some better clue.
 
Old 10-06-2022, 05:22 AM   #28
colorpurple21859
LQ Veteran
 
Registered: Jan 2008
Location: florida panhandle
Distribution: Slackware Debian, Fedora, others
Posts: 7,346

Rep: Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589Reputation: 1589
along with the video= on the linux line add these lines to the mix of the grub menu-entry if they don't exist
somthing like this.

insmod all_video
load video
insmod gzio
linux (hd3,gpt3)/boot/vmlinuz root=/dev/nvme0n1p3 ro video=1024x768
initrd (hd3,gpt3)/boot/initramfs

If that doesn't work change all_video to either eif_uga or efi_gop

Last edited by colorpurple21859; 10-06-2022 at 05:27 AM.
 
Old 11-14-2022, 12:42 PM   #29
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
Hi! Coming back to update this in case it helps anyone else searching for something similar.

I had the same thing happen to my server recently, which really focused me on finding a solution.

I finally found a clue in the BLFS (Beyond Linux from Scratch) chapter on setting up GRUB for UEFI. It lists a set of kernel parameters needed for the handoff, and i discovered that a couple of years ago, i switched a critical one (CONFIG_DRM) over to a module rather than compiling it into the kernel. This meant that the kernel couldn't output anything immediately after the handoff from GRUB - i'd noticed this in the past as a delay before i saw any kernel output but assumed this was something to do with the monitor catching up with a mode switch. The kernel would start outputting messages after it got its own video handling set up.

Once i compiled this in i was able to see the early kernel output and easily find the problem with the server, which was something hung up early in my init scripts.

For the workstation it's still stopping at loading /sbin/init for some reason, but at least i have a hope of debugging that now that i'm getting some early kernel output! I'll come back and update again if that turns out to be anything significant, but i fully expect it'll turn out to be something stupid i won't want to note publicly.
 
Old 11-15-2022, 09:42 PM   #30
kalaleq
LQ Newbie
 
Registered: Mar 2021
Posts: 27

Original Poster
Rep: Reputation: Disabled
So... maybe i do need a little help with this. Maybe i should start a new thread, but i'll try here first.

On the workstation, the kernel boot always stops at "Running /sbin/init". No further message or activity (that i can see) after that message. Note, that message is emitted by the initrd script once it calls init, but it's optional on this machine so i also tried running the kernel without an initrd so it directly calls /sbin/init. In that case it halts after the kernel message "Run /sbin/init as init process".

/sbin/init is a symlink into my package management directory for sysvinit. This works fine on all other machines. I've also tried replacing the symlink with a copy of the binary to no avail.

Passing init=/bin/bash (which is also a symlink) on the kernel command line works just fine.

Once i'm in the shell, trying either "exec /sbin/init -i" or just "/sbin/init -i" does *not* work. Same thing, it just hangs after starting the command.

I can manually run init scripts and agetty and "sort of" bring up the system, though definitely not fully... i haven't really tried but everything pretty much works except the initial call to /sbin/init.

Any ideas of what i might try next?

Last edited by kalaleq; 11-15-2022 at 11:12 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Screen flashes and turns blank after resuming from blank state. Possibly a kernel bug. Fedora 35 with 5.14 kernel noname01 Linux - Newbie 3 11-21-2021 07:45 PM
Infinite Grub Loop: GRUB GRUB GRUB GRUB GRUB GRUB GRUB GRUB GRUB GRUB... beeblequix MEPIS 2 11-02-2013 10:56 PM
Booting my new ubuntu install = "GRUB GRUB GRUB GRUB GRUB" etc. dissolved soul Ubuntu 2 01-13-2007 12:55 PM
Virtually blank screen on Fedora 4 load JohnLocke Linux - Laptop and Netbook 4 08-11-2005 09:34 PM
Blank screen after trying to load 9.2 who kid Mandriva 6 03-11-2004 12:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 03:47 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration