LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 05-15-2019, 05:49 PM   #1
petejc
LQ Newbie
 
Registered: Apr 2019
Distribution: Slackware
Posts: 21

Rep: Reputation: Disabled
New kernel, 5.1 / 5.1.2 boot failure strangeness


I'm a bit baffled. Running and booting kernel 5.1 on my desktop is fine. However, I've failed to boot my old Thinkpad T400 to an ext4 filesystem. I've had no issue with 5.0.8 and am using essentially the same lilo and kernel config for booting with both 5.0.8 and 5.1 / 5.1.2 and the latter two fail.

With the boot failures for 5.1.2 from looking at the boot messages it is finding the drive, the partitions and the ext4 file system on /dev/sda1. It does not appear to run the init scripts. I get a login prompt for 'Darkstar' immediately but am unable to log in. I know Darkstar is an old Slackware default. I presume I can't log in as /etc/shadow or /etc/passwd might not be the the normal ones. But I can't check as I can't log into 'Darkstar' and who knows what file system it thinks it is using.

My hostname is 'laptop-maint' and /etc/HOSTNAME contains 'laptop-main.fire' not 'Darkstar' so I really wonder where this login prompt is arising from? Looking through /etc and /etc/rc.d the only place 'Darkstar' comes up is in /etc/rc.d/rc.M, but given that /etc/HOSTNAME exists and is readable that default should be ignored.

I wondered whether it was picking up an old initrd and stalling there, but none is being used. I renamed an old one just in case, no effect.

Given that /dev/sda1 is the only disk partition with a file system, /dev/sda2 being swap and /dev/sda3 being a luks volume and no other disks, I really can't see where it is trying to boot from? The only thing left is whether there has been some last ditch init / getty functionality included in newer kernels and parhaps 'Darkstar' complied into the kernel, rather than just a panic if it fails to find a root partition, but I am not aware of such functionality.

Any thoughts as to what might be happening?

Last edited by petejc; 05-15-2019 at 05:51 PM. Reason: typo
 
Old 05-16-2019, 01:47 AM   #2
Petri Kaukasoina
Member
 
Registered: Mar 2007
Posts: 405

Rep: Reputation: 257Reputation: 257Reputation: 257
Yes, it's in the kernel config:
Code:
CONFIG_DEFAULT_HOSTNAME="darkstar"
 
1 members found this post helpful.
Old 05-16-2019, 03:01 PM   #3
petejc
LQ Newbie
 
Registered: Apr 2019
Distribution: Slackware
Posts: 21

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Petri Kaukasoina View Post
Yes, it's in the kernel config:
Code:
CONFIG_DEFAULT_HOSTNAME="darkstar"
THank you. Building 5.1.3 (which has just come out) with the default hostname changed, so I can tell if it comes from the kernel. Still does no explain why root mounts yet I don't get a normal boot. Still, I will find out if 5.1.3 fixes it for some inexplicable reason.
 
Old 05-16-2019, 04:48 PM   #4
petejc
LQ Newbie
 
Registered: Apr 2019
Distribution: Slackware
Posts: 21

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by petejc View Post
THank you. Building 5.1.3 (which has just come out) with the default hostname changed, so I can tell if it comes from the kernel. Still does no explain why root mounts yet I don't get a normal boot. Still, I will find out if 5.1.3 fixes it for some inexplicable reason.
OK, 5.1.3, with default hostname set to 'Pete_intel'. Kernel mounts the ext4 filesystem on /dev/sda1 but does not run the init scripts as it should and immedately gives me a login, but the hostname is now 'Pete_intel', so it is picking this up from the kernel, not /etc/HOSTNAME, which is set, nor the default in /etc/rc.d/rc.M.
 
Old 05-24-2019, 12:58 PM   #5
petejc
LQ Newbie
 
Registered: Apr 2019
Distribution: Slackware
Posts: 21

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by petejc View Post
OK, 5.1.3, with default hostname set to 'Pete_intel'. Kernel mounts the ext4 filesystem on /dev/sda1 but does not run the init scripts as it should and immedately gives me a login, but the hostname is now 'Pete_intel', so it is picking this up from the kernel, not /etc/HOSTNAME, which is set, nor the default in /etc/rc.d/rc.M.
Solved, sort of.

I've built kernel 5.1.4 on a different machine and that seems to boot OK via lilo. Not sure what the difference is apart from a minor bump in version number and that I used a rather out of date copy of slackware-current to build the kernel.
 
Old 06-04-2019, 09:26 AM   #6
Ilgar
Senior Member
 
Registered: Jan 2005
Location: Istanbul, Turkey
Distribution: Slackware64 14.2, Slackwarearm-current
Posts: 1,008

Rep: Reputation: 128Reputation: 128
Quote:
Originally Posted by petejc View Post
Solved, sort of.

I've built kernel 5.1.4 on a different machine and that seems to boot OK via lilo.
I think I stumbled upon a seemingly similar problem. My old Dell Precision M4300 laptop running Slackware64 14.2 fails to boot with 5.1.x, although 5.0.6 was working fine. I tried 5.1.4 and 5.1.7, the first built on a relatively modern computer and the second on the Dell itself. In both cases I get the darkstar prompt. However, before it loads the modules from initrd and right after the initialization of eudev I see a "Bus error" line in the output. It appears again after the initrd module-loading messages. Do you remember if you had that error, too?

Btw, in both cases the builds were made on Slackware64 14.2.
 
Old 06-12-2019, 06:24 AM   #7
duncan_roe
Member
 
Registered: Nov 2008
Posts: 33

Rep: Reputation: 4
Me too

I have a similar-sounding problem. 5.1 installed fine on my laptop. But the identical kernel gave boot weirdness on the desktop. The system came up to the login prompt very quickly, without changing from the initial terminal font to a smaller one as it usually did. There was a message from INIT that "SV" was respawning too quickly. When I supplied user name and password, I just got the login prompt again.
5.0.9 works great.
I tried building with the .config from current (kernel-source-4.19.49-noarch-1.txz, file usr/src/linux-4.19.49/arch/x86/configs/x86_64_defconfig). PCI support was turned off initially, so rebuilt with it on (and SELINUX turned off). Worked slightly better, could log in but fonts were weird and X looked horrible.
Just noticed top-level usr/src/linux-4.19.49/.config - will try that.
Desktop mainboard is approx 10 YO while laptop is < 2YO. Desktop has nvmE disks but they don't seem to be the problem.
Hope it's just some type of config issue but ... what?

Last edited by duncan_roe; 06-12-2019 at 06:26 AM.
 
Old 06-12-2019, 07:09 AM   #8
Petri Kaukasoina
Member
 
Registered: Mar 2007
Posts: 405

Rep: Reputation: 257Reputation: 257Reputation: 257
Quote:
Originally Posted by duncan_roe View Post
I tried building with the .config from current (kernel-source-4.19.49-noarch-1.txz, file usr/src/linux-4.19.49/arch/x86/configs/x86_64_defconfig).
That is not what current uses. From kernel-source-4.19.49-noarch-1.txz, try file usr/src/linux-4.19.49/.config
 
Old 06-13-2019, 04:40 AM   #9
Ilgar
Senior Member
 
Registered: Jan 2005
Location: Istanbul, Turkey
Distribution: Slackware64 14.2, Slackwarearm-current
Posts: 1,008

Rep: Reputation: 128Reputation: 128
Quote:
Originally Posted by duncan_roe View Post
There was a message from INIT that "SV" was respawning too quickly. When I supplied user name and password, I just got the login prompt again.
In my case it said "x1" was spawning too fast. However, I believe this is not a manifestation of the main issue but only a side-effect of it. Did you see any "bus error" messages like I did? Given this message and the fact that I was unable to find similar bug reports in Google, I suspect that the culprit is some rare combination of software, such as eudev 3.1.5 (from 2015) and kernel 5.1. I would like to try updating eudev but I don't know how complicated it can get, so I am waiting for now.
 
Old 06-16-2019, 06:49 PM   #10
duncan_roe
Member
 
Registered: Nov 2008
Posts: 33

Rep: Reputation: 4
I have yet to notice a bus error. Where I am at now is: I do have a .config that will boot, but the resultant system is not very usable. This .config is usr/src/linux-4.19.49/arch/x86/configs/x86_64_defconfig, migrated to 5.1.8 by successive iterations of make xconfig plus some manual diffs to get device 259 recognised (nvmE disks) (attached)
The system mis-reports /dev/sda as having 1 partition when it actually has 4 (or it might have only seen what is normally /dev/sdb - one of them is SATA and the other is IDE but they both have multiple partitions).
There is no network: ifconfig -a shows sit0 instead of eth0.
In case it's any help, I've attached dmesg o/p in that system.
There's more, but I have to go now
Attached Files
File Type: txt x86_64_defconfig_manual_diffs.txt (2.3 KB, 2 views)
File Type: txt config_that_boots_5.1.8.txt (107.0 KB, 1 views)
File Type: txt dmesg-5.1.9-k8_64S3.txt (52.9 KB, 2 views)
 
Old 06-17-2019, 03:32 AM   #11
Ilgar
Senior Member
 
Registered: Jan 2005
Location: Istanbul, Turkey
Distribution: Slackware64 14.2, Slackwarearm-current
Posts: 1,008

Rep: Reputation: 128Reputation: 128
Thanks duncan_roe. I would still suspect eudev, since it is eudev's responsibility to create the device nodes correctly. Perhaps if I have the time I will try upgrading eudev and recompiling other stuff if necessary. This is kind of like Linux from Scratch which I have not much experience with, though, and I may give up early on. Any advice on how to do an eudev upgrade is welcome.

By the way, is there a particular reason why you did not use 'make oldconfig' to migrate your old config? It should have made the switch in one go.
 
Old 06-17-2019, 09:06 PM   #12
duncan_roe
Member
 
Registered: Nov 2008
Posts: 33

Rep: Reputation: 4
I was under the impression that make xconfig did an implied make oldconfig first. It has worked for me for 25 years anyway. At least it has worked for x.y -> x.y+1, not always for bigger jumps.
I was going to document my further experiences but think I have a better plan now: I'm going to git bisect between v5.0 and v5.1 until I find the culprit patch. When I do find it, raise a bug report.
 
Old 06-19-2019, 02:01 AM   #13
duncan_roe
Member
 
Registered: Nov 2008
Posts: 33

Rep: Reputation: 4
git bisect is progressing - looks like problem appeared around 5.1.rc7

Last edited by duncan_roe; 06-19-2019 at 02:03 AM.
 
Old 06-19-2019, 08:13 AM   #14
duncan_roe
Member
 
Registered: Nov 2008
Posts: 33

Rep: Reputation: 4
Please try this patch

git bisect identified commit 459e3a21535ae3c7a9a123650e54f5c882b8fcbf as the culprit. This is the log entry:
Quote:
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed May 1 11:20:53 2019 -0700

gcc-9: properly declare the {pv,hv}clock_page storage

The pvlock_page and hvclock_page variables are (as the name implies)
addresses to pages, created by the linker script.

But we declared them as just "extern u8" variables, which _works_, but
now that gcc does some more bounds checking, it causes warnings like

warning: array subscript 1 is outside array bounds of "u8[1]"

when we then access more than one byte from those variables.

Fix this by simply making the declaration of the variables match
reality, which makes the compiler happy too.

Signed-off-by: Linus Torvalds <torvalds@-linux-foundation.org>
I made an antidote to that patch, attached as revert_459e3a21.txt. When I applied it to Linux 5.1.12, the kernel booted normally. 5.1.12 had just appeared, I guess it should work for any 5.1.
Please try this patch. In your top-level Linux source directory, do cat revert_459e3a21.txt | patch -p1 Then [re-] build the kernel.

2 QUESTIONS:

1. Are we all only seeing a problem on old hardware?

2. AMD, Intel or what? (both my old and new systems are AMD)

Any answers will be helpful for the bug report I now need to put together
Attached Files
File Type: txt revert_459e3a21.txt (629 Bytes, 2 views)
 
Old 06-19-2019, 03:45 PM   #15
Ilgar
Senior Member
 
Registered: Jan 2005
Location: Istanbul, Turkey
Distribution: Slackware64 14.2, Slackwarearm-current
Posts: 1,008

Rep: Reputation: 128Reputation: 128
Aha, that was quick. Great work on your part duncan_roe. Unfortunately I don't have that computer with me right now but I will compile a kernel with the patch reverted and try it this weekend. I will report the results here (or on kernel.org, if you open the bug report by then. I already have an account there).
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Failure after failure after failure.....etc 69Rixter Linux - Laptop and Netbook 5 04-14-2015 09:58 AM
Alsaplayer strangeness bjb123 Linux - Software 1 02-13-2003 09:39 PM
Booting strangeness cipher_arg Linux - General 4 12-17-2002 01:00 PM
Samba file copy strangeness sts_cat Linux - Software 1 11-15-2002 09:16 AM
Samba strangeness jharris Linux - Networking 3 10-09-2001 03:38 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 05:23 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration