LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 03-17-2014, 04:21 PM   #1
BensonBear
LQ Newbie
 
Registered: Feb 2005
Posts: 25

Rep: Reputation: 1
Kernel will not boot. How to find information about what is happening?


I am tying to boot a new kernel, but after it is selected in grub, it seemingly runs for a while, then just starts an automatic reboot, except for twice out of dozens of attempts where it did actually boot.

It does not print ANY MESSAGES AT ALL to the console, and it does not touch the /var/log/messages file or any other log files, except in those two cases where it booted.

I think it must be putting messages out to what it thinks is the console, because it runs for a few seconds before rebooting, but perhaps it is confused about where the console is. I uninstalled plymouth to minimize the number of programs that might be contending for control of video and perhaps creating a race condition or other kind of conflict.

This is a general question, not about specific kernels or hardware. With this problem, one needs to know how to get at the information that surely must be being generated. So I would like to know how this could be done in general. I guess I have to try to disable all graphics drivers except for a primitive generic one, but my understanding is that these will not be used until X is started, and the problem of no messages comes up well before that. Even if I boot into runlevel 1, the behavior is the same.

In terms of the specific setup. I am running Fedora 19, using kernel 3.12.9-201, and trying to boo any 3.13 kernel, such as the latest 3.13.6-100. I am using the nvidia proprietary drivers kmod-nvidia-3.13.6-100 (which are in the same rpms as those for 12.9-201). I will soon try to force the use of generic vesa drivers through the entire process, but I am a little wary of doing this for fear of messing up my working kernel/video combo and being left with no way to boot, so I hope there is something else I can try first.

Edit: googling has led me to /proc/sys/kernel/printk which I will try using to up the debug level in the kernel when I get home. However, I doubt this will produce any more info since there is already some printk's that are not being seen.

Closer examination of /var/log/messages (never did this particular sort of examination before) yields that since rsyslog only starts as process 300 or so, the kernel messages that get into /var/log/messages are actually generated before rsyslog starts and must be saved in the kernel for a while first. In fact, it is hard to understand what is going on here exactly. 30 seconds of kernel messages are inserted in what rsyslog calls 1 second. I can't believe rsyslog is that far behind in adding messages.

Last edited by BensonBear; 03-17-2014 at 05:14 PM. Reason: /proc/sys/kernel/printk
 
Old 03-17-2014, 07:11 PM   #2
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
The problem is, by nature, very difficult to solve. Even when you do get the kernel to start barfing out some messages, they will frequently roll off the screen too quickly to read them. For this reason, I find it useful to use the bootloader to, if possible, specify that the kernel use a serial console. Of course this implies that there is a serial port available, and that usually means one that is not on a USB dongle. If you are able to use a serial console, it will give you the advantage that any messages can be grabbed and stored by a suitably configured serial terminal emulator (I like C-Kermit, running on another Linux host). You can then analyze the messages, use Google to search for something contained in the messages, or even ask here about what you find.
 
Old 03-17-2014, 11:14 PM   #3
BensonBear
LQ Newbie
 
Registered: Feb 2005
Posts: 25

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by theNbomr View Post
The problem is, by nature, very difficult to solve. Even when you do get the kernel to start barfing out some messages, they will frequently roll off the screen too quickly to read them.
Yes, but the first problem is to get it putting out messages at all. I am suspecting the solution to this problem may itself point to the main problem.

Quote:
For this reason, I find it useful to use the bootloader to, if possible, specify that the kernel use a serial console. Of course this implies that there is a serial port available, and that usually means one that is not on a USB dongle. If you are able to use a serial console, it will give you the advantage that any messages can be grabbed and stored by a suitably configured serial terminal emulator (I like C-Kermit, running on another Linux host). You can then analyze the messages, use Google to search for something contained in the messages, or even ask here about what you find.
Too bad I don't have a serial port on the case or a serial cable, and this sounds complicated, so I will probably just use old kernel and hope someone else has and solves this problem. (There is a serial port on the motherboard, but my other (notebook) computer also has no serial at all).

Thanks for the suggestions. Any idea why there are no messages, though? Surely there are some, going somewhere?
 
Old 03-18-2014, 10:53 AM   #4
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
It's possible the kernel crashes before there are any drivers for whatever it is using as a console loaded. Coercing the kernel to emit messages isn't likely to be possible without rebuilding the kernel, or at least having the bootloader change some kernel arguments. Did the hardware ever support some working kernel? Perhaps the kernel is incapable of running on the hardware due to too little memory, or is missing drivers for some vital resource.
 
Old 03-22-2014, 10:52 PM   #5
BensonBear
LQ Newbie
 
Registered: Feb 2005
Posts: 25

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by theNbomr View Post
It's possible the kernel crashes before there are any drivers for whatever it is using as a console loaded. Coercing the kernel to emit messages isn't likely to be possible without rebuilding the kernel, or at least having the bootloader change some kernel arguments. Did the hardware ever support some working kernel? Perhaps the kernel is incapable of running on the hardware due to too little memory, or is missing drivers for some vital resource.
Yes, the hardware has supported many other kernels (Intel i3 540, Asus P7H55-M PRO, 4x4G Corsair XMS3 Classic 1333, NVidia 9800GT, Kingston SV300, Seagate Barracuda 7200.12, Corsair CX430) In fact every other kernel. As I mentioned but did not make adequately clear, I was running 3.12.9-201 at the time, and had run every other kernel I ever tried with no problem. Also, I have now tried the debug kernel and it seems to run it fine. I have also installed a completely pristine f20, and it boots with the stock 3.11.10-301, and after a yum update it does the same thing as f19 on the 3.13 kernels. That is, crashes with no messages on the normal kernel, but boots on the debug kernel. I have also tried building the debug kernel on f19, but it crashes on that build also, although I think it is not really the debug kernel since it has the same size as the regular kernel. I will try to do that again and also take closer look at the messages from the debug kernel but offhand don't see anything strange in there.
 
Old 03-23-2014, 07:21 AM   #6
ianbb01
LQ Newbie
 
Registered: Jan 2014
Posts: 7

Rep: Reputation: 3
You can modify grub settings of the problematic kernel to print all messages to serial console. Then connect from another PC using a serial cable to see what is going on (see https://wiki.archlinux.org/index.php...serial_console). You may need to edit /etc/rsyslog.conf and add the following line: *.* /dev/ttyS0 (or *.* |/dev/ttyS0).
 
Old 03-23-2014, 05:07 PM   #7
BensonBear
LQ Newbie
 
Registered: Feb 2005
Posts: 25

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by ianbb01 View Post
You can modify grub settings of the problematic kernel to print all messages to serial console. Then connect from another PC using a serial cable to see what is going on (see https://wiki.archlinux.org/index.php...serial_console). You may need to edit /etc/rsyslog.conf and add the following line: *.* /dev/ttyS0 (or *.* |/dev/ttyS0).
For this I will first have to purchase a serial port card to contect to this computer, then get a serial port cable, then get another computer that has a serial port as I don't have any of those things. I guess for a while I will have to just use the old kernal and the old os and hope someday something improves. Or I guess a serial port to usb adapter could work, that is stil some funds I would have to scrap up which would be very disappointing if it yields no results (such as if there are no messages at all anywhere as was suggesed previously).

If anyone knoww how to build the debug kernel to that might be helpful. I followed the instructions at https://fedoraproject.org/wiki/Building_a_custom_kernel bto build the debug kernel (with debug only) setting in kernel.spec, and it calls it debiug in its name, but it is the same size as the regular kernal and it doesnt boot either. It booted only once (the first time), then never again with no messages (as if the first boot messed something up -- perhaps that it possible).

This is VERY frustating, since in order to make a system fedora would care to eamine I have had to boot with nouveau, but even on the old kernel it was hard to get nouveau. It appears that it will not load with acpi off, which I do not understand. But acpi on has caused lots of problems in the past for me whih appear to be due to its conflicts with various sensor drivers. So I have now been getting random crashses in even the old 3.11 kernel. I blacklist the two possible sensor drivers which google as indicated may give problems asus_atk0110 and w83627eh, but subsequent to that I get even more ACPI Warnings of conflicts for which it is not clear to me who is reponsible. But I can't turn ACPI off and still get Nouveau to load.

It is now a HUGE CONFUSED MESS :-(
 
Old 03-24-2014, 10:46 AM   #8
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
So, perhaps you could benefit by looking at the problem differently. What aspect of the kernel you're trying to run is vital to you? What do you lose by rolling back to a working kernel? Is it possible to backport the required feature(s) of the new kernel to an older, working version?
 
Old 03-25-2014, 02:54 AM   #9
BensonBear
LQ Newbie
 
Registered: Feb 2005
Posts: 25

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by theNbomr View Post
So, perhaps you could benefit by looking at the problem differently. What aspect of the kernel you're trying to run is vital to you? What do you lose by rolling back to a working kernel? Is it possible to backport the required feature(s) of the new kernel to an older, working version?
I don't believe any specific functional aspect of the kernel is vital to me at this point. But that will eventually change and I don't want to get left behind. Eventually the kernels used in installation programs will be 3.13 or greater (and no 3.14 in Fedora Rawhide runs for me either).

I am also worried with something like this that it is evidence my hardware is faulty. This is especially true when coupled with some of the other problems I have mentioned. It is possible to attribute to them to flaky device drivers and the like most of the time. But in this case, I don't know. So I need to know whether I have to be looking into getting new hardware.

(Maybe about two years ago I got a new usb card not just to get usb3, but also to replace what I thought was flaky hardware which caused usb to fail randomly once in a while requiring a reboot to fix it. But one day after an upgrade I stopped having this problem. Seems like that one was indeed software, but this is a far more crucial thing).

If you think it is likely I could get some kind of messages of use via redirecting early kernel debugging messages via serial port to another computer, I may eventually try that, but right now I think it is not worth the trouble. I suspect there won't be many such messages. It seems there should be some other idea worth testing, given that it does boot once in a while. If it is software there may be some crucial race condition that usually doesn't work out the right way but sometimes does. (It boots once in a while on Fedora 20, not on Fedora 19, that is).

I need to find out how to build the debug kernel that Fedora ships, since it boots in that, but still not in the kernel I built from their SRPMS. Then I could work backward taking out more debug stuff until it fails.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
RTAI kernel can't find hard disk at boot, but normal kernel can jamwaffles Linux - Software 0 04-08-2009 01:44 PM
How to find out indone information and datablocks information in a file system chaitanya1982 Linux - Newbie 1 09-24-2008 01:58 AM
Kernel config options, where can I find more information on them ? danlee Linux - Kernel 2 02-24-2008 03:13 AM
From kernel compiling sig11 fails-to boot fail-what's happening? wolve Linux - Newbie 3 06-30-2004 02:51 PM
Strange happening...won't boot to kdm. Nu-Bee Linux - General 2 12-07-2003 01:22 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 11:08 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration