LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Mandriva
User Name
Password
Mandriva This Forum is for the discussion of Mandriva (Mandrake) Linux.

Notices

Reply
 
Search this Thread
Old 10-16-2003, 03:20 PM   #1
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Rep: Reputation: 69
Suddenly kernels won't finish booting


I have Mandrake 9.0. Recently my primary hard disk drive went bad (starting having read and write errors) so I bought an identical hard drve and used tar to copy over all the information from every partition to my new hard drive.

I changed /etc/fstab to mount everything on the new disk. The only operation happening on the old disk was booting (still booted from the MBR on that disk). Eventually I decided just to remove the old drive entirely and boot off the new disk. I ran lilo with the proper config file and specified the new hd as the boot device, then I set the BIOS to boot off the second disk. Everything seemed to go fine.

The problem I now have is that some of the kernels will boot all the way to "freek xxxK of memory" and some kernels will only get to "unknown bridge, assuming transparent". I thought it was a devfs issue at first, because the failsafe kernel has devfs=nomount and it was able to get nearly to the end of the boot sequence, but I tried setting devfs=nomount on the 2.4.19-16 kernel and rerunning lilo and that didn't change the behavior (would still stop at the very beginning after loading the kernel into memory). I did a little experimenting and notice that the nonfb option would also get nearly to the end of the boot process befor dying.

What in the world could be causing this problem? I ran fsck.ext3 -f on every partition and there weren't any problems. I can mount all the disks fine and view all the data from Knoppix, so it doesn't seem like a disk issue. I tried commenting out the initrd line and running lilo again (so it wouldn't use the ramdisk for booting) but that didn't make any difference either.

I'm so frustrated that there isn't any debugging information available. I'm about ready to just backup all my data and install FreeBSD instead because I never had any inexplicable problems like this with BSD.
 
Old 10-17-2003, 04:22 AM   #2
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Original Poster
Rep: Reputation: 69
OK, so I think I've narrowed it down a bit more. The part think it's getting hung up on is starting init. Normally when I boot the Init 2.xx banner is the next thing that comes up after the messages about the bridging devices... Also, init does not start when I boot with failsafe or nonfb kernels.

I tried booting my normal kernel with the options of single and debug and it does exactly what the failsafe config does.

Is my init some how damaged? Is it possible to repair init?

edit:
Well the saga continues. I still cannot figure out what is causing this problem. The research I've done today seems to indicate that the part it's getting stuck on is mounting the root fs. I'm not sure how that can be the case, since Knoppix can mount the same partition just fine.

I'm getting pretty desparate here. I'm thinking about reinstalling Mandrake from CDs and mounting / on a different partition, then copy my old /etc to the new root partition and try booting that way (using old /usr, /var, /home, etc but with new /). Of course I'll have to copy /bin, /sbin, and anything else that may have been changed by packages... hmm, I guess /lib too. Does this seem like a reasonable solution?

Last edited by chort; 10-17-2003 at 02:41 PM.
 
Old 10-17-2003, 04:36 PM   #3
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Original Poster
Rep: Reputation: 69
So a little bit more information. Passing init=/bin/sh on the boot prompt has no effect (still halts at the same place) so I'm assuming that means the kernel is not even getting as far as evoking init. What else could happen in between loading the kernel into memory and starting init that would cause it to halt? If I don't specify emergency/single/failsafe at the boot prompt, it halts immediately after the "loading kernel ... unknown bridging resource ... " section. If I do pass single/emergency/failsafe it will get up to the section where it mounts devfs on /dev, then frees 136K of kernel memory, then... nothing.

Buehler, Buehler... anyone?
 
Old 10-19-2003, 04:45 AM   #4
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Original Poster
Rep: Reputation: 69
Well in case anyone is interested, I finally gave up, bought a third hard drive, and installed OpenBSD. OBSD can read the data off all the Linux partitions just fine, so I copied over my important data and all the necessary configuration files and just started over.

Oh well, only one Linux box left on my network now and it's days are numbered. I'm having much better luck with OpenBSD and FreeBSD.
 
Old 10-20-2003, 11:49 AM   #5
aus9
Guru
 
Registered: Oct 2003
Posts: 5,060

Rep: Reputation: Disabled
chort

assuming you have still kept that nasty old drive, did you look at the jumpers?
2) did you try dd if=/dev/hda of=/dev/hdc? or whatever /etc/fstab thought it was?

just a thought as disk dump should also dump the mbr. Now assuming I am right, each drive has a mbr so the new drive will think its now /dev/hda so you pull out the new new drive and put the second good one into the primary jumper position
 
Old 10-26-2003, 03:55 AM   #6
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Original Poster
Rep: Reputation: 69
aus9, thanks for the advice. I actually had the first and second HDD both setup as primaries, but on different IDE channels. As for the mbr, that's rewritten by LILO any way (at least, that's my understanding) and I tried both the Mandrake LILO and the Knoppix (Debian) LILO to install the boot image. Neither one had errors (well, after I removed the message since the Debian LILO couldn't handle the big /boot/message) but they both had the same results with booting.

I do not think it's a problem with the MBR since I can get all the way through the built-in kernel modules if I boot in single user mode. It's the spot right around where the root file system gets mounted that it dies. Since both Debian and OpenBSD can mount all the file systems, I find that rather curious.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
grub suddenly stops booting XP mab_123 Linux - Software 3 03-26-2005 10:08 AM
sarge suddenly rebooting (30ish seconds after booting) tfotherby Debian 6 09-10-2004 09:56 AM
SuSE 9.1 / Windows XP suddenly dual booting fails ticker Linux - General 8 06-17-2004 04:44 AM
Need help with booting 2 kernels CodeWarrior Slackware 6 12-06-2003 04:30 PM
loaded slackware, won't finish booting. stops at cs: IO Port probe Whitehat Slackware 5 10-15-2003 03:18 PM


All times are GMT -5. The time now is 04:51 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration