LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (http://www.linuxquestions.org/questions/slackware-14/)
-   -   Generic kernel boot failure - drives not being seen (http://www.linuxquestions.org/questions/slackware-14/generic-kernel-boot-failure-drives-not-being-seen-4175450733/)

eduardr 02-18-2013 10:28 PM

Generic kernel boot failure - drives not being seen
 
Hello,

On a couple of new rack servers, I cannot get
the generic kernel to boot on Slackware 14 x64.
It can't find the disk(s) and panics. Huge
kernel works fine.

What I see when booting with the generic kernel
is VFS Cannot Open Root Device 900 when attempting
to boot off the md0 device, and when I tested
trying to boot off sda1 I got a similar error
unable to open root device 801. This suggests the
issue is with drive recognition under the generic
kernel.

Hardware info:
Supermicro 1027R-72RFTP server
Intel E5-2600, C602 chipset
Onboard LSI 2208 SAS2 in JBOD mode, megaraid_sas

Have followed the usual instructions for getting
an initrd created and lilo configured. I've had
generic kernels running on other machines with
13.37 in the past no problem.

I'm using RAID1 across the 2 drives, metadata 0.9
on root partition (includes boot) and metadata
1.2 on all the rest.

lilo.conf
---------
append=" vt.default_utf8=1"
boot = /dev/md0
raid-extra-boot = mbr
lba32
large-memory

image = /boot/vmlinuz
initrd = /boot/initrd.gz
root = /dev/md0
label = Default
read-only
image = /boot/vmlinuz-generic-3.2.29
initrd = /boot/initrd.gz
root = /dev/md0
label = Generic
read-only
image = /boot/vmlinuz-huge-3.2.29
root = /dev/md0
label = Huge
read-only

mkinitrd.conf
-------------
# mkinitrd.conf.sample
# See "man mkinitrd.conf" for details on the syntax of this file
#
#SOURCE_TREE="/boot/initrd-tree"
#CLEAR_TREE="0"
#OUTPUT_IMAGE="/boot/initrd.gz"
#KERNEL_VERSION="$(uname -r)"
#KEYMAP="us"
MODULE_LIST="ext4:megaraid_sas"
#LUKSDEV="/dev/sda2"
#LUKSKEY="LABEL=TRAVELSTICK:/keys/alienbob.luks"
ROOTDEV="/dev/md0"
ROOTFS="ext4"
#RESUMEDEV="/dev/sda2"
RAID="1"
#LVM="0"
UDEV="1"
#MODCONF="0"
#WAIT="1"

The mkinitrd_command_generator.sh spits out
mkinitrd -c -k 3.2.29 -f ext4 -r /dev/md0 -m megaraid_sas:usb-storage:ehci-hcd:usbhid:mbcache:jbd2:ext4 -R -u -o /boot/initrd.gz

which matches what I put in my mkinitrd.conf (aside from the
unnecessary usb stuff).

Thanks in advance for any hints, not sure how to
even attempt to troubleshoot this any further.

--Ed

Didier Spaier 02-18-2013 11:36 PM

Blind guess: may be your disk needs a little more time to sync.

So I'd add to the relevant entry in lilo.conf:
Code:

append="rootdelay=30"
then run "lilo".

Adapt to the time really needed, the delay is in seconds.

To know more, see /home/src/linux/Documentation/kernel-parameters.txt

eduardr 02-18-2013 11:55 PM

Didier,

Thanks for the suggestion! I tried and still getting the
boot error (ocr'd it from the screenshot) -

6.0516081 Waiting 45sec before mounting root device ...
51.0561251 md: Waiting for all devices to be available before autodetect
51.0562511 md: If you don't use raid, use raid=noautodetect
51.0570591 md: Autodetecting RAID arrays.
51.0571751 md: Scanned 0 and added 0 devices.
51.0572871 md: autorun ...
51.057399 I md: ... autorun DONE.
51.0575111 RAMDISK: Couldn't find valid RAM disk image starting at 0.
51.0716031 UFS: Cannot open root device "900" or unknown-block(9,0)
51.0717'151 Please append a correct "root=" boot option; here are the
available partitions:

Kernel panic- not syncing: UFS: Unable to mount root fs on unkno
Pid: 1, comm: swapper/0 Not tainted 3.2.29 #1
Ca II Trace:
<ffffffff811f9126>l panic•0x91/0x189
l<ffffffff81896f58>l mount_block_root•0x1ce/0x27f
<ffffffff818971ec>l mount_root•0xal/0xa5
l<ffffffff8189735d>l prepare_namespace+0x16d/0x1a6
l<ffffffff8107acd0>l? release_tgcred.isra.5•0x30/0x30
l<ffffffff81896cc6>l kernel_ init•0x11b/0x150
l<ffffffff81501e31>l kernel_thread_helper•0x1/0x10
l<ffffffff81896b7b>l ? start_kernel•0x390/0x390
I <ffffffff81501e30> J ? gs_change•0xb/0xb

Didier Spaier 02-19-2013 02:31 AM

Quote:

Originally Posted by eduardr (Post 4894818)
51.0575111 RAMDISK: Couldn't find valid RAM disk image starting at 0.

This seems to indicate a problem with your initrd.

My lack of knowledge prevents me to help you solve it, but there is still the possibility to reconfigure your kernel with all drivers needed at boot time (especially the file system used for /) built in, then re-compile and install it.

This could help you doing that.

comet.berkeley 02-19-2013 06:03 PM

Since the huge kernel works and the smaller one doesn't it definitely looks like an initrd problem.

Quote:

Originally Posted by Didier Spaier (Post 4894890)
This seems to indicate a problem with your initrd.

Look at the /boot/README.initrd file for documentation on initrd.

Essentially an initial ramdisk (initrd) is created to contain kernel modules that are not included in the kernel.

I used to mess around with initrd files, but after many years I realized that it is not worth the trouble.

Now I always use "huge" kernels and keep it simple.

Didier Spaier 02-19-2013 06:19 PM

Yes, but in the few cases where some conflicting driver can't be blacklisted (because it is built-in) nor disabled using some specific parameter.

In that case you end up either using an initrd or re-compiling the kernel whithout this driver.

eduardr 02-27-2013 05:29 PM

Thanks again for all the help, I compiled a new kernel 3.2.39 version with megaraid_sas, mpt2sas (for another server I use), ext2, 3, and 4 built in and it boots fine now! I based this kernel config on the generic config shipped with Slackware 14. Why the initrd didn't work, some mysteries must remain forever unsolved :).

I mainly followed the guide at http://alien.slackbook.org/dokuwiki/...kernelbuilding but instead of using a local version name in the kernel config I just used a newer kernel release source (3.2.39) so that it would not conflict with the default kernels and modules installed by Slack 14.

Cheers,
--Ed


All times are GMT -5. The time now is 12:25 PM.