LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 10-01-2011, 02:55 AM   #1
jrhorn424
LQ Newbie
 
Registered: Oct 2011
Location: Virginia
Posts: 20

Rep: Reputation: Disabled
Question Boot fails after setting up software RAID 1+0 (or RAID 10)


I'm getting a kernel panic due to bad partition tables on my RAID arrays. This error isn't helpful, since I know the partition tables on the device disks to be fine.

I'm using `mdadm` to setup software RAID according to the procedure outlined in `README_RAID.txt` file on the installation DVD.

Specifically, I have four physical SATA drives showing up as `/dev/sd[abcd]`. I created my partition table using `fdisk` as below on `/dev/sda`, all partitions using type FD, or "Linux software RAID autodetect".

Code:
  PARTITION     SIZE         PURPOSE  
 -------------+------------+---------
  ~/dev/sda1~   100M         boot     
  ~/dev/sda2~   about 300G   root     
  ~/dev/sda3~   2G           swap
I think used `sfdisk` to copy this partition table to the three other devices, for example:

Code:
sfdisk -d /dev/sda | sfdisk /dev/sdb
So far, so good. Then I setup my arrays using `mdadm`, for example:

Code:
mdadm --create /dev/md0 --level 1 --raid-devices 4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 --metadata=0.90
This was a weird array. What it does is mirror the 100M boot partition across all four disks, making it (somewhat) immaterial which disk BIOS finds first. Here are the arrays I created, none of which used the metadata flag in the example code.

Code:
| ARRAY      | RAID LEVEL | # DEV | DEVICES          | PURPOSE                  | MOUNT |
|------------+------------+-------+------------------+--------------------------+-------|
| ~/dev/md0~ |          1 |     4 | ~/dev/sd[abcd]1~ | Mirrored boot partition  | /boot |
| ~/dev/md1~ |          1 |     2 | ~/dev/sd[ab]2~   | Mirrored root partition  |       |
| ~/dev/md2~ |          1 |     2 | ~/dev/sd[ab]3~   | Mirrored swap partition  |       |
| ~/dev/md3~ |          1 |     2 | ~/dev/sd[cd]2~   | Stripe minor for root    |       |
| ~/dev/md4~ |          1 |     2 | ~/dev/sd[cd]3~   | Stripe minor for swap    |       |
| ~/dev/md5~ |          0 |     2 | ~/dev/md[13]~    | Stripe the mirrored root | /     |
| ~/dev/md6~ |          0 |     2 | ~/dev/md[24]~    | Stripe the mirrored swap |       |
So, I was shooting for RAID 1+0 with the root and swap partitions, RAID 1 for the boot partition. My intuition is that these are setup correctly, since the correct sizes show up in the Slackware setup menu for both mounts and the swap space. That is, I have `md1` at 100M, `md5` at about 600G, and `md6` at 4G.

I installed LILO as specified in the README_RAID file, and since I got a kernel panic on reboot, I used the install DVD to double to switch to the generic kernel, make the initrd, edit the `/etc/lilo.conf`, and even edit the `/etc/mdadm.conf` file. The last edit had no appreciable affect on booting.

I don't know what to try next. Any advice?
 
Old 10-01-2011, 06:02 AM   #2
wildwizard
Member
 
Registered: Apr 2009
Location: Oz
Distribution: slackware64-14.0
Posts: 875

Rep: Reputation: 282Reputation: 282Reputation: 282
Well first the 'fd' raid auto detect doesn't and should be removed from the README ASAP, you should use the partition type 'da' as stated in the *current* Linux RAID documentation.

To enable the RAID array when booting a / on RAID you need to pass '-R' to mkinitrd to enable raid support.

I strongly suggest you keep /boot off RAID until you have the system working right then you can go ahead and set it up.

Also according to the LILO docs for LILO you will need to use metadata v0.90 for the /boot partition unless you are running -current which has a newer version of LILO that can use metadata v1.0 (LILO should fail to install with an error message if you get it wrong)

I also highly recommend reading the lilo raid doc which can be found inside the source file for lilo as it explains how lilo works with RAID (and how it should be installed) to enable the system to be bootable from any disk.

EDIT:

When you read the lilo raid doc you will also see that it makes particular note that everything needed for lilo and the kernel to load must be in /boot

Last edited by wildwizard; 10-01-2011 at 06:04 AM.
 
1 members found this post helpful.
Old 10-01-2011, 05:50 PM   #3
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,858

Rep: Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225Reputation: 2225
Quote:
Originally Posted by wildwizard View Post
Well first the 'fd' raid auto detect doesn't
Since when? It auto-detects just fine on my 13.37 system.
 
Old 10-01-2011, 05:54 PM   #4
jrhorn424
LQ Newbie
 
Registered: Oct 2011
Location: Virginia
Posts: 20

Original Poster
Rep: Reputation: Disabled
More details

Quote:
I strongly suggest you keep /boot off RAID until you have the system working right then you can go ahead and set it up.
I should have been more specific. LILO acts fine and I suppose that part of the boot phase is successful. I'm getting the kernel panic since the kernel doesn't know what to make of the partition tables on the bare disks, and it somehow isn't automatically pickup up the RAID arrays created by `mdadm`.

Specifically, the startup sequence fails after the kernel tries to mount the root partition, not boot.

I can't make heads or tails of what I should be putting the `/etc/mdadm.conf` file, if anything.

Quote:
Well first the 'fd' raid auto detect doesn't and should be removed from the README ASAP, you should use the partition type 'da' as stated in the *current* Linux RAID documentation.
I'll try repartitioning and reinstalling with the `da` type after I try a few other solutions.

Quote:
To enable the RAID array when booting a / on RAID you need to pass '-R' to mkinitrd to enable raid support.
To be clear, I did edit the `mkinitrd` configuration by setting `RAID="1"` and enabling the appropriate kernel modules. I will try remaking while passing that flag.

Quote:
Also according to the LILO docs for LILO you will need to use metadata v0.90 for the /boot partition unless you are running -current which has a newer version of LILO that can use metadata v1.0 (LILO should fail to install with an error message if you get it wrong)
Thanks for the warning. As you can see in my OP, I did use v0.90 for the metadata on the /boot partition.

Quote:
I also highly recommend reading the lilo raid doc which can be found inside the source file for lilo as it explains how lilo works with RAID (and how it should be installed) to enable the system to be bootable from any disk.
That was helpful, but I think most of this advice has been folded into README_RAID.txt file on the Slackware install DVD.

Thanks for all the suggestions. I was really stuck yesterday, so I'll try these out and let you know what happens.
 
Old 10-01-2011, 08:50 PM   #5
wildwizard
Member
 
Registered: Apr 2009
Location: Oz
Distribution: slackware64-14.0
Posts: 875

Rep: Reputation: 282Reputation: 282Reputation: 282
Another thing have you created partitions on the raid arrays or not?

I currently use partitions on the RAID array so my root is /dev/md0p1, not sure if you can still put the filesystem directly on the RAID array.

Quote:
Originally Posted by Richard Cranium View Post
Since when? It auto-detects just fine on my 13.37 system.
Does it or is it mdadm from an initrd doing the detecting.

I would suggest RTFM but the new documentation is stored on a kernel.org server and if you haven't already heard about that problem :- http://www.google.com.au/#q=kernel.org+hacked

If it comes back :- http://raid.wiki.kernel.org/
 
Old 10-02-2011, 12:32 AM   #6
jrhorn424
LQ Newbie
 
Registered: Oct 2011
Location: Virginia
Posts: 20

Original Poster
Rep: Reputation: Disabled
Talking

Quote:
Originally Posted by wildwizard View Post
I would suggest RTFM but the new documentation is stored on a kernel.org server and if you haven't already heard about that problem :- http://www.google.com.au/#q=kernel.org+hacked

If it comes back :- http://raid.wiki.kernel.org/
Yes, the thought had occurred to me, but I was dismayed to find the raid howto stored on kernel.org! :-D

I got it working in the end. I'll post a write up tomorrow. The short of it is indeed a RTFM kind of story... I thought the README had me completely covered, but I should have read the `mdadm` manual. Turns out, `mdadm` natively supports RAID 10. The gibberish I set up (creating an md device from two other md devices) was how I logically thought RAID 10 would go, but it apparently created some syncing loops between the partitions.
 
Old 10-02-2011, 12:30 PM   #7
jrhorn424
LQ Newbie
 
Registered: Oct 2011
Location: Virginia
Posts: 20

Original Poster
Rep: Reputation: Disabled
Talking RAID 10 on Slackware 13.37 Mini How-to

I'm posting this to document the correct procedure for myself, as well as to help anyone else who had this issue.

The Problems
After trying everything mentioned in this thread, I stumbled upon a clue finally reading the mdadm manual: mdadm supports RAID 10. As best I can gather, creating an /dev/mdN device from two other md devices won't work, since during boot, the system was reporting having to re-sync several of the arrays, right before the kernel panic.

On boot, I received the following error messages:

Code:
MOUNT:   Mounting /dev/md5 on /mnt failed: No such file or directory.
ERROR:   No /sbin/init found on rootdev (or not mounted). Trouble ahead.
I checked my partition tables (of the constituent drives) using fdisk and was warned:

Code:
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
No, even after I got RAID working, this warning message showed up on the bare drives, so I think it's normal when using mdadm. In any case, rewriting these partition tables from scratch didn't even fix the problem.

After double checking /etc/fstab and switching to the huge kernel, the problem persisted, so I suspected the problem lay with mdadm. That's when I started over, using mdadm --create /dev/md0 --level 10 instead.

One more note before I move on: I loathe fdisk, now. It apparently subscribes to the philosophy that usability may be sacrificed for precision, but the existence of cfdisk shows that isn't a necessary tradeoff. My gribe is specific: fdisk defaults to using sectors instead of cylinders, and (after being thoroughly confused by the math) I found out none of my partitions ended at cylinder boundaries. Hell, the default starting sector was almost 2000 sectors ahead of the first cylinder. I found out you could switch to cylinder mode using `u`, but found out that was deprecated.

The easiest solution for me? Create the partitions with cfdisk. Note, however, that cfdisk likes to make the first partition swap even if you tell it to type it as `fd`. The solution was to delete that partition and recreate it in fdisk. Since it was already the right size and cylinder, I started in cylinder mode, printed the table, and used the same addresses to recreate the partition.

The Solution
Step One: Plan Partitions

I had a few goals. First, I wanted to separate the /boot partition from the rest so I could mirror it across all drives, as suggested by the lilo RAID readme and by the README_RAID.txt on the Slackware install DVD. I also wanted to separate /usr and /home partitions to make upgrading simpler. I ended up using LVM from the previous two partitions, and keeping /boot, root, and swap as primary partitions. We'll be referring back to this table when setting up mdadm arrays.

Code:
  MD #   Partition #   Mount        Size (Full/2 for RAID 10)   RAID Devices   RAID Level  
 ------+-------------+------------+---------------------------+--------------+------------
     0             1   ~/boot~      100M                                   4            1  
     1             2   ~/~          20G/2 = 10G                            4           10  
     2             3   swap         4G/2 = 2G                              4           10  
 ------+-------------+------------+---------------------------+--------------+------------
     -             4   Extended     -                                      -            -  
     3             5   ~/usr~       40G/2 = 20G                            4           10  
     4             6   ~/home~      Remainder (about ~250G)                4           10  
     -             -   Free space   ~100MB                                 -            -
Step Two: Create Partitions
Create the partitions using both cfdisk and fdisk, as noted in the last two paragraphs of the previous section.

After you create the partitions on the first device, use sfdisk to copy that partition table to the other drives, and check that they are parallel (all partitions start and end on the same sector, important for lilo) using fdisk.

Code:
sfdisk -d /dev/sda | sfdisk [--force] /dev/sdb
Optionally add the --force flag to right pipe-side if needed, but be sure to reboot before continuing if you do.

Step Three: Create RAID Arrays
For my partitions, I created the following commands to create all my arrays.
Code:
mdadm --create /dev/md0 --level 1 --raid-devices 4 /dev/sd[abcd]1 --metadata=0.90
mdadm --create /dev/md1 --level 10 --raid-devices 4 /dev/sd[abcd]2
mdadm --create /dev/md2 --level 10 --raid-devices 4 /dev/sd[abcd]3
mdadm --create /dev/md3 --level 10 --raid-devices 4 /dev/sd[abcd]5
mdadm --create /dev/md4 --level 10 --raid-devices 4 /dev/sd[abcd]6
Step Four: Setup
Go ahead and type setup to start the Slackware setup menu. Select your target partitions first, making sure to select the root partition (/dev/md1 in my case) before setting up all the other partitions. Then, double check to make sure the correct swap partition is selected, and begin the installation.

When it comes time to configure lilo, just choose the simple option, and add the following flag when prompted: root=/dev/md1 (or whatever your root is). Exit setup when finished. Do not reboot.

Step Five: Edit Configuration
Switch to the new installation by typing:
Code:
chroot /mnt
Edit /etc/lilo.conf, adding the following line without indenting:
Code:
raid-extra-boot = mbr-only
This increases the robustness of your boot setup. Change the boot=something line to the correct RAID device, in my case:
Code:
boot = /dev/md0
Save the file. You don't have to run lilo yet.

Step Six: Switch to Generic Kernel
The /boot partition should be mounted, but if it isn't, type:
Code:
mount /dev/md0 /boot
Perform the following commands to remove the current kernel and activate the generic one:
Code:
cd /boot
rm vmlinuz System.map config
ln -s vmlinuz-generic-[TAB] vmlinuz
ln -s System.map-generic-[TAB] System.map
ln -s config-generic-[TAB] config
Prepare an initrd by first copying the example config file
Code:
cp /etc/mkinitrd.conf.example /etc/mkinitrd.conf
Then edit the configuration file. I uncommented as many lines as seemed necessary and filled in the appropriate values, including ROOTDEV and ROOTFS. The most important ones are RAID="1" and MODULE_LIST="ext4" and LVM="1".

Run mkinitrd -f -R. Note where the initrd image is output. Edit /etc/lilo.conf again, adding initrd = /boot/initrd.gz indented under the boot menu option for linux at the end of the file.

Run lilo -t -v to test. Observe the warnings. If all looks good, go ahead and run lilo without flags.

Reboot and pray.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Software RAID w/ 4 Drives Fails carlosinfl Linux - Software 1 01-23-2010 06:16 AM
Software RAID Always Fails on Shutdown carlosinfl Debian 2 04-07-2009 11:54 AM
Adding an old software-RAID array to a software-RAID installation.. Boot problems GarethM Linux - Hardware 2 05-05-2008 03:16 PM
Software RAID 1, if one drive fails I can't boot just from the other! Oskare100 Linux - Server 1 09-23-2007 01:48 AM
Will a ex - Software Raid 1 disk boot without Raid software? carlosruiz Linux - Software 0 05-27-2006 01:12 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 09:20 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration