LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 07-05-2013, 03:42 AM   #1
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Rep: Reputation: 43
Exclamation Raid issues with the last kernel upgrade from 3.2.29 to 3.2.45 on Slackware 14.0


Been a long time since I've asked a question here.

The last kernel release for Slackware hosed my Raid installation. Basically, I had to roll back. 3.2.45 seems to be recognizing my disk as md127 and md126 when the kernel is loading.

A little info on my system...
4 disks

Raid 1 on the boot partition, /dev/md0 on a logical volume /dev/vg1/boot
md version 0.90

Raid 10 on the rest /dev/md1, /dev/vg2/root, /dev/vg2/swap, /dev/vg2/home
md version 1.2

I'm using the generic kernel. After the upgrade, mkinitrd, and lilo reinstall I get this message:

Code:
mount: mounting /dev/vg2/root on /mnt failed: No such device 
ERROR: No /sbin/init found on rootdev (or not mounted). Trouble ahead.  
You an try to fix it. Type ‘exit’ when things are done.
At this point nothing brings it alive. I've tried booting both the huge and generic kernels. I have to boot from the Slackware install DVD, remove all kernel patch packages for 3.2.45 and install the 3.2.29 packages again. Rerun mkinitrd and reinstall lilo and I have a working system again.

Any thoughts or have others run into the same problems? I've search around quite a bit a tried quite a few things but it looks like this kernel upgrade is a "no go" for software raid devices being recognized and used in the same way.

Last edited by meetscott; 07-05-2013 at 12:15 PM. Reason: Added CODE blocks
 
Old 07-05-2013, 04:14 AM   #2
wildwizard
Member
 
Registered: Apr 2009
Location: Oz
Distribution: slackware64-14.0
Posts: 875

Rep: Reputation: 282Reputation: 282Reputation: 282
Hmm I had a similar issue with one of my RAID partitions not showing up in -current and I had assumed it was related to a change in the mkinitrd changes that went in.

I did however resolve the problem by ensuring that all RAID partitions are listed in /etc/mdadm.conf before creating the initrd.

That may or may not help as I don't know if the 3.2 series has been getting the same RAID code updates as the 3.9 series.
 
Old 07-05-2013, 05:58 AM   #3
TracyTiger
Member
 
Registered: Apr 2011
Location: California, USA
Distribution: Slackware
Posts: 528

Rep: Reputation: 273Reputation: 273Reputation: 273
Just a point of information ....

I'm successfully running Slack64 14.0 with the 3.2.45 kernel with a fully encrypted (except /boot) RAID1/RAID10 setup very similar to yours. However I'm not using LVM.

I use UUIDs in /etc/fstab, and like wildwizard, /etc/mdadm.conf defines the arrays, again with (different) UUIDs. I've had RAID component identification problems in the past when I didn't use UUID so now I always build RAID systems using UUID for configuration information.

It boots up as expected without difficulty. The challenging part was getting the UUIDs correct. Every query-type command seems to produce different UUIDs. Through trial and error I figured out which ones to use.

You may want to look carefully at mkinitrd, lilo, fstab, & mdadm.conf before giving up on the 3.2.45 kernel.

EDIT: ... and get rid of "root=" in the lilo configuration image section.

Last edited by TracyTiger; 07-05-2013 at 06:01 AM.
 
Old 07-05-2013, 12:21 PM   #4
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Original Poster
Rep: Reputation: 43
Quote:
Originally Posted by wildwizard View Post
Hmm I had a similar issue with one of my RAID partitions not showing up in -current and I had assumed it was related to a change in the mkinitrd changes that went in.

I did however resolve the problem by ensuring that all RAID partitions are listed in /etc/mdadm.conf before creating the initrd.

That may or may not help as I don't know if the 3.2 series has been getting the same RAID code updates as the 3.9 series.
Yes, I have those listed in my mdadm.conf. I did check that. I forgot to say so.

Code:
ARRAY /dev/md0 UUID=994ea4ee:2e64f4d5:208cdb8d:9e23b04b
ARRAY /dev/md/1 UUID=d79b38ac:2b0c654d:a16d0a19:babaf044
I've tried a few settings in there but have gotten no where. The /device is showing /dev/md/1. I've tried that and /dev/md1, which is what it originally was.
 
Old 07-05-2013, 01:03 PM   #5
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Original Poster
Rep: Reputation: 43
Quote:
Originally Posted by Tracy Tiger View Post
Just a point of information ....

I'm successfully running Slack64 14.0 with the 3.2.45 kernel with a fully encrypted (except /boot) RAID1/RAID10 setup very similar to yours. However I'm not using LVM.

I use UUIDs in /etc/fstab, and like wildwizard, /etc/mdadm.conf defines the arrays, again with (different) UUIDs. I've had RAID component identification problems in the past when I didn't use UUID so now I always build RAID systems using UUID for configuration information.

It boots up as expected without difficulty. The challenging part was getting the UUIDs correct. Every query-type command seems to produce different UUIDs. Through trial and error I figured out which ones to use.

You may want to look carefully at mkinitrd, lilo, fstab, & mdadm.conf before giving up on the 3.2.45 kernel.

EDIT: ... and get rid of "root=" in the lilo configuration image section.
Interesting. You are not using LVM and you are encrypting. I encrypt my laptop drive and I use LVM on that. The upgrade went okay on that one. Weird of the UUIDs need to be tweaked around now. I see them use them and didn't think anything of it. They match, what more could the system be looking for?

I'm a little confused on the last thing. "Get rid of 'root=' in the lilo configuration image section"??? How on earth will it know which partition to use for root? I have 4 partitions that could be used... I'm assuming it doesn't know that swap is swap.

Here's my lilo configuration without the the commented out parts. Keep in mind that this is my configuration with 3.2.29. The configuration is the same for 3.2.45 with those particular values changed.
Code:
append=" vt.default_utf8=0"
boot = /dev/md0
raid-extra-boot = mbr-only

bitmap = /boot/slack.bmp
bmp-colors = 255,0,255,0,255,0
bmp-table = 60,6,1,16
bmp-timer = 65,27,0,255                                                                                              
prompt                                                                                                                                                                                                                                                                         
timeout = 100
change-rules
reset

vga = 773

image = /boot/vmlinuz-generic-3.2.29
  initrd = /boot/initrd.gz
  root = /dev/vg2/root
  label = 3.2.29
  read-only
 
Old 07-05-2013, 02:31 PM   #6
TracyTiger
Member
 
Registered: Apr 2011
Location: California, USA
Distribution: Slackware
Posts: 528

Rep: Reputation: 273Reputation: 273Reputation: 273
Quote:
Originally Posted by meetscott View Post
Interesting. You are not using LVM and you are encrypting.
I've used RAID/Encryption both with and without LVM. Both worked. I don't have any current systems running LVM for me to check at the moment.

Quote:
I'm a little confused on the last thing. "Get rid of 'root=' in the lilo configuration image section"??? How on earth will it know which partition to use for root? I have 4 partitions that could be used... I'm assuming it doesn't know that swap is swap.
I believe "root=" isn't needed with initrd because initrd already has the information about which partition to use for root. See the thread here https://www.linuxquestions.org/quest...6/#post4801795 for information on how using "root=" in the lilo image section causes problems.

Quote:
Here's my lilo configuration without the the commented out parts. Keep in mind that this is my configuration with 3.2.29. The configuration is the same for 3.2.45 with those particular values changed.
My particular problems in the linked post occurred with I upgraded a running system. I don't know why an upgrade causes issues.

Troubleshooting based on my ignorance follows ...
You may want to force a failure by changing a UUID in mdadm.conf just to see that the information there is actually being utilized and that the UUIDs there are correct when the new kernel is running.

Last edited by TracyTiger; 07-05-2013 at 02:36 PM. Reason: clarification - single word added
 
Old 07-05-2013, 04:21 PM   #7
TracyTiger
Member
 
Registered: Apr 2011
Location: California, USA
Distribution: Slackware
Posts: 528

Rep: Reputation: 273Reputation: 273Reputation: 273
Quote:
Originally Posted by meetscott View Post
Yes, I have those listed in my mdadm.conf. I did check that. I forgot to say so.

Code:
ARRAY /dev/md0 UUID=994ea4ee:2e64f4d5:208cdb8d:9e23b04b
ARRAY /dev/md/1 UUID=d79b38ac:2b0c654d:a16d0a19:babaf044
I've tried a few settings in there but have gotten no where. The /device is showing /dev/md/1. I've tried that and /dev/md1, which is what it originally was.
Note that my mdadm.conf file looks more like this:

Code:
ARRAY /dev/md1 metadata=0.90 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md2 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md3 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md5 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md6 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
Maybe the missing metadata is important to the new kernel? Perhaps it defaults to version 1.2 so version 0.90 needs to be made explicit?

Last edited by TracyTiger; 07-05-2013 at 04:22 PM.
 
Old 07-07-2013, 03:41 AM   #8
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Original Poster
Rep: Reputation: 43
Quote:
Originally Posted by Tracy Tiger View Post
Note that my mdadm.conf file looks more like this:

Code:
ARRAY /dev/md1 metadata=0.90 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md2 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md3 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md5 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
ARRAY /dev/md6 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
Maybe the missing metadata is important to the new kernel? Perhaps it defaults to version 1.2 so version 0.90 needs to be made explicit?
Thanks for the reply, I've tried both, with and without the metadata.

Regarding your previous post...
I've never tried *not* specifying the root device in my lilo.conf. I've been running this way for years and never had a problem. It is also specified in the Slackware documentation Alien Bob wrote. That doesn't make it right and perhaps it is worth trying.

I don't know why this is suddenly becoming an issue. I don't reinstall from scratch unless I must for some new system. I always go through the upgrade process. These LVM Raid 10 configurations have been flawless through these upgrades. I even upgrade one machine remotely as it is colocated. This has also gone well for the last 7 years and I don't know how many upgrades :-)

Incidentally, I have a laptop, which is *not* raid but uses LVM and encryption. The upgrade was okay there. Given the variety of configurations of systems (5 at the moment) I have running Slackware, I'm left with the impression that this is only an issue with Raid and the new 3.2.45 kernel.
 
Old 07-07-2013, 02:34 PM   #9
TracyTiger
Member
 
Registered: Apr 2011
Location: California, USA
Distribution: Slackware
Posts: 528

Rep: Reputation: 273Reputation: 273Reputation: 273
Quote:
Originally Posted by meetscott View Post
3.2.45 seems to be recognizing my disk as md127 and md126 when the kernel is loading.
Whenever I don't use default values and I see default values appearing on the screen and in logs, I usually suspect that my configuration setup isn't working (/etc/xxxx.conf) or isn't being referenced as I intended.

Quote:
I've never tried *not* specifying the root device in my lilo.conf. I've been running this way for years and never had a problem. It is also specified in the Slackware documentation Alien Bob wrote. That doesn't make it right and perhaps it is worth trying.
As you probably read in the link to the previous LQ thread, Alien Bob was who suggested I drop specifying root in the lilo.conf image section.

Quote:
I don't know why this is suddenly becoming an issue.
RAID using initrd and specifying root in lilo worked well for me for a long time also....until it didn't.

Perhaps other LQ members have better insight into your issue than I, and would like to respond.
 
Old 07-07-2013, 02:59 PM   #10
kikinovak
MLED Founder
 
Registered: Jun 2011
Location: Montpezat (South France)
Distribution: CentOS, OpenSUSE
Posts: 3,453

Rep: Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154Reputation: 2154
Everything running fine here.

Code:
[root@nestor:~] # uname -r
3.2.45
[root@nestor:~] # cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md3 : active raid5 sda3[0] sdd3[3] sdc3[2] sdb3[1]
      729317376 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      
md2 : active raid1 sda2[0] sdd2[3] sdc2[2] sdb2[1]
      995904 blocks [4/4] [UUUU]
      
md1 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
      96256 blocks [4/4] [UUUU]
 
Old 07-07-2013, 03:51 PM   #11
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Original Poster
Rep: Reputation: 43
Quote:
Originally Posted by Tracy Tiger View Post
Whenever I don't use default values and I see default values appearing on the screen and in logs, I usually suspect that my configuration setup isn't working (/etc/xxxx.conf) or isn't being referenced as I intended.



As you probably read in the link to the previous LQ thread, Alien Bob was who suggested I drop specifying root in the lilo.conf image section.



RAID using initrd and specifying root in lilo worked well for me for a long time also....until it didn't.

Perhaps other LQ members have better insight into your issue than I, and would like to respond.
I didn't read the link before, but I have now. I'll have to give it a try. It seems that might be the key.
 
Old 07-07-2013, 07:33 PM   #12
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,860

Rep: Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229
I had no issues upgrading from 3.2.29 to 3.2.45.

My boot partition is on /dev/md0. I do use grub2 instead of lilo and all of my raid arrays auto-assemble instead of being explicitly defined in /etc/mdadm.conf.

Code:
root@darkstar:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md1 : active raid1 sde3[0] sdf3[1]
      142716800 blocks super 1.2 [2/2] [UU]
      
md0 : active raid1 sde2[0] sdf2[1]
      523968 blocks super 1.2 [2/2] [UU]
      
md3 : active raid1 sdc2[0] sda2[1]
      624880192 blocks [2/2] [UU]
      
unused devices: <none>
# pvs
  PV         VG      Fmt  Attr PSize   PFree  
  /dev/md1   mdgroup lvm2 a--  136.09g 136.09g
  /dev/md3   mdgroup lvm2 a--  595.91g  86.62g
  /dev/sdd   testvg  lvm2 a--  111.79g  11.79g
root@darkstar:~#
 
Old 07-14-2013, 08:17 PM   #13
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Original Poster
Rep: Reputation: 43
Well, I figured I'd let everyone know I tried this... that is not specifying the root in the lilo.conf images section. I get the exact same results. So, I'm completely at a loss as to why I appear to be the only person who is seeing this behavior.

I give up on this one. I'm just going to be happy running the older kernel. It takes too much time to experiment around with this sort of thing and I have several other projects I need to attend to.
 
Old 07-15-2013, 11:19 PM   #14
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,860

Rep: Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229Reputation: 2229
Quote:
Originally Posted by meetscott View Post
Been a long time since I've asked a question here.

The last kernel release for Slackware hosed my Raid installation. Basically, I had to roll back. 3.2.45 seems to be recognizing my disk as md127 and md126 when the kernel is loading.
I bothered to look at my dmesg output; it appears that my system also starts using md125, md126 and md127 but figures out later that isn't correct (some messages removed for clarity)...

Code:
[    4.664805] udevd[1056]: starting version 182
[    4.876982] md: bind<sda2>
[    4.882388] md: bind<sdc2>
[    4.883399] bio: create slab <bio-1> at 1
[    4.883557] md/raid1:md127: active with 2 out of 2 mirrors
[    4.883674] md127: detected capacity change from 0 to 639877316608
[    4.890199]  md127: unknown partition table

[   23.896644]  sde: sde1 sde2 sde3
[   23.902268] sd 8:0:2:0: [sde] Attached SCSI disk
[   24.036540] md: bind<sde2>
[   24.039801] md: bind<sde3>
[   24.126990]  sdf: sdf1 sdf2 sdf3
[   24.132618] sd 8:0:3:0: [sdf] Attached SCSI disk

[   24.264754] md: bind<sdf3>

[   24.266127] md/raid1:md125: active with 2 out of 2 mirrors
[   24.266242] md125: detected capacity change from 0 to 146142003200

[   24.274335] md: bind<sdf2>
[   24.275479] md/raid1:md126: active with 2 out of 2 mirrors
[   24.275593] md126: detected capacity change from 0 to 536543232
[   24.288237]  md126: unknown partition table
[   24.293884]  md125: unknown partition table

[   25.575361] md125: detected capacity change from 146142003200 to 0
[   25.575466] md: md125 stopped.
[   25.575566] md: unbind<sdf3>
[   25.589043] md: export_rdev(sdf3)
[   25.589161] md: unbind<sde3>
[   25.605083] md: export_rdev(sde3)
[   25.605667] md126: detected capacity change from 536543232 to 0
[   25.605771] md: md126 stopped.
[   25.605871] md: unbind<sdf2>
[   25.610029] md: export_rdev(sdf2)
[   25.610136] md: unbind<sde2>
[   25.615016] md: export_rdev(sde2)
[   25.615537] md127: detected capacity change from 639877316608 to 0
[   25.615641] md: md127 stopped.
[   25.615741] md: unbind<sdc2>
[   25.620051] md: export_rdev(sdc2)
[   25.620156] md: unbind<sda2>
[   25.624083] md: export_rdev(sda2)
[   25.772347] md: md3 stopped.
[   25.772979] md: bind<sda2>
[   25.773190] md: bind<sdc2>
[   25.774071] md/raid1:md3: active with 2 out of 2 mirrors
[   25.774188] md3: detected capacity change from 0 to 639877316608
[   25.781286]  md3: unknown partition table
[   25.794571] md: md0 stopped.
[   25.795353] md: bind<sdf2>
[   25.795611] md: bind<sde2>
[   25.796365] md/raid1:md0: active with 2 out of 2 mirrors
[   25.796482] md0: detected capacity change from 0 to 536543232
[   25.808178]  md0: unknown partition table
[   26.014044] md: md1 stopped.
[   26.020403] md: bind<sdf3>
[   26.020649] md: bind<sde3>
[   26.021428] md/raid1:md1: active with 2 out of 2 mirrors
[   26.021544] md1: detected capacity change from 0 to 146142003200
[   26.071258]  md1: unknown partition table
I doubt any of that helps, but if you ever get around to looking at this again, you might want to wade through https://bugzilla.redhat.com/show_bug.cgi?id=606481 which contained more than I ever wanted to know about the subject. (Hell, now I'm not sure why my setup works! )
 
Old 07-16-2013, 11:34 AM   #15
meetscott
Samhain Slackbuild Maintainer
 
Registered: Sep 2004
Location: Phoenix, AZ, USA
Distribution: Slackware
Posts: 411

Original Poster
Rep: Reputation: 43
Richard Cranium, thanks for taking the time to put that output together so nicely. I saw the same things, only mine doesn't figure it out later. I guess they've put this auto-detection into kernel now. I would imagine I'm going to have to address it some day. I have a few other projects I'm working on at the moment, so I don't really have time to burn on figuring this out for now.

Just be grateful it is working :-)
 
  


Reply

Tags
3.2.45 slackware raid


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
RAID 6 Issues after upgrade to 12.04... and also a fresh install. :( Pyro666 Linux - Newbie 1 05-05-2012 05:40 PM
Dell Latitude D610 Kernel Upgrade Issues [Slackware 12] Chryzmo Slackware 4 05-01-2008 03:58 PM
Slackware-Current: Qt upgrade issues Neruocomp Slackware 1 04-03-2005 01:24 PM
HPT370 RAID 0 and kernel upgrade brianweber4 Linux - Newbie 2 09-09-2004 09:32 PM
RAID - Linux kernel upgrade 2.2.16 to 2.4.20 dazo Linux - General 0 08-14-2003 11:38 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 11:14 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration