LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 10-06-2010, 07:31 PM   #1
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Rep: Reputation: 44
Software Raid1 problems


Well I guess I'm back with new questions! I "almost" successfully implemented a RAID1 setup in my server using a VERY good website

https://raid.wiki.kernel.org/index.p...ID_Boot_Recipe

Basically, I had a working system on a SATA HDD but the drive died. I transferred (using dd) all data to a new HDD, and got an identical HDD for RAID1. So the procedure stated at the link above was intended for what I wanted to do, i.e. convert a running system to a RAID1 setup.

My plan was to have /boot on md0 & / on md1. This is the way I setup things.

So now I am almost done but I have a problem. When Slackware (13.1) starts up, it gets to a point where it should normally assemble the arrays but instead, I get an error message like:
Code:
Failed to open the device '/dev/md1': No such file or directory

*******************************************************
*** An error occurred during the file system check. ***
*** You will now be given a chance to log into the  ***
*** system in single-user mode to fix the problem.  ***
*** Running 'e2fsck -v -y <partition>' might help.  ***
*******************************************************

Once you exit the single user shell, the system will reboot.

Type Control-d to proceed with normal system startup
(or enter root password for system maintenance):
logging in maintenance mode, I probed mdstat (cat /proc/mdstat) and there is no assembled/running arrays. I understand why the system goes into recovery or emergency mode... md1 (root) cannot be assembled into md1 and slack runs into a degraded array using sda3 (one of md1's component).

In this recovery shell, I performed some basic stuff and found some interesting things:

1- None of the arrays are being assembled at system startup. Even md0 (/boot) gets started as sda1 only.

2- I can manually assemble all arrays (mdadm -A /dev/mdX /dev/sdaX /dev/sdbX), with the exception of md1 (root) which gives:

Code:
mdadm: Cannot open device /dev/sda3: device or resource busy
mdadm: /dev/sda3 has no superblock - assembly aborted
Lilo seems properly installed but I can provide lilo.conf if people are interested. I created a initrd using the instructions in README.initrd and it completed without errors. Again, I can provide mkinitrd.conf if need be. To me it looks like for some reasons, Slack is not capable of assembling the arrays and therefore mounts root from the traditional partition as readonly filesystem mode.

Booting the machine with a live distro like SLAX or Parted Magic will work 100%... In fact SLAX even mounts each arrays in /mnt/mdX.

What could be the problem, or a direction I could take.

Thanks in advance

Last edited by lpallard; 10-06-2010 at 07:33 PM.
 
Old 10-06-2010, 08:12 PM   #2
HasC
Member
 
Registered: Oct 2009
Location: South America - Paraguay
Distribution: Debian 5 - Slackware 13.1 - Arch - Some others linuxes/*BSDs through KVM and Xen
Posts: 329

Rep: Reputation: 55
when booting in manteinance mode, check if you have /etc/mdadm.conf
 
Old 10-06-2010, 08:23 PM   #3
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
Thanks for your reply!

Yes, /etc/mdadm.conf is there.
 
Old 10-06-2010, 09:29 PM   #4
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
Here's a little more info, hoping this will help you guys helping me... All from the live distro (SLAX)

fdisk -l
Code:
Disk /dev/sda: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xefb5efb5

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          17      131072   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2              17         272     2048000   82  Linux swap
Partition 2 does not end on cylinder boundary.
/dev/sda3             272        2184    15360000   fd  Linux raid autodetect
/dev/sda4            2184       38914   295030784    5  Extended
/dev/sda5            2184        3459    10240000   fd  Linux raid autodetect
/dev/sda6            3459        4097     5120000   fd  Linux raid autodetect
/dev/sda7            4097        4734     5120000   fd  Linux raid autodetect
/dev/sda8            4734       38914   274546688   fd  Linux raid autodetect

Disk /dev/sdb: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb1c50413

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          17      131072   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2              17         272     2048000   82  Linux swap
Partition 2 does not end on cylinder boundary.
/dev/sdb3             272        2184    15360000   fd  Linux raid autodetect
/dev/sdb4            2184       38914   295030784    5  Extended
/dev/sdb5            2184        3459    10240000   fd  Linux raid autodetect
/dev/sdb6            3459        4097     5120000   fd  Linux raid autodetect
/dev/sdb7            4097        4734     5120000   fd  Linux raid autodetect
/dev/sdb8            4734       38914   274546688   fd  Linux raid autodetect
cat /proc/mdstat
Code:
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid1 sdb3[1]
      15359936 blocks [2/1] [_U]

md2 : active raid1 sdb5[1] sda5[0]
      10239936 blocks [2/2] [UU]

md3 : active raid1 sdb6[1] sda6[0]
      5119936 blocks [2/2] [UU]

md4 : active raid1 sdb7[1] sda7[0]
      5119936 blocks [2/2] [UU]

md5 : active raid1 sdb8[1] sda8[0]
      274546624 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
      131008 blocks [2/2] [UU]

unused devices: <none>
It seems that md1 cannot be assembled, even in SLAX. Either this is new, or I just did not notice before the _U instead of UU...

cat /etc/mdadm.conf
Code:
# mdadm configuration file
#
# mdadm will function properly without the use of a configuration file,
# but this file is useful for keeping track of arrays and member disks.
# In general, a mdadm.conf file is created, and updated, after arrays
# are created. This is the opposite behavior of /etc/raidtab which is
# created prior to array construction.
#
#
# the config file takes two types of lines:
#
#       DEVICE lines specify a list of devices of where to look for
#         potential member disks
#
#       ARRAY lines specify information about how to identify arrays so
#         so that they can be activated
#
# You can have more than one device line and use wild cards. The first
# example includes SCSI the first partition of SCSI disks /dev/sdb,
# /dev/sdc, /dev/sdd, /dev/sdj, /dev/sdk, and /dev/sdl. The second
# line looks for array slices on IDE disks.
#
#DEVICE /dev/sd[bcdjkl]1
DEVICE /dev/sda1 /dev/sda3 /dev/sda5 /dev/sda6 /dev/sda7 /dev/sda8 /dev/sdb1 /dev/sdb3 /dev/sdb5 /dev/sdb6 /dev/sdb7 /dev/sdb8
#
# If you mount devfs on /dev, then a suitable way to list all devices is:
#DEVICE /dev/discs/*/*
#
#
#
# ARRAY lines specify an array to assemble and a method of identification.
# Arrays can currently be identified by using a UUID, superblock minor number,
# or a listing of devices.
#
#       super-minor is usually the minor number of the metadevice
#       UUID is the Universally Unique Identifier for the array
# Each can be obtained using
#
#       mdadm -D <md>
#
#ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371
#ARRAY /dev/md1 super-minor=1
#ARRAY /dev/md2 devices=/dev/hda1,/dev/hdb1
#
# ARRAY lines can also specify a "spare-group" for each array.  mdadm --monitor
# will then move a spare between arrays in a spare-group if one array has a failed
# drive but no spare
#ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1
#ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1
#
# When used in --follow (aka --monitor) mode, mdadm needs a
# mail address and/or a program.  This can be given with "mailaddr"
# and "program" lines to that monitoring can be started using
#    mdadm --follow --scan & echo $! > /var/run/mdadm
# If the lines are not found, mdadm will exit quietly
#MAILADDR root@mydomain.tld
#PROGRAM /usr/sbin/handle-mdadm-events
ARRAY /dev/md0 metadata=0.90 UUID=75f29e39:6a4cdcf1:004291dc:e0ebbc5e auto=yes
ARRAY /dev/md1 metadata=0.90 UUID=06e7066d:f412d012:004291dc:e0ebbc5e auto=yes
ARRAY /dev/md2 metadata=0.90 UUID=03badc61:6937c408:004291dc:e0ebbc5e auto=yes
ARRAY /dev/md3 metadata=0.90 UUID=a212adef:00856c80:004291dc:e0ebbc5e auto=yes
ARRAY /dev/md4 metadata=0.90 UUID=d704a4bd:4aca87e4:004291dc:e0ebbc5e auto=yes
ARRAY /dev/md5 metadata=0.90 UUID=e223ae96:fe78952d:004291dc:e0ebbc5e auto=yes
cat /etc/lilo.conf
Code:
append=" vt.default_utf8=0"
boot = /dev/md0
raid-extra-boot = mbr-only
change-rules
  reset
vga = normal

image = /boot/vmlinuz-generic-2.6.33.4
  initrd = /boot/initrd.gz
  root = /dev/md1
  label = Gen-2.6.33.4
  read-only
cat /etc/mkinitrd.conf
Code:
MODULE_LIST="reiserfs"
RAID="1"
Once I got the RAID arrays working (using live distro SLAX), the procedure I used to make the mkinitrd:
1- boot with SLAX
2- chroot to md1 (/mnt/md1)
3- run
Code:
/sbin/mkinitrd -v -o /boot/initrd.gz -k 2.6.33.4
Then I installed LILO using this command:
Code:
lilo -v
The output of both commands had NO fatal or serious errors.

Like I said in my initial post, I created the base drive (sda 320GB) of my current system from a crashed drive. It required a fair amount of work. Here's what happened and what I did. Maybe it will shed light on my problem.

I initially had a 750GB SATA HDD with Slack. The drive died.

I used dd to recover the partitions to images that I cloned (using dd again) to a temporary 80GB HDD. After some reinstalls, some updates (slackpkg update all) and some fooling around with lilo, I once again had a fully working system, just on a smaller HDD.

Then, I started from there to create the RAID system. First of all, I plugged one of the new 320GB drives and the source 80GB drive to the machine and using parted magic (clonezilla), I cloned the 80GB HDD to the 320GB HDD. I double checked to see if the operation was successful and yes, I could run the machine (boot and use) from the new 320GB HDD. THen I unplugged the 80GB HDD.

At this point, I started the procedure at the link in the original post of this thread.

Last edited by lpallard; 10-06-2010 at 09:47 PM. Reason: Added info
 
Old 10-07-2010, 05:59 PM   #5
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
Anybody who reads this thread and has an idea, please share! I am in a "rush" of getting this server back up & running so any kind of info would be appreciated.

thanks!
 
Old 10-07-2010, 08:26 PM   #6
jefro
Guru
 
Registered: Mar 2008
Posts: 11,323

Rep: Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386
If live works then I'd suspect some issue is not allowing the array to be build before it tries to access data. Might have to make a small non-raid partition. Might have to put in some timer to settle some other issue or such.

The real answer is that software raid on some of these cards is not exactly much good. I have played with a number of them and found them quirky at best. I agree that at least a mirror would be of some help. It may be more trouble than it is worth. Consider a real hardware card. It will perform much better and work generally on many OS's and can be recovered without an OS in most cases. I know they cost more.
 
Old 10-08-2010, 07:43 AM   #7
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
jefro, well i really thought that live worked to assemble md1 but it seems that no... after my last long post, you can see

md1 : active raid1 sdb3[1]
15359936 blocks [2/1] [_U]

"_U" means a degraded array no?

that was using SLAX live distro. Of course the real OS (Slackware) cant assemble any of the arrays on its own...

Quote:
The real answer is that software raid on some of these cards is not exactly much good
I am not using any kind of raid or controller cards... I am using 2 identical HDD's directly connected the the Mobo's SATA ports. Nothing fancy here.

I am trying the software raid because I dont trust the fakeraid (nvidia chipset) on the mobo, I tried fakeraid in the past and lost a great deal of data when the chipset went bad... (that was on a gigabyte mobo with a intel matrix storage chipset)

Real hardware raid controllers might be a better deal but they cost $$$$$$$$$$$ I cant afford anything over $150... thats just a home server after all.

I really wonder why this is isnt working.. I followed the instructions and adapted the commands exactly to my environment... I setup my laptop with Raid0 2 years ago using sensibly the same procedure and it never failed.

Last edited by lpallard; 10-08-2010 at 07:45 AM.
 
Old 10-08-2010, 03:30 PM   #8
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
I have been pointed out to the error messages from fdisk:
Code:
Partition 1 does not end on cylinder boundary.
Partition 2 does not end on cylinder boundary.
Froma different post on LQ, somebody was asking if this error is big deal, and somebody else answered:

Quote:
That was not just a warning, this kind of thing is very dangerous:
/dev/hdc11 3892 4622 5859472+ 83 Linux
/dev/hdc12 4622 5351 5859472+ 83 Linux

The next partition starts at the same point where the previous ends, which can lead to data corruption if some bytes are written to the wrong partition.

If it was me, and especially concerning a server, I would doublecheck fdisk -l to see that the problem is gone.
So could it be the problem?
 
Old 10-08-2010, 04:10 PM   #9
jefro
Guru
 
Registered: Mar 2008
Posts: 11,323

Rep: Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386Reputation: 1386
Again we can't be sure if the array is correct or the partitions overlap. I guess you could try some partition program like ranish to decide. I just don't like the software array cards.


Are you using the exact same software as you did 2 years ago?
 
Old 10-08-2010, 04:21 PM   #10
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
I managed to bring the problem to a simpler level (I hope). I basically restarted fresh while I made sure the "Partition X does not end on cylinder boundary." error was not present. Now, when I boot I get to a point where slack successfully mount all arrays (woohoo!) but for some reason, still thinks that / is on sda3 when it is on md1... SO I get dumped in a recovery console saying:

sda3: reiserfs: read super-block: bread failed (dev sda3)
Error: mounting sda3 on /mnt failed: Invalid argument.
ERROR: No /sbin/init found on rootdev. Trouble ahead.
Blablabla...

Of course, sda3 is NOW an extended partition hosting sda5 to sda9. So you see slack tries to mount sda3 on / while it is being currently mounted as md1...

Where & how slack would still think / needs to be mounted on sda3???

There is no mention of sda3 in lilo.conf, neither in fstab and nothing in mkinitrd.conf... I recreated initrd.gz from scratch and reinstalled lilo.

the reason why I am thinking it is an artifact of the past is because sda3 USED to be / (prior to the crash). So somehow, slack finds this info somewhere and get fooled by it.

EDIT: I think I found the problem. When I compiled the initrd, I got the error message "ERROR: cat /proc/partitions: No such file or directory". I think this is why the partition sda3 is trying to be mounted. I also understand why I get the kninitrd error message. I am chroot in the / folder and then I execute the command. /proc/partitions is empty because the real OS (the final one) is not running. basically, when I execute the mkinitrd command it should somehow look outside the chroot environment to find /proc/partitions... Is that possible?

Last edited by lpallard; 10-08-2010 at 09:54 PM.
 
Old 10-09-2010, 06:58 PM   #11
lpallard
Member
 
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware (various releases)
Posts: 970

Original Poster
Rep: Reputation: 44
Fixed!

Everything is working now!

The problem was that I needed to add the variable ROOTDEV=/dev/md1 in mkinitrd.conf and recreate the initrd.gz image.

Any questions, please ask.

Thanks!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Software raid1 shadowers Linux - Hardware 2 10-24-2008 04:08 PM
Have software raid1, but like to change to raid1+0 or 0+1, how? spaceuser Debian 8 03-17-2008 02:07 PM
Software RAID1 problems, won't take /dev/sdb2 ? michaelsanford Linux - General 4 01-08-2006 07:46 PM
Software RAID1 problems robbow52 Debian 7 11-15-2004 10:52 AM
Software RAID1 problems robbow52 Linux - Software 1 07-28-2004 09:32 AM


All times are GMT -5. The time now is 01:37 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration