LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 05-22-2009, 04:45 PM   #1
ABL
Member
 
Registered: Mar 2005
Location: NYC
Distribution: CentOS 5
Posts: 54

Rep: Reputation: 16
Dual drive failure in RAID 5 (also, RAID 1, and LVM)


I *had* a server with 6 SATA2 drives with CentOS 5.3 on it (I've upgraded over time from 5.1). I had set up (software) RAID1 on /boot for sda1 and sdb1 with sdc1, sdd1, sde1, and sdf1 as hot backups. I created LVM (over RAID5) for /, /var, and /home. I had a drive fail last year (sda).

After a fashion, I was able to get it working again with sda removed. Since I had two hot spares on my RAID5/LVM deal, I never replaced sda. Of course, on reboot, what was sdb became sda, sdc became sdb, etc.

So, recently, the new sdc died. The hot spare took over, and I was humming along. A week later (before I had a chance to replace the spares, another died (sdb).

Now, I have 3 good drives, my array has degraded, but it's been running (until I just shut it down to tr y.

I now only have one replacement drive (it will take a week or two to get the others).

My questions/problems are:
I went to linux rescue from the CentOS 5.2 DVD and changed sda1 to a Linux (as opposed to Linux RAID) partition. I need to change my fstab to look for /dev/sda1 as boot, but I can't even mount sda1 as /boot. What do I need to do next? If I try to reboot without the disk, I get insmod: error inserting '/lib/raid456.ko': -1 File exists

Also, my md1 and md2 fail because there are not enough discs (it says 2/4 failed). I *believe* that this is because sda, sdb, sdc, sdd, and sde WERE the drives on the raid before, and I removed sdb and sdc, but now, I do not have sde (because I only have 4 drives) and sdd is the new drive. Do I need to label these drives and try again? Suggestions? (I suspect I should have done this BEFORE failure).

Do I need to rebuild the RAIDs somehow? What about LVM?

Any suggestions welcome.

Thank you!
 
Old 05-26-2009, 12:52 PM   #2
ABL
Member
 
Registered: Mar 2005
Location: NYC
Distribution: CentOS 5
Posts: 54

Original Poster
Rep: Reputation: 16
Update

Update:
I still have this issue with a kernel panic after the insmode error (insmod: error inserting '/lib/raid456.ko': -1 File exists)--I get this line twice.

I've been able to find some instructions stating that I need to rebuild the initrd with mkinitrd, but when I boot through the rescue disk, I can't seem to mount any of my hard drives (I am sure I am doing something wrong).

Can anyone help walk me through this?

Thanks!
 
Old 05-26-2009, 08:49 PM   #3
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,289

Rep: Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034
You can make a new initrd without using the boot disk.
Login as root in and use xterm.

Anyway, you'll probably find /etc/modprobe.conf has multiple entries for that driver. You need to edit that file as root.
 
Old 05-26-2009, 09:24 PM   #4
ABL
Member
 
Registered: Mar 2005
Location: NYC
Distribution: CentOS 5
Posts: 54

Original Poster
Rep: Reputation: 16
Quote:
Originally Posted by chrism01 View Post
You can make a new initrd without using the boot disk.
Login as root in and use xterm.

Anyway, you'll probably find /etc/modprobe.conf has multiple entries for that driver. You need to edit that file as root.
I can't login, my raid crashed, and I can't bring the computer up. I *can* bring up md0 (/boot, raid1) under linux rescue, but I can't seem to modify anything, as I don't have mkinitrd available to me.

The only things I can think of doing at this point are:
1) start over and install CentOS 5.3 (I will only lose some settings, as my /home is backed up)
2) copy the /boot directory to a usb drive, modify it on another machine and then copy it back, and see if that works (any chance? Can I even get the correct initrd on another machine if it's not also 64-bit?)

Also, I cannot seem to bring up md1 (/) and md2 (/home), both raid5. Whenever I try, it says that I have 2 drives with 1 spare, so it can't bring up the array! If I can get this working (hints, anyone?), can I just reinstall CentOS to /boot and get it all working again?

I think that if I can't get this working tomorrow, I'll have to go with starting from scratch, as I need this server back up (right now, we are all logging into an offsite server, which had been replicated using unison).

Thanks for any help.
 
Old 05-26-2009, 11:41 PM   #5
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,289

Rep: Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034
Use the
linux rescue
mode to start the box, mount the root system (will probably do that for you) then check that file (/etc/modprobe.conf).
 
1 members found this post helpful.
Old 05-27-2009, 09:21 AM   #6
ABL
Member
 
Registered: Mar 2005
Location: NYC
Distribution: CentOS 5
Posts: 54

Original Poster
Rep: Reputation: 16
Quote:
Originally Posted by chrism01 View Post
Use the
linux rescue
mode to start the box, mount the root system (will probably do that for you) then check that file (/etc/modprobe.conf).
I don't think I can do that. I cannot mount the root filesystem, as it is in md1 (a raid5 array that failed). I cannot seem to get md1 back up. When I try, using:
mdadm --examine --scan /dev/sda >>/etc/mdadm.conf
then add DEVICES partitions to the top and devices=/dev/sda1,/dev/sdb1,/dev/sdc1, missing

and run mdadm -A -s
I get:
mdadm: /dev/md1 assembled from 2 drives and 1 spare - not enough to start the array.

Am I dead in the water, then?
 
Old 05-27-2009, 08:01 PM   #7
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,289

Rep: Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034
RAID 5 requires a min of 3 active disks, 0 or more spares.
So, you need to set all the disks as active, no spares. That should get you back up and running ... speaking of which, do a backup asap afterwards.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
mdadm RAID 5 single drive failure atarghe1 Linux - Server 7 12-14-2012 06:20 PM
RAID mdadm - Sending E-Mails on RAID Failure? rootking Linux - General 1 12-25-2007 03:59 AM
software raid 5 + LVM and other raid questions slackman Slackware 5 05-09-2007 02:58 PM
Will not boot RAID drive after PS failure. webguyinternet Linux - Server 0 10-04-2006 02:59 PM
Migrating single drive to RAID / LVM? joadoor Linux - Hardware 3 10-14-2005 07:01 AM


All times are GMT -5. The time now is 07:04 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration