LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-03-2011, 09:02 AM   #1
BTG308
LQ Newbie
 
Registered: Mar 2011
Posts: 1

Rep: Reputation: 0
RAID-6 issues


I have a 12-disk RAID-6 array setup on commodity hardware. It's been running fine for a few weeks until yesterday when one of the disks failed. I suspected a faulty cable, so I replaced it. While I was doing that, I noticed that I had put the cables in the "wrong" order when installing, so I swapped them around since I wanted to know which disk was connected to which interface. I thought the RAID would use the disk's UUIDs only and not really care which port they were on. When I brought the array back up, it found 10 disks and one spare (the faulty one) with one disk out of the array. I tried adding the lone disk and let it run for a while. Next I looked at the reconstruction, it was counting up time remaining. Re-tried, same thing. Around here, my old Windows roots took over and made me reboot. I guess I thought the kernel was confused and wanted to re-read the disks or something. When it came back up it found 8 disks, two spares, no missing. I went to bed.

Today, I swapped the two disks whose cables were swapped and tried again, now it finds 10 of the disks, all spares:

Code:
root@baloo:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdl1[12](S) sdg1[5](S) sdh1[8](S) sdb1[1](S) sdk1[9](S) sdd1[7](S) sdc1[2](S) sdm1[13](S) sdf1[4](S) sdi1[6](S)
      4883864320 blocks
       
unused devices: <none>
I figured I'd try pushing a little harder to see what happened:

Code:
root@baloo:~# mdadm --assemble --force /dev/md0
mdadm: forcing event count in /dev/sdd1(7) from 68545 upto 68574
mdadm: Cannot open /dev/sdj1: Device or resource busy
dmsetup table was clean, so I thought maybe sdj1 needed an even harder nudge and zeroed it's superblock:

Code:
root@baloo:~# mdadm --misc --zero-superblock /dev/sdj1
root@baloo:~# mdadm --assemble --force /dev/md0
mdadm: clearing FAULTY flag for device 7 in /dev/md0 for /dev/sdd1
mdadm: SET_ARRAY_INFO failed for /dev/md0: Device or resource busy
Oh dear. (sdd would be the previously faulty disk, that may or may not be a cable error.) Right about now, I finally realize that I am trying very hard to dig myself out of a hole. So, let's see were we're at right now:

Code:
root@baloo:~# mdadm --assemble --force /dev/md0  --update=summaries --verbose
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/sdj1
mdadm: /dev/sdj1 has wrong raid level.
mdadm: /dev/dm-3 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/dm-2 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/dm-1 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/dm-0 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdm is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdl is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdk is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: no RAID superblock on /dev/sdj1
mdadm: /dev/sdj1 has wrong raid level.
mdadm: /dev/sdj is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdi is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdh is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdg is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdf is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sde1 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sde is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdd is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdc is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdb is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sda5 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sda2 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sda1 is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sda is not one of /dev/sdd1,/dev/sdh1,/dev/sdi1,/dev/sdj1,/dev/sdl1,/dev/sdb1,/dev/sdc1,/dev/sdf1,/dev/sdg1,/dev/sdg1,/dev/sdm1,/dev/sdk1,/dev/sdn1
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdm1 is identified as a member of /dev/md0, slot 13.
mdadm: /dev/sdk1 is identified as a member of /dev/md0, slot 9.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 7.
mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot 8.
mdadm: /dev/sdi1 is identified as a member of /dev/md0, slot 6.
mdadm: /dev/sdl1 is identified as a member of /dev/md0, slot 12.
Segmentation fault
Code:
root@baloo:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdl1[12](S) sdi1[6](S) sdh1[8](S) sdd1[7](S) sdc1[2](S) sdm1[13](S) sdb1[1](S) sdk1[9](S) sdg1[5](S) sdf1[4](S)
      4883864320 blocks
       
unused devices: <none>
root@baloo:~# uname -a
Linux baloo 2.6.35-27-server #48-Ubuntu SMP Tue Feb 22 21:53:16 UTC 2011 x86_64 GNU/Linux
root@baloo:~# mdadm --version
mdadm - v2.6.7.1 - 15th October 2008
Code:
[ 2897.030447] md: bind<sdf1>
[ 2897.055035] md: bind<sdg1>
[ 2897.101455] md: bind<sdk1>
[ 2897.118990] md: bind<sdb1>
[ 2897.148076] md: bind<sdm1>
[ 2897.333941] md: bind<sdc1>
[ 2897.525613] md: bind<sdd1>
[ 2897.573990] md: bind<sdh1>
[ 2898.036870] mdadm[3389]: segfault at 4 ip 000000000041823d sp 00007fff1f1c7ed0 error 4 in mdadm[400000+2a000]
[ 2898.044518] md: bind<sdi1>
[ 2898.246323] md: bind<sdl1>

What I would like to do is force an assembly of all disks, without risking a re-sync since I'm pretty sure at least 11 of the 12 disks have good data. Any ideas?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] More Raid issues interndan Slackware 7 01-13-2011 03:10 PM
RAID issues wesley.bruwer Linux - Newbie 7 06-26-2009 10:07 AM
Software RAID (5) issues BlackRabbit Linux - Software 10 04-27-2008 05:18 PM
RAID Issues gsoft Ubuntu 0 09-11-2007 03:11 AM
Raid issues Sigh Linux - Newbie 1 03-20-2005 09:32 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 12:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration