LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 12-21-2010, 10:11 AM   #1
xonogenic
Member
 
Registered: Feb 2006
Posts: 30

Rep: Reputation: 2
mdadm acting oddly with RAID 5 array


I have been having some odd issues over the last day or so while trying to get a raid 5 array running in software under Kubuntu.
I installed 3 1TB drives and started up, my sd* order got all messed up( sda was now sdc and so on). This wasn't entirely unexpected, so I fixed up fstab and booted again. I found all three of the drives I installed, set them to raid auto-detect and used mdadm to create /dev/md0. I then created mdadm.conf by piping the output of mdadm --detail --scan --verbose into /etc/mdadm.conf.

At this point, everything was still going swimmingly. I copied over a few hundred GB of data from another failing drive and everything seemed ok.

I went to reboot once the copy was done and everything just went weird. All of the sd* drives went back to the original. Of course, this meant that the mdadm.conf was wrong. I tried to just change the device list, but that didn't work. I then deleted mdadm.conf and rebooted. The drive list stayed in the original order this time, so I just tried manually starting the array.

By erasing the partition table of the 3rd drive, I've been able to get it to the status of spare, but it says it is busy when I try to add it to the array. A grep through dmesg makes me think that md has a lock on it. I'm not sure where to go with it now. If anyone has any pointers, I would like to hear them.

Thanks in advance.

Device List(original):

/dev/sda => boot drive, /home /
/dev/sdb => 1.5TB media storage, failing
/dev/sdc => 1 TB raid element
/dev/sdd => 1 TB raid element
/dev/sde => 1 TB raid element

Device List( changed )

/dev/sda => 1 TB raid element
/dev/sdb => 1 TB raid element
/dev/sdc => boot drive, /home, /
/dev/sdd => 1.5TB media storage, failing
/dev/sde => 1 TB raid element

Code:
mdadm:

root@butters:/home/colin# mdadm --examine /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 16b4df56:8f2d6c64:f26de854:b79c2b58 (local to host butters)
  Creation Time : Sun Dec 19 14:02:12 2010
     Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
     Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sun Dec 19 14:02:12 2010
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 1
       Checksum : cd6d899b - correct
         Events : 2

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State                                                                                                             
this     3       8       65        3      spare   /dev/sde1                                                                                                 

   0     0       8        1        0      active sync   /dev/sda1
   1     1       8       17        1      active sync   /dev/sdb1
   2     2       0        0        2      faulty removed
   3     3       8       65        3      spare   /dev/sde1
root@butters:/home/colin# mdadm /dev/md0 --add /dev/sde1
mdadm: Cannot open /dev/sde1: Device or resource busy


root@butters:/home/colin# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sda1[0] sdb1[1]
      1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
      
md_d0 : inactive sde1[3](S)
      976759936 blocks
       
unused devices: <none>

dmesg:

root@butters:/home/colin# dmesg | grep sde1
[    2.369522]  sde: sde1
[   17.072829] md: bind<sde1>
 
Old 12-21-2010, 10:16 AM   #2
xonogenic
Member
 
Registered: Feb 2006
Posts: 30

Original Poster
Rep: Reputation: 2
Hmm, changing what I'm looking for in dmesg changes things... looks like my disk might be DOA

root@butters:/home/colin# dmesg | grep sde
[ 2.369406] sd 6:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[ 2.369429] sd 6:0:0:0: [sde] Write Protect is off
[ 2.369431] sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
[ 2.369441] sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 2.369522] sde: sde1
[ 2.370863] sd 6:0:0:0: [sde] Attached SCSI disk
[ 17.072829] md: bind<sde1>
[23603.802516] sd 6:0:0:0: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[23603.802524] sd 6:0:0:0: [sde] Sense Key : Aborted Command [current] [descriptor]
[23603.802569] sd 6:0:0:0: [sde] Add. Sense: No additional sense information
[23603.802578] sd 6:0:0:0: [sde] CDB: Write(10): 2a 00 00 00 04 00 00 04 00 00
[23603.802597] end_request: I/O error, dev sde, sector 1024
[23603.802606] Buffer I/O error on device sde, logical block 128
[23603.802611] lost page write due to I/O error on sde
[23603.802621] Buffer I/O error on device sde, logical block 129
[23603.802626] lost page write due to I/O error on sde
[23603.802632] Buffer I/O error on device sde, logical block 130
[23603.802636] lost page write due to I/O error on sde
[23603.802642] Buffer I/O error on device sde, logical block 131
[23603.802647] lost page write due to I/O error on sde
[23603.802653] Buffer I/O error on device sde, logical block 132
[23603.802657] lost page write due to I/O error on sde
[23603.802663] Buffer I/O error on device sde, logical block 133
[23603.802667] lost page write due to I/O error on sde
[23603.802673] Buffer I/O error on device sde, logical block 134
[23603.802677] lost page write due to I/O error on sde
[23603.802683] Buffer I/O error on device sde, logical block 135
[23603.802687] lost page write due to I/O error on sde
[23603.802693] Buffer I/O error on device sde, logical block 136
[23603.802697] lost page write due to I/O error on sde
[23603.802703] Buffer I/O error on device sde, logical block 137
[23603.802707] lost page write due to I/O error on sde
[23603.802916] sd 6:0:0:0: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[23603.802923] sd 6:0:0:0: [sde] Sense Key : Aborted Command [current] [descriptor]
[23603.802964] sd 6:0:0:0: [sde] Add. Sense: No additional sense information
[23603.802972] sd 6:0:0:0: [sde] CDB: Write(10): 2a 00 00 00 00 00 00 04 00 00
[23603.802989] end_request: I/O error, dev sde, sector 0
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Bad Sectors on mdadm Raid 5 array The Jester Linux - General 4 08-21-2010 05:47 AM
mdadm create,, raid array is not clean dbj Linux - Server 2 11-16-2009 05:36 PM
I need help on repairing a mdadm raid 5 array compgeek50 Linux - Hardware 0 02-24-2008 08:06 AM
Recovering a Raid 5 array, mdadm mess-up somebox Linux - Server 4 10-17-2007 06:57 PM
mdadm: re-assembling raid array on new installation hamish Linux - Server 3 06-10-2007 02:23 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 12:09 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration