LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   mdadm acting oddly with RAID 5 array (https://www.linuxquestions.org/questions/linux-server-73/mdadm-acting-oddly-with-raid-5-array-851608/)

xonogenic 12-21-2010 10:11 AM

mdadm acting oddly with RAID 5 array
 
I have been having some odd issues over the last day or so while trying to get a raid 5 array running in software under Kubuntu.
I installed 3 1TB drives and started up, my sd* order got all messed up( sda was now sdc and so on). This wasn't entirely unexpected, so I fixed up fstab and booted again. I found all three of the drives I installed, set them to raid auto-detect and used mdadm to create /dev/md0. I then created mdadm.conf by piping the output of mdadm --detail --scan --verbose into /etc/mdadm.conf.

At this point, everything was still going swimmingly. I copied over a few hundred GB of data from another failing drive and everything seemed ok.

I went to reboot once the copy was done and everything just went weird. All of the sd* drives went back to the original. Of course, this meant that the mdadm.conf was wrong. I tried to just change the device list, but that didn't work. I then deleted mdadm.conf and rebooted. The drive list stayed in the original order this time, so I just tried manually starting the array.

By erasing the partition table of the 3rd drive, I've been able to get it to the status of spare, but it says it is busy when I try to add it to the array. A grep through dmesg makes me think that md has a lock on it. I'm not sure where to go with it now. If anyone has any pointers, I would like to hear them.

Thanks in advance.

Device List(original):

/dev/sda => boot drive, /home /
/dev/sdb => 1.5TB media storage, failing
/dev/sdc => 1 TB raid element
/dev/sdd => 1 TB raid element
/dev/sde => 1 TB raid element

Device List( changed )

/dev/sda => 1 TB raid element
/dev/sdb => 1 TB raid element
/dev/sdc => boot drive, /home, /
/dev/sdd => 1.5TB media storage, failing
/dev/sde => 1 TB raid element

Code:

mdadm:

root@butters:/home/colin# mdadm --examine /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : 16b4df56:8f2d6c64:f26de854:b79c2b58 (local to host butters)
  Creation Time : Sun Dec 19 14:02:12 2010
    Raid Level : raid5
  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
    Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
  Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sun Dec 19 14:02:12 2010
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 1
      Checksum : cd6d899b - correct
        Events : 2

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State                                                                                                           
this    3      8      65        3      spare  /dev/sde1                                                                                               

  0    0      8        1        0      active sync  /dev/sda1
  1    1      8      17        1      active sync  /dev/sdb1
  2    2      0        0        2      faulty removed
  3    3      8      65        3      spare  /dev/sde1
root@butters:/home/colin# mdadm /dev/md0 --add /dev/sde1
mdadm: Cannot open /dev/sde1: Device or resource busy


root@butters:/home/colin# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sda1[0] sdb1[1]
      1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
     
md_d0 : inactive sde1[3](S)
      976759936 blocks
     
unused devices: <none>

dmesg:

root@butters:/home/colin# dmesg | grep sde1
[    2.369522]  sde: sde1
[  17.072829] md: bind<sde1>


xonogenic 12-21-2010 10:16 AM

Hmm, changing what I'm looking for in dmesg changes things... looks like my disk might be DOA

root@butters:/home/colin# dmesg | grep sde
[ 2.369406] sd 6:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[ 2.369429] sd 6:0:0:0: [sde] Write Protect is off
[ 2.369431] sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
[ 2.369441] sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 2.369522] sde: sde1
[ 2.370863] sd 6:0:0:0: [sde] Attached SCSI disk
[ 17.072829] md: bind<sde1>
[23603.802516] sd 6:0:0:0: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[23603.802524] sd 6:0:0:0: [sde] Sense Key : Aborted Command [current] [descriptor]
[23603.802569] sd 6:0:0:0: [sde] Add. Sense: No additional sense information
[23603.802578] sd 6:0:0:0: [sde] CDB: Write(10): 2a 00 00 00 04 00 00 04 00 00
[23603.802597] end_request: I/O error, dev sde, sector 1024
[23603.802606] Buffer I/O error on device sde, logical block 128
[23603.802611] lost page write due to I/O error on sde
[23603.802621] Buffer I/O error on device sde, logical block 129
[23603.802626] lost page write due to I/O error on sde
[23603.802632] Buffer I/O error on device sde, logical block 130
[23603.802636] lost page write due to I/O error on sde
[23603.802642] Buffer I/O error on device sde, logical block 131
[23603.802647] lost page write due to I/O error on sde
[23603.802653] Buffer I/O error on device sde, logical block 132
[23603.802657] lost page write due to I/O error on sde
[23603.802663] Buffer I/O error on device sde, logical block 133
[23603.802667] lost page write due to I/O error on sde
[23603.802673] Buffer I/O error on device sde, logical block 134
[23603.802677] lost page write due to I/O error on sde
[23603.802683] Buffer I/O error on device sde, logical block 135
[23603.802687] lost page write due to I/O error on sde
[23603.802693] Buffer I/O error on device sde, logical block 136
[23603.802697] lost page write due to I/O error on sde
[23603.802703] Buffer I/O error on device sde, logical block 137
[23603.802707] lost page write due to I/O error on sde
[23603.802916] sd 6:0:0:0: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[23603.802923] sd 6:0:0:0: [sde] Sense Key : Aborted Command [current] [descriptor]
[23603.802964] sd 6:0:0:0: [sde] Add. Sense: No additional sense information
[23603.802972] sd 6:0:0:0: [sde] CDB: Write(10): 2a 00 00 00 00 00 00 04 00 00
[23603.802989] end_request: I/O error, dev sde, sector 0


All times are GMT -5. The time now is 07:27 AM.