LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-11-2012, 05:24 PM   #1
krayian
LQ Newbie
 
Registered: Jun 2012
Posts: 1

Rep: Reputation: Disabled
mdadm Raid 5 messup, disk out of sync


I have an mdadm Raid 5 installation consisting of four 2 TB disks. Suddenly one of the disks (I call it sdd) failed from the array with dmesg getting filled with messages like this
Code:
[4150866.564208] ata7.00: configured for UDMA/33
[4150866.564289] ata7: EH complete
[4150868.264686] ata7: exception Emask 0x10 SAct 0x0 SErr 0x41c0000 action 0xe frozen
[4150868.269575] ata7: irq_stat 0x00000040, connection status changed
[4150868.274485] ata7: SError: { CommWake 10B8B Dispar DevExch }
[4150868.279347] ata7: hard resetting link
[4150869.164974] ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
I took the disk out and tested, and it worked correctly. I suspect there is something broken in the disk controller, but tried rebooting the system. Everything seemed to work correctly with the degraded array and I was also able to access the sdd correctly. Then I did something I apparently shouldn't have: I inserted the sdd back into the array, and not only it failed immediately, but it also took sda with it. Here is current output of mdadm --examine on each of the three disks (sda, sdb and sdc):
Code:
/dev/sda:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 5f3a2cc7:6540fd9f:07bf9e84:f3abc916
           Name : saya:0
  Creation Time : Thu Jan  6 11:50:57 2011
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
     Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
  Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 3d53fcaf:af4b5b7f:90016b85:82872480

    Update Time : Mon Jun 11 21:46:02 2012
       Checksum : b01b3b28 - correct
         Events : 12147

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 5f3a2cc7:6540fd9f:07bf9e84:f3abc916
           Name : saya:0
  Creation Time : Thu Jan  6 11:50:57 2011
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
     Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
  Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 8e215baf:e3628767:01e270fd:88549fc7

    Update Time : Mon Jun 11 21:46:55 2012
       Checksum : 5728ce5d - correct
         Events : 12155

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : .AA. ('A' == active, '.' == missing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 5f3a2cc7:6540fd9f:07bf9e84:f3abc916
           Name : saya:0
  Creation Time : Thu Jan  6 11:50:57 2011
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
     Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
  Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 07cfe5ff:dcc51b2b:1b088f86:56e3e397

    Update Time : Mon Jun 11 21:46:55 2012
       Checksum : c4aa93b7 - correct
         Events : 12155

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : .AA. ('A' == active, '.' == missing)
sda thinks the array is still intact (with the sdd that just started rebuilding), while sdb and sdc know the array lacks two disks. The sda's last update time is 52 seconds earlier than it is for sdb and sdc.

Now, I'm not entirely sure should I do anything to these with this current motherboard but rather get new hardware before anything, but is it possible to rescue this raid? There shouldn't be any significant difference in state between those disks, only that they are not completely in sync.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
MDADM error: Help creating a RAID 1 RAM disk abidbodal Linux - Newbie 3 04-10-2011 12:33 PM
[SOLVED] Software RAID (mdadm) - RAID 0 returns incorrect status for disk failure/disk removed Marjonel Montejo Linux - General 4 10-04-2009 06:15 PM
replace failure disk and rebuild RAID with mdadm ufmale Linux - Software 0 11-15-2007 02:24 PM
RAID-1 with mdadm. Disk fails sometime. jostmart Linux - Server 5 08-15-2007 05:13 AM
Major problem with software raid (mdadm) and disk failure norwolf Linux - Server 8 07-27-2007 06:14 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 11:48 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration