LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 07-13-2009, 06:57 AM   #1
icrf
LQ Newbie
 
Registered: Sep 2005
Posts: 3

Rep: Reputation: 0
Drive intermittently dropping from RAID5 array


I have a 9x320G RAID5 array that I am migrating over to a 3x1.5T RAID5 array. Intermittently, a drive would drop out of the older array and it would automatically start rebuilding. I thought it was a bad cable or controller somewhere, so when I bought the three new drives, I bought a new controller for them all, too.

I'm running both arrays side by side until I'm happy the new hardware is stable (one drive was DOA). Then I noticed one morning that both arrays were rebuilding themselves. This was in /var/log/messages:
Code:
Jul  5 00:30:19 mnemosyne -- MARK --
Jul  5 00:50:19 mnemosyne -- MARK --
Jul  5 01:06:02 mnemosyne kernel: md: syncing RAID array md0
Jul  5 01:06:02 mnemosyne kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Jul  5 01:06:02 mnemosyne kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Jul  5 01:06:02 mnemosyne kernel: md: using 128k window, over a total of 312568576 blocks.
Jul  5 01:06:02 mnemosyne kernel: md: syncing RAID array md1
Jul  5 01:06:02 mnemosyne kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Jul  5 01:06:02 mnemosyne kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Jul  5 01:06:02 mnemosyne kernel: md: using 128k window, over a total of 1465135936 blocks.
Jul  5 01:30:20 mnemosyne -- MARK --
Jul  5 01:50:20 mnemosyne -- MARK --
Each array is on separate controllers, and the three new drives are actually on a separate PSU, too, not using any of the nice drive cages I have for the older ones. Any idea what cause both arrays to rebuild at the same time? There was nothing in the logs prior to the above.
 
Old 07-17-2009, 06:04 PM   #2
icrf
LQ Newbie
 
Registered: Sep 2005
Posts: 3

Original Poster
Rep: Reputation: 0
Any ideas? Any other useful information I can provide?
 
Old 09-06-2009, 09:43 AM   #3
icrf
LQ Newbie
 
Registered: Sep 2005
Posts: 3

Original Poster
Rep: Reputation: 0
Got emails that the array was rebuilding this morning, and found this in syslog:
Code:
Sep  6 01:06:01 mnemosyne /USR/SBIN/CRON[4548]: (root) CMD ([ -x /usr/share/mdadm/checkarray ] && [ $(date +%d) -le 7 ] && /usr/share/mdadm/checkarray --cron --all --quiet)
Sep  6 01:06:02 mnemosyne kernel: md: syncing RAID array md0
Sep  6 01:06:02 mnemosyne kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Sep  6 01:06:02 mnemosyne kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Sep  6 01:06:02 mnemosyne kernel: md: using 128k window, over a total of 312568576 blocks.
Sep  6 01:06:02 mnemosyne kernel: md: syncing RAID array md1
Sep  6 01:06:02 mnemosyne kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Sep  6 01:06:02 mnemosyne kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
So it looks like this is debian-specific, as checkarray is something written just for debian. It runs the first Sunday of every month at 1:06 AM, which matches up with the last time I saw this, noted above, on July 5.

So, my best guess is there's something wrong with at least one drive on each array that's causing the check to fail, hence the rebuilding. I know there's something up with one of the three drives on md1, and md0 has nine drives, all of which are much older and used, so it wouldn't surprise me if there was an issue there, too.

I guess I was expecting it to say "problem with device X, rebuilding" but it doesn't really know what device has the problem, all it knows is the array itself is out of sync.
 
  


Reply

Tags
debian, mdadm, raid5, rebuild


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Grow RAID5 array dezza Linux - Server 2 07-09-2009 10:50 AM
Mdadm: reporting 2 drive failures in RAID5 array wolfywolf Linux - Software 3 04-26-2009 11:54 AM
adding another 1TB drive to raid5 array. javaholic Linux - Server 5 08-26-2008 10:59 AM
Removed RAID5 array and now no boot Child of Wonder Ubuntu 2 07-27-2007 09:58 PM
Secure Deletion with RAID5 array neilschelly Linux - Security 2 12-05-2004 06:25 PM


All times are GMT -5. The time now is 08:08 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration