LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 06-26-2009, 09:57 AM   #1
mikesjays
Member
 
Registered: Dec 2005
Distribution: FC8, FC9, FC10
Posts: 30

Rep: Reputation: 16
mdadm failed disk, why?


I have a Raid 1 with 2 WDC WD800JD, mdadm failed one of them and I'm not sure why. What should I check first? I looked at SMART and I did not see anything or maybe I missed it. I've run badblocks with a read only test and did not find any errors, I'm in the process of doing a read right test with badblocks now it looks like it has made to the 4 write read phase and nothing so far.

Is there a way to clear the failure and re-add the drive to the RAID and see if it happens again?

I'm running Fedora Core 10 2.6.27.12-170.2.5.fc10.i686
 
Old 06-26-2009, 06:16 PM   #2
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Arch/Manjaro, might try Slackware again
Posts: 1,859
Blog Entries: 14

Rep: Reputation: 284Reputation: 284Reputation: 284
I believe that if you remove the disk from the array and add it back again that that will clear the F flag.

Remove:

mdadm /dev/md{0,1..} -r /dev/hda{1,2...}

Add:

mdadm /dev/md{0,1..} -a /dev/hda{1,2...}

See what happens, assuming it gets sync'd:

mdadm --detail /dev/md{0,1..}


Hope that helps and you haven't tried all that already...
 
Old 06-27-2009, 01:20 PM   #3
mikesjays
Member
 
Registered: Dec 2005
Distribution: FC8, FC9, FC10
Posts: 30

Original Poster
Rep: Reputation: 16
I think I've done that in the past. I did try it again and we will see how long it will last. I was able to find a SMART error from the past.

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
200 Multi_Zone_Error_Rate 0x0009 200 001 051 Pre-fail Offline In_the_past 0


Does anyone know of a way to clear out the error log of SMART or is this what mdadm is seeing and causing it to kick out the drive?

Thanks
 
Old 06-28-2009, 10:36 PM   #4
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Arch/Manjaro, might try Slackware again
Posts: 1,859
Blog Entries: 14

Rep: Reputation: 284Reputation: 284Reputation: 284
I believe you can turn SMART off, actually, though you might not want to do that. However, I don't think mdadm monitors SMART or cares about that.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
mdadm RAID1 failed and not rebuilding indienick Linux - Hardware 7 01-20-2009 11:45 AM
mdadm RAID 5, 6 disks failed. recovery possible? ufmale Linux - Server 10 10-20-2008 09:24 AM
Using mdadm - Failed RAID-5 Array but individual disks check out ok JRFrogman Linux - Server 0 06-05-2008 03:46 PM
mdadm reports no superblock trying to rebuild failed RAID 5 hotcut23 Linux - Hardware 0 08-18-2007 02:39 AM
Failed Dependency installing mdadm blackdragonblood Linux - Software 3 02-03-2006 08:22 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 06:27 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration