LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 01-17-2010, 04:22 PM   #1
touser
Member
 
Registered: Apr 2005
Posts: 31

Rep: Reputation: 15
mdadm cannot remove failed drive, drive name changed.


Hello everyone, i am setting up a software raid6 for the first time. To test the raid i removed a drive from the array by popping it out of the enclosure. mdadm marked the drive as F and everything seemed well. From what i gather the next step is to remove the drive from the array (mdadm /dev/md0 -r sdf), when i try this i receive the error:
mdadm: cannot find /dev/sdf: No such file or directory

That is true, when i plugged the drive back in the machine now recognizes it as /dev/sdk. My question is how do i remove this non-existent failed drive from my array as i was able to re-add it just fine as /dev/sdk with mdadm /dev/md0 -a /dev/sdk

Also, is there any way to define a drive based on id or something similar to the same drive name to avoid this? Thank you for any help in advance!

Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdk[9] sdj[8] sdi[7] sdh[6] sdg[5] sdf[10](F) sde[3] sdd[2] sdc[1] sdb[0]
13674601472 blocks level 6, 64k chunk, algorithm 2 [9/8] [UUUU_UUUU]
[>....................] recovery = 2.7% (54654548/1953514496) finish=399.9min speed=79132K/sec

mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Sat Jan 16 04:54:06 2010
Raid Level : raid6
Array Size : 13674601472 (13041.12 GiB 14002.79 GB)
Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
Raid Devices : 9
Total Devices : 10
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Sat Jan 16 22:09:34 2010
State : clean, degraded, recovering
Active Devices : 8
Working Devices : 9
Failed Devices : 1
Spare Devices : 1

Chunk Size : 64K

Rebuild Status : 2% complete

UUID : d3d98db4:55167169:f455fbeb:21592b43 (local to host archive)
Events : 0.10

Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
9 8 160 4 spare rebuilding /dev/sdk
5 8 96 5 active sync /dev/sdg
6 8 112 6 active sync /dev/sdh
7 8 128 7 active sync /dev/sdi
8 8 144 8 active sync /dev/sdj

10 8 80 - faulty spare
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 01-24-2010, 08:18 AM   #2
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Slackware 14.1 (multilib) with kernel 3.15.5
Posts: 1,534
Blog Entries: 12

Rep: Reputation: 171Reputation: 171
I'd guess the problem was physically removing the drive to "fail" it. If it had failed but still been physically present, you probably could have removed it with mdadm -r, then physically removed it. Probably either a physical reboot of the machine or possibly just stopping the array and re-assembling it will solve the problem.
 
Old 01-04-2011, 01:33 PM   #3
xaminmo
LQ Newbie
 
Registered: Feb 2010
Location: TX
Distribution: Debian
Posts: 10

Rep: Reputation: 4
For Posterity: mdadm /dev/md0 -r detached

This is a common issue, and leaving the solution as "well, don't let your drives disappear before removing them" is unfathomable.

To remove the failed and missing drives, don't specify them, use
mdadm /dev/md0 -r detached

Code:
/bin/bash# mdadm /dev/md2 -r detached
mdadm: hot removed 8:19 from /dev/md2
mdadm: hot removed 8:35 from /dev/md2
[root@ns1:/root]
/bin/bash# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
md2 : active raid6 sdg3[5](S) sdf3[6] sdd3[0] sda3[1] sde3[3]
      5753928192 blocks level 6, 512k chunk, algorithm 2 [5/3] [UU_U_]
      [=========>...........]  recovery = 46.7% (897177416/1917976064) finish=343.0min speed=49593K/sec

md0 : active raid1 sdg1[1] sdf1[0] sde1[4] sdd1[3] sda1[2]
      264960 blocks [5/5] [UUUUU]

md1 : active raid6 sdg2[4] sdf2[3] sda2[0] sde2[2] sdd2[1]
      105810432 blocks level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]

unused devices: <none>
This removes the detached devices which are no longer on the system. Arguably, this might be desired BEFORE readding the devices, as you might be able to use --re-add and save a rebuild.
 
2 members found this post helpful.
Old 08-25-2012, 09:25 PM   #4
schworak
LQ Newbie
 
Registered: Apr 2011
Location: Salem Oregon USA
Posts: 3

Rep: Reputation: 1
Talking

Quote:
Originally Posted by xaminmo View Post
This is a common issue, and leaving the solution as "well, don't let your drives disappear before removing them" is unfathomable.

To remove the failed and missing drives, don't specify them, use
mdadm /dev/md0 -r detached

YOU ROCK!

Thank you so much for having the CORRECT answer to the problem. I just had a drive die due to total power failure on it and because it is in a hot-swap case, the case thought I pulled it out so it vanished. So this saved me from dismounting the raid and re-assembling it.

THANK YOU SO MUCH!
 
Old 08-26-2012, 06:39 PM   #5
xaminmo
LQ Newbie
 
Registered: Feb 2010
Location: TX
Distribution: Debian
Posts: 10

Rep: Reputation: 4
mdadmin remove failed devices

I'm glad it helps. I know it did for me. It took me tons of digging to figure this out.
Then, I posted it in several places so I could find it again when it happened to me again.
Which I did, and read the answer, and was so happy, then saw I posted it!

Also, I found that you can't hot-add devices to the same /dev/sd## device until you free them this way.
When they are gone, but still locked by MDADM, the device nodes stay in kernel space even though udev devices are gone.
Once you do this, then you can hot-plug a replacement drive and you won't have a gap in drive numbers.

Last edited by xaminmo; 08-26-2012 at 06:41 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
mdadm RAID 5 single drive failure atarghe1 Linux - Server 7 12-14-2012 06:20 PM
mdadm and lvm removed hard drive seaking1 Linux - Hardware 1 01-25-2009 06:53 AM
How does mdadm handle changing drive letters? crontab Linux - Software 3 09-04-2008 09:02 AM
remount RAID drive after reboot with mdadm ufmale Linux - Software 1 11-15-2007 08:13 PM
How to copy ext2fs from failed hard drive to good drive? DogWalker Linux - Hardware 2 08-30-2004 10:52 PM


All times are GMT -5. The time now is 03:33 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration