LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 02-02-2009, 06:04 AM   #1
carlmarshall
Member
 
Registered: Jan 2004
Location: North Yorkshire, UK
Distribution: Centos 5
Posts: 133

Rep: Reputation: 16
mdadm - removing faulty spare


Hi,

I've had a failure of one of my HDDs (/dev/sdc) which makes up a few RAID partitions. The hot spare has now cut in, so all is currently safe, but how do I now remove the faulty spare?

mdadm --detail /dev/md1 gives the following:

Version : 00.90.03
Creation Time : Fri May 23 15:37:20 2008
Raid Level : raid5
Array Size : 945312256 (901.52 GiB 968.00 GB)
Device Size : 472656128 (450.76 GiB 484.00 GB)
Raid Devices : 3
Total Devices : 4
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Mon Feb 2 11:52:32 2009
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 256K

UUID : 10644348:012f6764:70879599:7631693d
Events : 0.4896

Number Major Minor RaidDevice State
0 8 5 0 active sync /dev/sda5
1 8 21 1 active sync /dev/sdb5
2 8 53 2 active sync /dev/sdd5

3 8 37 - faulty spare

How can I mark the now faulty spare for removal. man mdadm gives me the line:

mdadm /dev/md1 --fail /dev/sdc5

This fails as it can't see /dev/sdc5

Any ideas?

Carl.
 
Old 02-02-2009, 01:37 PM   #2
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Arch/Manjaro, might try Slackware again
Posts: 1,851
Blog Entries: 14

Rep: Reputation: 284Reputation: 284Reputation: 284
Quote:
How can I mark the now faulty spare for removal
You don't have to mark it as failed; mdadm already did that.
 
Old 02-03-2009, 03:18 AM   #3
carlmarshall
Member
 
Registered: Jan 2004
Location: North Yorkshire, UK
Distribution: Centos 5
Posts: 133

Original Poster
Rep: Reputation: 16
Don't I have to mark it for removal? e.g.

mdadm /dev/md1 --remove /dev/sdc5

before I can physically remove it.

Carl.
 
Old 02-03-2009, 01:51 PM   #4
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Arch/Manjaro, might try Slackware again
Posts: 1,851
Blog Entries: 14

Rep: Reputation: 284Reputation: 284Reputation: 284
Probably, but I thought your original question was marking as failed.

Quote:
For Manage mode:
-a, --add

hot-add listed devices.
--re-add
re-add a device that was recently removed from an array.
-r, --remove
remove listed devices. They must not be active. i.e. they should be failed or spare devices.
-f, --fail
mark listed devices as faulty.
--set-faulty
same as --fail.
Each of these options require that the first device list is the array to be acted upon and the remainder are component devices to be added, removed, or marked as fault. Several different operations can be specified for different devices, e.g.
mdadm /dev/md0 --add /dev/sda1 --fail /dev/sdb1 --remove /dev/sdb1
Each operation applies to all devices listed until the next operations.
If an array is using a write-intent bitmap, then devices which have been removed can be re-added in a way that avoids a full reconstruction but instead just updated the blocks that have changed since the device was removed. For arrays with persistent metadata (superblocks) this is done automatically. For arrays created with --build mdadm needs to be told that this device we removed recently with --re-add.

Devices can only be removed from an array if they are not in active use. i.e. that must be spares or failed devices. To remove an active device, it must be marked as faulty first.


Though since it is the failed disk, I doubt that it would make a difference; after all it's not being used anymore.
 
Old 02-04-2009, 09:47 AM   #5
carlmarshall
Member
 
Registered: Jan 2004
Location: North Yorkshire, UK
Distribution: Centos 5
Posts: 133

Original Poster
Rep: Reputation: 16
Thanks for that, the one remaining question is that the output of

mdadm --detail /dev/md0

gives:

/dev/md0:
Version : 00.90.03
Creation Time : Fri May 23 15:36:56 2008
Raid Level : raid5
Array Size : 8385536 (8.00 GiB 8.59 GB)
Device Size : 4192768 (4.00 GiB 4.29 GB)
Raid Devices : 3
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Nov 3 14:02:47 2008
State : clean
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 256K

UUID : 5df593b7:b205acc4:57fae03c:ec92ecae
Events : 0.20

Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 8 19 1 active sync /dev/sdb3
2 8 35 2 active sync

3 8 51 - spare /dev/sdd3

How do I deal with the 3rd disk (Number 2) which used to be /dev/sdc3? I can't mark it as failed nor remove it since I can't specify which element has failed.

/dev/sdc no longer shows as a valid item in /dev

Any ideas?

Carl.
 
Old 02-04-2009, 10:02 AM   #6
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Arch/Manjaro, might try Slackware again
Posts: 1,851
Blog Entries: 14

Rep: Reputation: 284Reputation: 284Reputation: 284
That's a good question. I would try shutting down the array and remounting it, possibly with the resync option:
Quote:
mdadm -S /dev/md0
mdadm -A --update=resync
 
Old 02-04-2009, 10:06 AM   #7
carlmarshall
Member
 
Registered: Jan 2004
Location: North Yorkshire, UK
Distribution: Centos 5
Posts: 133

Original Poster
Rep: Reputation: 16
Thanks for that, I'll give it a try.

Carl.
 
Old 04-28-2009, 04:31 PM   #8
spqrusa
LQ Newbie
 
Registered: Apr 2009
Posts: 1

Rep: Reputation: 1
Quote:
Originally Posted by carlmarshall View Post
Hi,

I've had a failure of one of my HDDs (/dev/sdc) which makes up a few RAID partitions. The hot spare has now cut in, so all is currently safe, but how do I now remove the faulty spare?

Any ideas?

Carl.
Hi Carl,

You can remove any faulty or failed drives with :

sudo mdadm --manage /dev/md0 --remove faulty
-- or --
sudo mdadm --manage /dev/md0 --remove failed

This lets mdadm know to deallocate the device space. When you hot-add a new spare drive it should replace the /dev/sd<failed> node. After the hot-add you can:

sudo mdadm --manage /dev/md0 --re-add /dev/sd<failed>

-- a real world case --

sudo mdadm --manage /dev/md0 --re-add /dev/sdc1

Hope that helps,

SPQR

Last edited by spqrusa; 04-28-2009 at 04:32 PM.
 
1 members found this post helpful.
Old 03-09-2023, 02:31 PM   #9
voidstar
LQ Newbie
 
Registered: Mar 2023
Posts: 1

Rep: Reputation: 0
Quote:
Originally Posted by spqrusa View Post
Hi Carl,

You can remove any faulty or failed drives with :

sudo mdadm --manage /dev/md0 --remove faulty
-- or --
sudo mdadm --manage /dev/md0 --remove failed

...

Hope that helps,

SPQR
Hello from the distant future! Thank you, --remove faulty was exactly what I needed to know.

I knew how to do --remove /dev/sdX, but not what to do when the device was already missing so there was no device id to refer to.

My situation was, I had replaced a failing drive by adding a hot spare and then physically yanking the bad one, without first doing --fail and --remove. The rebuild went fine, but I wanted to get the old entry off the list so the array would show as "clean", preferably without needing to restart it.

Last edited by voidstar; 03-09-2023 at 02:39 PM. Reason: formatting
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
mdadm forced resyncing to activate spare drive javaholic Linux - Server 3 12-15-2008 06:24 AM
[Fedora 9]mdadm + faulty spare setkos Linux - Newbie 0 10-30-2008 09:17 AM
"faulty spare" do I have a bad HDD? Red Squirrel Linux - Software 3 09-29-2008 05:23 PM
mdadm shows 2 faulty drives steven.wong Linux - General 2 08-21-2006 03:39 AM
A question on removing spare kernels satimis Fedora 2 01-09-2005 09:39 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 02:35 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration