RHEL5.4 software RAID 5 - how do I replace "faulty" drive?
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I have a 5 disk RAID 5 with 1 used as a hot spare. As a test, we pulled a drive while doing a 4GB file write to the RAID to simulate a drive failure during use. Everything went perfectly, the write wasn't interrupted and all seemed well, the RAID took about 60 minutes to rebuild on the spare. But when we plugged the original drive back in, it still shows up in /proc/mdstat as being removed:
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 49 1 active sync /dev/sdd1
2 8 33 2 active sync /dev/sdc1
3 0 0 3 removed
The RAID was originally setup with /dev/sd[b-e]1 with /dev/sdf as the hot spare. Is there something I need to do when I replace the "faulty" drive in order to get it to see new drive? This is what I tried:
[root@name-removed ~]# mdadm /dev/md0 -a /dev/sde1
mdadm: add new device failed for /dev/sde1 as 4: Invalid argument
All the forums I found about this message claimed that it was caused by the replacement drive not having enough blocks. This is obviously not the case since I am using the same drive. Here is the partition info:
The really frustrating part is that my system won't boot while in degraded mode unless I modify the /sys/module/md_mod/parameters/start_dirty_degraded file, which I don't want to have to do.
Is there a step I am missing in replacing a RAID drive?
I'm not sure what the problem was and the system needed to go into production 3 days ago, so I stopped and deleted the array and recreated it and now it seems to be working fine. Good thing we hadn't used the array for storage yet. Disk Druid built the RAID5 with the drives in a different order than mdadm did the second time, that is the only difference I can see. It seems to be happy for now.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.