[SOLVED] reactivating raid after drive disconnect - all drives now listed as spares
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
reactivating raid after drive disconnect - all drives now listed as spares
CentOS7 - I have a 32 drive array that I'm working with to learn and once solid, use for storage. Before I do that, I wanted to ensure I knew how to recover from disaster. I've already removed / replaced a disk, and now, after a catastrophic test (unplugged 16 drives in the middle of use, rebooted) I'm unable to get it back online.
Drives in the raid are /dev/sdb1 - /dev/sdab1, fs is ext4. Raid 10. When mdadm assembled, it was a hodgepodge of what drive from where went with what, so when I unplugged the 16 drives at the same time, it was definitely not just mirrors of the other 16 - I expected it to fail.
The output of
mdadm --examine /dev/sd{b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,ab,ac,ad,ae,af,ag}1
is very long, so I won't include that here. However, a single drive is:
Code:
/dev/sdag1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 0c8a51b5:c79e4eae:a2a30468:40a1e2d4
Name : hz16:0 (local to host hz16)
Creation Time : Sun Feb 21 21:58:16 2016
Raid Level : raid10
Raid Devices : 32
Avail Dev Size : 968622080 (461.88 GiB 495.93 GB)
Array Size : 7748976640 (7390.00 GiB 7934.95 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : clean
Device UUID : 0d63a689:dc69402c:222a45a1:4aaca376
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Feb 22 10:51:42 2016
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 6f9ddbd4 - correct
Events : 8811
Layout : near=2
Chunk Size : 512K
Device Role : Active device 31
Array State : AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
All drives are online and ready, but can't get it to assemble, and all drives are tagged as spares. What do I have to do to get this to assemble once more?
I'll have to look through my notes and get back to you. FWIW the #raid channel on irc.freenode.net has pretty knowledgeable users which can help. They've helped me. Just don't forget about IRC etiquette (simply ask rather than ask to ask, people in different time zones may take 24 hrs to respond, be respectful when asking for help, etc).
Distribution: Cinnamon Mint 20.1 (Laptop) and 20.2 (Desktop)
Posts: 1,672
Rep:
Quote:
after a catastrophic test (unplugged 16 drives in the middle of use, rebooted) I'm unable to get it back online
I can't guess what sort of failure scenario you were trying to simulate.
I'd say it's toast. RAID 10 can only recover if a failed/pulled disk has its mirror intact.
Quote:
so when I unplugged the 16 drives at the same time, it was definitely not just mirrors of the other 16 - I expected it to fail.
Yup! Toast.
I've only been involved with replacing failed disks in arrays, HP, IBM, Sun, usually at three in the morning! (Why's that? ) So I'd say that your 16 out of 32 disks in the same RAID 10 "failure" is highly improbable. If you split them across two 16 disk JBoDS I'd expect one JBod to mirror the other so a PSU failure on one JBod wouldn't kill everything. (Most arrays have redundant PSUs to mitigate against this as well.)
In your scenario I'd say your quickest option to get back in business would be to re-initialise the RAID and do a restore.
Most raid structures allow for single disk failures to be recovered. raid6 allows for two disk to fail.
Unfortunately, if more than that fail you are toast.
This is why most raid architectures have done it in groups of 5 disks/volume (raid5), then combine multiple
raid 5 volume into mirror groups... Thus requiring four disks to fail (two in each raid5 in a mirror group).
I figured that since all the drives are still as they were at the time of the fail (data / partitions all intact, superblocks unchanged) then the array could be assembled once again - I'd even expect it to assemble without intervention. I thought that at worst I'd just have to do a file system repair after reassembly to correct any minor file system errors on the last file written. It seems like a major weakness that mdadm can't handle a temporary loss of the drives (power failure, cable disconnect). While true that I'd expect a total failure if 1/2 of a mirror was *permanently* lost; but in this case where it just disappeared for a short time (and thus the disk stopped being utilized) it would seem to be highly recoverable. Think about a scenario where all the power to all equipment goes out - different power supplies die after different time periods - maybe 0.5 seconds apart. Considering that, and the assumption that my current situation is unrecoverable, then *any* power failure would result in a total rebuild of the array and restore of data from backup.
In my current setup I have 2 boxes of 16 drives. The failure was essentially a power failure on one of those boxes (or a loose data cable - my test was actually unplugging the cable for a while). I just thought mdadm would be more tolerant of a temporary disconnect than that.
As I was laying things out for this, one of my questions early on was how to direct mdadm to use what drive in what part of the raid. Obviously as raid-10 with 32 drives in 2 boxes, I'd want one half of each 16 mirrors to be on box1, and the other half of those mirrors to be on box2; then stripe the mirrors. But mdadm during the assemble puts the pairs all over the place. How do you tell it what to stick where? Does it require setting up 16 raid1's first then setup the raid0 (instead of just specifying to assemble a raid10)?
No matter how fast you unplug - some of the disks will be disabled... and the remaining disks informed of that failure.
Even a hard power off won't do that as the disks are designed to maintain operation for a second (or so) of operations to save the current DMA, and any buffers to disk.
The kernel raid software is designed to protect... not prevent.
It likely would have worked if the system were powered down instead...
I'm not sure what everyone is freaking out about. What the OP did is a realistic scenario, think power interruption on the backplane, failed power splitter feeding half the array, etc.
Of course the array will go down, nothing but RAID 1 can protect against that, the point is he replaced the drives and the array is not rebuilding/verifying. The drives, when added back in, were detected as spares instead of the missing parts of the failed array.
I have had exactly this scenario happen on a 24 drive 80 TB RAID 60 system of mine. A power cable went bad and power to 8 drives was cut during operation. It was a hardware RAID, not software, and recovery simply consisted of deleting the array and re-creating it without initialization, followed by an fsck. No data loss, and only minor down time.
Unfortunately I do not know the proper steps to recover the array with mdadm. Frankly I've never heard of somebody using a 32 drive software array, it sounds dangerous to me given my limited experience and numerous hiccups with software raid.
@suicidaleggroll thanks - that was my point; that it seems there *should* be an easy solution for bringing it back online after such an event. I'm not convinced that there isn't a way, but I haven't figured it out yet obviously. You should be able to flick the "spare" bit to "up" then reassemble.
The excessive 32 drive software array is just because I happened into free hardware - one mans garbage... They are only 500GB drives, but it'd be a shame to not play with them and get more familiar with software raids. I've scripted the setup so I can re-do it quickly when necessary. I also have a hardware array on the machine as well but I have not yet dug into playing with it much yet.
Data was never really lost - all just test data to play with the raid.
But if there's no way to recover, does anyone know about the other question? - how to dictate which drive goes where in the raid? Is the "solution" I mentioned (establish 16 mirrors then stripe it) the best way to achieve that? I was concerned that I might loose something to overhead by creating a raid of raids manually like that, but perhaps mdadm is smart enough to optimize it properly. Can mdmadm handle swapping a bad drive within a mirror within a stripe like that? To do it, it would have to have translated the raid1's in the raid0 to be a raid10 on it's own.
I'm looking forward to pulling the plug on 16 drives and having it still run
Well, I haven't tried this (not enough disks actually), but mdadm does have a "--re-add" option under the manage command. According to the manpage:
Quote:
If the device name given is faulty then mdadm will find all
devices in the array that are marked faulty, remove them and
attempt to immediately re-add them. This can be useful if you
are certain that the reason for failure has been resolved.
Distribution: Cinnamon Mint 20.1 (Laptop) and 20.2 (Desktop)
Posts: 1,672
Rep:
Quote:
I'm wondering what/why /dev/sdb1 is busy. Something must be using it for it to be busy, and shouldn't be.
OK, here's a guess...
All the drives in a RAID have an extra small partition on them containing data which keeps track of the RAID; disk details including serial No., position of the disk within the array, disk status (ready, failed, recovery/rebuild, etc,).
It needs to read this information which is used to recover from disk failures. 32 disks of a 64 disk array suddenly disappeared so the data on these 32 partitions no longer matches the the remaining 32 which should have been updated to reflect the now missing disks. (This may or may not have happened as the RAID was effectively shot in the head! ) I'd imagine the Op pulled them one at a time so at least some of the remaining disks have had this data updated. Now nothing matches.... AAarrgghh!
You'll notice on the Ops attempt to recover the RAID by re-adding the disks, the first disk it tries to access to read this config data is /dev/sdb1 which I reckon is this RAID system config partition on that disk.
Which disks were pulled? Was this one of them? maybe the data is now corrupt?
Anyway, that's my If I'm wrong in my conceptual description, I think I should at least get a gold star for the attempt!
"lsof | grep sdb" reports nothing. I am not sure why mdadm reports it as busy (the array is not started, drive not mounted) but it's consistent across reboots.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.