Linux - Server This forum is for the discussion of Linux Software used in a server related context. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
11-02-2010, 09:46 AM
|
#1
|
LQ Newbie
Registered: Mar 2010
Posts: 8
Rep:
|
RAID6 rebuild weirdness
I am in a weird situation with a raid6 array which has just had a disk replaced. It is composed of 8 disks, sd[b-i], and sdh went bad. I took the system down, replaced the disk, brought it back up, and I can't mount the array (/dev/md0), and I'm not sure what it happening. Here's some output:
Code:
[root@hostname ~]# mdadm --query /dev/md0
/dev/md0: 0.00KiB raid6 8 devices, 1 spare. Use mdadm --detail for more detail.
How is this possible? Doesn't raid 6, by definition, have *2* spares?
Code:
[root@hostname ~]# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Mon Jun 28 10:46:51 2010
Raid Level : raid6
Device Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Tue Nov 2 08:19:52 2010
State : dirty, degraded
Active Devices : 7
Working Devices : 8
Failed Devices : 0
Spare Devices : 1
Chunk Size : 64K
UUID : 6b8b4567:327b23c6:643c9869:66334873
Events : 0.34591155
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
3 8 65 3 active sync /dev/sde1
4 8 81 4 active sync /dev/sdf1
5 8 97 5 active sync /dev/sdg1
6 8 113 6 spare rebuilding /dev/sdh1
7 8 129 7 active sync /dev/sdi1
If there are 7 disks working, and it's rebuilding the other, why can't I go ahead and mount the array; I've only lost one disk!
Code:
[root@hostname ~]# cat /proc/mdstat
Personalities : [raid6]
md0 : inactive sdh1[6] sdb1[0] sdi1[7] sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1]
15628095488 blocks
unused devices: <none>
I don't understand why I can't just mount the array. Instead, I get a superblock error:
Code:
root@hostname ~]# mount /dev/md0
mount: /dev/md0: can't read superblock
Can someone please shed some light on what it going on?
Thanks!
Last edited by mvanhorn; 11-02-2010 at 10:08 AM.
|
|
|
11-02-2010, 10:02 AM
|
#2
|
Member
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731
Rep: 
|
Hi,
the RAID cannot be mounted, because it is inactive as mdstat shows.
Did you try the
Code:
mdadm --run /dev/md0
Be careful with all commands in case there is data on your RAID.
|
|
|
11-02-2010, 10:07 AM
|
#3
|
LQ Newbie
Registered: Mar 2010
Posts: 8
Original Poster
Rep:
|
I've now downloaded a new mdadm, and this shows that the array is active:
Code:
[root@chanute04 mdadm-3.1.4]# mdadm --detail /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Mon Jun 28 10:46:51 2010
Raid Level : raid6
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Tue Nov 2 08:19:52 2010
State : active, degraded, Not Started
Active Devices : 7
Working Devices : 8
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
UUID : 6b8b4567:327b23c6:643c9869:66334873
Events : 0.34591155
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
3 8 65 3 active sync /dev/sde1
4 8 81 4 active sync /dev/sdf1
5 8 97 5 active sync /dev/sdg1
6 8 113 6 spare rebuilding /dev/sdh1
7 8 129 7 active sync /dev/sdi1
So, why can't I mount it?
And, yes, I have tried --run:
Code:
[root@hostname]# mdadm --run /dev/md0
mdadm: failed to run array /dev/md0: Input/output error
And, yes, the raid does have data on it; I was expecting to still be able to get to the data while it was adding the new drive back in. I don't understand why I can't do so.
Last edited by mvanhorn; 11-02-2010 at 10:09 AM.
|
|
|
11-02-2010, 10:12 AM
|
#4
|
Member
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731
Rep: 
|
Hi,
did they RAID autostart during boot, or did you assemble it manualy? I'm also not sure what creates this behaviour.
Can you stop your raid and start it manually, i can not see any reason why mdstat reports the array as inactive.
|
|
|
11-02-2010, 10:20 AM
|
#5
|
LQ Newbie
Registered: Mar 2010
Posts: 8
Original Poster
Rep:
|
Quote:
Originally Posted by mesiol
did they RAID autostart during boot, or did you assemble it manualy? I'm also not sure what creates this behaviour.
Can you stop your raid and start it manually, i can not see any reason why mdstat reports the array as inactive.
|
No, it does not start at boot. I'm trying to start it manually, and that isn't working. I also don't understand why it's showing up as inactive; with (now) 8 disks of an 8-disk raid6 array, it should be just fine.
|
|
|
11-02-2010, 10:33 AM
|
#6
|
LQ Newbie
Registered: Mar 2010
Posts: 8
Original Poster
Rep:
|
This may help. I just tried to assemble it again, and now it says:
Code:
[root@hostname]# mdadm -A /dev/md0
mdadm: /dev/sdh1 has no superblock - assembly aborted
So, how do I put a superblock on /dev/sdh1 (this is the drive that was replaced). I thought the system would do this automatically?
|
|
|
11-02-2010, 11:22 AM
|
#7
|
Member
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731
Rep: 
|
Hi,
could please post output of
Do you replace your disk by only replacing the physical hardware? Or do you also replace it with the appropriate fdisk/mdadm commands?
Did you add the disk by creating a software RAID partition with fdisk and than using
Code:
mdadm manage /dev/md0 --add /dev/sdh1
?
Software RAID partitions contains private regions where information about the RAID is stored. This is created during initialization of the RAID. If you replace a disk this information is lost, so it cannot really be taken into the RAID. So you have to tell the RAID to use this disk as an replacement, therefor are commands like --remove and --add.
Last edited by mesiol; 11-02-2010 at 01:49 PM.
|
|
|
11-03-2010, 07:55 AM
|
#8
|
LQ Newbie
Registered: Mar 2010
Posts: 8
Original Poster
Rep:
|
It turns out I had a second disk bad. Or, well, maybe I do. I started checking the output of 'mdadm --examine' on each individual disk in the array, and on a couple of them it showed that a different disk (sdd, rather than sdh) had been removed from the array as well. So, I did a 'mdadm --assemble' without either sdd or sdh, and it took right off. Then, I was able to add sdh back in, and it's rebuilding now.
What's odd is that I've now run what I thought was the bad sdh through the Western Digital diagnostics, and everything seems okay with this disk. So, I'm going to try replacing sdd with this (formerly sdh) disk, and see what happens.
Thanks for your help!
|
|
|
11-03-2010, 08:05 AM
|
#9
|
Member
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731
Rep: 
|
Hi,
fine to hear that your rebuild is working. Wish you success on your problem.
|
|
|
All times are GMT -5. The time now is 06:54 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|