Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
04-11-2006, 09:41 AM
|
#1
|
LQ Newbie
Registered: Jan 2005
Posts: 12
Rep:
|
help interpreting MDADM readouts
Hi all,
(Is this in the right place? Is it hardware or software related? Also, it's a repost because I posted it in the hardware section, but no replies.)
I have a Fedora Core 3 home server with three 320GB hard drives, which are fairly new -- only a month or so old. They are in a RAID-5 array, with partitions like this:
/dev/hda1 5GB root partition mounted as /
/dev/hdd1 and /dev/hdc1 are a 8GB Logical Volume Group thingumy mounted as /tmp
/dev/hda2 /dev/hdc3 and /dev/hdd3 are the 630 GB RAID-5 array mounted as /home
(There's some swap partitions there too and some space is lost due to filesystem inefficienies.)
This all well and good but late last night one of the hard drives (hdd, a secondary slave) started making loud clicking sounds at fairly regular intervals, about once a minute or so. Catting /proc/mdstat revealed one of the drives was faulty but there was nothing I could do about it until today. I turned off the server, reseated the cables etc., turned it back on, and the drive wasn't recognised by the BIOS. I put the drive into my own workstation and it "click"ed on startup, but was recognised by both the BIOS and by Suse (though I didnt try to mount it -- obviously). So I put it back into the server and the BIOS recognised it ok and its been running for an hour or so with no clicking sounds. I dont think all is well, however, perhaps you guys can help me make sense of these RAID readouts?
# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 hdc3[1] hda2[0]
614903808 blocks level 5, 256k chunk, algorithm 2 [3/2] [UU_]
unused devices: <none>
There seems to be only hdc3 and hda2 in this array -- no sign of hdd3. And what does it mean [3/2]? Shouldnt it be [2/3] because it is two out of three drives?
# mdadm --examine /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 00000000)
What exactly does this mean?
# mdadm --query /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Wed Feb 15 14:35:22 2006
Raid Level : raid5
Array Size : 614903808 (586.42 GiB 629.66 GB)
Device Size : 307451904 (293.21 GiB 314.83 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Mar 3 14:39:02 2006
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 256K
Number Major Minor RaidDevice State
0 3 2 0 active sync /dev/hda2
1 22 3 1 active sync /dev/hdc3
2 0 0 -1 removed
UUID : 8c8f0f62:9e69e701:409df450:89adf2fb
Events : 0.136964
This suggests to me that there are two drives in the array, not three -- we're missing HDD, right?
# mdadm --examine /dev/hda2
/dev/hda2:
Magic : a92b4efc
Version : 00.90.00
UUID : 8c8f0f62:9e69e701:409df450:89adf2fb
Creation Time : Wed Feb 15 14:35:22 2006
Raid Level : raid5
Device Size : 307451904 (293.21 GiB 314.83 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 0
Update Time : Fri Mar 3 14:39:24 2006
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : 38c744bb - correct
Events : 0.136972
Layout : left-symmetric
Chunk Size : 256K
Number Major Minor RaidDevice State
this 0 3 2 0 active sync /dev/hda2
0 0 3 2 0 active sync /dev/hda2
1 1 22 3 1 active sync /dev/hdc3
2 2 0 0 2 faulty removed
Why does /dev/hda2 appear in that list twice? Surely it should only appear once? And again, we're missing HDD, right?
# mdadm --examine /dev/hdc3
/dev/hdc3:
Magic : a92b4efc
Version : 00.90.00
UUID : 8c8f0f62:9e69e701:409df450:89adf2fb
Creation Time : Wed Feb 15 14:35:22 2006
Raid Level : raid5
Device Size : 307451904 (293.21 GiB 314.83 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 0
Update Time : Fri Mar 3 14:39:36 2006
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : 38c744e9 - correct
Events : 0.136978
Layout : left-symmetric
Chunk Size : 256K
Number Major Minor RaidDevice State
this 1 22 3 1 active sync /dev/hdc3
0 0 3 2 0 active sync /dev/hda2
1 1 22 3 1 active sync /dev/hdc3
2 2 0 0 2 faulty removed
Again, hdc3 appears twice (why?) and there is no sign of HDD.
Let's have a look for HDD...
# mdadm --examine /dev/hdd3
/dev/hdd3:
Magic : a92b4efc
Version : 00.90.00
UUID : 8c8f0f62:9e69e701:409df450:89adf2fb
Creation Time : Wed Feb 15 14:35:22 2006
Raid Level : raid5
Device Size : 307451904 (293.21 GiB 314.83 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 0
Update Time : Thu Mar 2 18:30:44 2006
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : 38c60fdd - correct
Events : 0.133609
Layout : left-symmetric
Chunk Size : 256K
Number Major Minor RaidDevice State
this 2 22 67 2 active sync /dev/hdd3
0 0 3 2 0 active sync /dev/hda2
1 1 22 3 1 active sync /dev/hdc3
2 2 22 67 2 active sync /dev/hdd3
Now this is strange: we now have hdd listed twice, along with the others.
What exactly is going on here? Is it, like I think, that I'm going on only two drives? If so, how can I make the array reintegrate hdd3? Or if I'm wrong and everything is OK, where have I gone wrong in interpreting the readouts?
Cheers,
Daniel
|
|
|
04-20-2006, 05:50 AM
|
#2
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
UU_
I think it means one HD is not part of the raid anymore
confirmed by
State : clean, degraded
AFAIK raid5 can work with only 2 HD, this is the whole point of it
Time to do some backups, and buy a new HD, and "rebuild" the area
man mdadm (I have never rebuilt an area)
|
|
|
04-20-2006, 07:37 AM
|
#3
|
LQ Newbie
Registered: Jan 2005
Posts: 12
Original Poster
Rep:
|
Cheers for the help, looks like I have some work to do.
As I understand it, the third hard disk should be working physically fine, just as not part of the array. So I'll have to reintegrate it somehow. The persistent superblock (am I right?) will still be there which will hamper my attempts to reintegrate it the "normal" way (tutorials, man pages, etc.)
Cheers for your help,
Daniel
|
|
|
04-20-2006, 07:49 AM
|
#4
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
If the drive make noises and your server is critical,
then why would you put that faulty drive back?
If you try to add back that faulty drive to the area I do not know what can happen
I suppose a new drive is needed
I played only with raid 0
clicking sound, 1 month, send back for refund (3 yr warranty)
|
|
|
04-20-2006, 08:09 AM
|
#5
|
LQ Newbie
Registered: Jan 2005
Posts: 12
Original Poster
Rep:
|
I think its a 5 year warranty actually, so very cool! (It's not even 5 months old yet.)
It was clicking but after restarting the server and plugging the hard drive back in, the clicking stopped and seems to be working fine now. (Both Suse and Fedora can see it in /dev, and I can querry it using mdadm.) So I was planning to try and simply re-add it. Good idea, or do you think I should send it back? If I send it back, wouldn't they most likely plug it into a test computer, discover that it "works" (no clicking, recognised by the OS and the BIOS) and send it back to me?
Cheers,
Daniel
|
|
|
04-20-2006, 08:21 AM
|
#6
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
you can install smartmontools and look into
the life parameters of the HD
(saying that only very recent kernel may support SMART on sata)
http://smartmontools.sourceforge.net/
vendor will accept just a printout
There is probably a win utility from the vendor to access smart data
Be frank with vendor and tell them that the noise stopped on putting it back
Using it again: how much is it worth loosing all your data?
You know it is partly faulty... why do you want to use again for?
series effect:
check that serial no do not follow
if one HD failed, maybe the other will
raid is no replacement for backups
|
|
|
All times are GMT -5. The time now is 07:52 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|