Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
08-08-2005, 07:56 PM
|
#1
|
Member
Registered: Jun 2005
Distribution: Centos
Posts: 215
Rep:
|
Mdadm Raid question
As I had a server crash overnight, I'm still struggling to find the cause as the logs don't tell me anything.
This machine has got software raid setup and I started querying the Raid config.
I'm new to mdadm and I didnt' set it up originally but I get the following.
Maybe someone with mdadm skills will be able to help out.
Standard disk info stuff
--------------------------------
# fdisk -l
Disk /dev/hdc: 80.0 GB, 80026361856 bytes
16 heads, 63 sectors/track, 155061 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 * 1 207 104296+ fd Linux raid autodetect
/dev/hdc2 208 2312 1060920 fd Linux raid autodetect
/dev/hdc3 2313 155061 76985496 fd Linux raid autodetect
Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 13 104391 fd Linux raid autodetect
/dev/hda2 14 145 1060290 fd Linux raid autodetect
/dev/hda3 146 9729 76983480 fd Linux raid autodetect
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md2 73G 50G 19G 73% /
/dev/md0 99M 19M 75M 21% /boot
none 503M 0 503M 0% /dev/shm
------------------------------------------------------------------
But when querying the array using
# mdadm -E /dev/hdc1
/dev/hdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 7f5b6639:a36365fd:def88079:b731abb5
Creation Time : Mon Dec 15 01:27:33 2003
Raid Level : raid1
Device Size : 104192 (101.75 MiB 106.69 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Update Time : Tue Aug 9 17:38:38 2005
State : dirty, no-errors
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Checksum : e4ebb016 - correct
Events : 0.57
Number Major Minor RaidDevice State
this 1 22 1 1 active sync /dev/hdc1
0 0 0 0 0 faulty removed
1 1 22 1 1 active sync /dev/hdc1
---
I get one of them listing as faulty removed!
Only get this when querying hdc, hda1 are all active sync.
Do I have a problem?
Thx
|
|
|
08-09-2005, 03:36 PM
|
#2
|
Member
Registered: Aug 2003
Location: Edinburgh
Distribution: Server: Gentoo2004; Desktop: Ubuntu
Posts: 720
Rep:
|
To examine an array, you should do:
mdadm --detail /dev/md0
this shows you the full details of what has been removed etc.
If hda is damanged, you want to replace it ASAP.
1. go and buy a new hard drive of the same size (it simplifies everything).
2. make the partitions the same size as they used to be on the old hda
3. run this command to add the parition to the array:
mdadm /dev/md0 –add /dev/hda1
(change md0 and hda1 as required.)
The hard drives will now spent time resyncing.
if you do :
cat /proc/mdstat
it might look like:
root@hamishnet:/# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hde3[0]
20015744 blocks [2/1] [U_]
md0 : active raid1 hde1[1] hda5[0]
19542976 blocks [2/2] [UU]
in mine, the md0 array is good (indivated by "[UU]"), however my md1 partition has one drive missing.
hamish
|
|
|
08-09-2005, 09:41 PM
|
#3
|
Member
Registered: Jun 2005
Distribution: Centos
Posts: 215
Original Poster
Rep:
|
Thx,
When I do
# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 hdc3[1]
76983360 blocks [2/1] [_U]
md1 : active raid1 hdc2[1]
1060224 blocks [2/1] [_U]
md0 : active raid1 hdc1[1]
104192 blocks [2/1] [_U]
unused devices: <none>
So is this OK or not?
Whenever I do the following I always see that "faulty removed" and I don't know whether that's normal or not?
# mdadm --detail /dev/md2
/dev/md2:
Version : 00.90.00
Creation Time : Mon Dec 15 01:26:32 2003
Raid Level : raid1
Array Size : 76983360 (73.42 GiB 78.83 GB)
Device Size : 76983360 (73.42 GiB 78.83 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Tue Aug 9 17:38:38 2005
State : dirty, no-errors
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 0 0 0 faulty removed
1 22 3 1 active sync /dev/hdc3
UUID : 7b337624:a23aad65:c485d413:28f65fbb
Events : 0.72
|
|
|
08-10-2005, 01:21 AM
|
#4
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
Hi,
Newbee here following this with interest (got a raid0 at home working ok,
I know bad idea). Anyway:
/dev/hdb is missing from your fdisk -l
This hard drive might be dead then!
Can you please post the mdadm.conf
- How do you know the HD controller is not faulty?
- Have you got so-called smart HDs? the log might have recorded with smartmontools
signs of its failure
Hamish: How can stefaandk identify physically which of the two HD has failed
(other than by trial and error)
A stab in the dark: Did you try to restart the area with mdadm (from a CLI)
to see what mdadm says?
|
|
|
08-10-2005, 09:15 PM
|
#5
|
Member
Registered: Jun 2005
Distribution: Centos
Posts: 215
Original Poster
Rep:
|
It doesn't seem like the mdadm.conf file is being used
Quote:
# more /etc/mdadm.conf
# mdadm configuration file
#
# mdadm will function properly without the use of a configuration file,
# but this file is useful for keeping track of arrays and member disks.
# In general, a mdadm.conf file is created, and updated, after arrays
# are created. This is the opposite behavior of /etc/raidtab which is
# created prior to array construction.
#
#
# the config file takes two types of lines:
#
# DEVICE lines specify a list of devices of where to look for
# potential member disks
#
# ARRAY lines specify information about how to identify arrays so
# so that they can be activated
#
# You can have more than one device line and use wild cards. The first
# example includes SCSI the first partition of SCSI disks /dev/sdb,
# /dev/sdc, /dev/sdd, /dev/sdj, /dev/sdk, and /dev/sdl. The second
# line looks for array slices on IDE disks.
#
#DEVICE /dev/sd[bcdjkl]1
#DEVICE /dev/hda1 /dev/hdb1
#
# If you mount devfs on /dev, then a suitable way to list all devices is:
#DEVICE /dev/discs/*/*
#
#
#
# ARRAY lines specify an array to assemble and a method of identification.
# Arrays can currently be identified by using a UUID, superblock minor number,
# or a listing of devices.
#
# super-minor is usually the minor number of the metadevice
# UUID is the Universally Unique Identifier for the array
# Each can be obtained using
#
# mdadm -D <md>
#
#ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371
#ARRAY /dev/md1 super-minor=1
#ARRAY /dev/md2 devices=/dev/hda1,/dev/hda2
#
# ARRAY lines can also specify a "spare-group" for each array. mdadm --monitor
# will then move a spare between arrays in a spare-group if one array has a failed
# drive but no spare
#ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1
#ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1
#
# When used in --follow (aka --monitor) mode, mdadm needs a
# mail address and/or a program. This can be given with "mailaddr"
# and "program" lines to that monitoring can be started using
# mdadm --follow --scan & echo $! > /var/run/mdadm
# If the lines are not found, mdadm will exit quietly
#PROGRAM /usr/sbin/handle-mdadm-events
|
I basically inherited this system so I'm trying to make sense of it's raid config.
Since I have no prior XP with mdadm I don't want to start putting in commands that could potentially blow up this raid.
How would I manually try to start hdb?
But if there was an hdb in this config, would this mean that there was a mirror across 3 disks?
|
|
|
08-11-2005, 01:30 AM
|
#6
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
what is the output of
mdadm --detail /dev/md2
Have you got the output of
/etc/raidtab
btw which distro have you got
I am pretty sure /dev/hda and hdc are part of the raid.
You can have raid with 3 hd (but I suppose it would say raid 5 then)
Forget about me asking about hdb, it just sounded strange, but
it is possible to have a raid accross hda and hdc
You are not necessarily using mdadm at the minute; there is
a series of utilities called mdtools (I think?)
You have [_U] One of your hard drive is malfunctioning / dead.
But because you have a raid1 system (mirror), the system still works.
|
|
|
08-11-2005, 01:55 AM
|
#7
|
Member
Registered: Jun 2005
Distribution: Centos
Posts: 215
Original Poster
Rep:
|
This is on a RedHat 9 box.
So the _U with certainty tells me that one of my drives is dead?
Coz with fdisk -l I get
Disk /dev/hdc: 80.0 GB, 80026361856 bytes
16 heads, 63 sectors/track, 155061 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 * 1 207 104296+ fd Linux raid autodetect
/dev/hdc2 208 2312 1060920 fd Linux raid autodetect
/dev/hdc3 2313 155061 76985496 fd Linux raid autodetect
Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 13 104391 fd Linux raid autodetect
/dev/hda2 14 145 1060290 fd Linux raid autodetect
/dev/hda3 146 9729 76983480 fd Linux raid autodetect
Seems that there are 2 disks there, or does this show even if the disk is dead due to raid?
Here are the other commands you asked for:
# mdadm --detail /dev/md2
/dev/md2:
Version : 00.90.00
Creation Time : Mon Dec 15 01:26:32 2003
Raid Level : raid1
Array Size : 76983360 (73.42 GiB 78.83 GB)
Device Size : 76983360 (73.42 GiB 78.83 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Tue Aug 9 17:38:38 2005
State : dirty, no-errors
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 0 0 0 faulty removed
1 22 3 1 active sync /dev/hdc3
UUID : 7b337624:a23aad65:c485d413:28f65fbb
Events : 0.72
----------------
# more /etc/raidtab
raiddev /dev/md2
raid-level 1
nr-raid-disks 2
chunk-size 64k
persistent-superblock 1
nr-spare-disks 0
device /dev/hda3
raid-disk 0
device /dev/hdc3
raid-disk 1
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
chunk-size 64k
persistent-superblock 1
nr-spare-disks 0
device /dev/hda1
raid-disk 0
device /dev/hdc1
raid-disk 1
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
chunk-size 64k
persistent-superblock 1
nr-spare-disks 0
device /dev/hda2
raid-disk 0
device /dev/hdc2
raid-disk 1
--------
|
|
|
08-11-2005, 02:22 AM
|
#8
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
re [_U]
looks like it, yes (but I am like you, is this 100% sure?
you might want to do some backups first and then try
to restart the raid with some of the mdtools
rather than mdadm. I heard mdadm is "better" and I use it
but then you will need to edit mdadm.conf
I have not enough knowledge to see why fdsik still see both HD.
Maybe one of the HD is not that damaged?
An example
http://aplawrence.com/Linux/rebuildraid.html
I have never rebuild an area myself (and cannot bec I have raid 0)
A generic piece of info
http://gentoo-wiki.com/HOWTO_Gentoo_..._Software_RAID
Maybe you could try to plug each HD on its own and reboot
(I have no idea of the possible consequences of that)
|
|
|
08-11-2005, 10:12 AM
|
#9
|
Member
Registered: Aug 2003
Location: Edinburgh
Distribution: Server: Gentoo2004; Desktop: Ubuntu
Posts: 720
Rep:
|
[_U] means that one of the disks is broken.
The above means that the first drive in the array is unavailable. [U_] means that the second HDD is unavailable.
I believe that trial and error is the only way to find out. you are right in thinking that ding fdisk -l will give you an indicaiton of which one is broken. If you do that and see that hdb is not listed in fdisk -l, then you can open up the PC and see if hdb is in fact a hard drive.
raidtab has nothing to do with mdadm. They are two different packages for doing the same thing. Raidtab is older, and mdadm is becoming more popular.
you will find that /etc/mdadm.conf is probably unused. I have never used it, in fact I didn't know it existed!
best of luck
|
|
|
08-12-2005, 12:50 AM
|
#10
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
I suppose one can do without mdadm.conf while using mdadm with some scripts,
and this will depend on the distro
On my distro the raid area is started automatically and I think mdadm
takes the info it needs from mdadm.conf (that I configured by hand).
My point was that possibly mdtools was used by stefaandk's legacy system
rather tham mdadm.
Must be said that indeed stefaandk can use either mdtools or mdadm
|
|
|
08-15-2005, 07:47 PM
|
#11
|
Member
Registered: Jun 2005
Distribution: Centos
Posts: 215
Original Poster
Rep:
|
Thanks for all the help with this guys, it was indeed a faulty drive and I had it replaced and it's all good now!
|
|
|
08-16-2005, 02:07 AM
|
#12
|
Member
Registered: Aug 2003
Location: Edinburgh
Distribution: Server: Gentoo2004; Desktop: Ubuntu
Posts: 720
Rep:
|
|
|
|
08-16-2005, 02:11 AM
|
#13
|
Senior Member
Registered: Nov 2004
Distribution: Mandriva mostly, vector 5.1, tried many.Suse gone from HD because bad Novell/Zinblows agreement
Posts: 1,606
Rep:
|
Glad to know you are sorted. Hope you have learned about raid in the process :-)
|
|
|
All times are GMT -5. The time now is 02:26 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|