LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 08-02-2007, 12:12 PM   #1
unkie888
Member
 
Registered: Aug 2007
Posts: 67

Rep: Reputation: 24
Question software raid - linux device naming


Hi,

I have just installed software raid on linux fc5. appears to work fine.

Quote:
[root@x] cat /proc/mdstat

md1 : active raid5 sdc1[2] sdb2[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU]
But something is troubling me. I know that if a disk fails I will see this;

Quote:
[root@x] cat /proc/mdstat

md1 : active raid5 sdc1[2] sdb2[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UU_]
The UU_ would indicate that sdc1 has failed.

But how do I match up "sdc" with a physical drive?

i.e. when I open up the box which one do I replace.

I'm thinking "sdc" would correspond to the "SATA3" socket on my motherboard - but does it really work that way? or does linux just assign device names on an unpredictable (first come first served) basis?
 
Old 08-02-2007, 07:45 PM   #2
cgjones
Member
 
Registered: Nov 2005
Location: Central New York
Distribution: Ubuntu
Posts: 405

Rep: Reputation: 30
I don't currently use SATA, but I am 99% percent sure that /dev/sdc would correspond to the third SATA port.
 
Old 08-04-2007, 05:07 AM   #3
unkie888
Member
 
Registered: Aug 2007
Posts: 67

Original Poster
Rep: Reputation: 24
determining which physical disk in broken with software raid

Actually, no, this will be useless because if disk sdb is removed (dies) then on a reboot sdc becomes the new sdb. So you can't use the linux drive designation - you'd be removing the wrong disk!


I missled you when I said;

Quote:
md1 : active raid5 sdc1[2] sdb2[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UU_]
which is wrong -- if a drive was gone it would be this;

Quote:
md1 : active raid5 sdb1[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UU_]
So... the only way I can think of is not to use linux at all. The bios describes the disks depending on the SATA connector they are on. Mine are CH2S, CH2M, CH3M. By going into the bios with only one disk attached each time I have been able to label each one with a marker pen.

When a disk goes, I will boot into the bios to see which one is missing, then I will replace that one.

I think this will work.
 
Old 02-17-2008, 04:15 AM   #4
horde
LQ Newbie
 
Registered: Jan 2005
Posts: 19

Rep: Reputation: 0
Finding the failed SATA drive

I'm sure there are better solutions and I will probably take them up if I can find them on the net.

On a regular basis (once a day) I run the following perl code which gets emailed to my client machine (if you are more organised than me you only need to do this once and store the results away):

#!/usr/bin/perl
# List out all HDD serial numbers of disks in RAID array
#
# The trailing pipe "|" directs command output
# into our program:

$process = "yes";
if (! open (ListDevPIPE,"mdadm --detail \/dev\/md0 |")) {
die "Can't run ls! $!\n";
}

while (<ListDevPIPE>) {
chomp $_ ;

$lin = $_ ;
$linein = ltrim($_);

if ( trim($linein) eq "") {
next;
}

if ( $linein =~ /active sync/ ) {

@devinfo = split(/ +/,$linein);

$SerialLine = `hdparm -I $devinfo[6] | grep "Serial Number:"`;
chomp $SerialLine;

@serialinfo = split(/\s+/,$SerialLine);

print "$lin Serial Number : $serialinfo[3]\n";
}
else {
print "$lin\n";
}

}

sub ltrim() {
my $string = shift;
$string =~ s/^\s+//;
return $string;
}

Output will look like this for a failure:

/dev/md0:
Version : 00.90.03
Creation Time : Mon Jun 11 03:41:55 2007
Raid Level : raid5
Array Size : 1250242048 (1192.32 GiB 1280.25 GB)
Device Size : 312560512 (298.08 GiB 320.06 GB)
Raid Devices : 5
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Feb 17 17:31:57 2008
State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 128K
UUID : d9f81e55:2fe5e5fb:f8139d5b:a6e55cd4
Events : 0.454424
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1 Serial Number : 5QF0YT6C
1 8 17 1 active sync /dev/sdb1 Serial Number : 5QF03P11
2 0 0 2 removed
3 8 49 3 active sync /dev/sdd1 Serial Number : 9QF49ERL
4 3 65 4 active sync /dev/hdb1 Serial Number : 5QF4S9J4

On my system those serial numbers match the external serial numbers printed on the drives ..... so it is relatively easy to identify the failed drive.

Once removed (in OpenSuse anyway) take out the dead drive, put in the new one, partition it as Linux Raid (a bit more effort if they aren't the same size). Then "mdadm /dev/md0 -a /dev/sdc1" and away goes the rebuild - very easy once you've figured out the failed drive.
 
Old 02-17-2008, 05:21 AM   #5
gzunk
Member
 
Registered: Sep 2006
Posts: 89

Rep: Reputation: 20
The Linux sd driver will assign the disks in the order that the BIOS presents them, so it's a fixed order that may not necessarily correspond to the SATA ports.

For example, in one of my previous motherboards it went like this:

SATA 0 = /dev/sdc
SATA 1 = /dev/sdd
SATA 2 = /dev/sda
SATA 3 = /dev/sdb

And yes, if you remove one, the /dev/sdX will change. I've not used the RAID driver, but perhaps instead of using /dev/sdX you could use /dev/disk/by-id/X or /dev/disk/by-uuid/X. These entries don't change for a disk even if you remove disks.

The only problem is that I believe they are set up by udev, and they are symbolic links - so if the raid driver comes in before udev then they won't be there, or if the raid driver doesn't like symbolic links.

I use LVM2 to manage the drives, and it ignores the /dev/sdX aspect of the drive and just scans them on startup so see where the various partitions that it knows about are - so if I lose a drive on a mirror then LVM will start up correctly.

I also use (physical) sticky labels on the drive and on the cable. To indicate what the drive should be (/dev/sdX) and what connector it's plugged into (sata 0).
 
1 members found this post helpful.
Old 11-09-2008, 03:45 PM   #6
cam34
Member
 
Registered: Aug 2003
Distribution: OpenSuse 11.1, SLES10, Fedora 11 & XP 4 Gaming *sniffs
Posts: 101

Rep: Reputation: 16
Sorry just a correction:
The line: if ( trim($linein) eq "") {
Should actually read: if ( ltrim($linein) eq "") {

So the whole corrected code should be:

Code:
#!/usr/bin/perl
# List out all HDD serial numbers of disks in RAID array
#
# The trailing pipe "|" directs command output
# into our program:

$process = "yes";
if (! open (ListDevPIPE,"mdadm --detail \/dev\/md0 |")) {
die "Can't run ls! $!\n";
}

while (<ListDevPIPE>) {
chomp $_ ;

$lin = $_ ;
$linein = ltrim($_);

if ( ltrim($linein) eq "") {
next;
}

if ( $linein =~ /active sync/ ) {

@devinfo = split(/ +/,$linein);

$SerialLine = `hdparm -I $devinfo[6] | grep "Serial Number:"`;
chomp $SerialLine;

@serialinfo = split(/\s+/,$SerialLine);

print "$lin Serial Number : $serialinfo[3]\n";
}
else {
print "$lin\n";
}

}

sub ltrim() {
my $string = shift;
$string =~ s/^\s+//;
return $string;
}
Cheers cAm

Last edited by cam34; 11-09-2008 at 03:46 PM. Reason: Formatting
 
  


Reply

Tags
device, raid


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
software raid - device rearrangement after reboot when drives are disconnected rtspitz Linux - Hardware 5 07-08-2007 08:00 PM
device naming didi156 Linux - General 3 07-14-2006 06:37 AM
Is there a way to have grub translate its own naming to naming scheme under Linux zhjim Linux - Software 6 05-28-2006 08:09 AM
help please naming wireless device! op_stager Linux - Software 3 09-20-2004 12:20 AM
software raid - add device wrongly marked faulty back into array? snoozy Linux - General 2 06-27-2003 02:11 PM


All times are GMT -5. The time now is 01:33 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration