LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   software raid - linux device naming (https://www.linuxquestions.org/questions/linux-software-2/software-raid-linux-device-naming-574256/)

unkie888 08-02-2007 12:12 PM

software raid - linux device naming
 
Hi,

I have just installed software raid on linux fc5. appears to work fine.

Quote:

[root@x] cat /proc/mdstat

md1 : active raid5 sdc1[2] sdb2[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU]

But something is troubling me. I know that if a disk fails I will see this;

Quote:

[root@x] cat /proc/mdstat

md1 : active raid5 sdc1[2] sdb2[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UU_]

The UU_ would indicate that sdc1 has failed.

But how do I match up "sdc" with a physical drive?

i.e. when I open up the box which one do I replace.

I'm thinking "sdc" would correspond to the "SATA3" socket on my motherboard - but does it really work that way? or does linux just assign device names on an unpredictable (first come first served) basis?

cgjones 08-02-2007 07:45 PM

I don't currently use SATA, but I am 99% percent sure that /dev/sdc would correspond to the third SATA port.

unkie888 08-04-2007 05:07 AM

determining which physical disk in broken with software raid
 
Actually, no, this will be useless because if disk sdb is removed (dies) then on a reboot sdc becomes the new sdb. So you can't use the linux drive designation - you'd be removing the wrong disk!
:tisk:

I missled you when I said;

Quote:

md1 : active raid5 sdc1[2] sdb2[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UU_]
which is wrong -- if a drive was gone it would be this;

Quote:

md1 : active raid5 sdb1[1] sda2[0]
4096000 blocks level 5, 256k chunk, algorithm 2 [3/3] [UU_]
So... the only way I can think of is not to use linux at all. The bios describes the disks depending on the SATA connector they are on. Mine are CH2S, CH2M, CH3M. By going into the bios with only one disk attached each time I have been able to label each one with a marker pen.

When a disk goes, I will boot into the bios to see which one is missing, then I will replace that one.

I think this will work.

horde 02-17-2008 04:15 AM

Finding the failed SATA drive
 
I'm sure there are better solutions and I will probably take them up if I can find them on the net.

On a regular basis (once a day) I run the following perl code which gets emailed to my client machine (if you are more organised than me you only need to do this once and store the results away):

#!/usr/bin/perl
# List out all HDD serial numbers of disks in RAID array
#
# The trailing pipe "|" directs command output
# into our program:

$process = "yes";
if (! open (ListDevPIPE,"mdadm --detail \/dev\/md0 |")) {
die "Can't run ls! $!\n";
}

while (<ListDevPIPE>) {
chomp $_ ;

$lin = $_ ;
$linein = ltrim($_);

if ( trim($linein) eq "") {
next;
}

if ( $linein =~ /active sync/ ) {

@devinfo = split(/ +/,$linein);

$SerialLine = `hdparm -I $devinfo[6] | grep "Serial Number:"`;
chomp $SerialLine;

@serialinfo = split(/\s+/,$SerialLine);

print "$lin Serial Number : $serialinfo[3]\n";
}
else {
print "$lin\n";
}

}

sub ltrim() {
my $string = shift;
$string =~ s/^\s+//;
return $string;
}

Output will look like this for a failure:

/dev/md0:
Version : 00.90.03
Creation Time : Mon Jun 11 03:41:55 2007
Raid Level : raid5
Array Size : 1250242048 (1192.32 GiB 1280.25 GB)
Device Size : 312560512 (298.08 GiB 320.06 GB)
Raid Devices : 5
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Feb 17 17:31:57 2008
State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 128K
UUID : d9f81e55:2fe5e5fb:f8139d5b:a6e55cd4
Events : 0.454424
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1 Serial Number : 5QF0YT6C
1 8 17 1 active sync /dev/sdb1 Serial Number : 5QF03P11
2 0 0 2 removed
3 8 49 3 active sync /dev/sdd1 Serial Number : 9QF49ERL
4 3 65 4 active sync /dev/hdb1 Serial Number : 5QF4S9J4

On my system those serial numbers match the external serial numbers printed on the drives ..... so it is relatively easy to identify the failed drive.

Once removed (in OpenSuse anyway) take out the dead drive, put in the new one, partition it as Linux Raid (a bit more effort if they aren't the same size). Then "mdadm /dev/md0 -a /dev/sdc1" and away goes the rebuild - very easy once you've figured out the failed drive.

gzunk 02-17-2008 05:21 AM

The Linux sd driver will assign the disks in the order that the BIOS presents them, so it's a fixed order that may not necessarily correspond to the SATA ports.

For example, in one of my previous motherboards it went like this:

SATA 0 = /dev/sdc
SATA 1 = /dev/sdd
SATA 2 = /dev/sda
SATA 3 = /dev/sdb

And yes, if you remove one, the /dev/sdX will change. I've not used the RAID driver, but perhaps instead of using /dev/sdX you could use /dev/disk/by-id/X or /dev/disk/by-uuid/X. These entries don't change for a disk even if you remove disks.

The only problem is that I believe they are set up by udev, and they are symbolic links - so if the raid driver comes in before udev then they won't be there, or if the raid driver doesn't like symbolic links.

I use LVM2 to manage the drives, and it ignores the /dev/sdX aspect of the drive and just scans them on startup so see where the various partitions that it knows about are - so if I lose a drive on a mirror then LVM will start up correctly.

I also use (physical) sticky labels on the drive and on the cable. To indicate what the drive should be (/dev/sdX) and what connector it's plugged into (sata 0).

cam34 11-09-2008 03:45 PM

Sorry just a correction:
The line: if ( trim($linein) eq "") {
Should actually read: if ( ltrim($linein) eq "") {

So the whole corrected code should be:

Code:

#!/usr/bin/perl
# List out all HDD serial numbers of disks in RAID array
#
# The trailing pipe "|" directs command output
# into our program:

$process = "yes";
if (! open (ListDevPIPE,"mdadm --detail \/dev\/md0 |")) {
die "Can't run ls! $!\n";
}

while (<ListDevPIPE>) {
chomp $_ ;

$lin = $_ ;
$linein = ltrim($_);

if ( ltrim($linein) eq "") {
next;
}

if ( $linein =~ /active sync/ ) {

@devinfo = split(/ +/,$linein);

$SerialLine = `hdparm -I $devinfo[6] | grep "Serial Number:"`;
chomp $SerialLine;

@serialinfo = split(/\s+/,$SerialLine);

print "$lin Serial Number : $serialinfo[3]\n";
}
else {
print "$lin\n";
}

}

sub ltrim() {
my $string = shift;
$string =~ s/^\s+//;
return $string;
}

Cheers cAm


All times are GMT -5. The time now is 05:49 PM.