LinuxQuestions.org - Effect of changing SCSI IDs on md devices

- Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)

- - Effect of changing SCSI IDs on md devices (https://www.linuxquestions.org/questions/linux-hardware-18/effect-of-changing-scsi-ids-on-md-devices-4175469318/)

Effect of changing SCSI IDs on md devices

First off, sorry for the length of this post...

Background:

I have a system with three external SCSI enclosures. One currently contains the boot device (SCSI ID 0) and the rest of the operating system, some application directories, and home directories (/dev/sd[a-f]). The other two contain disks (/dev/sd[g-l] and /dev/sd[m-r]) that I've been building RAID1 devices on (through a two-channel SCSI adapter). The enclosures that are holding the mirrors have a SCSI ID selector switch that allows you to have each bay automatically assigned an ID from 0-6 or, after flipping the switch, 8-15. Currently the switch is set to "0-6". (Yeah, they're SCA interfaces.) All the disks and md devices are hosting ext3 filesystems.

What I'd like to do is flip those switches and reassign those devices to the upper IDs. That would let me use the internal connections on the SCSI adapter to host 2-4 disks internally in the system case for a future upgrade of the OS and major applications. (I'd prefer these to be using IDs 0 and 1.) Short term, the current external boot devices would still be used until the new OS and applications are available and I've moved users' home directories onto the RAID devices on those two enclosures. Once that upgrade task is done, the external enclosure with the OS would go away. (Less noise / less power.)

I suspect that merely flipping the switch may not cause any immediate problem since the number of disks isn't changing and the names of the disks will stay the same. But adding the internal disks to the SCSI adapter will almost certainly change things. (That's when I expect the real "fun" to begin.)

Q:

1.) Will changing the SCSI ID of the disks in those two cabinets cause the "md" devices to be become unrecognized? I.e., will they be seen as completely new disks by Linux after switching the SCSI IDs? (My pessimistic side is leaning toward "yes" unless someone tells me otherwise.)

2.) And, if the md devices would need to be rebuilt, what commands would be needed to rebuild them? What's the best HOWTO available for managing md RAID devices?

3.) Can I use garden-variety disk names (/dev/sdX) in /etc/mdadm.conf? All I've seen previously are definitions that use something that appears to be disk UUIDs.

4.) Or, would it be be easier to back all the data on the disks in those two cabinets, rebuild new RAID devices, switch everything to ext4 filesystems, and restore?

Any tips to help me pull this off in the shortest time are most welcome.

TIA...

--
Rick

Yeah, yeah, yeah... I realize that the removal of the SCSI enclosure that currently hosts the OS disk (and friends) will cause another Great Disk Renaming to take place and cause me to have to rebuild the md devices yet again. (Boy, I wish I would have gotten to know some kernel developers and lobbied hard for using disk names like c0t0p0 -- similar to what SYSV uses -- so manipulating disks wouldn't cause weird things to happen to RAID devices. Drive letters are a PITA.) And it'll happen again if/when I migrate to external SATA enclosures. :^/

Wow, someone is still using SCSI? :)

Quote:

Originally Posted by rnturn (Post 4988651)

1.) Will changing the SCSI ID of the disks in those two cabinets cause the "md" devices to be become unrecognized? I.e., will they be seen as completely new disks by Linux after switching the SCSI IDs? (My pessimistic side is leaning toward "yes" unless someone tells me otherwise.)

In that case, I'm happy to tell you that md arrays are not normally recognized based on the ID or device name of the underlying device components, but rather on the UUID in the metadata on each device/partition.

It is possible to "bind" specific device names using the DEVICE parameter in /etc/mdadm.conf, but why anyone would want to do that is a mystery to me. Device names may change for any number of reasons (new drivers, new kernel, different drivers being compiled as modules vs. into the kernel etc), so the default setting is to let the md driver figure things out by itself.

As long as the partition type is set to "Linux raid autodetect" (fd), the md driver will find and identify all components and activate the /dev/mdN devices.

Quote:

Originally Posted by rnturn (Post 4988651)

2.) And, if the md devices would need to be rebuilt, what commands would be needed to rebuild them? What's the best HOWTO available for managing md RAID devices?

I don't know about HowTo documents, but the mdadm --assemble command can be used to piece together a RAID set that for some reason isn't autodetected.

Quote:

Originally Posted by rnturn (Post 4988651)

3.) Can I use garden-variety disk names (/dev/sdX) in /etc/mdadm.conf? All I've seen previously are definitions that use something that appears to be disk UUIDs.

Yes, you can! And you don't want to! :) Seriously, you should just let the md driver take care of this. My /etc/mdadm.conf contains nothing but comments.

Quote:

Wow, someone is still using SCSI? :)

They're not real large but they're reliable as hell. I had one system that I rebuilt about a year ago and felt a little sad to make the decision against reusing one of the the SCSI drives that I'd been using in it. Based on the dates on some of the controller board chips, it had been built back in the mid '90s and had been in nearly constant use all that time. If the bearings in that drive hadn't started making scary sounds it might still be in service.

Garden-variety SATA disks aren't suitable for RAID -- at least not software RAID -- so I use one of those USB drive bays and use a couple of big SATA drives and backup incessantly. Once I can afford enterprise-class SATA drives, though...

Quote:

... I'm happy to tell you that md arrays are not normally recognized based on the ID or device name of the underlying device components, but rather on the UUID in the metadata on each device/partition.

I knew that was supposed to be the case (why, I suppose, that the SYSV-type naming convention has not been needed) but what I've not seen is how the UUID-like string that gets used in the md device configuration can be tied back to the physical devices. I've looked at information in the superblock, the /dev tree, the /sys treeetc., and not found anything that's helped to link the mirror and its components together. Any pointers? I guess I'd like to know just how mdadm is providing that list of disk names when you specify "--verbose".

...snip...

Quote:

Seriously, you should just let the md driver take care of this. My /etc/mdadm.conf contains nothing but comments.

Actually, I've saved the output of the "mdadm --detail --scan --verbose > /etc/mdadm.conf" command on one system that recently had a mirror member failure and commented out the "device=" records. At least I have a record of what things looked like at a point in time.

So, any tips on linking md UUIDs to the actual drives?

TIA...

--
Rick

Hey! :) I have a few hundred servers running scsi drives still.

I can't say for sure, I know uuid and device by name tends to work OK. Order in bios/scsi bios could cause issues in generic terms. Easy to test usually. Either works or doesn't.

Quote:

Originally Posted by jefro (Post 4989277)

I can't say for sure, I know uuid and device by name tends to work OK. Order in bios/scsi bios could cause issues in generic terms. Easy to test usually. Either works or doesn't.

Righto on the BIOS problem. My main workstation was (seemingly) randomly assigning the disk order a while back. The cause was that the m'board battery was beginning to fail. Any power cycling was sometimes changing the disk order and making it unbootable until I went into setup mode and reset the correct order. I've never seen a SCSI controller that randomly changed drive IDs. That has always (in my experience, anyway) been a function of hardware (shelf bay or jumpers). Of course, a dead drive would affect the drive letters assigned at boot time.

Why am I looking for UUID<->device mapping? Imagine a disk enclosure/shelf full of identical SATA drives and then try to find the one that's failed. At least with SCSI, I have sufficient information under /proc to track down disks, jumpers (or shelf location) to tell me what ID each disk is using, and status LEDs on the drives themselves that might -- at least for some disks -- even show you that the drive had failed. SATA disks are truly black boxes with nothing to visually distinguish a good disk from a failed disk. Given my experience with their short lifetimes, I'm hoping to find a simple means of locating a disk in an "md" device when it fails. Having to pull cables, rebooting, and seeing if the device even shows up in a reduced state is not a very good method to locate a dead disk. (It's surely not made easier by having to manually edit the Grub2 configuration to go into single user mode each time. Remember how easy it was with the original Grub bootloader to tab over to where you'd simply enter "single" and press enter to change the boot level? But I digress... :)) Imagine the hassle of finding the dead disk in a large RAID5 device: "Yeah, yeah... I'll have the system back up as soon as I can find the disk that failed." Ugh! At least it'll be a little while before I can afford a wholesale replacement of the SCSI disks. Here's hoping that the HOWTOs that address some of these situations can get written.

--
Rick

Quote:

Originally Posted by thienlya (Post 4991342)

ChuyÃªn m?c: Vi?c lÃ*m bÃ¡n th?i gian bu?i t?i

T?a d?: NhÃ¢n viÃªn van phÃ²ng

lien he ; (08) 3 7714774

[snip]

Di?n Tho?i: (08) 3 7714774

Huh? :)

Looks like a heinous unicode error in my browser. What is this supposed to be?

--