LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   RAID5 + spare assembles from different devices each boot (https://www.linuxquestions.org/questions/linux-hardware-18/raid5-spare-assembles-from-different-devices-each-boot-612394/)

jdavidow 01-09-2008 01:05 PM

RAID5 + spare assembles from different devices each boot
 
I have posted a few times about this, but it looks like I was confusing the RAID problems I was having with udev and naming rules.

I have a RAID5 array with a spare (5+1). The problem is that when I boot, the system incorporates the spare as a member device, detects a degraded array and immediately beings rebuilding the array.

Even more interesting:
My devices are actually partitioned so that I have two arrays (/home and /var) on the same physical drives. The last time I booted, the arrays were assembled using a different set of partitions! (ex: md0 = sd[adbce]1 and md1 = sd[abcdf]2) I though it might have had to do with the order in which the drives were detected, but this seems to prove that's not the case.

I thought it might have to do with the spare having a confusing superblock. All the partitions have valid superblocks. I tried to delete the SB on the spare, but as soon as I add it back to the array the SB is written and the issue recurs.

Does anyone out there know what md_mod is doing at boot? I can't seem to find any forum, list or info on it, other than the man page.

/etc/mdadm/mdadm.conf
Code:

DEVICE partitions
ARRAY /dev/md0 level=raid5 num-devices=5 spares=1 UUID=e7356e2b:71e53a26:94b87bc7:e6a9e6b2
ARRAY /dev/md1 level=raid5 num-devices=5 spares=1 UUID=aa0264a3:5fb0396b:04071607:a713ba9d

/proc/mdstat
Code:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sdg2[5](S) sde2[0] sda2[3] sdc2[4] sdb2[2] sdf2[1]
      927913984 blocks level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]

md0 : active raid5 sdg1[5](S) sde1[0] sdc1[4] sda1[3] sdb1[2] sdf1[1]
      48869120 blocks level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]

unused devices: <none>

mdadm output
Code:

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Sat Apr  7 23:32:58 2007
    Raid Level : raid5
    Array Size : 48869120 (46.61 GiB 50.04 GB)
  Used Dev Size : 12217280 (11.65 GiB 12.51 GB)
  Raid Devices : 5
  Total Devices : 6
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Jan  9 10:42:23 2008
          State : clean
 Active Devices : 5
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 1

        Layout : left-symmetric
    Chunk Size : 64K

          UUID : e7356e2b:71e53a26:94b87bc7:e6a9e6b2
        Events : 0.2601918

    Number  Major  Minor  RaidDevice State
      0      8      65        0      active sync  /dev/sde1
      1      8      81        1      active sync  /dev/sdf1
      2      8      17        2      active sync  /dev/sdb1
      3      8        1        3      active sync  /dev/sda1
      4      8      33        4      active sync  /dev/sdc1

      5      8      97        -      spare  /dev/sdg1
$ sudo mdadm --examine /dev/sda1
(pretty much the same for all member devices)
/dev/sda1:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : e7356e2b:71e53a26:94b87bc7:e6a9e6b2
  Creation Time : Sat Apr  7 23:32:58 2007
    Raid Level : raid5
  Used Dev Size : 12217280 (11.65 GiB 12.51 GB)
    Array Size : 48869120 (46.61 GiB 50.04 GB)
  Raid Devices : 5
  Total Devices : 6
Preferred Minor : 0

    Update Time : Wed Jan  9 10:46:29 2008
          State : clean
 Active Devices : 5
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 1
      Checksum : c50d245 - correct
        Events : 0.2601918

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    3      8        1        3      active sync  /dev/sda1

  0    0      8      65        0      active sync  /dev/sde1
  1    1      8      81        1      active sync  /dev/sdf1
  2    2      8      17        2      active sync  /dev/sdb1
  3    3      8        1        3      active sync  /dev/sda1
  4    4      8      33        4      active sync  /dev/sdc1
  5    5      8      97        5      spare  /dev/sdg1



dmesg (edited)
Code:

[  36.112449] md: linear personality registered for level -1
[  36.117197] md: multipath personality registered for level -4
[  36.121795] md: raid0 personality registered for level 0
[  36.126950] md: raid1 personality registered for level 1
[  36.131424] raid5: automatically using best checksumming function: pIII_sse
[  36.149971]    pIII_sse  :  4564.000 MB/sec
[  36.150020] raid5: using function: pIII_sse (4564.000 MB/sec)
[  36.218015] raid6: int32x1    780 MB/s
[  36.285943] raid6: int32x2    902 MB/s
[  36.353961] raid6: int32x4    667 MB/s
[  36.421869] raid6: int32x8    528 MB/s
[  36.489811] raid6: mmxx1    1813 MB/s
[  36.557775] raid6: mmxx2    2123 MB/s
[  36.625763] raid6: sse1x1    1101 MB/s
[  36.693717] raid6: sse1x2    1898 MB/s
[  36.761688] raid6: sse2x1    2227 MB/s
[  36.829647] raid6: sse2x2    3178 MB/s
[  36.829695] raid6: using algorithm sse2x2 (3178 MB/s)
[  36.829744] md: raid6 personality registered for level 6
[  36.829793] md: raid5 personality registered for level 5
[  36.829842] md: raid4 personality registered for level 4
[  36.853475] md: raid10 personality registered for level 10

[  39.634924] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
[  39.634995] sd 0:0:0:0: [sda] Write Protect is off
[  39.635048] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[  39.635076] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  39.635218] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
[  39.635292] sd 0:0:0:0: [sda] Write Protect is off
[  39.635350] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[  39.635380] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  39.635462]  sda: sda1 sda2
[  39.650092] sd 0:0:0:0: [sda] Attached SCSI disk

[  39.650226] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB)
[  39.650296] sd 1:0:0:0: [sdb] Write Protect is off
[  39.650348] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[  39.650379] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[  39.650505] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB)
[  39.650573] sd 1:0:0:0: [sdb] Write Protect is off
[  39.650625] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[  39.650657] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[  39.650727]  sdb: sdb1 sdb2
[  39.667599] sd 1:0:0:0: [sdb] Attached SCSI disk

[  39.667719] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB)
[  39.667788] sd 3:0:0:0: [sdc] Write Protect is off
[  39.667840] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[  39.667871] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  39.667997] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB)
[  39.668064] sd 3:0:0:0: [sdc] Write Protect is off
[  39.668116] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[  39.668146] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  39.668213]  sdc: sdc1 sdc2
[  39.692703] sd 3:0:0:0: [sdc] Attached SCSI disk

[  39.699348] sd 0:0:0:0: Attached scsi generic sg0 type 0
[  39.699570] sd 1:0:0:0: Attached scsi generic sg1 type 0
[  39.699786] sd 3:0:0:0: Attached scsi generic sg2 type 0

[  39.834560] md: md0 stopped.
[  39.870361] md: bind<sdc1>
[  39.870527] md: md1 stopped.
[  39.910999] md: md0 stopped.
[  39.911064] md: unbind<sdc1>
[  39.911120] md: export_rdev(sdc1)
[  39.929760] md: bind<sda1>
[  39.929953] md: bind<sdc1>
[  39.930139] md: bind<sdb1>
[  39.930231] md: md1 stopped.
[  39.932468] md: bind<sdc2>
[  39.932674] md: bind<sda2>
[  39.932860] md: bind<sdb2>

[  40.880217] sd 6:0:0:0: [sdd] 234441648 512-byte hardware sectors (120034 MB)
[  40.880288] sd 6:0:0:0: [sdd] Write Protect is off
[  40.880340] sd 6:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[  40.880367] sd 6:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.880504] sd 6:0:0:0: [sdd] 234441648 512-byte hardware sectors (120034 MB)
[  40.880572] sd 6:0:0:0: [sdd] Write Protect is off
[  40.880623] sd 6:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[  40.880651] sd 6:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.880718]  sdd: sdd1 sdd2 <<7>ieee1394: Host added: ID:BUS[0-00:1023]  GUID[001485000012704c]
[  40.908264]  sdd5 >
[  40.908479] sd 6:0:0:0: [sdd] Attached SCSI disk
[  40.908579] sd 6:0:0:0: Attached scsi generic sg4 type 0

[  40.908747] scsi 6:0:1:0: Direct-Access    ATA      Maxtor 7L250S0  BACE PQ: 0 ANSI: 5
[  40.908899] sd 6:0:1:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
[  40.908968] sd 6:0:1:0: [sde] Write Protect is off
[  40.909020] sd 6:0:1:0: [sde] Mode Sense: 00 3a 00 00
[  40.909050] sd 6:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.909174] sd 6:0:1:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
[  40.909243] sd 6:0:1:0: [sde] Write Protect is off
[  40.909294] sd 6:0:1:0: [sde] Mode Sense: 00 3a 00 00
[  40.909324] sd 6:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.909391]  sde: sde1 sde2
[  40.930218] sd 6:0:1:0: [sde] Attached SCSI disk
[  40.930318] sd 6:0:1:0: Attached scsi generic sg5 type 0

[  40.930480] scsi 7:0:0:0: Direct-Access    ATA      WDC WD2500JD-00H 08.0 PQ: 0 ANSI: 5
[  40.930621] sd 7:0:0:0: [sdf] 488397168 512-byte hardware sectors (250059 MB)
[  40.930689] sd 7:0:0:0: [sdf] Write Protect is off
[  40.930742] sd 7:0:0:0: [sdf] Mode Sense: 00 3a 00 00
[  40.930769] sd 7:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.930894] sd 7:0:0:0: [sdf] 488397168 512-byte hardware sectors (250059 MB)
[  40.930963] sd 7:0:0:0: [sdf] Write Protect is off
[  40.931015] sd 7:0:0:0: [sdf] Mode Sense: 00 3a 00 00
[  40.931044] sd 7:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.931111]  sdf: sdf1 sdf2
[  40.948846] sd 7:0:0:0: [sdf] Attached SCSI disk
[  40.948946] sd 7:0:0:0: Attached scsi generic sg6 type 0

[  40.949106] scsi 7:0:1:0: Direct-Access    ATA      WDC WD2500JS-00M 02.0 PQ: 0 ANSI: 5
[  40.949248] sd 7:0:1:0: [sdg] 488397168 512-byte hardware sectors (250059 MB)
[  40.949317] sd 7:0:1:0: [sdg] Write Protect is off
[  40.949368] sd 7:0:1:0: [sdg] Mode Sense: 00 3a 00 00
[  40.949396] sd 7:0:1:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.949519] sd 7:0:1:0: [sdg] 488397168 512-byte hardware sectors (250059 MB)
[  40.949588] sd 7:0:1:0: [sdg] Write Protect is off
[  40.949640] sd 7:0:1:0: [sdg] Mode Sense: 00 3a 00 00
[  40.949668] sd 7:0:1:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  40.949734]  sdg: sdg1 sdg2
[  40.969827] sd 7:0:1:0: [sdg] Attached SCSI disk
[  40.969926] sd 7:0:1:0: Attached scsi generic sg7 type 0

[  41.206078] md: md0 stopped.
[  41.206137] md: unbind<sdb1>
[  41.206187] md: export_rdev(sdb1)
[  41.206253] md: unbind<sdc1>
[  41.206302] md: export_rdev(sdc1)
[  41.206360] md: unbind<sda1>
[  41.206408] md: export_rdev(sda1)
[  41.247389] md: bind<sdf1>
[  41.247584] md: bind<sdb1>
[  41.247787] md: bind<sda1>
[  41.247971] md: bind<sdc1>
[  41.248151] md: bind<sdg1>
[  41.248325] md: bind<sde1>
[  41.256718] raid5: device sde1 operational as raid disk 0
[  41.256771] raid5: device sdc1 operational as raid disk 4
[  41.256821] raid5: device sda1 operational as raid disk 3
[  41.256870] raid5: device sdb1 operational as raid disk 2
[  41.256919] raid5: device sdf1 operational as raid disk 1
[  41.257426] raid5: allocated 5245kB for md0
[  41.257476] raid5: raid level 5 set md0 active with 5 out of 5 devices, algorithm 2
[  41.257538] RAID5 conf printout:
[  41.257584]  --- rd:5 wd:5
[  41.257631]  disk 0, o:1, dev:sde1
[  41.257677]  disk 1, o:1, dev:sdf1
[  41.257724]  disk 2, o:1, dev:sdb1
[  41.257771]  disk 3, o:1, dev:sda1
[  41.257817]  disk 4, o:1, dev:sdc1

[  41.257952] md: md1 stopped.
[  41.258009] md: unbind<sdb2>
[  41.258060] md: export_rdev(sdb2)
[  41.258128] md: unbind<sda2>
[  41.258179] md: export_rdev(sda2)
[  41.258248] md: unbind<sdc2>
[  41.258306] md: export_rdev(sdc2)
[  41.283067] md: bind<sdc2>
[  41.283297] md: bind<sda2>
[  41.285235] md: bind<sdb2>
[  41.306753] md: md1 stopped.
[  41.306818] md: unbind<sdb2>
[  41.306878] md: export_rdev(sdb2)
[  41.306956] md: unbind<sda2>
[  41.307007] md: export_rdev(sda2)
[  41.307075] md: unbind<sdc2>
[  41.307130] md: export_rdev(sdc2)
[  41.312250] md: bind<sdf2>
[  41.312476] md: bind<sdb2>
[  41.312711] md: bind<sdg2>
[  41.312922] md: bind<sdc2>
[  41.313138] md: bind<sda2>
[  41.313343] md: bind<sde2>
[  41.313452] md: md1: raid array is not clean -- starting background reconstruction
[  41.322189] raid5: device sde2 operational as raid disk 0
[  41.322243] raid5: device sdc2 operational as raid disk 4
[  41.322292] raid5: device sdg2 operational as raid disk 3
[  41.322342] raid5: device sdb2 operational as raid disk 2
[  41.322391] raid5: device sdf2 operational as raid disk 1
[  41.322823] raid5: allocated 5245kB for md1
[  41.322872] raid5: raid level 5 set md1 active with 5 out of 5 devices, algorithm 2
[  41.322934] RAID5 conf printout:
[  41.322980]  --- rd:5 wd:5
[  41.323026]  disk 0, o:1, dev:sde2
[  41.323073]  disk 1, o:1, dev:sdf2
[  41.323119]  disk 2, o:1, dev:sdb2
[  41.323165]  disk 3, o:1, dev:sdg2
[  41.323212]  disk 4, o:1, dev:sdc2

[  41.323316] md: resync of RAID array md1
[  41.323364] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  41.323415] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[  41.323492] md: using 128k window, over a total of 231978496 blocks.


ramram29 01-09-2008 02:37 PM

Run the following command and print the output:

df -h

jdavidow 01-09-2008 02:38 PM

Filesystem Size Used Avail Use% Mounted on
/dev/sdd1 108G 41G 61G 41% /
varrun 872G 576G 252G 70% /var/run
varlock 872G 576G 252G 70% /var/lock
udev 506M 508K 506M 1% /dev
devshm 506M 0 506M 0% /dev/shm
lrm 506M 34M 472M 7% /lib/modules/2.6.22-14-generic/volatile
/dev/md0 46G 21G 24G 47% /home
/dev/md1 872G 576G 252G 70% /var

ramram29 01-09-2008 03:22 PM

Run:

fdisk -l

jdavidow 01-09-2008 03:30 PM

No problem. Boot drive (non-raid) is /dev/sdd; The six RAID drives are made up of two different models. Also, drives a-c are actually located in a PCI SATA controller card, while drives b-d are plugged into the MD. sdd1 is actually accessed by GRUB as hd0,0. It used to be the other way around, but it swapped when I upgraded Ubuntu from Dapper, back in the day.

(I have also included fstab in case you were headed that way...)

Code:

$ sudo fdisk -l

Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

  Device Boot      Start        End      Blocks  Id  System
/dev/sda1              1        1521    12217401  fd  Linux raid autodetect
/dev/sda2            1522      30401  231978600  fd  Linux raid autodetect

Disk /dev/sdb: 251.0 GB, 251000193024 bytes
255 heads, 63 sectors/track, 30515 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

  Device Boot      Start        End      Blocks  Id  System
/dev/sdb1              1        1521    12217401  fd  Linux raid autodetect
/dev/sdb2            1522      30401  231978600  fd  Linux raid autodetect

Disk /dev/sdc: 251.0 GB, 251000193024 bytes
255 heads, 63 sectors/track, 30515 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

  Device Boot      Start        End      Blocks  Id  System
/dev/sdc1              1        1521    12217401  fd  Linux raid autodetect
/dev/sdc2            1522      30401  231978600  fd  Linux raid autodetect

Disk /dev/md0: 50.0 GB, 50041978880 bytes
2 heads, 4 sectors/track, 12217280 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table

Disk /dev/md1: 950.1 GB, 950183919616 bytes
2 heads, 4 sectors/track, 231978496 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000

Disk /dev/md1 doesn't contain a valid partition table

Disk /dev/sdd: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x535bfd7a

  Device Boot      Start        End      Blocks  Id  System
/dev/sdd1  *          1      14219  114214086  83  Linux
/dev/sdd2          14220      14593    3004155    5  Extended
/dev/sdd5          14220      14593    3004123+  82  Linux swap / Solaris

Disk /dev/sde: 251.0 GB, 251000193024 bytes
255 heads, 63 sectors/track, 30515 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

  Device Boot      Start        End      Blocks  Id  System
/dev/sde1              1        1521    12217401  fd  Linux raid autodetect
/dev/sde2            1522      30401  231978600  fd  Linux raid autodetect

Disk /dev/sdf: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

  Device Boot      Start        End      Blocks  Id  System
/dev/sdf1              1        1521    12217401  fd  Linux raid autodetect
/dev/sdf2            1522      30401  231978600  fd  Linux raid autodetect

Disk /dev/sdg: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

  Device Boot      Start        End      Blocks  Id  System
/dev/sdg1              1        1521    12217401  fd  Linux raid autodetect
/dev/sdg2            1522      30401  231978600  fd  Linux raid autodetect

/etc/fstab:
Code:

proc /proc proc defaults 0 0
# Entry for /dev/sdd1 :
UUID=e69f1ea9-78d0-4291-aa30-1068cb7a1953 / ext3 defaults,errors=remount-ro 0 1
# Entry for /dev/sdd5 :
UUID=c956c71e-dd6c-4430-9b8e-d98026c03760 none swap sw 0 0
/dev/hda /media/cdrom1 udf,iso9660 user,noauto 0 0
/dev/ /media/floppy0 auto rw,user,noauto 0 0
/dev/md0 /home ext2 defaults,errors=remount-ro 0 1
/dev/md1 /var ext2 defaults,errors=remount-ro 0 1



All times are GMT -5. The time now is 10:10 PM.