This is a really weird problem...
I use SUSE 9.2 (x86-64), kernel 2.6.8-24.11, an AOpen nforce-3 based motherboard with 6 SATA drivers and one PATA device, the DVD drive, 512MB RAM, AMD Athlon 64 2800+
The six SATA drivers appeared as sda, sdb ... sdf
I created one raid1 (mirror) as /dev/md0, one raid5 as /dev/md1 and one raid2 as /dev/md2
# cat /etc/raidtab
# autogenerated /etc/raidtab by YaST2
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
persistent-superblock 1
chunk-size 4
device /dev/sda2
raid-disk 0
device /dev/sdb2
raid-disk 1
raiddev /dev/md2
raid-level 0
nr-raid-disks 4
persistent-superblock 1
chunk-size 32
device /dev/sdc1
raid-disk 0
device /dev/sdd1
raid-disk 1
device /dev/sde1
raid-disk 2
device /dev/sdf1
raid-disk 3
raiddev /dev/md1
raid-level 5
nr-raid-disks 6
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 128
device /dev/sda4
raid-disk 0
device /dev/sdb4
raid-disk 1
device /dev/sdc3
raid-disk 2
device /dev/sdd3
raid-disk 3
device /dev/sde3
raid-disk 4
device /dev/sdf3
raid-disk 5
It worked for a few hours but one of the disks (/dev/sdd) was faulty and it "died" for the raids after too much data errors; md0 was unaffected (used only sda & sdb), md1 enter critical mode (no data lost) and md2 died (fortunately all data there was preserved elsewhere). I keep working like that for a few days until I got a drive replacement. I installed it, partitioned it with YAST and then tried to recover the previous structure. I could reconstruct md1 with no problems, but md2 was impossible... after some tests, I discovered that /dev/sdf *DOES NOT EXIST NOW*, although /dev/sdf3 is part of md1 (and md1 was in critical mode, in fact it was being recovered at the moment I discovered it was not listed in /dev).
So, /dev/sdf (and conversely, /dev/sdf3) did not exist but /dev/md1 was using it flawlessly...
Right now, md1 works:
#cat /proc/mdstat
md1 : active raid5 sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb4[1] sda4[0]
1065550720 blocks level 5, 128k chunk, algorithm 2 [6/6] [UUUUUU]
but one of the drives it is using does not exist:
# ls /dev/sdf
/bin/ls: /dev/sdf: No such file or directory
cannot be accesed from parted or fdisk also:
# fdisk /dev/sdf
Unable to open /dev/sdf
I cannot reconstruct md2 because sdf1 does not exist also:
# mdadm --create /dev/md2 -l 0 -n 4 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
mdadm: /dev/sdc1 appears to be part of a raid array:
level=0 devices=4 ctime=Sun Dec 5 16:45:40 2004
mdadm: /dev/sde1 appears to be part of a raid array:
level=0 devices=4 ctime=Sun Dec 5 16:45:40 2004
mdadm: Cannot open /dev/sdf1: No such file or directory
mdadm: create aborted
In fact, both /dev/sdf and /dev/sdg are missing in /dev:
# ls /dev/sd?
/dev/sda
/dev/sdb
/dev/sdc
/dev/sdd
/dev/sde
/dev/sdh
/dev/sdi
/dev/sdj
/dev/sdk
/dev/sdl
/dev/sdm
/dev/sdn
/dev/sdo
/dev/sdp
/dev/sdq
/dev/sdr
/dev/sds
/dev/sdt
/dev/sdu
/dev/sdv
/dev/sdw
/dev/sdx
/dev/sdy
/dev/sdz
dmesg and /var/log/messages do not seem to find anything wrong in any of
the disks...
This is from dmesg (I try to keep only the relevant data...):
...
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE3-250: IDE controller at PCI slot 0000:00:08.0
NFORCE3-250: chipset revision 162
NFORCE3-250: not 100% native mode: will probe irqs later
NFORCE3-250: BIOS didn't set cable bits correctly. Enabling workaround.
NFORCE3-250: 0000:00:08.0 (rev a2) UDMA133 controller
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda
MA, hdb
MA
ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc
MA, hdd
MA
Probing IDE interface ide0...
hda: DVD-ROM 16X, ATAPI CD/DVD-ROM drive
...
swsusp: Resume From Partition: /dev/sda3
pmdisk: Error -6 resuming
PM: Resume from disk failed.
...
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
...
SCSI subsystem initialized
libata version 1.02 loaded.
sata_nv version 0.03
...
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xEA00 irq 11
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xEA08 irq 11
ata1: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:407f
ata1: dev 0 ATA, max UDMA/133, 490234752 sectors: lba48
ata1: dev 0 configured for UDMA/133
scsi0 : sata_nv
ata2: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:407f
ata2: dev 0 ATA, max UDMA/133, 490234752 sectors: lba48
ata2: dev 0 configured for UDMA/133
scsi1 : sata_nv
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdb: 490234752 512-byte hdwr sectors (251000 MB)
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2 sdb3 sdb4
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
sata_sil version 0.54
ACPI: PCI interrupt 0000:02:04.0[A] -> GSI 10 (level, low) -> IRQ 10
ata3: SATA max UDMA/100 cmd 0xFFFFFF000001C080 ctl 0xFFFFFF000001C08A bmdma 0xFFFFFF000001C000 irq 10
ata4: SATA max UDMA/100 cmd 0xFFFFFF000001C0C0 ctl 0xFFFFFF000001C0CA bmdma 0xFFFFFF000001C008 irq 10
ata5: SATA max UDMA/100 cmd 0xFFFFFF000001C280 ctl 0xFFFFFF000001C28A bmdma 0xFFFFFF000001C200 irq 10
ata6: SATA max UDMA/100 cmd 0xFFFFFF000001C2C0 ctl 0xFFFFFF000001C2CA bmdma 0xFFFFFF000001C208 irq 10
ata3: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:207f
ata3: dev 0 ATA, max UDMA/133, 490234752 sectors: lba48
ata3: dev 0 configured for UDMA/100
scsi2 : sata_sil
ata4: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:207f
ata4: dev 0 ATA, max UDMA/133, 490234752 sectors: lba48
ata4: dev 0 configured for UDMA/100
scsi3 : sata_sil
ata5: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:207f
ata5: dev 0 ATA, max UDMA/133, 490234752 sectors: lba48
ata5: dev 0 configured for UDMA/100
scsi4 : sata_sil
ata6: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:207f
ata6: dev 0 ATA, max UDMA/133, 490234752 sectors: lba48
ata6: dev 0 configured for UDMA/100
scsi5 : sata_sil
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdc: 490234752 512-byte hdwr sectors (251000 MB)
SCSI device sdc: drive cache: write back
sdc: sdc1 sdc2 sdc3
Attached scsi disk sdc at scsi2, channel 0, id 0, lun 0
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdd: 490234752 512-byte hdwr sectors (251000 MB)
SCSI device sdd: drive cache: write back
sdd: sdd1 sdd2 sdd3
Attached scsi disk sdd at scsi3, channel 0, id 0, lun 0
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sde: 490234752 512-byte hdwr sectors (251000 MB)
SCSI device sde: drive cache: write back
sde: sde1 sde2 sde3
Attached scsi disk sde at scsi4, channel 0, id 0, lun 0
Vendor: ATA Model: Maxtor 7Y250M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdf: 490234752 512-byte hdwr sectors (251000 MB)
SCSI device sdf: drive cache: write back
sdf: sdf1 sdf2 sdf3
Attached scsi disk sdf at scsi5, channel 0, id 0, lun 0
md: raid1 personality registered as nr 3
md: Autodetecting RAID arrays.
md: invalid raid superblock magic on sdd1
md: sdd1 has invalid sb, not importing!
md: autorun ...
md: considering sdf3 ...
md: adding sdf3 ...
md: sdf1 has different UUID to sdf3
md: adding sde3 ...
md: sde1 has different UUID to sdf3
md: adding sdd3 ...
md: adding sdc3 ...
md: sdc1 has different UUID to sdf3
md: adding sdb4 ...
md: sdb2 has different UUID to sdf3
md: adding sda4 ...
md: sda2 has different UUID to sdf3
md: created md1
md: bind<sda4>
md: bind<sdb4>
md: bind<sdc3>
md: bind<sdd3>
md: bind<sde3>
md: bind<sdf3>
md: running: <sdf3><sde3><sdd3><sdc3><sdb4><sda4>
md: personality 4 is not loaded!
md :do_md_run() returned -22
md: md1 stopped.
md: unbind<sdf3>
md: export_rdev(sdf3)
md: unbind<sde3>
md: export_rdev(sde3)
md: unbind<sdd3>
md: export_rdev(sdd3)
md: unbind<sdc3>
md: export_rdev(sdc3)
md: unbind<sdb4>
md: export_rdev(sdb4)
md: unbind<sda4>
md: export_rdev(sda4)
md: considering sdf1 ...
md: adding sdf1 ...
md: adding sde1 ...
md: adding sdc1 ...
md: sdb2 has different UUID to sdf1
md: sda2 has different UUID to sdf1
md: created md2
md: bind<sdc1>
md: bind<sde1>
md: bind<sdf1>
md: running: <sdf1><sde1><sdc1>
md: personality 2 is not loaded!
md :do_md_run() returned -22
md: md2 stopped.
md: unbind<sdf1>
md: export_rdev(sdf1)
md: unbind<sde1>
md: export_rdev(sde1)
md: unbind<sdc1>
md: export_rdev(sdc1)
md: considering sdb2 ...
md: adding sdb2 ...
md: adding sda2 ...
md: created md0
md: bind<sda2>
md: bind<sdb2>
md: running: <sdb2><sda2>
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.
...
Adding 514072k swap on /dev/sda3. Priority:42 extents:1
Adding 514072k swap on /dev/sdb3. Priority:42 extents:1
Adding 514072k swap on /dev/sdc2. Priority:42 extents:1
Adding 514072k swap on /dev/sdd2. Priority:42 extents:1
Adding 514072k swap on /dev/sde2. Priority:42 extents:1
Adding 514072k swap on /dev/sdf2. Priority:42 extents:1
...
(gives some errors and mounts md1 and not md2, it is correct since /dev/sdd1 is not yet part of the raid and raid0 needs all the partitions)
...
raid0: looking at sdc1
raid0: comparing sdc1(31471232) with sdc1(31471232)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sde1
raid0: comparing sde1(31471232) with sdc1(31471232)
raid0: EQUAL
raid0: looking at sdf1
raid0: comparing sdf1(31471232) with sdc1(31471232)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: too few disks (3 of 4) - aborting!
md: pers->run() failed ...
...
(at this point, the kernel detects and used /dev/sdf)
...
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0
Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0, type 0
Attached scsi generic sg2 at scsi2, channel 0, id 0, lun 0, type 0
Attached scsi generic sg3 at scsi3, channel 0, id 0, lun 0, type 0
Attached scsi generic sg4 at scsi4, channel 0, id 0, lun 0, type 0
Attached scsi generic sg5 at scsi5, channel 0, id 0, lun 0, type 0
...
Can anybody give some advice on what to do?
Thank you very much