I appear to have a degraded RAID array, specifically, hda2 is bad. What is the best way to recover it? this is a long post with details, but the root of it is in the last sentence...
Code:
¦ Current RAID status: ¦
¦ ¦
¦ Personalities : [raid1] ¦
¦ md1 : active raid1 hdb2[1] ¦
¦ 155918784 blocks [2/1] [_U] ¦
¦ ¦
¦ md2 : active raid1 hdb3[1] hda3[0] ¦
¦ 264960 blocks [2/2] [UU] ¦
¦ ¦
¦ md0 : active raid1 hdb1[1] hda1[0] ¦
¦ 104320 blocks [2/2] [UU] ¦
¦ ¦
¦ unused devices: <none> ¦
¦ ¦
¦ ¦
¦ There should be two RAID devices, not 3
Here is my current filesystem setup
Code:
[root@gluon]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md1 147G 8.9G 131G 7% /
/dev/md0 99M 32M 63M 34% /boot
none 315M 0 315M 0% /dev/shm
/dev/hdd1 230G 63G 156G 29% /mnt/bigdisk
and here are some details on the RAID settings for md0 and md1 (md2 is just like md0)
Code:
[root@gluon]# mdadm -D /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Thu Jan 12 19:26:31 2006
Raid Level : raid1
Array Size : 104320 (101.88 MiB 106.82 MB)
Device Size : 104320 (101.88 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Mon Oct 16 18:38:10 2006
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 3 65 1 active sync /dev/hdb1
UUID : 5139bc2e:39939d3e:5abd791c:3ce0a6ef
Events : 0.3834
[root@gluon]# mdadm -D /dev/md1
/dev/md1:
Version : 00.90.01
Creation Time : Thu Jan 12 19:21:55 2006
Raid Level : raid1
Array Size : 155918784 (148.70 GiB 159.66 GB)
Device Size : 155918784 (148.70 GiB 159.66 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Mon Oct 16 18:27:38 2006
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 0 0 -1 removed
1 3 66 1 active sync /dev/hdb2
UUID : 0a968a22:d1b0d2bd:ab248bae:ec482cc1
Events : 0.12532934
As I interpret this, /dev/md1 is broken, with /dev/hda2 not being mirrored.
However, if I try to add hda2 back to md1, I get an invalid argument error:
Code:
[root@gluon]# mdadm -a /dev/md1 /dev/hda2
mdadm: hot add failed for /dev/hda2: Invalid argument
So... I tried removing the partition first:
Code:
[root@gluon]# mdadm /dev/md1 -r /dev/hda2 -a /dev/hda2
mdadm: hot remove failed for /dev/hda2: No such device or address
So now what? the / partition on hda is hosed? How do I rebuild that? I'm quickly diving out of my depth here...
FWIW
Code:
[root@gluon init.d]# mdadm -E /dev/hdb2
/dev/hdb2:
Magic : a92b4efc
Version : 00.90.00
UUID : 0a968a22:d1b0d2bd:ab248bae:ec482cc1
Creation Time : Thu Jan 12 19:21:55 2006
Raid Level : raid1
Device Size : 155918784 (148.70 GiB 159.66 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 1
Update Time : Mon Oct 16 18:27:38 2006
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Checksum : b0a4fa9a - correct
Events : 0.12532934
Number Major Minor RaidDevice State
this 1 3 66 1 active sync /dev/hdb2
0 0 0 0 0 removed
1 1 3 66 1 active sync /dev/hdb2
[root@gluon init.d]# mdadm -E /dev/hda2
/dev/hda2:
Magic : a92b4efc
Version : 00.90.00
UUID : 0a968a22:d1b0d2bd:ab248bae:ec482cc1
Creation Time : Thu Jan 12 19:21:55 2006
Raid Level : raid1
Device Size : 155918784 (148.70 GiB 159.66 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Update Time : Sun Oct 15 21:07:07 2006
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : b0a3ce33 - correct
Events : 0.12532928
Number Major Minor RaidDevice State
this 0 3 2 0 active sync /dev/hda2
0 0 3 2 0 active sync /dev/hda2
1 1 3 66 1 active sync /dev/hdb2
WhatsHisName in thread 429857 suggests running mdadm -C if all else fails... So I did:
Code:
[root@gluon init.d]# mdadm -C /dev/md1 -l1 -n2 /dev/hda2 /dev/hdb2
mdadm: /dev/hda2 appears to contain an ext2fs file system
size=155918784K mtime=Mon Oct 16 18:27:39 2006
mdadm: /dev/hda2 appears to be part of a raid array:
level=1 devices=2 ctime=Thu Jan 12 19:21:55 2006
mdadm: /dev/hdb2 appears to contain an ext2fs file system
size=155918784K mtime=Sun Oct 15 20:33:12 2006
mdadm: /dev/hdb2 appears to be part of a raid array:
level=1 devices=2 ctime=Thu Jan 12 19:21:55 2006
Continue creating array?
And I chickened out. I'm afraid of wiping the contents of the surviving partition. Does anyone know if I chose to continue, what would happen?