LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   RAID1, bad superblock and a bit of Oops? (https://www.linuxquestions.org/questions/linux-software-2/raid1-bad-superblock-and-a-bit-of-oops-394775/)

dfidler 12-20-2005 01:13 AM

RAID1, bad superblock and a bit of Oops?
 
Hi everyone,

This morning, my alarm clock woke me up with this really annoying flashing 12:04... * flash flash flash... and I thought to myself, "D'oh, power outtage..."

Sure enough, there my system lay, slumbering peacefully. I gave it the boot and it greeted me with "/dev/md0: bad superblock" and a kernel panic. I spent the next couple of hours reading about RAID, etc and trying to bring the computer back online. I tried unplugging one drive, no workie; the other drive, no workie; booting from a debian netinst CD, *mumbles something about ABit Motherboards*; booting to Single user mode, no workie.

Eventually I tried booting into a 2.6.12 kernel and the disk came up in degraded mode. A quick reboot into 2.6.14 and I have my system back. However, I still seemed to be down a disk. So like any numb nuts would do, I started "fiddling" [notice my username :) ] without really knowing what I was doing....

--

The RAID1 array is software using two devices: /dev/sda1 & /dev/sdb1

cat /proc/mdstat said that only one disk was being used, dmesg confirmed this. Seeing that I was down a disk I decided to just, "Add it back in" using mdadm -a /dev/md0 /dev/sdb1

Now /proc/mdstat says that they are syncing. Great!

# :/home# cat /proc/mdstat
Code:

Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[0]
      293033536 blocks [2/1] [U_]
      [===>.................]  recovery = 49.8% (146075136/293033536) finish=54.2.9min speed=45157K/sec
unused devices: <none>

But doing an examine on the disks has me worried...

# :/home# mdadm -E /dev/sda1
Code:

/dev/sda1:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : de5728b2:04f52c23:e2d46b15:93ac8bb3
  Creation Time : Sat Oct  8 03:43:08 2005
    Raid Level : raid1
  Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Mon Dec 19 22:39:27 2005
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
      Checksum : 9b88d861 - correct
        Events : 0.1363584

      Number  Major  Minor  RaidDevice State
this    0      8        1        0      active sync  /dev/sda1
  0    0      8        1        0      active sync  /dev/sda1
  1    1      0        0        1      faulty removed
  2    2      8      17        1      spare  /dev/sdb1


# :/home# mdadm -E /dev/sdb1
Code:

/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : de5728b2:04f52c23:e2d46b15:93ac8bb3
  Creation Time : Sat Oct  8 03:43:08 2005
    Raid Level : raid1
  Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Mon Dec 19 22:39:30 2005
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
      Checksum : 9b88d877 - correct
        Events : 0.1363586

      Number  Major  Minor  RaidDevice State
this    2      8      17        2      spare  /dev/sdb1

  0    0      8        1        0      active sync  /dev/sda1
  1    1      0        0        1      faulty removed
  2    2      8      17        2      spare  /dev/sdb1

I am looking at the list of devices at the end and I am worried by the "faulty removed" and "spare" states.

Questions:
* Does this look okay or do I need to fix it?
* RaidDevice 2 & 1 are the same device. Now that I am syncing to device 2, is there going to be any kind of conflict during my next bootup between RaidDevice 1 & 2 (as 1 should now be in sync)
* Why would my superblock get corrupted after a power loss? I wouldn't think that the superblock sees a lot of writes, or was i just one of the lucky ones?
* Does this sound more like a HD failure?
* Why is sdb1 now listed as a spare, or is that normal also?


Mostly, I am terrified that another power outtage will take this box out and I will lose my data. I don't *think* that my array config is unrecoverable, but my sphinkter has had enough exercise for one night.

I am trying to find a place where I can backup all of my data, but that might take a while, so any help or soothing words would be greatly appreciated in the interrim.

dfidler 12-20-2005 10:38 AM

Well, I was wondering if I should wait until the sync completed before posting that question, and it turns out that I should have. After the sync completed it got rid of the "faulty removed" and "spare" entries so that I now just have a RaidDevice 0 and 1.


All times are GMT -5. The time now is 12:25 PM.