LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Raid5 Recovery Help on Embedded-Linux NAS Device (https://www.linuxquestions.org/questions/linux-general-1/raid5-recovery-help-on-embedded-linux-nas-device-560195/)

phyros 06-08-2007 08:35 AM

Raid5 Corruption Recovery Help for embedded-linux NAS device
 
Hey guys, I'd like some help recovering from a failed software Raid-5 setup. The raid-5 setup is on an embedded linux NAS (the Bufallo Terastation Pro, if anyone's familiar with it), so I can't really give all that many details as to the distro, version, setup, etc. All of that is hidden and proprietary. The tech support told me that all I can do is scrap my data, but this is stupid... they're manufacturing a redundant data server; they should know better.

Anyways, a hacked firmware does allow me to telnet into the device as root (and void my warrenty probably, but whatever), so if any pertinent information is discoverable, I can attempt to reverse engineer this thing if you tell me what to do (my linux experience is about a few month's worth... enough to get by but lacking in the deeper understandings of things). Google has been surprisingly unhelpful in finding a comprehensive tutorial on troubleshooting a raid configuration, so I'm hoping someone here can help me.

Anyways, here's what I do know about the setup: it uses four 500gb hard-drives in a RAID-5 configuration, and the raid arrays are mounted as md devices. [edit]The file system is XFS.[/edit] There's two main partitions of interest: /md0 is a system partition and /md1 is the partition of data that I'm trying to recover. I suspect the problem is a bricked superblock, but I'm not quite sure on how to recover from that.

Here's what I've discovered by poking around with mdadm. Looking at the system partition...
Code:

root@HAXD_HELPER:/etc# mdadm --examine /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 00000000)


root@HAXD_HELPER:/etc# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.02
  Creation Time : Sat Jan 14 12:32:49 2006
    Raid Level : raid1
    Array Size : 385408 (376.38 MiB 394.66 MB)
    Device Size : 385408 (376.38 MiB 394.66 MB)
  Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Jun  6 21:26:53 2007
          State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
  Spare Devices : 0

          UUID : e87531ac:9fe1f96a:121f55a1:1220867e
        Events : 0.110

    Number  Major  Minor  RaidDevice State
      0      8        1        0      active sync  /dev/sda1
      1      8      33        1      active sync  /dev/sdc1
      2      8      49        2      active sync  /dev/sdd1
      3      8      17        3      active sync  /dev/sdb1

I may not be understanding it correctly (or just not knowing what a good working config looks like), but it seems that all the --details are fine while the --examine says uh-oh. This is also weird since this is supposed to be the system partition (and the system works since, well, I'm in it and running commands), but it supposedly has a bad superblock.

Anyways, there's probably some implementation magic that makes things happen. Thats not too important. I'm really just concerned about my data, which is on /md1.

Code:

root@HAXD_HELPER:/etc# mdadm --examine /dev/md1   
mdadm: No super block found on /dev/md1 (Expected magic a92b4efc, got 7d7d7d7d)


root@HAXD_HELPER:/etc# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.02
  Creation Time : Tue Dec 27 16:09:40 2005
    Raid Level : raid5
    Array Size : 1462862592 (1395.09 GiB 1497.97 GB)
    Device Size : 487620864 (465.03 GiB 499.32 GB)
  Raid Devices : 4
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Wed Jun  6 22:22:04 2007
          State : active, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
  Spare Devices : 0

        Layout : left-symmetric
    Chunk Size : 64K

          UUID : 37d97fb5:083ede07:8d3e9c16:0f299b85
        Events : 0.300

    Number  Major  Minor  RaidDevice State
      0      8        3        0      active sync  /dev/sda3
      1      8      19        1      active sync  /dev/sdb3
      2      8      35        2      active sync  /dev/sdc3
      3      8      51        3      active sync  /dev/sdd3

What concerns me here are the lines that say there are 4 raid devices, but only 1 total device. The md device doesn't have a good superblock, but when I --examine the individual sd*3 partitions, they do appear to have good superblocks, so this makes me think that all hope is not yet lost...
Code:

root@HAXD_HELPER:/etc# mdadm -E /dev/sd[abcd]3
/dev/sda3:
          Magic : a92b4efc
        Version : 00.90.02
          UUID : 37d97fb5:083ede07:8d3e9c16:0f299b85
  Creation Time : Tue Dec 27 16:09:40 2005
    Raid Level : raid5
  Raid Devices : 4
  Total Devices : 1
Preferred Minor : 1

    Update Time : Wed Jun  6 22:22:04 2007
          State : active
Active Devices : 4
Working Devices : 1
Failed Devices : 0
  Spare Devices : 0
      Checksum : 2cd505c9 - correct
        Events : 0.300

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    0      8        3        0      active sync  /dev/sda3

  0    0      8        3        0      active sync  /dev/sda3
  1    1      8      19        1      active sync  /dev/sdb3
  2    2      8      35        2      active sync  /dev/sdc3
  3    3      8      51        3      active sync  /dev/sdd3
/dev/sdb3:
          Magic : a92b4efc
        Version : 00.90.02
          UUID : 37d97fb5:083ede07:8d3e9c16:0f299b85
  Creation Time : Tue Dec 27 16:09:40 2005
    Raid Level : raid5
  Raid Devices : 4
  Total Devices : 1
Preferred Minor : 1

    Update Time : Wed Jun  6 22:22:04 2007
          State : active
Active Devices : 4
Working Devices : 1
Failed Devices : 0
  Spare Devices : 0
      Checksum : 2cd505db - correct
        Events : 0.300

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    1      8      19        1      active sync  /dev/sdb3

  0    0      8        3        0      active sync  /dev/sda3
  1    1      8      19        1      active sync  /dev/sdb3
  2    2      8      35        2      active sync  /dev/sdc3
  3    3      8      51        3      active sync  /dev/sdd3
/dev/sdc3:
          Magic : a92b4efc
        Version : 00.90.02
          UUID : 37d97fb5:083ede07:8d3e9c16:0f299b85
  Creation Time : Tue Dec 27 16:09:40 2005
    Raid Level : raid5
  Raid Devices : 4
  Total Devices : 1
Preferred Minor : 1

    Update Time : Wed Jun  6 22:22:04 2007
          State : active
Active Devices : 4
Working Devices : 1
Failed Devices : 0
  Spare Devices : 0
      Checksum : 2cd505ed - correct
        Events : 0.300

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    2      8      35        2      active sync  /dev/sdc3

  0    0      8        3        0      active sync  /dev/sda3
  1    1      8      19        1      active sync  /dev/sdb3
  2    2      8      35        2      active sync  /dev/sdc3
  3    3      8      51        3      active sync  /dev/sdd3
/dev/sdd3:
          Magic : a92b4efc
        Version : 00.90.02
          UUID : 37d97fb5:083ede07:8d3e9c16:0f299b85
  Creation Time : Tue Dec 27 16:09:40 2005
    Raid Level : raid5
  Raid Devices : 4
  Total Devices : 1
Preferred Minor : 1

    Update Time : Wed Jun  6 22:22:04 2007
          State : active
Active Devices : 4
Working Devices : 1
Failed Devices : 0
  Spare Devices : 0
      Checksum : 2cd505ff - correct
        Events : 0.300

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    3      8      51        3      active sync  /dev/sdd3

  0    0      8        3        0      active sync  /dev/sda3
  1    1      8      19        1      active sync  /dev/sdb3
  2    2      8      35        2      active sync  /dev/sdc3
  3    3      8      51        3      active sync  /dev/sdd3

Soo... it seems to me like the individual sd*3 devices have the right superblock info, but the superblock info on the md1 device got bust. Is there any way I can tell the md1 device to look at the individual sd*3 devices for its superblock? I'm not sure how to phrase this in terms of proper raid/mdadm terminology (or if I even have the right idea).

Finally, it may help to figure out how these devices are scripted to be setup at boot-time. Again, this is a embedded linux NAS device, so all of this is hidden and would have to be reverse-engineered. I've been told that creating a /initrd directory un-hides all of the boot-time scripts/ramdisk (and indeed this is true for my device), but I have no idea what to look for in here.

Any help from a raid guru would be infinitely helpful.

rtspitz 06-09-2007 09:35 PM

if you run mdadm --examine on an md device it will complain about a missing superblock - this is normal and happens on my raid systems as well - which all work fine.

the 4 partitions sd[abcd]3 seem to contain all the info of the raid they should form.

so you could try this (no guarantee for data recovery though):

- unmount /dev/md1 and stop the md1 array with:

Code:

mdadm --stop /dev/md1
- reassemble the array:

Code:

mdadm --assemble /dev/md1 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdc3


All times are GMT -5. The time now is 11:38 PM.