LinuxQuestions.org - RAID1 - fixing a corrupted file system

- Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)

- - RAID1 - fixing a corrupted file system (https://www.linuxquestions.org/questions/linux-general-1/raid1-fixing-a-corrupted-file-system-699224/)

RAID1 - fixing a corrupted file system

hi, i've been happily running RAID1 for years on my system.

i have two IDE hard disks each with 4 partitions:

Code:

  Device Boot      Start        End      Blocks  Id  System

/dev/sda1  *          1          16      128488+  fd  Linux RAID autodetect

/dev/sda2              17        3663    29294527+  fd  Linux RAID autodetect

/dev/sda3            3664      14471    86815260  fd  Linux RAID autodetect

/dev/sda4          14472      14593      979965  82  Linux swap / Solaris

/dev/sdb is an identical disk and partitioned exactly the same.

Code:

root@ubuntu:~/Desktop# mount | grep md

/dev/md0 on /boot type ext3 (rw,relatime)

/dev/md1 on / type ext3 (rw,relatime,errors=remount-ro)

a bad past experience with RAIDed swap meant i don't RAID it anymore:

Code:

root@ubuntu:~/Desktop# swapon -s

Filename                                Type                Size        Used        Priority

/dev/sda4                              partition        979956        0        -1

/dev/sdb4                              partition        979956        0        -2

ok so far. so where's /dev/md2? well that's where i normally mount all my user data and /home and md2 is /dev/sd[ab]3. now the fun starts.

i'm running ubuntu 8.10 and noticed that trackerd was stuck indexing the last directory (approximately 5678/5678). i restarted indexing and it got stuck again. hmmm - now i'm interested. my email client is evolution and that was starting to complain about the database and i lost an email or two. now i'm worried.

a reboot won't harm, i thought. wrong:

Code:

fsck 1.41.3 (12-Oct-2008)

e2fsck 1.41.3 (12-Oct-2008)

fsck.ext3: Group descriptors look bad... trying backup blocks...

fsck.ext3: Bad magic number in super-block while trying to open /dev/sda3



The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>

i stopped /dev/md2, booted and logged in as root (doesn't use /home at all) and have tried to fsck /dev/sda3 and /dev/sdb3

same fsck error as above on both. and the -b 8193 option has no effect.

i tried mdadm --zero-superblock /dev/sda3 and rebuilt the array, but all that came back after a disk sync was the corrupted file system error.

my understanding isn't great, but i think the RAID is fine, and it's the file system that's broken. if i could fix /dev/sda3 i'll happily zero sdb3 (dd -if=/dev/zero -of=/dev/sdb3) and then re-assemble the RAID, but i just can't seem to fix sda3.

any suggestions, and am i even going about this the right way?

thanks in advance

oh - here's what dmesg had to say...

Code:

/* i zeroed the md2 superblock on /dev/sda3 and then rebuilt with sdb3 */

[ 2280.672577] md: bind<sda3>

[ 2280.680505] md: bind<sdb3>

[ 2280.681366] md: md2: raid array is not clean -- starting background reconstruction

[ 2280.787296] raid1: raid set md2 active with 2 out of 2 mirrors

[ 2280.815780] md: resync of RAID array md2

[ 2280.815790] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.

[ 2280.815794] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.

[ 2280.815800] md: using 128k window, over a total of 86815168 blocks.

[ 6952.036591] md: md2: resync done. 

/* so the RAID is OK? */

[ 6952.153591] RAID1 conf printout:

[ 6952.153599]  --- wd:2 rd:2

[ 6952.153604]  disk 0, wo:0, o:1, dev:sda3

[ 6952.153607]  disk 1, wo:0, o:1, dev:sdb3

[ 7732.393726] EXT3-fs error (device md2): ext3_check_descriptors: Block bitmap for group 384 not in group (block 573188386)! 

/* ah - i guess not */

[ 7732.446713] EXT3-fs: group descriptors corrupted!

[ 8063.483777] EXT3-fs error (device md2): ext3_check_descriptors: Block bitmap for group 384 not in group (block 573188386)!

[ 8063.483795] EXT3-fs: group descriptors corrupted!

[ 8077.915835] EXT3-fs error (device md2): ext3_check_descriptors: Block bitmap for group 384 not in group (block 573188386)!

[ 8077.915851] EXT3-fs: group descriptors corrupted!

[ 9486.222714] md: md2 stopped. 

/* so i stop /dev/md2 and have access to sd[ab]3 again */

[ 9486.223752] md: unbind<sdb3>

[ 9486.225485] md: export_rdev(sdb3)

[ 9486.225519] md: unbind<sda3>

[ 9486.227054] md: export_rdev(sda3)

a solution...

first, recover the original borked disk image from another disk back onto the partition that i'd hosed with various attempts at doing stuff i didn't really understand:

$ ddrescue /mnt/home.img /dev/sda3

i'm now back to the point where i my original fsck will have barfed. the superblock that was corrupted lives on the disk in various places, so let's see where...

root@ubuntu:~# mke2fs -n /dev/sda3
mke2fs 1.41.3 (12-Oct-2008)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
5431296 inodes, 21703815 blocks
1085190 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
663 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000

i then ran through each one in turn looking to see how broken they were (i hit ctrl-C rather than actually do anything):

574 e2fsck -b 32768 /dev/sda3
575 e2fsck -b 98304 /dev/sda3
576 e2fsck -b 163840 /dev/sda3
577 e2fsck -b 229376 /dev/sda3
578 e2fsck -b 294912 /dev/sda3
579 e2fsck -b 819200 /dev/sda3
580 e2fsck -b 884736 /dev/sda3
581 e2fsck -b 1605632 /dev/sda3
582 e2fsck -b 20480000 /dev/sda3
583 e2fsck -b 11239424 /dev/sda3
584 e2fsck -b 7962624 /dev/sda3
585 e2fsck -b 4096000 /dev/sda3
586 e2fsck -b 2654208 /dev/sda3
587 history

e2fsck complained more about the contents of some than others. in the end i chose the highest and let it run with -y:

$ e2fsck -b 20480000 -y /dev/sda3

when that finished, i rebuilt the raid (degraded with just the one disk for now), mounted it and i can now see all my files

590 mdadm --assemble /dev/md2 /dev/sda3 --run
591 mount /dev/md2 /home
592 ls -l /home
593 cd /home/pb
594 ls

w00t

Hello, you are still there?
I have almost the same problem but I am a complete noob on linux.
Please can you help me with detailed instructions?

> I have almost the same problem but I am a complete noob on linux.

you're certain you want to restore the superblock from the file system's own backups - it looks as though that's all i did.

all the details are there - i think you just have to work through them.

good luck

OK; I would like to leave intact the disks if something goes wrong.
Can you suggest a way to make a full and complete image of a disk?
Working on the image it should be easier.
thx