LinuxQuestions.org - Raid1 fails to rebuild (unrecoverable read error)

Hi AlienDog,

Quote:

Originally Posted by alienDog (Post 3214957)

Is there a way to find out exactly what is the faulty block supposed to contain (i.e. which file does it belong to)?

You could try to read file by file, e.g.
find -type f -print0 | tee filelist.null | \
xargs --null md5sum > filelist.md5
Then compare filelist.* to see if there is a problem. (To use a normal text editor, first convert the Null bytes to newlines with tr '\0' '\n'.) If not, then the faulty block is somewhere is the free space of the filesystem. You can try to force the disk to re-allocate the block by writing lots of data to your filesystem. However, the faulty block could be hidden somewhere: The device blocks are 512 bytes in size, while the filesystem blocks are 4096 bytes. If then you have a, let's say, 3400 byte file and the only the last 512 bytes of the 4096 bytes allocated for the file are faulty, then the faulty block will never be touched by the filesystem (except if you append data to this particular file).

Quote:

Originally Posted by alienDog (Post 3214957)

Is there a way to force building the secondary disk even when there is an error on the primary, or is there some other workaround for this?

Workaround:
1. Remove the second disk from the RAID, leaving your current RAID1 in degraded mode.
2. Create a new (second) RAID1 with the second disk in degraded mode.
3. Create a new filesystem on the second RAID.
4. Copy all files to the new filesystem.
5. Switch to the new filesystem.
6. Deconstruct the first RAID (mdadm --fail followed by --remove).
7. Add the first disk to the new RAID.
(The resync is reading from the second disk and writing to the first disk. When the faulty block is over-written, it hopefully is readable again.)
8. Wait for resync to finish.
9. Check that the first disk is readable:
dd if=/dev/sda of=/dev/null bs=32k

JJ