RAID1+LVM: data keeps getting corrupted
After a recent HD failure, I decided to start using (software)RAID-1 (md). But I only had one disk, but I would soon have another (smaller disk) available after I migrated my fileserver to a (software)RAID-1 using two new disks.
So I created a partition on my first disk of the size as the second disk was going to be and configured it as a degraded RAID-1. On that RAID configuration I created a LVM2 volume group on which I created 2 LVM2 volumes (home and data). That worked perfectly for a few months. A few days ago, I added that other disk to the currently degraded (software)RAID-1. It started updating and the (software)RAID became clean. Everything seemed alright. Until somewhat later, I found out my home partition was suddenly remounted read-only because of some troubles with the ext3 journaling. After an fsck it turned I had had a lot of errors on that partition, lots of inodes I had to clean or fix.. :-(
But the system came up again, with no more errors.. Until, while deleting a few big files from the data partition: read-only filesystem. In the logging again the ext3-journaling who gave up and remounted the partition read-only. Again lots of data corruption and a lot of files lost..
After another reboot, everything seemed ok again. But after a while again the home partition read-only.
Both disks never gave any problem before, and the problems started since I hot-added one disk to the initialy degraded (software)raid-1. There is no message about DriveReadySeek errors or anything alike. It's always the ext-3 journaling system that seems to find something wrong causing the drive to be remounted read-only. No other errors in the logs which could point to any hardware failure.
I decided to remove that other disk again from raid, since problems started with that disk.
But even after the removal of it, the corruption keeps going on on the LVM2 volumes.
What could have gone wrong? And how should I fix this?
Last edited by Chojin; 10-16-2006 at 03:48 AM.