LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   raid0 ext3 corruption after I suspend-to-ram several times (2.6.29) (https://www.linuxquestions.org/questions/linux-software-2/raid0-ext3-corruption-after-i-suspend-to-ram-several-times-2-6-29-a-723405/)

map250r 05-03-2009 06:25 PM

raid0 ext3 corruption after I suspend-to-ram several times (2.6.29)
 
I have two 500gb sata drives, each with a partition that was set up with mdadm for raid 0. The kernel recognizes the partitions fine at boot when I specify root=/dev/md0 .

If I suspend to ram several repeatedly (4-5 times?), the last suspend-to-ram will take about 20-30 seconds with some extra text showing up on the console (I need to take a pic, but haven't been fast enough, yet). I think it said something about inodes.

The first time this happened, KDE came back up but things didn't work well. I switched to the console and eventually got a kernel panic. There were many pages of errors before the panic.

The second time it happened, I got a kernel panic before the gui came up. This time, only one screen full of errors.

The errors (first 3 repeated many times, last one not as often):
Code:

sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sdb, sector 845256191
EXT3-fs error (device md0): ext3_get_inode_loc: unable to read inode block - inode=28402518, block=113606711

Eventually I got several of
Code:

Read-error on swap-device (8:16:381061879)
Read-error on swap-device (8:16:381061...)

and then
Code:

Kernel panic - not syncing: Attempted to kill init!

Both times, on the next boot fsck reported errors and said it would need to be manually ran. When I did, I got a lot of errors, though the second time there were fewer types of errors and it all fit in one screen.

The errors:
Code:

corrupted orphan linked lists
inode with zero dtime
inode bitmap differences
free blocks count wrong for group ...
free inodes count wrong for group ...
free blocks count wrong
free inodes count wrong

I googled "corrupted orphan linked list" and apparently I should find stuff in /lost+found - but that dir is empty, even with ls -a .

I have been checking dmesg regularly, but haven't seen any warning signs again. I have not been letting my machine suspend many times, for fear that I will corrupt data. After a couple suspends, I have been restarting or shutting down.

EDIT: Does anyone know the cause, or know of a solution?
Is there any way to know whether this is harmless (or completely recoverable via fsck)?

I'm using Debian Squeeze with a few packages from Sid and experimental (mostly KDE 4.2 stuff). I compiled kernel 2.6.29 myself.

Hardware is a Phenom II X4 810 on Gigabyte MA770-UD3. North bridge is AMD770, south bridge is AMD SB700. Disks are both Seagate Barracuda 7200.12, model ST3500410AS.

I took pictures of the errors, if that would help.


All times are GMT -5. The time now is 04:05 AM.