LinuxQuestions.org

LinuxQuestions.org (http://www.linuxquestions.org/questions/index.php)
-   Linux - Hardware (http://www.linuxquestions.org/questions/forumdisplay.php?f=18)
-   -   disk failing? Best strategy (http://www.linuxquestions.org/questions/showthread.php?t=4175436219)

Kropotkin 11-08-2012 03:32 AM

disk failing? Best strategy
 
Hi all,

Yesterday evening I started seeing some disk errors on the home partition, /dev/sda2, of my Fedora 17 box. Here is a line from dmesg:
Code:

[2300766.067364] EXT4-fs error (device sda2): ext4_wait_block_bitmap:447: comm flush-8:0: Cannot read block bitmap - block_group = 121, block_bitmap = 3670025
[2300775.140555] EXT4-fs (sda2): delayed block allocation failed for inode 2230055 at logical offset 65 with max blocks 2 with error -5
[2300775.140560] EXT4-fs (sda2): This should not happen!! Data will be lost

I see this error for three different inodes.

Right now, I am in the process of copying a collection of MP3s to a USB drive but for some reason it is going extremely slowly (approx. 1G/hour). What I would really like to do is boot from Fedora in rescue mode from a USB stick, mount the problematic partition, and copy the remaining MP3s to the USB drive, but I am apprehensive that the partition won't be mountable if the disk is corrupted in any way.

Any thoughts on the best way to manage this?

TobiSGD 11-08-2012 05:23 AM

The best strategy would be to make an image from that partition using ddrescue (or the whole disk, if necessary) and try to retrieve your data from that image.

adol83 11-08-2012 05:34 AM

Hi,

fsck doesn't give any message?

unSpawn 11-08-2012 06:45 AM

Quote:

Originally Posted by adol83 (Post 4824868)
fsck doesn't give any message?

Depending on the cause, the type and severity of errors seen and if data is "just" hard to come by or irreplaceable running fsck may (or may not) be a critical error of judgment. Running fsck in the face of hardware trouble may even compound the problem and thwart later recovery attempts.


Quote:

Originally Posted by TobiSGD (Post 4824862)
The best strategy would be to make an image from that partition using ddrescue (or the whole disk, if necessary) and try to retrieve your data from that image.

That doesn't really address the OP being apprehensive wrt disk corruption, doesn't it? Or would you be able to put your hand on your heart and say the block access failures shown represent isolated incidents? Or phrased differently, what diagnostics would make you decide to stop copying and start a dd_rescue / ddrescue? Just being curious because I have had copy ops stop and then start a dd only to find the disk wouldn't power up again...

H_TeXMeX_H 11-08-2012 07:42 AM

I say just wait for it to complete, if it looks like it is failing fast there probably isn't much time left to do anything.

TobiSGD 11-08-2012 08:33 AM

Quote:

Originally Posted by unSpawn (Post 4824920)
That doesn't really address the OP being apprehensive wrt disk corruption, doesn't it? Or would you be able to put your hand on your heart and say the block access failures shown represent isolated incidents? Or phrased differently, what diagnostics would make you decide to stop copying and start a dd_rescue / ddrescue? Just being curious because I have had copy ops stop and then start a dd only to find the disk wouldn't power up again...

You are right, I didn't realize that the OP is still copying at this state. It comes down to calculate the risks: Is it more risky to stop the copying process and make an image or is it more risky to let the drive trying to copy the MP3s and maybe die in that process due to the stress on that? Copying with ddrescue will continue the progress with another sector, leaving the bad sector for later rescue attempts, so that at least the known good sectors are copied. Copying with ordinary copy commands will try to read the same sector over and over again, regardless if it is able to recover that sector or not, putting more stress on the disk, I would think. In this case i would go so far and unm,ount the /home partition to make an image of it, no need for live-CDs or power-downs. Of course only viable if the OP has an external medium large enough to hold the image.

BoraxMan 11-09-2012 09:43 PM

I think the best option is to use dd_rescue, and if possible, make the target a sparse file so as to save space.
Most important think it so avoid powering the disk down and back up, as any time it goes down, might be the last time.


All times are GMT -5. The time now is 02:27 PM.