"Failed to read block" during fsck

Willard · 05-30-2010, 09:03 AM

Greetings.

I woke up this morning to the horror that the ext3 partition on my 1TB drive (the only partition on this drive), in an external USB drive case, was not readable.

I did some poking around on Google for related issues, and found that what worked for many was to have fsck check, and possibly fix, some errors that have occurred on the partition. This I did with the following command:

Code:

root@naglfar:/# fsck -y -C -V /dev/sdc1
fsck 1.41.4 (27-Jan-2009)
[/sbin/fsck.ext3 (1) -- /dev/sdc1] fsck.ext3 -y -C0 /dev/sdc1 
e2fsck 1.41.4 (27-Jan-2009)
/dev/sdc1: recovering journal
/dev/sdc1: Attempt to read block from filesystem resulted in short read while reading block 122093217

JBD: Failed to read block at offset 31870
fsck.ext3: Attempt to read block from filesystem resulted in short read while trying to re-open /dev/sdc1
e2fsck: io manager magic bad!
root@naglfar:/#

(yes, the device name is correct. this is the device that shows up when I connect the USB device to my computer)

While I would surely miss 1TB of storage space, I would much sorely miss the >500GB of data that is on the drive.

Could you folks help me:

Find out of the partition is fixable
If so, help me fix it,
If not, help me recover files from it?

Kind regards,
Willard.

P.S: Below are some details you may find useful:

Code:

root@naglfar:/$ uname -r
2.6.28-18-generic

(I know; I've been too lazy to update)

Code:

root@naglfar:/# dmesg | tail
[245827.566727] sd 12:0:0:0: [sdc] Sense Key : Medium Error [current] 
[245827.566737] sd 12:0:0:0: [sdc] Add. Sense: Unrecovered read error
[245827.566746] end_request: I/O error, dev sdc, sector 63
[245827.583129] sd 12:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[245827.583141] sd 12:0:0:0: [sdc] Sense Key : Medium Error [current] 
[245827.583150] sd 12:0:0:0: [sdc] Add. Sense: Unrecovered read error
[245827.583160] end_request: I/O error, dev sdc, sector 65

(there is more stuff following the above snippet, but that is from my wireless adapter losing its beacon and trying to find it again; not relevant to this problem)

Code:

root@naglfar:/# lsusb
Bus 002 Device 011: ID 04b4:6830 Cypress Semiconductor Corp. CY7C68300A EZ-USB AT2 USB 2.0 to ATA/ATAPI

(I have other USB devices connected, but this is the device that disappeared from this list when I disconnected the drive).

tredegar · 05-30-2010, 10:28 AM

I hope you are not trying to run fdisk on a mounted filesystem. (Very bad things happen).

If fdisk cannot mend the filesystem, then you need to make sure you do not make any further writes to the disk, and then look at recovery software like testdisk and photorec. They both have good reputations (see the many posts here on LQ).

You will also need another (big) disk to recover the files to.

Willard · 05-30-2010, 01:53 PM

Here is what I did:

I realised that I could not read or write to the partition. I unmounted it.

When I tried mounting it again (I am using Ubuntu; I tried browsing the partition in a file browser), I got this error (in a pop-up, after ~30 seconds delay):

Code:

wrong fs type, bad option, bad superblock on /dev/sdc1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

I tried doing what the error message suggested. This yielded

Code:

[263912.375866] sd 14:0:0:0: [sdc] Add. Sense: Unrecovered read error
[263912.375876] end_request: I/O error, dev sdc, sector 976745679
[263912.375929] JBD: Failed to read block at offset 31855
[263912.375946] JBD: recovery failed
[263912.375950] EXT3-fs: error loading journal.
[263912.410317] sd 14:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[263912.410329] sd 14:0:0:0: [sdc] Sense Key : Medium Error [current] 
[263912.410338] sd 14:0:0:0: [sdc] Add. Sense: Unrecovered read error
[263912.410348] end_request: I/O error, dev sdc, sector 976745919

It is at this point that I started to attempt recovery by using fsck.

I never ran fdisk myself. I don't know if fsck uses fdisk. In any case, I never ran fdisk on a mounted partition, as I cannot mount the partition.

What I would like to know, but do not have the know-how to find out:

Is the partition corrupted?
1. If so, is the disk okay?
  1. If so, I expect I can either
    1. repair the partition (preferred), or
    2. recover files from it, and re-partition the disk.
    in either case, I could use some directions on how to do so.
  2. If the disk is dead, is there still hope that I can get the files?
    1. If so, how do I proceed?
    2. If not, then such is life.

business_kid · 05-30-2010, 01:55 PM

Painful lesson: Don't use a 1tb partition. The partition seems in rag order. Try
e2fsck -cfv /dev/sdc1
It is a lot braver when you don't tell it to do things automagically. Check /lost+found/

du -sh lost+found

That will tell you how bad your disk was. Then use tune2fs -i n -c n to set recheck interval or count. It strikes me you never checked this disk.

Willard · 05-30-2010, 02:10 PM

Here is what happens when I run the e2fsck command:

Code:

root@naglfar:/# e2fsck -cfv /dev/sdc1
e2fsck 1.41.4 (27-Jan-2009)
/dev/sdc1: recovering journal
/dev/sdc1: Attempt to read block from filesystem resulted in short read while reading block 122093217

JBD: Failed to read block at offset 31870
e2fsck: Attempt to read block from filesystem resulted in short read while trying to re-open /dev/sdc1
e2fsck: io manager magic bad!
root@naglfar:/#

Between the "recovering journal" output and the next line (short read) there is a ~30 second delay. Perhaps some sort of time-out occurs?

Here's the size of /lost+found/:

Code:

root@naglfar:/# du -sh lost+found/
16K	lost+found/
root@naglfar:/#

However, my root partition is fine. My guess is that you were interested in knowing the size of the lost+found directory on the faulty partition. However, that partition won't mount.

By the way, if I run the e2fsck command again, then it promptly delivers this output:

Code:

root@naglfar:/# e2fsck -cfv /dev/sdc1
e2fsck 1.41.4 (27-Jan-2009)
e2fsck: Attempt to read block from filesystem resulted in short read while trying to open /dev/sdc1
Could this be a zero-length partition?
root@naglfar:/#

As for checking the drive: Yes. I should have. I expected Ubuntu to do the checking for me on boot when necessary, just like it does with my root and home partition.

Is it more risky to have one large partition on a drive, instead of several smaller ones?

And how can you see from the output above that I have never checked the disk?

tredegar · 05-30-2010, 02:46 PM

Quote:

I never ran fdisk myself.

My typo. Sorry. Post #2 should refer to fsck not fdisk

Your other Qs:

Quote:

Is the partition corrupted?

Yes. But all is not yet lost

Quote:

If so, is the disk okay?

The hardware of the HDD is probably OK, but we haven't tested, and should not until recovery of files has taken place.

Quote:

If so, I expect I can either

1. repair the partition (preferred), or
2. recover files from it, and re-partition the disk.

1 -You have tried repair, and it looks like it failed. But business_kid's suggestions may help (I don't know).

2 -For recovery, testdisk and photorec are not typos, search on them. There's your-favourite-engine or LQ's search. Try the latter first.

Quote:

If the disk is dead, is there still hope that I can get the files?

1. If so, how do I proceed?

The HDD (hardware) does not seem to be dead. The filesystem may be in an "unrecoverable" mess, but photorec and testdisk will probably still be able to rescue all your data files, even if they aren't named properly. I hope you have a few big files, not millions of little ones. Nevertheless, recovery is very probably possible.

I only used testdisk and photorec once, and it was some years ago. The results exceeded my expectations.

If the data on your 1TB disk is really valuable, you should not play with the original medium, in case you make a mistake.

The most secure / reliable / non-destructive way to go about this would be to take an image of the faulty partition to a file, and then mount this file with the loopback option. It will mount apparently as a real disk (even though it it just a file-image of the partition). Then run the recovery tools on the loopback "disk" (which is just a file). It's a virtual, not physical disk if you like. The physical disk isn't touched, and can even be unplugged, so is safe.

If the recovery tools foul-up, you still have your original disk partition to go back to and you can take another image, and try again.

If you are going to go down this route, you will need sufficient disk space not only to save partition image to but also for the recovered files.

Maybe you need another 2TB of storage (You have 1TB that is faulty, then you need 1TB for the image file, and 1TB for the recovered files to be sent to) if you want to do this properly.

Hope this helps.

Willard · 05-30-2010, 03:04 PM

Indeed it does. Thanks for the replies tredegar (and you business_kid). I'll get my hands on a beefy 2TB storage space, and then try the above.

The "bright" side is that after doing so, I won't have to worry about storage space for a while.

tredegar · 05-30-2010, 03:22 PM

Please let us know, in due course, how you got on (we like to know if offered advice was appropriate and useful, or not).

You also need to think about why this filesystem corruption might have occurred, because you will not want it to happen again.

Good luck with the recovery.

PTrenholme · 05-30-2010, 08:43 PM

The error messages you posted in #1:

Code:

/dev/sdc1: recovering journal
/dev/sdc1: Attempt to read block from filesystem resulted in short read while reading block 122093217

JBD: Failed to read block at offset 31870
fsck.ext3: Attempt to read block from filesystem resulted in short read while trying to re-open /dev/sdc1
e2fsck: io manager magic bad!

suggests to me (because it happened to me once, on a smaller USB drive) that the "bad block" may be in your journal file rather than your data. When that happened to me, I used tun2fs to remove the journal. Then fsckcould at least run on the drive and, after it ran, I recreated the journal file. (In my case, the USB problem was a result of the cat pulling the USB plug out while he was sharpening is claws.)

FYI, right now I'm trying to image a 60Gb drive with several bad blocks. I'm using dd_rescue, but the process has been running for two days now, and it's less than half way done. I shudder to think about how long it would take to image a TB drive.

syg00 · 05-30-2010, 09:28 PM

Be aware that fsck fixes things. Doesn't necessarily equate to repairing them - especially in auto mode.
Cross linked files will get truncated - at least one of those cross-linked, maybe all. May happen more than once within a f/s. Once done you can't tell it happened unless you get a data/logical error on a file at some later stage.
Makes subsequent recovery debatable as well. Still attempt the recovery, just something to keep in mind.

I am heavily tempted to migrate everything to btrfs.

Willard · 06-01-2010, 06:43 PM

Quote:

Originally Posted by tredegar

Please let us know, in due course, how you got on (we like to know if offered advice was appropriate and useful, or not).

Will do. My drive should ship tomorrow, but if the postal service won't deliver it before the weekend, then I might not be able to properly address this issue until the end of next week.

Quote:

Originally Posted by tredegar

You also need to think about why this filesystem corruption might have occurred, because you will not want it to happen again.

I am guessing that the cause is either

a busted drive (my brother gave it to me after using it for a long time to read write many small files near constantly. throughout its entire life, the drive has lived in an external USB drive housing)
no checking, ever (I never checked the drive myself, and if Ubuntu never did, then it has never been checked since the drive was partitioned).

Quote:

Originally Posted by PTrenholme

The error messages you posted in #1 suggests to me (because it happened to me once, on a smaller USB drive) that the "bad block" may be in your journal file rather than your data. When that happened to me, I used tun2fs to remove the journal. Then fsckcould at least run on the drive and, after it ran, I recreated the journal file. (In my case, the USB problem was a result of the cat pulling the USB plug out while he was sharpening is claws.)

I'll try that.

Quote:

Originally Posted by PTrenholme

FYI, right now I'm trying to image a 60Gb drive with several bad blocks. I'm using dd_rescue, but the process has been running for two days now, and it's less than half way done. I shudder to think about how long it would take to image a TB drive.

At least 2 months, if our setups have the same performance

We'll see when I get started.

Quote:

Originally Posted by syg00

Be aware that fsck fixes things. Doesn't necessarily equate to repairing them - especially in auto mode.
Cross linked files will get truncated - at least one of those cross-linked, maybe all. May happen more than once within a f/s. Once done you can't tell it happened unless you get a data/logical error on a file at some later stage.
Makes subsequent recovery debatable as well. Still attempt the recovery, just something to keep in mind.

I thought only Winblows file systems cross-linked files

Thanks for the tip, I'll keep this in mind.

Quote:

Originally Posted by syg00

I am heavily tempted to migrate everything to btrfs.

I used EXT3 because EXT3 is backwards-compatible with EXT2, and there are EXT2 drivers for Winblows. I thought I would be hooking the drive to a Winblows machine more often than I in fact have (each time I have considered doing so, I have had a Linux laptop at my disposal, and they can easily function as a proxy).

As for future drives, I have been considering XFS. It journals, auto-recovers on mount (*), performs very well (particularly with large files), and has a similarly-ludicrous maximum file size.

(*): If a file cannot be recovered, its content is replaced by null-bytes. This is bad if you are using the drive for your OS, and your system crashes, as you do not necessarily know which files your OS was writing to. Also, this kills any configuration file you are editing (been there). However, in the way I am using the drive (only copying files to the drive from a "safe" source), I will never lose data this way.