LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Trying to save a failing HDD (https://www.linuxquestions.org/questions/linux-hardware-18/trying-to-save-a-failing-hdd-4175444368/)

INCyr 01-06-2013 09:29 AM

Trying to save a failing HDD
 
Hey everyone-

I have a feeling I'm SOL, but I'm hoping someone here might be able to help me.

I have a 1TB WD Black Drive that's failing. I have no delusions about saving the drive, but I'm trying to get the data off of it before it goes.

Unlike most failed drives, this one does not seem to be clicking, so I'm not entirely sure what's wrong with it. However, it's incredibly slow to read from, and I'm not sure if I'm able to write to it - if it's failing, what's the point?

I picked up another 1TB WD Black last night, and this morning I started a ddrescue operation on the failing drive, hoping to clone the drive before it goes completely. However, I'm slowly losing hope that this is a feasible salvage attempt.

ddrescue actually seems to be working well - after an hour, I'm sitting at 28MB saved, with no errors reported. However, that's the problem - it's an hour for 28MB - my average speed is just over 5k/s. At this rate, it's going to take a year to clone the drive.

Edit: for clarification, the exact command used in ddrescue was the following:
ddrescue -v -r3 -f /dev/sdb /dev/sdc clonelog.log

Part of the problem is that the drive really seems to be failing. While ddrescue is running, i'm getting the following messages:

ata19.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata19.00: failed command: READ DMA
ata19.00: cmd c8/00:00:00:df:00/00:00:00:00:00/e0 tag 0 dma 131072 in
res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
ata19.00: status { DRRDY }
ata19: hard resetting link
ata19: SATA link up 1.5 Gbps (SSTatus 113 SControl 310)
ata19.00: configured for UDMA/33
ata19.00: device reported invalid CHS sector 0
ata19: EH complete

This then repeats intermittently between ddrescue updates. So my question is:

What I'm really after is a specific set of data on the drive - specifically some photos. There's about 150 GB of them, estimated, and that's all I care about retrieving. If I could only copy that data off, that'd reduce the amount of time required by about 85%, which is huge. Is there any way to do that? I think the drive itself is about 600Gb full, so possibly making an image file might be a better way to go, but I have no experience in dealing with image files, and would have no idea what to do with it after I finished.

Is there any way to help the drive at all? Doing google searches, I've seen people suggest a variety of things, but I'm really not sure what I should be looking at doing, or how to do it. I'm only about an hour into DDRescue at this point in time, so I'd rather restart it now, rather than wait for it to get further in.

Any help would be greatly appreciated.

buttugly 01-06-2013 10:07 AM

Hi,

You mentioned you had aquired another of the same drive? If so look at the electronic parts, there are 5-6 small screws which hold it to the drive. CAREFULLY swap the electronics, they make contact through the drive case so nothing will explode when you take it apart. I use a LARGE WHITE towel to keep everything in one spot.

I've done this many times, usually hunt an identical from ebay and do the deed. I've always been "lucky", dozen times and counting, and recovered everything.

Once recovered, swap the electronics back. Make sure you check waranty before pitching, WD is usually 3 years.

If you were using the new black to copy too, you'll have to hunt another one. You can, if you have space copy a partition to an iso and then mount with loop to get to your data.

I agree with your ddrescue choice its a personal favorite.

Also heat makes stuff expand, I've used cold packs on drives ocassionally with sucess as well,

Good Luck!
Kevin

INCyr 01-06-2013 10:10 AM

Quote:

Originally Posted by buttugly (Post 4863970)
Hi,

You mentioned you had aquired another of the same drive? If so look at the electronic parts, there are 5-6 small screws which hold it to the drive. CAREFULLY swap the electronics, they make contact through the drive case so nothing will explode when you take it apart. I use a LARGE WHITE towel to keep everything in one spot.

I've done this many times, usually hunt an identical from ebay and do the deed. I've always been "lucky", dozen times and counting, and recovered everything.

Once recovered, swap the electronics back. Make sure you check waranty before pitching, WD is usually 3 years.

If you were using the new black to copy too, you'll have to hunt another one. You can, if you have space copy a partition to an iso and then mount with loop to get to your data.

I agree with your ddrescue choice its a personal favorite.

Also heat makes stuff expand, I've used cold packs on drives ocassionally with sucess as well,

Good Luck!
Kevin

So you think the problem is the controller on the harddrive? If I stop ddrescue and swap it out, you think I'll be able to save the drive data? I'll consider it. I actually have a third black sitting around I might use for the controller. *sigh* I just want this stupid data off this drive.

buttugly 01-06-2013 10:18 AM

Usually been my expierence. Give it a shot, you won't hurt anything.

r3 means retry 3 times, if it were me,

ddrescue /stuff_to_save /savefile.iso logfile

dont generate anymore heat initially, and save as much as possible.

you can always use the log to narrow down the "search area" "info ddrescue" has lots of examples

kevin

INCyr 01-06-2013 10:27 AM

Quote:

Originally Posted by buttugly (Post 4863981)
Usually been my expierence. Give it a shot, you won't hurt anything.

r3 means retry 3 times, if it were me,

ddrescue /stuff_to_save /savefile.iso logfile

dont generate anymore heat initially, and save as much as possible.

you can always use the log to narrow down the "search area" "info ddrescue" has lots of examples

kevin

Yeah, I stopped and restarted ddrescue with the following:

ddrescue -v -f -i #lastpos -d -n /dev/sdb /dev/sdc clonelog.log

Hoping that will help speed things up a bit. Still worried about how long this is going to take. A couple of days, that's fine. But weeks? Months? even at 20kBps that's still 2 years, if I'm doing my math right.

How do I isolate the directory I want to save? I'd be happy to do that, although I still need to figure out how to deal with images. I tried mounting the drive, but it can't find it to mount it, but ddrescue has no problems finding it. For reference, the drive is actually off a Windows 7 system, and is formatted NTFS.

Sadly, I can't seem to find any screwdriver heads to fit the screws on the electronics to swap them. Worried about forcing the screws out. Still not entirely sure I like that option, but there is some appeal to it.

gradinaruvasile 01-06-2013 10:44 AM

Did you triy to actually mount the partition under Linux? I had many cases where there were NTFS partitions created in Windows that wouldnt mount in Windows, but worked just fine on Linux. Yours will certainly exhibit errors and go slow, but copying from an actually mounted drive might be faster than with ddrescue. Additionally you will copy actual files, not an image (which requires the extra hassle of mounting and extracting the actual files from it).

INCyr 01-06-2013 11:01 AM

Update:

So I followed Kevin's advice, and swapped out the electronics from my spare drive to the failing one. DDRescue was reporting >750KB/s last I looked with the new electronics. So once I have a clone made, I'm going to try and boot back into windows and see if I can't copy the files I want straight off the drive. Then I have to figure out what to do with these drives, and how I want to deal with them.

I'm still anxious and worried about this, but it looks like maybe I might be on the right path now, and on a path that won't take me 2 years to get my data. I'll probably post updates as I get them, but I hope that was the answer I needed.

Thanks again.

Ian

buttugly 01-06-2013 11:14 AM

Couple of things, if it were me:

Stop. DEEP BREATH.

Find/buy correct screwdriver

Swap electronics....nothing to loose......plus if it works it will be FASTER

Not sure what your working with horsepower wise. Writing from a partition to "force creating and writing to another partition" takes way more than writing to a "file" (iso) and adds complication imho

-d means direct....I like the kernel cache, don't do this until try 2

-n no split.....ok we can do this on try 2

-i I don't fully understand why this would be needed, the logfile knows all

It took 9 days to save a dvd by the time all was said and done. Save a neghibors phone 4 gig sd card in 3 days.

With the slowness you are describing, I really believe this is "electronics" related. the mechanical stuff either works or doesn't.

Do not mount, you might actually write and that would be bad!!!!

I think there is a way to figure out what files are stored where but poking around we should avoid till we have a copy.

The log file is "the saving grace"! With it you can always add more and it will not redo something it has already done unless specifically told to (r1 etc)

I set block size for the dvd. If you are not absolutely sure what it is don't.

When you say images, I am guessing you mean iso not jpg. Relax it's all over the web how to mount an iso using loop device, I promise even if its a terabyte or 4.

Important thing is the jpgs and "professional recovery" is very good but not cheap.

Kevin

INCyr 01-06-2013 12:05 PM

Quote:

Originally Posted by buttugly (Post 4864015)
Couple of things, if it were me:

Stop. DEEP BREATH.

Find/buy correct screwdriver

Swap electronics....nothing to loose......plus if it works it will be FASTER

Not sure what your working with horsepower wise. Writing from a partition to "force creating and writing to another partition" takes way more than writing to a "file" (iso) and adds complication imho

-d means direct....I like the kernel cache, don't do this until try 2

-n no split.....ok we can do this on try 2

-i I don't fully understand why this would be needed, the logfile knows all

It took 9 days to save a dvd by the time all was said and done. Save a neghibors phone 4 gig sd card in 3 days.

With the slowness you are describing, I really believe this is "electronics" related. the mechanical stuff either works or doesn't.

Do not mount, you might actually write and that would be bad!!!!

I think there is a way to figure out what files are stored where but poking around we should avoid till we have a copy.

The log file is "the saving grace"! With it you can always add more and it will not redo something it has already done unless specifically told to (r1 etc)

I set block size for the dvd. If you are not absolutely sure what it is don't.

When you say images, I am guessing you mean iso not jpg. Relax it's all over the web how to mount an iso using loop device, I promise even if its a terabyte or 4.

Important thing is the jpgs and "professional recovery" is very good but not cheap.

Kevin

Heh, that deep breath is needed. As mentioned above, I did swap the electronics. Managed to get the screws loose and then was able to get them out - it wasn't elegant, but it was effective. That definitely sped things up quite a bit.

As far as power goes, this is an i7 with 6-8 cores, and at least 4gb of memory. So I'm not too worried about that.

I was up at an average rate of 3-4MB/s, but then I decided to get stupid. I stopped the copy, and booted into windows to see if I could, with the new electronics, copy the pictures I was after onto my system drive. That didn't seem to fare any better than previous attempts, so I'm back to cloning the disk. However, now my speeds are back down to ~2-300KBps, so I'm not sure if I screwed something up with my failed attempt at doing it in windows. Can't imagine I did, so I'm just hoping it's a bad sector, and it'll be back up to an average of 3-4MBps before too long.

I include the -i command because I'm not entirely sure how it interacts with teh log file, and I don't, at this point, have a single log file I'm working with. When I stopped it the second time, I was worried about overwriting the first log file, so I started a second file, and I've just been using new log files ever since. Sounds like this was a poor choice.

As far as the -d flag, my concern was with the READ DMA errors, most information I found on them seemed to point to a kernel issue, so I figured it'd be better to bypass the kernel. Hasn't seemed to cause any problems.

Anyway, I'm back to -v -f -d -n -i commands, but if you really think I'd be better with a different set of options, I'm happy to restart it. Not sure what to do about the log file either - if a single log file is needed, I'd be better starting over now, I guess.

Thanks for your help, I can't tell you how much I appreciate this. I wish I understood most of the ddrescue flags better.

Ian

INCyr 01-06-2013 03:54 PM

An update, for anyone who cares, and to bump the thread up again for some more views. Could use some more help, I think.

Process is still plugging away. I think I have about 22 Gigs of the clone created. However, my average speed is dropping, and is under 1MBps at this point, and doesn't show any indications of increasing any time soon - most of my current rates are 1-200KBps, if they're not 65 or 32KBps, which seem more common. I don't mind this taking a while, I just don't want it to take forever, and at 100KBps we're looking at 3 months, which just doesn't seem practical.

If I wanted to make an image instead of just cloning the drive, I'd have to format the drive into whatever format I wanted (NTFS is preferred, is that an okay format?) and then restart the process? I guess I'm just worried about doing that because I understand cloning, I don't understand imaging.

Anyone know how to isolate just the files I want? This is so frustrating - I mean, it's working, and that's great, but if we have to wait for 2 or 3 months? That's... I'm not sure it's really worth it.

*sigh*

INCyr 01-06-2013 05:38 PM

Alright, after playing around a bit, I have some more information.

hdparm -tT /dev/sdc (failed drive) results in the following:
/dev/sdc:
Timing cached reads: 18632 MB in 2.00 seconds = 9337.64 MB/sec
Timing buffered disk reads: 2 MB in 10.48 seconds = 195.47 kB/sec

Something tells me that buffered disk read is NOT where it should be. Anyone have any idea what affects that speed, and how I might be able to determine more about what's causing the problem?

As per the above conversation, I've already swapped out the electronics on this guy, and that made a difference (although I'm tempted to go back and see if they also affect the hdparm results) but I find it hard to believe that the electronics on two different drives are going bad at the same time - although they were both bought at the same time, and are the same model, so I guess it's not THAT crazy.

If this is truly a hardware issue internal to the drive, then I think I'm screwed. I just can't copy 100s of gigs of data with speeds that low - it's just not feasible. So I'm hoping that there's something else going on here that actually is fixable or at least addressable for a short period.

And I'm happy to try and get any other reports that are requested using standard tools - however, I might need instructions on how to mount and copy the files to a USB thumb drive, as I'll need to post them here using a different computer, and I don't use linux often enough to know how to do that off the top of my head.

Thanks,
Ian


All times are GMT -5. The time now is 05:42 PM.