How to back up files off a failing hard drive?
Hello everyone. I have a 120GB drive (formatted with reiserfs) that is in the process of dying. I get strange errors (usually involving DriveNotReady and hda_dma) on the screen, and each new badblocks scan gives me more failed sectors, so I know the drive is on its way out.
I am trying to replace it with a new 250GB drive. I tried to use GNU parted to clone the drive, but the I/O errors prevent it from working. So, I'm trying to just copy as much of the failing hard drive's data as I can to the new one. I'm currently using system rescue CD to boot the system. I have the old hard drive (/dev/hdb2) mounted on /mnt/temp2, and the new (/dev/hda2) mounted on /mnt/temp1. I started with trying to use tar: Code:
cd /mnt/temp2; tar cf - * (cd /mnt/temp1; tar xvpf -) Code:
rsync -avlH --progress /mnt/temp2 /mnt/temp1 Can anyone suggest a method of copying the files from the failing hard drive to the new hard drive that - will preserve all time/permission/owner/etc. metadata - will not stall on I/O errors but will try to copy as many files as it can, and - will give me a list of all files that had problems so I can concentrate on trying to recover those specific files? I don't mind erasing those 30GB of files and starting over, but I would like to stress the drive as little as possible until I've gotten the data off of it. Thanks! |
Does the system rescue CD come with ddrescue? If so, you can use it to copy the drive to a disk image. Unlike the regular dd command, ddrescue will do it's best to rescue bad blocks. If you can get a clean disk image copied, you can recover your actual data from there.
|
ddrescue appears to be working well
Neither System Rescue CD nor Knoppix contain ddrescue (although both contain dd_rescue.) (There is a post on the System Rescue CD forums suggesting the inclusion of ddrescue, so it might be there eventually.) Knoppix has enough build tools that I was able to download the source from the ddrescue homepage and build it easily.
I am following the instructions I found on the TestDisk wiki, which are as follows: Quote:
|
ddrescue appears to have done the trick
The ddrescue appears to have pulled as much of the old, failing HD as can be -- I now have a 111GB image file on the new HD. It apparently encountered over 1600 errors doing the more in-depth scan, and the logfile is over 5kB, so it definitely did not go flawlessly.
Now, the difficult part -- trying to turn the logfile into a coherent list of files that are corrupted beyond recovery. If there are no irreplaceable files in that list, I can just restore the image, reinstall some applications, and be on my way. A search of the bug-ddrescue list shows I might need a Perl script called 'ddrsummarize.pl', but I'll have to request that from the list. |
Kaynos,
You can get ddrsummarize.pl, ddr2nfi.pl, nficruncher.pl, ddrlogor.pl, and a lot of other ddrescue-related Perl scripts by downloading them from my server, here: www dot burtonsys dot com slash download slash ddr2sr.zip (Sorry, I've been a "member" here for nearly a year, but this annoying web site still won't let me post URLs, so change " dot " to "." and change " slash " to "/" to turn the above into a usable URL.) First, I recommend that you look closely at the ddrescue logfile. One sector is 0x200 bytes. So if the "-" status logfile entries are all multiples of 0x1000 then what you are seeing is the result of whole cluster at-a-time reads failing. You can probably use a "raw" device to get ddrescue to read individual sectors, which will reduce the number of bad sectors considerably. However, use of raw devices depends very much on what OS version you are running, which makes it difficult to tell you just what commands are needed. If your hard disk drive is not in NTFS format, then I can't offer much help identifying the damaged files. But if your hard disk drive is in NTFS format, then here's what I would do if I were you: 1) Copy the entire rescued disk image to a scratch drive. 2) Save a copy of the partition table, like this: fdisk -lu drive.ima >fdisk-lu_output.txt or: fdisk -lu /dev/hdd >fdisk-lu_output.txt (or whatever) 2) Use ddr2nfi.pl (formerly called srddrnfi.pl) to generate a .bat script of 'nfi' commands from the ddrescue logfile. Call the .bat script "nficmds.bat": perl -w ddr2nfi.pl nficmds.bat - fdisk-lu_output.txt drive.log or if you used SpinRite (probably via ddr2sr.pl): perl -w ddr2nfi.pl nficmds.bat SPIN_LOG.3 fdisk-lu_output.txt log_before_SR log_after_SR where: SPIN_LOG.3 is the spinrite log file (extension varies), log_before_SR is the ddrescue logfile saved before running SpinRite, log_after_SR is the ddrescue logfile created after running SpinRite 3) Copy the nficmds.bat file to a thumb drive or diskette. 3.1) If you don't already have Microsoft's nfi ("NTFS File Sector Information Utility") then get it, too: www dot google dot com slash search?q=%22NTFS+file+sector+information%22+site:microsoft.com 4) Attach the scratch drive to a Windows computer as a 2nd drive ("E:" for this example), and start the computer. Do NOT let Windows check the drive during startup, because if you do then you won't get to capture the list of file names that it mentions when checking the drive. 5) If the drive letter doesn't match the drive letter in nficmds.bat, then edit nficmds.bat and fix the drive letters. 6) (This step is optional; if you get tired of acknowledging the pop-up boxes then you can skip steps 6 and 7.) Run nficmds.bat on the Windows computer, redirecting the output into a text file: nficmds.bat >nfioutput.txt 7) Process nfioutput.txt using nficruncher.pl, to produced a "damaged files report," and various other reports: perl -w nficruncher.pl -f -d -r -i -u nfioutput.txt 8) In a Windows XP or Win2000 command-prompt window do: chkdsk /f E: >c:\errors1.log.txt (c:\errors1.log.txt is effectively another damaged files report.) 9a & 9b) Save the output files produced in steps 6 and 7. Then repeat steps 6 and 7. (Because of the chkdsk you did in step 8 nfi is less likely to produce pop-up messages which you must acknowledge.) 10) Your (not necessarily complete) list of damaged files is the combined list of files found in steps 7, 8, and 9b. 11) It is also possible to use the output of nficruncher.pl to "get smarter" with ddrescue. For example, it produces a free-space sector list, in ddrescue logfile format, called unimportant.log, which you can merge into the regular ddrescue logfile, to make a subsequent ddrescue run pretend that those unused sectors were already rescued, so that you don't waste time trying to recover them. BTW, that's why you copied the recovered disk or image to a scratch drive in step 1 -- because Windows changes the drive when it examines it (and drastically changes it when doing chkdsk!), which would prevents you from later restarting the rescue process using ddrescue. But since you only let Windows touch a scratch copy, Windows can't mess up the original. To merge the "unimportant.log" file (free-space sector list) produced by nficruncher.pl, you can use the ddrlogor.pl script ("DDRescue LOGfile logical .OR."). Note: one trick that I've done is to edit nfioutput.txt before processing it with nficruncher.pl, to make it look like some files that I don't care about, e.g., hiberfil.sys and swapfile.sys, are part of the NTFS partition's free-space. Then, after ddrlogor.pl merges unimportant.log, ddrescue's logfile will indicate that those other unimportant files are already recovered, so ddrescue won't waste time trying to recover them. 12 & on) Then you can resume the ddrescue recovery process, perhaps just targetting the most important disk areas, before going back to step 1. For lots more instructions, see the comments in the various Perl scripts. -Dave dave340 at burtonsys dot com but please no spam P.S. -- If you know of (or write) the equivalent of 'nfi' for FAT32 (or other file systems), or to run under Linux instead of under Windows, then please tell me!! |
You should be able to mount the disk image using the "-loop" loopback option and manually check or rescue important files. Although with 100GB+ of data that might be a bit impractical.
ncdave, you need to have 5 posts before you're allowed to link to other pages. It's an anti-spam defense. |
Fixed up now
The ddrsummarize.pl script told me that the 1688 errors only totaled about 800kB of lost data, but didn't give me any way to convert the sector numbers to filenames.
I posted[1] on the ddrescue mailing list to ask how to do the conversion, and was referred to a HOWTO[2] with instructions. Unfortunately, the HOWTO is for ext2/ext3 filesystems and the debugfs command used is specific to ext2/ext3. I tried asking[3] the reiserfs mailing list, but found that there is no equivalent command for reiserfs. I ended up following the instructions[4] in a previous post to the reiserfs list, and ran find /mnt/old_disk -type f -exec cat {} > /dev/null \; . This found only 3 files had problems with reading & writing to /dev/null (two of which were in my Firefox cache, so really no problems there). I copied the files out of the disk image using the original tar command, and deleted the 3 files. One quick re-LILO later, and the system boot and is pretty much running fine. I haven't done a thorough system check yet, but thus far it seems OK. Hopefully this thread will be of use to anyone else who has a disk failure on a reiserfs system. [1] http://lists.gnu.org/archive/html/bu.../msg00004.html [2] http://smartmontools.sourceforge.net/BadBlockHowTo.txt [3] http://marc.theaimsgroup.com/?l=reis...8754104268&w=2 [4] http://marc.theaimsgroup.com/?l=reis...5109321290&w=2 |
All times are GMT -5. The time now is 03:23 PM. |