Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back > Forums > Linux Forums > Linux - General
User Name
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.


LinkBack Search this Thread
Old 02-13-2005, 06:02 PM   #1
LQ Newbie
Registered: Jan 2005
Distribution: Debian
Posts: 10

Rep: Reputation: 0
Help! Need to recover corrupt bzip2 files..


I recently backed up a large portion of my /home directory to migrate to another machine.
I used:
# tar -cvjf /someotherdrive/backup.tar.bz2 <homeuser>

After trying to decompress the bzip2 file, it reported errors similar to the following:

bzip2: Data integrity error when decompressing.
Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

I've used bzip2recover to generate chunks of bzip2 data:
# bzip2recover /someotherdrive/backup.tar.bz2

which gernerates chunks:

I can find the corrupt chunks by using:
#bzip2 -t <given chunk>.tar.bz2

**My Question**:
Is it possible to extract the given corrupt chunk and recombine the good data in some way? I'm not too concerned about the block that the decompression failed on, but don't want to lose data following that...!!?

Many thanks.

Last edited by rhinomite; 02-13-2005 at 06:33 PM.
Old 02-13-2005, 06:30 PM   #2
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,950
Blog Entries: 11

Rep: Reputation: 860Reputation: 860Reputation: 860Reputation: 860Reputation: 860Reputation: 860Reputation: 860
Re: Help! Need to recover corrupt bzip2 files..

Originally posted by rhinomite
# tar -cvjf <homeuser> /someotherdrive/backup.tar.bz2
You got those the wrong way round, surely?

If that was the actual command you typed you'd
have a bzip2'ed tarfile called <homeuser> with
a content of /someotherdrive/backup.tar.bz2

Old 02-13-2005, 06:32 PM   #3
LQ Newbie
Registered: Jan 2005
Distribution: Debian
Posts: 10

Original Poster
Rep: Reputation: 0
Whoops! Thanks for that...

I actually did use it the other way around!

*I'll edit the first post to reflect the correct usage*
Old 02-15-2005, 05:02 PM   #4
LQ Newbie
Registered: Jan 2005
Distribution: Debian
Posts: 10

Original Poster
Rep: Reputation: 0
*Resolved* - Need to recover corrupt bzip2 files..

I've solved the problem in a round-about way, so here's how i did it:

1. Use bzip2recover to recover the individual bzip2 blocks (900k by default)
# bzip2recover corrupt.tar.bz2

which gernerates chunks:

2. Send bzip2 test results to a file for searching (could also grep right here if you want):
#for i in *.bz2; do bzip2 -tvf $i >> corruptblocks.out 2>&1; done

3. Search the test results for errors
# grep -i CRC corruptblocks.out

4. Delete the corrupt block
# rm <corruptblock>.tar.bz2

5. Untar all of the bzip2 blocks elsewhere
# for i in *.bz2; do bzip2 -dcvf $i > /elsewhere/$i.tar; done

6. Here you need to look for the first occurence of a tar header prior to the corrupt block. A tar block basically consists of the tar header filename (eg the actual file archived - /pictures/easter/pic02.jpg), followed by tar header metadata (doesn't matter), followed by the actual data. A tar archive is just made up of sequential tar blocks. The aim is to remove the entire tar block in which the corrupt bzip2 block lived. The untaring will continue as if the corrupt tar block never existed.

I did it using a hex editor as I wasn't too sure in which actual file (from the filesystem) the error had occured. So if the corruption occurred in block 1000, I would check through block 999, then 998 etc... If you know that you were backing up the "/pictures" directory and it failed around the "/pictures/easter" directory, then greping for "/pictures/easter" in the few blocks prior should find a match. You need to do the same to find the closest tar header AFTER the corrupt block. Remember that it needs to be just the tar block in which the corruption occured.

block997.tar - Closest preceeding header (/pictures/easter/pic01.jpg)
block1000.tar - CORRUPT BLOCK
block1002.tar - Closest trailing header (/pictures/easter/pic02.jpg)

a. Here you would open block997.tar and remove ALL data from the start of the header (/pictures/easter/pic01.jpg......) to the end of the file (the first "/" char onwards).
b. Make sure you delete block998.tar, block999.tar, block1000.tar (should have already been deleted earlier) and block1001.tar.
c. Open block1002.tar and remove ALL data from the byte prior to the start of the next header (/pictures/easter/pic02.jpg) to the start of the file (the new file should now start with /pictures/easter/pic02.jpg as opposed to raw data)

(After this, the tar block with the corrupt data should have been removed)

7. Glue everything back together using:
# cat /elsewhere/*.tar > recovered.tar

8. Untar as usual to recover existing data
* tar -xvf recovered

Hope this helps.

Last edited by rhinomite; 02-15-2005 at 05:04 PM.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
corrupt ext3 partition - need to recover whysyn Linux - Hardware 8 07-02-2010 08:30 AM
Video files corrupt over ftp ixus_123 Linux - General 5 01-19-2005 05:14 PM
Samba and corrupt files Wordan Linux - Networking 4 11-05-2004 06:51 AM
How to uninstall tar / bzip/bzip2 files........ emailssent Linux - Software 2 10-11-2004 01:11 AM
Passwd files corrupt - I am locked out. Need Help!!! cloy Linux - Newbie 4 02-16-2001 04:25 PM

All times are GMT -5. The time now is 03:01 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration