LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 11-21-2009, 12:13 PM   #1
Sum1
Member
 
Registered: Jul 2007
Distribution: Fedora, CentOS, and would like to get back to Gentoo
Posts: 332

Rep: Reputation: 30
Constant .tar.bz2 data corruption


Hi Group,

About a month ago I noticed several problems with my backups on my Samba server. There's 2 HD's on the server: /dev/sda holds the live data accessed by the users, and /dev/sdb holds all the foo.tar.bz2 daily backups. /dev/sdb is mounted as "/archive" in /etc/fstab.

Both HD's use the ext3 filesystem and specify filesystem parameters: defaults,noatime,data=writeback

1.I've fixed the root filesystem on /dev/sda and it's running fine.

2.I've written zeros to /dev/sdb using dd and freshly created the ext3 filesystem.

3.I have a bash scipt that runs daily as a cron job and it creates a .tar.bz2 of all the user data on the samba server. Seems to execute fine and squashes 20 gigs. into a 12 gig. file.

4.PROBLEM: Every test of every bzipped archive fails. When I try to decompress and unarchive the data, there's error messages about corrupted data, and maybe it can be recovered using bzip2recover. Relying on bzip2recover for all my user data is not a good long-term plan.

So for now, I'm simply making nightly copies of the live user data to /dev/sdb , but I definitely miss the space savings afforded by bzip2 compression.

Is there anything I can try to further test why the compressed data gets corrupted?

Thanks for your help.
 
Old 11-21-2009, 12:46 PM   #2
MS3FGX
LQ Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 361Reputation: 361Reputation: 361Reputation: 361
Nothing in your post is enough to identify the problem. Though you might want to look at a more robust filesystem then EXT3.

Do the kernel or system logs mention anything while the script runs? Can you post the script here in CODE tags so we can look it over for a possible glitch? Matter of fact, has the script ever worked as intended, or have you been able to run it on a different system?
 
Old 11-21-2009, 04:46 PM   #3
Sum1
Member
 
Registered: Jul 2007
Distribution: Fedora, CentOS, and would like to get back to Gentoo
Posts: 332

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by MS3FGX View Post
Though you might want to look at a more robust filesystem then EXT3.
What would you suggest?

Quote:
Do the kernel or system logs mention anything while the script runs?
No.

Quote:
Can you post the script here in CODE tags so we can look it over for a possible glitch?
Here it is, very simple stuff:

#!/bin/bash
echo "1. Make a date-stamped storage directory."
cd /archive
dirname=`date +"%Y%b%d"`
mkdir $dirname
cd $dirname
#
echo "2. Build archive and place into storage directory."
tar -cjf foo.tar.bz2 /foo
#
echo "3. Completed."

Quote:
Matter of fact, has the script ever worked as intended, or have you been able to run it on a different system?
The script starts, runs, and completes with no errors every night and produces the desired .tar.bz2 file. The only problem is once I unpack the .tar.bz2 file, it fails half way through with errors.

Thanks for your help, MS3.
I'm ready to follow up on any suggestions you may have.
 
Old 11-22-2009, 11:44 PM   #4
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
You could try gzip, it has an option to specify how much compression you want, on a scale of 1-9. Obviously deeper compression is slower. bzip2 has the same option.
http://linux.die.net/man/1/gzip
http://linux.die.net/man/1/bzip2
 
Old 01-09-2010, 04:21 AM   #5
Sum1
Member
 
Registered: Jul 2007
Distribution: Fedora, CentOS, and would like to get back to Gentoo
Posts: 332

Original Poster
Rep: Reputation: 30
I need to follow up on this thread and seek help from the group.
I continue to have problems with tar'ed backups.

Since last post, I have stopped using any form of file compression and create .tar files every night.

My bash backup script remains mostly the same, except now I simply do:

tar -c --atime-preserve -p -f foo.tar /foo

The result is a .tar file of about 22 gigs.

But when I test extraction of data (tar -xf foo.tar), I receive the following error:

tar: Skipping to next header
tar: Exiting with failure status due to previous errors

and then when I look in the extracted /foo directory I only see between 5 - 10 gigs. of data.

I don't know where to look to uncover the problem.
Nothing in /var/log indicates a problem with creating the .tar file, as far as I can tell.
 
Old 01-09-2010, 05:49 AM   #6
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by MS3FGX View Post
you might want to look at a more robust filesystem then EXT3.
If you search LQ you'll find ten times more problems with for example Reiser than with Ext. Leaves me wondering what your definition of "robust" here would be?..


Quote:
Originally Posted by Sum1 View Post
About a month ago
What changed in the system at that time software and configuration-wise?


Quote:
Originally Posted by Sum1 View Post
I've fixed the root filesystem on /dev/sda and it's running fine.
What exactly happened to require fixing the filesystem?


Quote:
Originally Posted by Sum1 View Post
I have a bash scipt that runs daily as a cron job and it creates a .tar.bz2
Which user does it run as?


Quote:
Originally Posted by Sum1 View Post
But when I test extraction of data (tar -xf foo.tar), I receive the following error
Can you at least list contents with 'tar -vtf foo.tar'? And of a compressed tarball? If it fails to complete listing contents then at what point (file or dir) does it fail and could you verbosely list the contents of the directory here? (Please use BB code tags.) Does it happen with small tarballs as well? And does, like chrism01 suggested, gzip work for you? Did you ever find traces of memory corruption in other processes? Does the machine have enough RAM and swap? You list mount flags. What happens if you remount with only "defaults" and test again? Else how about meanwhile using 'rsync' between the two disks? Please note there's thirteen questions here. You may or may not be able to answer them all but being as verbose as possible is good: the more information the better.

Last edited by unSpawn; 01-09-2010 at 06:05 AM. Reason: //More *is* more.
 
Old 01-09-2010, 10:58 AM   #7
Sum1
Member
 
Registered: Jul 2007
Distribution: Fedora, CentOS, and would like to get back to Gentoo
Posts: 332

Original Poster
Rep: Reputation: 30
UnSpawn,
First and foremost, thank you so much for your excellent response -- it's like a guide path.

Quote:
Originally Posted by unSpawn View Post
What changed in the system at that time software and configuration-wise?
What exactly happened to require fixing the filesystem?
Exactly this: http://www.linuxquestions.org/questi...gnose.-763829/

Quote:
Which user does it run as?
root

Quote:
Can you at least list contents with 'tar -vtf foo.tar'? And of a compressed tarball?
I'm logged in via ssh to the server as I write.
Creating a new foo.tar, and will try 'tar -vtf'.
Results: Two attempts and two failures.
"tar: Exiting with failure status due to previous errors"
Interesting note here, the failure occurred in the exact same place in the directory system both times. I can look at that further -- move this sub-directory to a different partition and try process again, etc.

Quote:
Does it happen with small tarballs as well?
No, I've recently tried a few .tar files about 1 Gig. in size (using data from within /foo) and those were successfully tar'ed and extracted without problems.

Quote:
And does, like chrism01 suggested, gzip work for you?
Will test .tar first, and then move on to gzip and bzip.

Quote:
Did you ever find traces of memory corruption in other processes?
Definitely not sure how to look for, or determine this.

Quote:
Does the machine have enough RAM and swap?
I believe so -- 4 gigs. of DDR2 800 RAM but only 3 gigs. of it is recognized due to using 32-bit Slackware 12.2 on this box. Have 2 Gigs. of Swap and it seemingly never gets used -- no matter what process is running, 'top' always reports 0k used for Swap. The server has no more than 30 users at any given time.

Quote:
You list mount flags. What happens if you remount with only "defaults" and test again?
I will plan an evening to give this a try. Server access is relied upon 7 days a week from 7 am - 7 pm.

Quote:
Else how about meanwhile using 'rsync' between the two disks?
I've heard about this and maybe it's time to try it. If I can make separate daily "syncs" equivalent to these .tar files, then I'll gladly opt for it. I need to RTFM along these lines.
 
Old 01-09-2010, 11:17 AM   #8
GooseYArd
Member
 
Registered: Jul 2009
Location: Reston, VA
Distribution: Slackware, Ubuntu, RHEL
Posts: 183

Rep: Reputation: 46
Hey Sum1,

There's definitely a problem with either the filesystem or the drive. If you're getting that kind of error with bzip2, you'll get it with any other compression program.

Is the kernel logging any filesystem or scsi/ide errors? I would expect an IO error to derail the tar as it was writing, but it's worth checking.

Next, I would rule out a bug in ext3. Use ext2fs on the drive receiving the tarball and see if the problem persists. If it does, I'd junk the drive.

This is probably a dumb question, but I assume you're running a 2.6 kernel with large file support? If not, or if you have an old glibc that doesnt support LFS, then bzip2 will receive a SIGFX once it writes 2gb of output, which would cause it to fail similarly to what you described. LFS has been around for a long time, so I doubt thats the problem, but it can happen.
 
Old 01-09-2010, 12:33 PM   #9
Sum1
Member
 
Registered: Jul 2007
Distribution: Fedora, CentOS, and would like to get back to Gentoo
Posts: 332

Original Poster
Rep: Reputation: 30
Quote:
Is the kernel logging any filesystem or scsi/ide errors? I would expect an IO error to derail the tar as it was writing, but it's worth checking.
Mr. Goose :-)
Thanks too for your help.

I'm not sure if I'm looking in the right logs, or whether I have activated logging the right stuff...but I've checked through /var/log/messages and dmesg and syslog, and I can't find any error messages relating to IO activity.

Quote:
Next, I would rule out a bug in ext3. Use ext2fs on the drive receiving the tarball and see if the problem persists. If it does, I'd junk the drive.
I like the thinking and I'm beginning to suspect the drive itself since I just wiped it with zeros, installed the partition and ext3 filesystem only a month ago.

I'll blend your suggestion with UnSpawn's:
remount with ext3 defaults and test;
rebuild with ext2 and test;
install a different hard drive altogether and test.

Quote:
I assume you're running a 2.6 kernel with large file support? If not, or if you have an old glibc that doesnt support LFS, then bzip2 will receive a SIGFX once it writes 2gb of output, which would cause it to fail similarly to what you described.
Currently, using kernel 2.6.30.4
I checked my kernel config and it does show "Support for large block devices and files" built into the kernel.

Thanks again for your help.
I've got quite a bit of testing to do.
 
Old 01-11-2010, 11:49 AM   #10
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by Sum1 View Post
Auch. Unfortunately the thread doesn't show you determining and fixing what was wrong.


Quote:
Originally Posted by Sum1 View Post
"tar: Exiting with failure status due to previous errors"
Sometimes noting the error value ('tar --do-Something; echo $?') might help.


Quote:
Originally Posted by Sum1 View Post
Interesting note here, the failure occurred in the exact same place in the directory system both times. I can look at that further -- move this sub-directory to a different partition and try process again, etc.
Let us know.


Quote:
Originally Posted by Sum1 View Post
No, I've recently tried a few .tar files about 1 Gig. in size (using data from within /foo) and those were successfully tar'ed and extracted without problems.
Could try running tar through 'split' to come up with chunked archives?


Quote:
Originally Posted by Sum1 View Post
Definitely not sure how to look for, or determine this.
Unexplainable crashes, applications failing?


Quote:
Originally Posted by Sum1 View Post
I've heard about this and maybe it's time to try it. If I can make separate daily "syncs" equivalent to these .tar files, then I'll gladly opt for it. I need to RTFM along these lines.
...and search LQ. We've definitely got some threads on rsync. It isn't hard to use.


Quote:
Originally Posted by GooseYArd View Post
This is probably a dumb question, but I assume you're running a 2.6 kernel with large file support? If not, or if you have an old glibc that doesnt support LFS, then bzip2 will receive a SIGFX once it writes 2gb of output, which would cause it to fail similarly to what you described. LFS has been around for a long time, so I doubt thats the problem, but it can happen.
I always thought LFS was a kernel 2.4 thing?.. BTW there is a 16 GB file-size limit if ext3 uses a 1 KB block-size, but the default is 4 KB anyway...


Quote:
Originally Posted by Sum1 View Post
I've got quite a bit of testing to do.
Let us know how it's going, OK?
 
Old 01-16-2010, 09:18 AM   #11
Sum1
Member
 
Registered: Jul 2007
Distribution: Fedora, CentOS, and would like to get back to Gentoo
Posts: 332

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by unSpawn View Post
Auch. Unfortunately the thread doesn't show you determining and fixing what was wrong.
Quote:
Let us know how it's going, OK?
I believe I can mark this thread "Solved."

1. Test Results

After multiple, many, repeated tests, using both server HD's in the box, I can report the following: regardless of tar, tar + gzip compression, or tar + bzip compression, there's always 2 or 3 corrupted areas of data that produce fatal errors when trying to recover/unpack the contents of the .tar. Depending on which of the 3 archiving methods employed, the errors are reproduced in different places in the data set.

2. Conclusion - (best efforts of deduction)

I must have committed an error while using tune2fs back in July 2009.
In another thread, I reported:
Quote:
I started with this ext3 setup in /etc/fstab:
/dev/sda2 / ext3 defaults 1 1
I changed /etc/fstab to:
/dev/sda2 / ext3 defaults,noatime,data=writeback 1 1
And then executed command on root partition:
tune2fs -o journal_data_writeback /dev/sda2

Works.
No data loss.
No problems.
In doing so, I may have made an error in stating a tune2fs command. Or possibly, I may not have properly unmounted the partition/filesystem prior to executing tune2fs commands. I may have remounted the partition in another terminal and forgot about it while executing commands in a different terminal. I'll never know for sure, but it seems like the only logical answer.

It would seem highly unlikely that both my server hard drives are failing. I have 30 users reading/writing to them no less than 12 hours a day, and I have not received any comments/complaints about lost files or inability to access files, corrupted files.....nothing at all. Believe me, they are not shy, and would gladly let me know of such occurrences. <grin>

I feel fortunate it's not a whole lot worse - 99.9% of the data is not corrupt. I've been backing up the data nightly by way of 'cp -p -r /foo /archive/date-stamped-directory/foo'. And then I run a bash script I made to diff and compare the copied files and directories in multiple ways.

Eventually, I'll have to delete all partitions and create new ones with cleanly configured ext3 or ext4 file systems.

UnSpawn, I truly appreciate the solid guidance and prompts to help me work through it in a logical way.

Last edited by Sum1; 01-16-2010 at 09:23 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to install tar.gz & tar.bz2 file on Linux min Krishnendu Linux Mint 2 07-16-2009 06:52 AM
Piping tar bzcat to add a file to a tar.bz2 archive DaveQB Linux - Software 0 06-02-2008 08:28 PM
tar-command not found while compiling glibc-libidn-2.7.tar.bz2 of lfs6.3 aditya_gpch Linux From Scratch 1 05-13-2008 11:27 PM
BackUp & Restore with TAR (.tar / .tar.gz / .tar.bz2 / tar.Z) asgarcymed Linux - General 5 12-31-2006 02:53 AM
.rpms, .tar.gz, .tgz, .src.rpm, & .tar.bz2 whoots Mandriva 10 10-18-2003 12:08 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:24 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration