LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 01-18-2011, 07:11 AM   #1
Babelduo
LQ Newbie
 
Registered: Mar 2009
Posts: 12

Rep: Reputation: 0
tar with parameter -T (or --files-from) does not compress?


Hi,

i have a root-server (debian 64bit lenny) with a few domains running on it. every sunday i run a full backup of all domain-files on the hard-disk.

but why is the "compressed" tar.bz2 file bigger than its content?

the compressed tar.bz2 file from my backup-script:
Code:
3.8G Jan 16 04:12 backup-20110116030001-full.tar.bz2
the extracted directory:
Code:
# du -sh backup-20110116030001-full
2.8G    backup-20110116030001-full
now i create a new tar.bz2 and archive the extracted folder again in a new archive. it is the same folder extracted from the 3.8G archive, but now its only 1.9GB!:
Code:
# tar -cjpf backup-20110116030001-full_NEW.tar.bz2 backup-20110116030001-full
1.9G Jan 18 10:30 backup-20110116030001-full_NEW.tar.bz2
the only difference is that my backup-script uses the -T (or --files-from) parameter to read all files from a list. so first i "find -mtime" all the files and then I create the archive.

does the parameter -T not compress? is there a alternative solution?

thx!
kim
 
Old 01-18-2011, 07:26 AM   #2
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 678Reputation: 678Reputation: 678Reputation: 678Reputation: 678Reputation: 678
The -T option won't effect compression. Are you certain that you don't have duplicates listed in in the filelist? If you do, they will be backed up twice.

Another possibility is that you are backing up the targets of links, instead of the links themselves.

Last edited by jschiwal; 01-18-2011 at 07:28 AM.
 
Old 01-18-2011, 07:38 AM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981
To me there are duplicates in the file list. The find command will list the directory names and the files inside them one per line, e.g.
Code:
/path/to/dir
/path/to/dir/file1
/path/to/dir/file2
...
the tar command will archive the directory recursively, then the files inside it. Think at the tar archive as a stream of files: tar will archive every thing from the list without worrying about what has been archived previously (indeed the tar command was developed for tape archives and files were archived sequentially and the tape was never rewound in the process). When extracting the archive with duplicates, they are simply overwritten and the extracted directory tree preserves its original size.

To avoid this behaviour use the --no-recursion option of tar. In this way directories will be simply created (referenced) in the archive and not descended recursively.
 
Old 01-18-2011, 08:23 AM   #4
Babelduo
LQ Newbie
 
Registered: Mar 2009
Posts: 12

Original Poster
Rep: Reputation: 0
thx for your responses! i think this could be the problem. i now use the following command after i created my whitelist/blacklist:

Code:
cat whitelist.lst | sort | uniq > whitelist.lst_uniq
the new "uniq" whitelist is about half the size of the original file (what matches excatly with the sizes i wrote in my first post: 1.9GB and 3.8GB)

i will try it out, i think it should work. but i'm wondering a little bit because i use "file -type f" to only find files and no directories. maybe i should look deeper into my backup-script to find the source of redundance!

thx
babel
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
using tar to compress specific files in a directory named a certain way slinky66 Linux - Newbie 3 11-25-2009 01:31 PM
tar doesn't compress .hidden files in the same directory it is run in? BassKozz Linux - Newbie 2 06-02-2009 04:46 PM
Resize (*.jpg) -> Rename -> Compress (xyz.tar) ? control_guy Linux - Newbie 6 05-22-2008 01:34 PM
compress a directory using tar muskiediver Linux - General 4 07-18-2006 10:24 AM
Compress Files mikeshn Programming 5 06-04-2003 12:25 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 03:01 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration