LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
LinkBack Search this Thread
Old 01-29-2007, 09:35 AM   #1
carlosinfl
Senior Member
 
Registered: May 2004
Location: Orlando, FL
Distribution: Debian
Posts: 2,831

Rep: Reputation: 65
Tar Compression


OK - I had a directory called "pics" which as you can see is 1.4GB in disk space. Now I was told I can use Tar for compression so I wanted to see how much "tar" can archive and save space for me. I know that compression is poor on video like .MPG or music .MP3 since they are already compressed formats but the "pics" directory is all .JPG files to equal 1.4GB.

Code:
carlos@lptp:~$ du -h pics/
1.4G    pics/
carlos@lptp:~$ du -h pics.tar.gz 
1.4G    pics.tar.gz
Why is the tar.gz file still 1.4GHz as before I decided to compress the directory? Should I try gzip or bzip2? Am I doing something wrong with tar?
 
Old 01-29-2007, 10:01 AM   #2
rufius
Member
 
Registered: Oct 2002
Location: Miami, FL
Distribution: Ubuntu
Posts: 184

Rep: Reputation: 30
Try a:
Code:
tar jcvf pics.tar.bz2 pics/
I believe bzip2 compression is better than gzip IIRC.
 
Old 01-29-2007, 10:34 AM   #3
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu (x86), Debian (PPC)
Posts: 3,528

Rep: Reputation: 60
The tar format itself doesn't include compression. It's just a way to get a filesystem in a file (historically, a file suitable for efficient access by non-random access devices such as tape drives - "tar" is a contraction of tape archive).

The GNU tar tool implementation includes the option to compress an archive in the same invocation which makes the .tar file, which results in the same output as making the tar file without compression, and then passing the resultant file though gzip or bzip2 or old-school unix compress.

The degree to which you can compress a tar file with a non-lossy algorithm is dependent on how much redundant data is in the file. For example, if there are lots of repeat characters, or patterns in the file, these can be replaced with descriptions on how to make those chunks of data, which may be shorter than the data itself. For a given data set, there's a minimum size to which you can compress it.

As you can imagine, files which have already been compressed, don't have as much redundant data as those which have not yet been compressed. It turns out that there is a trade off between the amount of processing power and memory you throw at compressing some data and the degree to which you can approach this theoretical limit of compression for the data set. Thus different compression algorithms compress by different amounts - older ones taking the trade-off of using less CPU and memory, but accepting worse compression.

Most image files are already compressed. Some, like PNG or GIF, use lossless compression. Some, like JPG use lossy compression, where some of the data is actually thrown away (bits we hope the human eye doesn't notice so much, like detail in low-contrast areas). In either case, image files are usually compressed already, and so compressing them again doesn't yield much benefit. You might get a few % compression by using a modern algorithm like bzip2 on PNGs or maybe even JPGs, but you're not likely to get more than that.

Basically, the bottom line is that if your tar file contains only pictures, it's probably nor worth compressing it.

It's worthwhile to note here that if you have a tar file which contains compressed files like mo3, ogg, jpg, png, mpg, wmv etc. AND uncompressed data (.txt and so on), compressing the whole tar file will only really yield a benefit for those parts of the file which don't contain pre-compressed data. However, you can't choose to just zip parts of the file, unless you pre-compress it, and don't compress the resultant file. For this, and other reasons, different archiving formats, like dar, can exclude compression for individual files, based on a filename extension.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how can i decompress this tar.tar file? hmmm sounds new.. tar.tar.. help ;) kublador Linux - Software 10 02-23-2008 05:40 AM
Curious about what tar, without compression, would be used for bdb4269 Linux - General 9 01-08-2007 06:36 PM
BackUp & Restore with TAR (.tar / .tar.gz / .tar.bz2 / tar.Z) asgarcymed Linux - General 5 12-31-2006 02:53 AM
Does TAR use any compression? Micro420 Linux - General 15 12-13-2006 02:17 PM
Diferance between rpm, tar, tar.gz, scr.tar, etc mobassir Linux - General 12 08-21-2003 06:30 AM


All times are GMT -5. The time now is 09:30 PM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration