LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   listing the contents of large tar files (https://www.linuxquestions.org/questions/linux-software-2/listing-the-contents-of-large-tar-files-543368/)

szim90 04-04-2007 03:28 PM

listing the contents of large tar files
 
Hello.

I used tar to create a multi-part archive containing a group of large files (mostly uncompressed image files). Many of the files are larger than 100 MB and the entire archive was over 6 GB (although I split it using tar -M so it would fit on two dvds). Although I can create and restore from the archive, I noticed that simply listing the archive's contents with tar -t took a very long time. I wanted to know is there some way to make tar faster when it is simply listing the archive's contents.

Thank you for any help,
szim90

ramram29 04-05-2007 08:35 AM

What I like to do is always do a find of the files that I want to backup; I then put them into a text file; at the end I use tar -T file.txt to archive the files in the text file. This makes it very easy to search for a file with the grep command.

szim90 04-05-2007 08:58 AM

Thank you. Okay so, just to make sure, if I wanted to backup directory A and all sub directories, I would do `find -d A -print > file.txt`, and then `tar -cv -T file.txt -f backup.tar`? Also, will this work with the -M option. I apologize if these are basic questions, but I am new to this and I want to make sure I create these backups correctly.

Also, when extracting files, if I specify a specific file to extract, will tar parse everything before that entry in the file, or will it jump to the file it needs to extract?

ramram29 04-05-2007 02:41 PM

This is how I would do it:

find /dir ! -type d > /tmp/backup-$(date +%F).txt

less /tmp/backup-$(date +%F).txt

tar -zcvf /tmp/backup-$(date +%F).tgz -T /tmp/backup-$(date +%F).txt

This finds all the files that need to be backed up. Find only files and not directories or you'll be backing up the directory twice. At the end you'll have two files in /tmp, one is a compressed tarball and the otherone is a list of the files backed up. The files will also have the date. That way you can backup daily and you'll also know the date that this file was backed up. If you need to view the files in the backup run:

less /tmp/backup*.txt

To split the file substitute the last command with:

tar -zcvf /tmp/backup-$(date +%F).tgz -T /tmp/backup-$(date +%F).txt | split -b 700M -

Then the files will be split accross many files of 700M each for easy backup to CD.

The files will be called:

backup-2007-04-05.tgzaa
backup-2007-04-05.tgzab
backup-2007-04-05.tgzac

You will have to contactenate them before expanding with command:

cat backup-2007* > backup-restore.tgz

Make sure you have enough hard disk space.

szim90 04-05-2007 10:18 PM

Okay. Thank you for all the help. I will definitely try this next time I need to create backups.

I wanted to ask one more question. For the archives that I already have created that run slowly, or for new archives with large files (including archives created using the method you described above, would increasing the block size (-b) make tar run faster?

Thank you for everything,
szim90

ramram29 04-06-2007 08:55 AM

I wouldn't mess with it. If you try to tar to the same hard disk that you are reading from then it may will take longer than usual. Over the network or to an external USB is faster. You seem to have a pretty big mega archive so it will take long. Also a faster processor and faster disks does help a lot.

szim90 04-06-2007 09:23 AM

I tried backing up to my external hard drive and I seem to get about 10 mb/sec when creating a tar archive, which seems to be better than what I was getting earlier when I was writing to the same disk I was reading from. Thank you for all of your help. I am backing up mostly uncompressed images as well some iMovie files so these archives grow to be 6-7 GB.

Thanks again,
szim90


All times are GMT -5. The time now is 06:16 PM.