LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   BASH: limit a .tar.gz to a maximum file size (http://www.linuxquestions.org/questions/programming-9/bash-limit-a-tar-gz-to-a-maximum-file-size-798910/)

worm5252 03-30-2010 09:56 AM

BASH: limit a .tar.gz to a maximum file size
 
Hey guys,
I am working on writing a backup script. It is cloud computing so what I have is several web servers that rsync to a directory located in cloud storage. I have mounted the cloud storage mounted at /mnt/cloudstorage on all of my cloud servers. The cloud storage is a samba share (which I didn't get to choose).

Anyways,
I need to work out a script to back up the directory on cloud storage that all the web servers rsync to. I figure this will be a manual process since changes to this directory will not be scheduled or automated. Few challenges, 1.) I can only access the cloud storage from inside the cloud providers network, and 2.) I need to be able to download the backups and store them here locally.

So with that being said, I will run the backup script on one of the web servers and then use a tool like winscp or something to download the files. What I am thinking is, since we do not have a box here that we can dedicate to storing these backups, I am going to need to put them on some external media. Since everyone here uses laptops, my first thought was either a USB drive or DVDs. So I want to limit the maximum file size to 4400MB to fit on a single layer DVD. At home I have rar on all of my machines and I just make tar.rar files for my backups. However rar is not standard on an installation of CentOS 5.3.

So with using only packages like tar, gzip, etc that were installed with CentOS, how can I limit the size of the back up files to 4400MB?

ilikejam 03-30-2010 10:34 AM

Hi.

I would pipe tar into split, e.g.
tar cvfz - /mnt/cloudstorge | split -b 4400m - backup.tar.gz.

That would give you backup.tar.gz.aa backup.tar.gz.ab backup.tar.gz.ac etc, all maximum 4400MB

Dave

worm5252 03-30-2010 11:51 AM

yea, Problem is if I have someone who doesn't know how to put em back togeather trying to work with the backup files. Not a big deal I guess.

what about tar with a multi-volume support. Something like this
Code:

tar czf --multi-volume -L 4400m --file=archive.tar.gz /path/to/be/archived
Anyone have any experience using tar this way?

worm5252 03-30-2010 12:03 PM

I just figured out using multi-volume won't work unless I am changing the destination media when it prompts. I guess split is the way to go.

jeffstrunk 04-07-2010 03:58 PM

redacted

jeffstrunk 04-07-2010 03:59 PM

Quote:

Originally Posted by worm5252 (Post 3918129)
I just figured out using multi-volume won't work unless I am changing the destination media when it prompts. I guess split is the way to go.

That is not true. You can use a script to manage which files tar writes to. I just wrote one to split an archive across 4 disks. The tar manual has a section describing this feature.

Unfortunately, you can't use compression with multi-volume.

I would be wary of using split with compression. It is very difficult too recover files from a later volume if any of the previous volumes are missing or corrupted. http://www.gzip.org/recover.txt

dar may be a reasonable alternative if you require compression, multiple volumes, and the ability to restore files from one volume without the previous ones.

Star_Gazer 04-08-2010 07:36 PM

Quote:

Originally Posted by ilikejam (Post 3918058)
Hi.

I would pipe tar into split, e.g.
tar cvfz - /mnt/cloudstorge | split -b 4400m - backup.tar.gz.

That would give you backup.tar.gz.aa backup.tar.gz.ab backup.tar.gz.ac etc, all maximum 4400MB

Dave

I'll remember that command line for myself as well. :cool:

How would one unsplit those files, or do they become seperate readable tar files, e.g.

tar --list backup.tar.gz.aa ?

BTW, I believe the 'f' option is supposed to be last, or followed by the filename (or "-"), and the option set should be 'cvzf':

tar -cvzf - /mnt/cloudstorge | split -b 4400m - backup.tar.gz.

;)

Clifton

Star_Gazer 04-08-2010 07:44 PM

Quote:

Originally Posted by worm5252 (Post 3918129)
I just figured out using multi-volume won't work unless I am changing the destination media when it prompts. I guess split is the way to go.

Actually, there is:
Code:

-L, --tape-length=NUMBER  change tape after writing NUMBER x 1024 bytes
When you use this option, you do not have to specify the multi-volume option because it will automatically be used with the above option. I have used it on my hard drive with no issues. However, I guess I should point out that when I used it, the tar file never reached the limit specified by the -L, (or --tape-length=NUMBER) option, but it might be worth a try for some users. ;-)

Also, not sure, but I do not think you can use compression employing this method, so usage of the 'split' command, as suggested by another user in this thread, is a far-out and groovy thing as well, since you can use compression with that method, can you dig it? :)

Clifton

Clifton

ilikejam 04-09-2010 05:14 AM

The flag order doesn't matter (not for GNU tar, anyway), so you can stick yer 'f' wherever you like.

To restore, just do '# cat backup.tar.gz.* | tar xzvf -' in the appropriate directory. If you just want to un-split the archive, but not restore, just do '# cat backup.tar.gz.* > backup.tar.gz'

Dave


All times are GMT -5. The time now is 03:19 PM.