LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Help creating multivolume tar archive (https://www.linuxquestions.org/questions/linux-software-2/help-creating-multivolume-tar-archive-757466/)

Thymox 09-24-2009 07:26 AM

Help creating multivolume tar archive
 
OK, so I'm still struggling to create a nice, clean backup script. Previous LQ's here and here.

So far, I have this:
Code:

  tar \
    --files-from=$strINCLUDE \
    --exclude-from=$strEXCLUDE \
    --recursion \
    --create \
    --file $strBACKUPLOCATION/backup \
    --multi-volume \
    --tape-length $strSIZE \
    --format=gnu \
    --new-volume-script=$strBACKUPLOCATION/post_tar_script.sh

In one of the other threads it was discovered that the GNU format the multi-volume header (still) cannot store filenames greater than 100 chars in length - which is causing big problems, but the POSIX format (which apparently does nicely support very long filenames) seems to break everything complaining of incompatible features.

So, what features from the above tar command are incompatible with the POSIX tar spec? I can't really afford to remove any of them unless there is a suitable replacement - I need to be able to specify what files to include and exclude, I need it to rucurse, I need to be able to split the tar file and I need to run some action on the split tar files (in this case, FTP them somewhere else).

Incidentally, creating a single tar file and subsequently splitting it is not an option. Splitting the tar file in-situ is not an option unless I can make it FTP those split files to another location whilst the backup is still occurring - we have ~60GB to backup and only ~40GB of space to backup into... hence the current script running "post_tar_script.sh" (which FTPs the just-completed part of the tar archive to our backup server) at every $strSIZE bytes.

chrism01 09-24-2009 07:33 PM

I wouldn't create a tar file that big in the first place. Anything goes wrong with the file and you lose the entire backup.
Can't you split it (ie multiple tar cmds/files) logically eg by dir?
Also, does the -z (gzip option) help?

Thymox 09-25-2009 04:10 AM

With the "multi-volume" thing, each individual file is a complete tar file in its own right - it's not necessary to concatenate all the files together to do an extract. Any files that are bigger than the span-size ($strSIZE) get put across multiple files, but they're quite rare in this case.

Unfortunately, it would seem that g/bzipping them with the tar file (ie -z or -j) doesn't work with multivolume. As it happens, part of the post_tar_script.sh we bzip the individual file prior to FTPing it off anyway.

The directories being included are (complete with current sizes):
/home - 3.3GB
/var - 84 GB
/root - 129MB
/usr/local - 794MB
I don't want to have to split the backups into this-part-of-var and that-part-of-var, etc, etc. It will only make it more complex to retrieve data from should we need to.

chrism01 09-25-2009 07:52 AM

Given you have more data than space to create a local backup in (as I understand you), why not think laterally and just rsync it to the backup server instead?
Unless you are willing to break up the /var backup, you are going to run out of disk space according to
Quote:

and only ~40GB of space to backup into..
anyway.

Thymox 09-25-2009 09:44 AM

Been there, done that. The destination computer in question is only accessible to us via FTP. It's a "1&1 Hosting thing" - and no, there's no option of changing provider! :D

zhjim 09-28-2009 03:25 AM

Also this does not solve the tar issue but easies the space problem. Mount the ftp directory and get things going there.
Some nice thing for this is http://ftpfs.sourceforge.net/

I also found some howtos in german to set things up. Do a quick "mount ftp" in google.

chrism01 09-29-2009 12:53 AM

Actually, I think dar with -c, -s, -p options should do it.
You can create slices and tell dar to pause in between slices. Primarily designed for spreading large backups over eg multiple Cds, but no reason in principle you can't ftp each slice as it pauses.
It can do compression as well.

jschiwal 09-29-2009 02:07 AM

Look in Section 4.6 of the tar info manual.

Quote:

tar -C sourcedir -cf - . | tar -C targetdir -xf -
Code:

  tar \
    --files-from=$strINCLUDE \
    --exclude-from=$strEXCLUDE \
    --recursion \
    --create \
    --file - \
    --format=gnu                    | ssh user@host cat >$str

would stream the entire archive to the storage server instead of breaking it up to a multi-volume set. This won't create a temporary tar file locally to be moved to the remote server.

AFAIK, this won't make it easier to backup, but it might be easier to restore. So maybe using the same backup script but using ssh to restore through a pipe could be what you are looking for. I have restored from a single backup this way, but not a multi-volume backup. I don't know if you can cat the volumes of a multi-volume tar backup ( at the storage server ) and use tar to restore files on the local side. I think it will work.

Using pipes this way may be more flexible if you use cpio instead of tar. I haven't tried it using dar.

You would need to use pubkey authentication. Locally, the script needs to be run as a backup user or root, but on the storage server, a normal user account would do.

I would also recommend using the -g option with tar for incremental backups. This will reduce the size of backups in between weekly/bi-weekly/monthly backups.

My favorite that I have played with is something like:
tar -C / -g podcasts/.snar -cf - <directory list> | tee /mnt/ndas/backupfile.tar | ssh -C / -xvf - >backuplog
I used this one-liner to replicate new podcasts from one computer to another, while simultaneously creating an incremental backup on a mounted NAS share. This is from memory. IIRC I also used the -v option and


All times are GMT -5. The time now is 02:06 AM.