LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Which rsync parameters to backup huge .tar.gz file? (https://www.linuxquestions.org/questions/linux-general-1/which-rsync-parameters-to-backup-huge-tar-gz-file-4175601951/)

postcd 03-16-2017 05:08 PM

Which rsync parameters to backup huge .tar.gz file?
 
Hello,

my virtualization host server is producing an monthly "snapshot" of the OpenVZ VPS. The file is like vzdump1234.tar.gz and is around 250GB in size, this archive contents changing, i guess like 10% of the VPS files are new/modified.

I want to transfer this archive monthly to remote server. Which rsync parameters are clever to be used in order
a) least data transfer is used
b) local and remote server space is very limited and storing double size of the archive might be issue.
c) remote server is low CPU VPS
d) both server's HDDs are quite a bottleneck of the overall performance

Thank You

TB0ne 03-16-2017 05:31 PM

Quote:

Originally Posted by postcd (Post 5684427)
Hello,
my virtualization host server is producing an monthly "snapshot" of the OpenVZ VPS. The file is like vzdump1234.tar.gz and is around 250GB in size, this archive contents changing, i guess like 10% of the VPS files are new/modified.

I want to transfer this archive monthly to remote server. Which rsync parameters are clever to be used in order
a) least data transfer is used
b) local and remote server space is very limited and storing double size of the archive might be issue.
c) remote server is low CPU VPS
d) both server's HDDs are quite a bottleneck of the overall performance

Again, as with many of your previous threads, you provide next to no useful details, which would let anyone help you. You omit the bandwidth between the machines, why you want to do this vs. backing the file up locally on the machine where the snapshot was taken, etc. To go through your post:
  • a) least data transfer is used - You say you have to move a 250GB file..that is the 'least data transfer' you can use, to move that amount of data. rsync can't make the file smaller.
  • b) local and remote server space is very limited and storing double size of the archive might be issue. - So if you can't store it, it's pointless to transfer it.
  • c) remote server is low CPU VPS - Meaningless in this context. Either you have the system resources to do the job or you live with it being slow. If you want to do it, you have those two choices.
  • d) both server's HDDs are quite a bottleneck of the overall performance - Again, meaningless in this context. How, exactly, do you think any piece of software can magically make your HDD's faster?

It's also odd your signature says "Linux Newbie"...when you've been registered and posting here for FOUR YEARS, and have asked about rsync many times:
http://www.linuxquestions.org/questi...er-4175525376/
http://www.linuxquestions.org/questi...ge-4175503702/
http://www.linuxquestions.org/questi...ng-4175528225/
http://www.linuxquestions.org/questi...ze-4175538411/
http://www.linuxquestions.org/questi...nc-4175585087/

Read the "Question Guidelines", which you have been pointed to many, MANY times now. Thread reported.

rknichols 03-16-2017 06:50 PM

Transferring a compressed file will eliminate the benefit of rsync's delta transfer. Upon reaching the first changed byte, the entire remainder of the file will be different due to the compression, and thus will be transmitted in its entirety. At least rsync is intelligent enough to see the ".gz" extension and not attempt re-compression of the already compressed file.

jefro 03-16-2017 09:33 PM

Some thoughts only.

Maybe mount the final file like mounting an iso image and then using the directory structure let some program only move the differences. Unison or rsync.
I think that then you could use the compression to transmit less after first sync.

Jigdo is kind of what I'm thinking.

pan64 03-17-2017 03:24 AM

gzip has a --rsyncable option nowadays, that is recommended if you want to rsync compressed files. otherwise see post #3

TB0ne 03-17-2017 08:18 AM

Quote:

Originally Posted by jefro (Post 5684550)
Some thoughts only.
Maybe mount the final file like mounting an iso image and then using the directory structure let some program only move the differences. Unison or rsync.
I think that then you could use the compression to transmit less after first sync. Jigdo is kind of what I'm thinking.

Yeah, but the OP hasn't provided any details at all, aside from sending me a nasty PM. We don't know how that file is getting generated, or if there's even an option to get it in ISO format. For all we know, this is generated from the VPS system, and is ONLY available as a .gz file. If they're using vzdump, it creates a .pm file, which isn't mountable. So even if you use the rsync compression options, you still have to shovel over 250GB, and can't do a delta of it.

Not to mention the fact they imply they don't have the disk space to store it. Backing up an entire VM container file with vzdump creates an image, but having multiple images is pointless. Take ONE image, then a delta rsync each day. Container fail? Restore the image via the bare metal restore utilities or vzrestore, and restore the rsync delta after you've got a working system.

OP, you STILL need to read the "Question Guidelines" link, and start showing some effort of your own. Sorry, but after 4 years you're not a 'newbie' anymore, and having worked with rsync for at least 3, you should be able to figure out some options on your own.

TB0ne 03-17-2017 08:20 AM

Quote:

Originally Posted by pan64 (Post 5684635)
gzip has a --rsyncable option nowadays, that is recommended if you want to rsync compressed files. otherwise see post #3

Yeah, but in post #1, the OP said they don't have space to store the file, so even if you optimize bandwidth with rsync, the transfer will fail.

rknichols 03-17-2017 08:34 AM

Quote:

Originally Posted by TB0ne (Post 5684711)
Yeah, but in post #1, the OP said they don't have space to store the file, so even if you optimize bandwidth with rsync, the transfer will fail.

Using the "--inplace" option of rsync could overcome that, but at the expense of possibly losing everything if there is a glitch in the transfer.

TB0ne 03-17-2017 09:28 AM

Quote:

Originally Posted by rknichols (Post 5684718)
Using the "--inplace" option of rsync could overcome that, but at the expense of possibly losing everything if there is a glitch in the transfer.

Yes, but if the disk is totally full, that could cause other entertaining things to happen. :)

And apparently, this has been an issue for three years now, going back to 2014:
http://www.linuxquestions.org/questi...5/#post5284404

...where the OP is (essentially), asking this same question. Didn't come back there to answer the follow-ups asked by wpeckham, either, and apparently the OP cannot post to the OpenVZ forums either, since they were banned from that forum. The openvz-diff-backups tool is in beta right now, specifically written to take deltas of existing PM's, and shovel it over SSH tunnels to different locations. The OP seems to have not looked.

jefro 03-17-2017 03:26 PM

If the OP is out of space on source then I'll agree that compressing data to a file is useless. Only to ways to stop in time is snapshot by some means or live state. (wish I could remember that open sources linux live state program name)

Using highest compression levels of the selected compression program may help. Knowing what kind of data it is may yield a better compression program choice.

TB0ne 03-17-2017 04:50 PM

Quote:

Originally Posted by jefro (Post 5684871)
If the OP is out of space on source then I'll agree that compressing data to a file is useless. Only to ways to stop in time is snapshot by some means or live state. (wish I could remember that open sources linux live state program name)

Using highest compression levels of the selected compression program may help. Knowing what kind of data it is may yield a better compression program choice.

True. But based on the sparse information given here, we can probably assume it's a vzdump file, containing a .pm snapshot, created with the --compress option, which uses gzip. https://openvz.org/Backup_of_a_runni...er_with_vzdump

If the file is already 250GB compressed/gzipp'ed, it probably can't get a lot tighter...which takes us back to the OP saying there's not enough space on the target. The OP hasn't given any information, despite being asked, which seems to be a theme with the OP.

pan64 03-18-2017 03:05 AM

Quote:

Originally Posted by TB0ne (Post 5684711)
Yeah, but in post #1, the OP said they don't have space to store the file, so even if you optimize bandwidth with rsync, the transfer will fail.

if he was able to create a compressed file (The file is like vzdump1234.tar.gz), he could try --rsyncable...


All times are GMT -5. The time now is 03:07 PM.