LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices



Reply
 
Search this Thread
Old 12-02-2010, 04:27 PM   #1
Skaperen
Senior Member
 
Registered: May 2009
Location: WV, USA
Distribution: Slackware, CentOS, Ubuntu, Fedora, Timesys, Linux From Scratch
Posts: 1,777
Blog Entries: 20

Rep: Reputation: 116Reputation: 116
rsync incremental backups to be restored


With the --backup and --backup-dir= options on rsync, I can tell it another tree where to put files that are deleted or replaced. I'm hoping it fills out the tree with a replica of the original directory paths (at least for the files put there) or else it's a show stopper. What I'm wanting to find out applies when I'm restoring files.

Assuming each time I run rsync (once a day) I make a new directory tree (named by the date) for the backup directory. For each file name/path in the tree, I would start with whatever is in the main tree (the rsync target) and work through the incremental trees going backwards until I reach the date of interest to restore to. If along the way I encounter a file in an incremental, I would replace the previous file at that path with this next one. So by the time I get back to a given date, I should have the version of the file which was present at that date. Do this for each file in the tree and it should be a full restore.

But ... and this is the hard part, it seems. What about files that did not exist at the intended restore date, but do exist (were created) on a date after the intended restore date. What I'd want for a correct restore would be for such files to be absent in the restored tree (just as they were absent in the source tree on that date).

How can such a restore be done to correctly exclude these files? Wouldn't rsync have to store some kind of sentinel that indicates that on dates prior, the file did not exist.

I suspect someone might suggest I just make a complete hard linked replica tree for each date, and this way absent files will clearly be absent. I can assure you this is completely impractical because I have actually done this before. I ended up with backup filesystems that have so many directories and nodes that it could take over a day, maybe even days, to just do something like "du -s" on it. I'm intending to keep daily changes for at least a couple years, if not more. So that means the 40 million plus files would be multiplied by over 700, making programs like "du -s" have to check over 28 BILLION file names (and that's assuming the number of files does not grow over the next two years). Let's not go that way.
 
Old 12-02-2010, 09:45 PM   #2
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,758

Rep: Reputation: 644Reputation: 644Reputation: 644Reputation: 644Reputation: 644Reputation: 644
Do you really need every days backup to be immediately available ? .. I'd probably split it into 2 parts :-

- rolling rsync backups with hard links to keep 30 days worth readily available
- generate a tarball every month and keep it for 2 years

The contents of each tarball can be listed and dumped into a text file when its generated to allow you to grep for a particular file

You can try Rsnapshot http://rsnapshot.org/ .. <plug> I also wrote one in bash before I realised Rsnapshot existed - snap_create</plug>

Last edited by kbp; 12-02-2010 at 09:50 PM.
 
Old 12-02-2010, 10:06 PM   #3
Skaperen
Senior Member
 
Registered: May 2009
Location: WV, USA
Distribution: Slackware, CentOS, Ubuntu, Fedora, Timesys, Linux From Scratch
Posts: 1,777
Blog Entries: 20

Original Poster
Rep: Reputation: 116Reputation: 116
Yes, daily is needed for most of it. It might need more often than daily. Weekends can probably be excluded.

I just read the article linked by rsnapshot. It looks like what I used to do, which is basically a hard-linked replica that I'm trying to avoid because it won't scale to this huge project.

I'm working on my own project to make something that will do the backup increments correctly (which --backup-dir= won't do) so a restore will be correct within the day resolution. But I need to verify that there isn't some other way to do it before I can justify using my project for this.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
rsync incremental backups tqz Linux - Newbie 9 09-15-2012 04:22 AM
using rsync for incremental backups cccc Linux - Server 5 01-29-2010 07:02 AM
Rsync for incremental backups Meson Linux - General 1 10-30-2007 10:44 AM
Image and Incremental backups with rsync kaplan71 Linux - Software 2 08-13-2007 04:42 PM
Rsync for incremental backups? Phaethar Linux - Software 3 12-04-2003 02:27 PM


All times are GMT -5. The time now is 06:19 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration