rsync incremental backups to be restored
With the --backup and --backup-dir= options on rsync, I can tell it another tree where to put files that are deleted or replaced. I'm hoping it fills out the tree with a replica of the original directory paths (at least for the files put there) or else it's a show stopper. What I'm wanting to find out applies when I'm restoring files.
Assuming each time I run rsync (once a day) I make a new directory tree (named by the date) for the backup directory. For each file name/path in the tree, I would start with whatever is in the main tree (the rsync target) and work through the incremental trees going backwards until I reach the date of interest to restore to. If along the way I encounter a file in an incremental, I would replace the previous file at that path with this next one. So by the time I get back to a given date, I should have the version of the file which was present at that date. Do this for each file in the tree and it should be a full restore.
But ... and this is the hard part, it seems. What about files that did not exist at the intended restore date, but do exist (were created) on a date after the intended restore date. What I'd want for a correct restore would be for such files to be absent in the restored tree (just as they were absent in the source tree on that date).
How can such a restore be done to correctly exclude these files? Wouldn't rsync have to store some kind of sentinel that indicates that on dates prior, the file did not exist.
I suspect someone might suggest I just make a complete hard linked replica tree for each date, and this way absent files will clearly be absent. I can assure you this is completely impractical because I have actually done this before. I ended up with backup filesystems that have so many directories and nodes that it could take over a day, maybe even days, to just do something like "du -s" on it. I'm intending to keep daily changes for at least a couple years, if not more. So that means the 40 million plus files would be multiplied by over 700, making programs like "du -s" have to check over 28 BILLION file names (and that's assuming the number of files does not grow over the next two years). Let's not go that way.