I didn't know if I should put this in server or programming, so I decided to put it here. But I'd like to say it would be ideal if things could be in bash.
Now, I have this idea. It came from being highly annoyed with backup methods. I know there are some good ones, but I think they could be better. There are symbolic links and ways to point to objects.
One of my issues is that with an incremental backup: it backups the new locations, even if the file didn't change. So, it wastes space. Sure, the file might not have changed, but it's location did; thus, the backup program, such as sbackup, thinks that it's a new file and needs to be added to an incremental backup tar.
ex:
/home/blahblahblah/jackhandy.mpeg
was put into the full backup and the file was 2GB.
the next week it was moved to...
/home/blahblahblah/jackhandyfiles/jackhandy.mpeg
And yet typical backup methods put this inside of the incremental backup.
That's annoying and wasteful.
Anyone see a problem with that? I think there could be an improvement.
It would be ideal if the program would check the file's checksum/properties against files of the same name and simply link to the file in a previous full/incremental backup.
So, I created a general thought plot as to how backup methods could be improved and used. Tell me if any of you understand what I'm getting at and think backup methods should be like this in the future:
Code:
1) Full backup
2) Incremental backup
3) Restoration from last incremental backup
4) Restoration from any point
2) Incremental backup
Incremental backup attributes:
1. Logs all instances that have changed since the last full backup
a. Logs if file is no longer there.
b. Logs if file has moved.
ba. Checks if file that has moved is the same file.
baa. If the file that has moved is not the same file (checksum different), then it is copied into the incremental backup.
bab. This new file's checksum is logged.
bb. If the file that has moved is the same file (same checksum), only a symlink is created to its new place of destination.
bba. This symlink points to the same file's old location that exists in either an incremental backup or in the full backup.
bba*. This prevents the file from being copied again, which would increase storage requirements.
Ways to make the checksum process easier:
1. Log the checksum of files only over a certain size, such as 1MB or 100KB.
3) Restoration from last incremental backup
1. Directory tree is created according what the most recent version of the tree should look like.
2. Symlinks are turned into the actual file they link to.