LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-22-2014, 06:40 AM   #1
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Rep: Reputation: 0
Question Periodic backup of changes on another disk from full backup


I am looking for a backup solution that will enable me to backup only changes to an internal disk after a full backup to an external disk. Also, this should run snapshot backups every day until the next full backup.

Day 1: Full Backup / > External
Day 2: Changes to /home after full > internal
Day 3: Changes to /home after full > internal

etc

Currently I am using a simple rsync script for the full backup

I have looked and previously used rsnapshot and recently Back in Time, which both allow transparent snapshots on the internal space of timed intervals. Both work great, but rely on full backup to the same location first as far as I know.

VG1
root
backups here

VG2
home

I am running Linux, LVM with 2 X VG (one home and other holds root LV and the backup LV). If LVM snapshots can be used in this, then I am open to suggestions. Something similar to https://btrfs.wiki.kernel.org/index....emental_Backup may work if I understand it correctly, but not ready to change to btrfs yet.
 
Old 10-22-2014, 10:41 AM   #2
linosaurusroot
Member
 
Registered: Oct 2012
Distribution: OpenSuSE,RHEL,Fedora,OpenBSD
Posts: 982
Blog Entries: 2

Rep: Reputation: 244Reputation: 244Reputation: 244
dump/restore (for ext* type filesystems) has levels where 0 means full and 1 means everything since full.
 
Old 10-22-2014, 03:21 PM   #3
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
Thanks, I didn't know about dump. Only problem is that the files won't remain transparent as in I can't restore an individual files which is the advantage of rsync or rsnapshot style backup. If I lose a file or need to get a file from last week, I don't want to restore the whole partition.
 
Old 10-23-2014, 05:23 AM   #4
linosaurusroot
Member
 
Registered: Oct 2012
Distribution: OpenSuSE,RHEL,Fedora,OpenBSD
Posts: 982
Blog Entries: 2

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by alleyoopster View Post
I can't restore an individual files .... If I lose a file or need to get a file from last week, I don't want to restore the whole partition.
False - pick an empty directory and do
Code:
restore ivf /name/of/dumpfile
and select the file(s) to extract under your CWD. Then move them to where you want.
 
Old 10-23-2014, 07:03 AM   #5
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
Fantastic. Thanks I'll look into it. I noticed XFS has this also, which would be preferable for me - if it gets the same result.
 
Old 10-23-2014, 09:54 AM   #6
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 529

Rep: Reputation: 319Reputation: 319Reputation: 319Reputation: 319
Since you are already using rsync, you should look into the --link-dest=DIR option.

It works by hardlinks, so one of the requirements is that the destination filesystem must have hardlink capability (as ext3 and ext4 do, and I think even ntfs does).

It creates what looks like a full backup; you see a directory with all the files. However, files which have not changed since the last backup are actually stored as hardlinks to the previously saved files, which saves a lot of disk space. Files which were changed or deleted since the last backup can be retrieved by looking in the previous backup directories.

Here's how it works:

First you create the full backup. If you backup no more than once a day, you could name the backup directory according to the date as in this example:
Code:
#!/bin/bash
backup_dir=/backup_drive/home_backup_directory
today=$(date +%Y-%m-%d)    # or equivalently $(date +%F) if you have it

/usr/bin/rsync -a /home/ $backup_dir/$today
For incremental backups, add the --link-dest=DIR option:
Code:
last_backup=$(ls -1A $backup_dir | tail -1)

/usr/bin/rsync -a --link-dest=$backup_dir/$last_backup /home/ $backup_dir/$today
If you want (as I certainly would), you could add some error checking to make sure $last_backup is valid. It might not be, for example, if the external drive is offline, or if $(ls -1A $backup_dir/ | tail -1) points to a regular file or a directory other than the most recent backup.

Additional note added in editing: One of the beautiful things about hardlinking backups in this way is that you can delete old, unneeded backup directories in any order. If you no longer need the first full backup, go ahead and delete it; the incremental backups will still have all the files. Each hardlink is independently associated with the actual file, so as long as at least one hardlink remains, the file is still there. When you delete the last hardlink, the file is gone.

Last edited by Beryllos; 10-23-2014 at 10:09 AM.
 
Old 10-23-2014, 12:11 PM   #7
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Beryllos View Post
Since you are already using rsync, you should look into the --link-dest=DIR option.
Thanks for the detailed answer. Hard links would be just the thing for this. Your suggestion is much like rsnapshot. There seems to be a problem though. The external drive is only connected for the full backup and after this stored in another location. With this method doesn't rsync need to see the full backup before it can run an incremental backup?
 
Old 10-23-2014, 01:08 PM   #8
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 529

Rep: Reputation: 319Reputation: 319Reputation: 319Reputation: 319
Quote:
Originally Posted by alleyoopster View Post
Thanks for the detailed answer. Hard links would be just the thing for this. Your suggestion is much like rsnapshot. There seems to be a problem though. The external drive is only connected for the full backup and after this stored in another location. With this method doesn't rsync need to see the full backup before it can run an incremental backup?
Yes. That is a problem. The hardlinks and the full backup must be stored within the same filesystem.

If the drive is attached to another computer, and accessible via the network, rsync can handle that. It would send the new files and changed files over the network, to be stored on the same filesystem as the full backup. This can be done efficiently by compressing the data stream, and securely with ssh as the transport protocol. I've heard of people doing something like that, creating the full backup locally because it's much faster for a large file system, then taking the drive to their remote office or facility, and then performing incremental backups remotely.

If the full backup is off the network, or powered off in a storage closet or a safe, this rsync method won't do what you need.

Edit: Your idea to keep the backup at a remote location is an excellent one. It's tragic when the computer and the backup are lost together, as might happen in a fire or theft.

Last edited by Beryllos; 10-23-2014 at 01:11 PM.
 
Old 10-24-2014, 04:16 AM   #9
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
I have been looking at the xfsdump method and it seems each time it writes a full backup it has to write all the data again. It cannot update just the changes. This would be a fail.

As for the rsync method I have an idea that if I create a LVM snapshot of the current /home and then specify this location as the --link-dest it may fool rsync into thinking that the snapshot was the last backup and then increment the backup from the current home. Not sure on what rsync uses to check the last backup.

Haven't tested either of the above yet.
 
Old 10-24-2014, 05:22 AM   #10
linosaurusroot
Member
 
Registered: Oct 2012
Distribution: OpenSuSE,RHEL,Fedora,OpenBSD
Posts: 982
Blog Entries: 2

Rep: Reputation: 244Reputation: 244Reputation: 244
If you ask for a full backup it writes all the data because that's what you asked for. To do a "differential" dump that writes only the changes you use a different dump level number.
 
Old 10-24-2014, 08:19 AM   #11
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by linosaurusroot View Post
If you ask for a full backup it writes all the data because that's what you asked for. To do a "differential" dump that writes only the changes you use a different dump level number.
I did ask for a full backup and that is what I want - a periodic full backup to the external disk. In practical terms the way dump writes is to send all the information to that disk where rsync - which is my current method for full backup - writes (and deletes) what has changed, in affect mirroring the data. With a large volume (which this is) the amount of time and system resources to run a full backup is significance greater,using dump. Rsync could often only take a few minutes where dump would take hours.

The dump method does have the advantage of keeping a record of the backup and using that for further backups. What I think your're suggesting is to write diff backups to the external disk and increments to the internal disk. What I understand from this is that the external would grow and grow. Also restoring would be more difficult. Wouldn't I at some point need to run another huge full backup. I only know a little of dump, so I maybe missing something?
 
Old 10-24-2014, 09:00 PM   #12
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 529

Rep: Reputation: 319Reputation: 319Reputation: 319Reputation: 319
Sorry, I didn't read your original post closely enough. Now that I understand what you want, I have another suggestion. It might not be too hard to write a script to select and backup files that have been created or modified since the last backup, either full or incremental.

Immediately before the full backup, run rsync with the --list-only option and redirect the output to a file on your internal drive:
Code:
today=$(date +%F)
/usr/bin/rsync -a --list-only /home/ $external_backup_dir/$today > previous_file_list
or equivalently
Code:
/usr/bin/rsync -r /home/ > previous_file_list
That does not back up anything, but creates a list of all files and directories that would have been backed up, with each file's size and modification time. Those are the properties that rsync uses by default to determine whether a file needs to be backed up. At this point, we need only to save it for later use.

Then execute the rsync full backup to your external drive.
Code:
/usr/bin/rsync -a /home/ $external_backup_dir/$today
To begin the incremental backup at some later date, make an updated file list:
Code:
/usr/bin/rsync -r /home/ > current_file_list
Then compare the previous and current lists. At the moment, I couldn't tell you exactly how to do that, but I suspect it can be done without much difficulty in bash; perhaps identify the unchanged items and eliminate them, and also eliminate directories. For each file which has been created or modified, put its name with the full path into a backup to-do list. Then tell rsync to backup from that list to your internal backup directory:
Code:
/usr/bin/rsync -a --files-from=$backup_to-do_list $internal_backup_dir/$today
At this point, you could also identify and list the files that were deleted since the previous backup.

The final step is to update the file list for the next incremental backup:
Code:
mv -f current_file_list previous_file_list
Please note: I haven't tested any of this. There may be syntax errors and/or logical errors that I don't know about.

Last edited by Beryllos; 10-25-2014 at 12:39 AM. Reason: minor correction
 
Old 10-25-2014, 03:11 AM   #13
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
That looks promising. I need to have a look at the compare as I also cannot see an obvious solution at the moment.


What I notice is that using --list-only gives a different and most likely more usable output to -r

Last edited by alleyoopster; 10-25-2014 at 03:18 AM. Reason: Added observation
 
Old 10-25-2014, 09:41 AM   #14
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 529

Rep: Reputation: 319Reputation: 319Reputation: 319Reputation: 319
Quote:
Originally Posted by alleyoopster View Post
That looks promising. I need to have a look at the compare as I also cannot see an obvious solution at the moment.
I would start with the diff command:
Code:
diff previous_file_list current_file_list
That still leaves a lot of work to do. The output of diff will differ somewhat depending on whether the file was created, deleted, or modified. For any of those three cases, the containing directory also changes and this shows up in the diff output. If I were doing this, I would ignore all changes to directories and analyze only the changes of files.


Quote:
Originally Posted by alleyoopster View Post
What I notice is that using --list-only gives a different and most likely more usable output to -r
Funny, I see the exact same output, but it might be due to the content of the source directory. The difference is probably not due to --list-only but rather due to using -r instead of -a. The -a option processes symlinks but I think -r doesn't. It would be better to use -a, or even better to use your exact rsync command plus the --list-only option. Edit: No, I just tried it with symlinks, and I still see no difference between the outputs of those rsync commands. It must be something else.

Last edited by Beryllos; 10-25-2014 at 02:31 PM. Reason: follow-up on last paragraph
 
Old 10-26-2014, 06:20 AM   #15
alleyoopster
Member
 
Registered: Jul 2008
Location: Cape Town
Distribution: Debian testing / unstable
Posts: 38

Original Poster
Rep: Reputation: 0
There have been some issues this weekend with the system. I lost a drive last week, which prompted the search for a better backup solution. Replaced the disk with a relatively new disk I was using for backups, but now getting Buffer IO errors and system hangs intermittently. I checked the old drive and that is definitely dead. The problem actually looked like a SATA cable for a long time, but today it is looking more like a SATA port.

So, putting the backup solution on hold for a bit until I can sort out the system. Thanks you both for your help so far with this.
 
  


Reply

Tags
backup, lvm, rsync, snapshot


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Hard drive partitioning, backup and restore full disk images Tem2 Linux - Newbie 36 01-17-2012 08:33 AM
Full System Backup of Linux from ext to NTFS disk ice21 Linux - Newbie 3 10-11-2010 03:54 PM
Linux full Hard disk backup or mirror charlestsai4001 Linux - Newbie 4 07-08-2010 06:54 PM
backup the RHEL5 server with Cpanel and create full system restore disk zaki Linux - Newbie 1 08-31-2009 11:59 AM
backup disk full bothra Linux - Newbie 1 03-20-2008 09:41 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:56 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration