| Linux - Server This forum is for the discussion of Linux Software used in a server related context. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
06-07-2008, 02:36 AM
|
#1
|
|
LQ Newbie
Registered: Mar 2007
Posts: 22
Rep:
|
Theory behind full, differential and incremental backups?
Hi,
I would like to know the theory behind different backup strategies. I know that full backup copies entire data each time it runs, differential copies the data changed since last full backup and incremental copies the data changes since last backup (full/incremental). I also know how these strategies affect the data restoration.
But, what I would like to know is;
1) Exact commands which can be used for different strategies. I guess scp or rsync can be used for full backups, and rsync with --delete can be used for differential backups.
2) How can I run differential/incremental backups for week days. Can I get differential/incremental backups for each day into separate folders named Mon, Tue, etc? Or all of them should go into a single location?
3) How the commands/folders know last backup run on this folder was full or differential or incremental? Is there any bit or identifiers set to recognize it?
Please let us know the details with possible commands.
Thanks in Advance !!!
Regards,
Mohammed.
|
|
|
|
06-07-2008, 04:05 PM
|
#2
|
|
Senior Member
Registered: Aug 2007
Location: Massachusetts, USA
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,189
Rep: 
|
hmm. Not sure your title really describes what you are after.
Sounds like you want to use rsync (on a single system?) and you want incremental backups. You can find all the details on how to do it with a snapshot like approach here: http://www.mikerubel.org/computers/rsync_snapshots/
If you are interested in ideas and approaches (I'm not sure I would go so far as to call it theory), there is a short bit I wrote here: http://wiki.zmanda.com/index.php/FAQ...da_use_them%3F (pieces of that contributed by others, of course). I like Amanda's planner strategy, because when you are doing many machines and disk list entries, it distributes the fulls and incrementals of different levels across the dump cycle in such a way as to even out loads and tape usage. Instead of having a huge peak when everything is getting a full, and then a lull through several days when everything is getting incrementals, you have an evenly distributed load (on your network, servers, backup system, tapes, etc.)
Look around and you will find many strategies and approaches. Browse through Curtis Preston's web site http://www.backupcentral.com, or buy his book "Backup & Recovery" published by O'Reilly. The newest edition is January 2007. You'll find just about anything you want there, although (a plus from my perspective) it focuses on open source software rather than getting into all the commercial stuff that's available. I understand he's working on something that will go into the commercial options.
|
|
|
|
06-14-2008, 09:13 AM
|
#3
|
|
LQ Newbie
Registered: Mar 2007
Posts: 22
Original Poster
Rep:
|
Thanks for your reply.
Yes, I would like to get basic rsync commands for different backup strategies. Backups will be taken from one machine to another, not into the same one. I'm not looking for any products.
http://www.mikerubel.org/computers/rsync_snapshots/ has good information about rsync. But, it doesn't answer all my questions
Using the commands below, we can have differential backups for each day separately by transferring only the data which has been changed.
mv backup.0 backup.1
rsync --progress -ab --delete --link-dest=../backup.1 source-dir/ backup.0/
Eventhough the command above transfers only the files changes since *backup.1*, the new backup *backup.0* will contain all files, like source_directory, when rsync gets completed. I guess the files from *backup.1* is getting copied to *backup.0* locally (if backup is on another server) before rsync transfers modified files.
If *backup.0* has only the files which got transferred (i.e., the modified/new files), it was best suitable for differential backup. Also, that will help you to get separate differential backups for the days you run it.
Then, about the title. I wanted to know how rsync finds the files needs to be copied. Is it just by comparing modification timestamps on both sides? It will be fine for differential backups, but of course, not for incremental backups. Now, I really wonder if both differential and incremental concepts are suitable for Linux/Unix systems. Additionally, I have heard Windows (ntbackup.exe) does full/differential/incremental backups based on flag *folder/file is ready for archiving* set on the files/folders which need to be backed up.
Please let me know your thoughts. Thanks in Advance.
Regards,
Mohammed.
|
|
|
|
06-14-2008, 10:18 AM
|
#4
|
|
Moderator
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733
|
If you use tar for the backups, look in section 5.2 of the tar info manual. You can use the --listed-incremental option to create a snapshot file. You probably want to save the backups in the same directory for each cycle. Most people append the data to the filename.
A differential backup is one where files altered since the first full backup are saved. So each differential backup will get larger. The advantage is that you only need to restore the system from the full backup and the last differential backup.
|
|
|
|
06-14-2008, 02:36 PM
|
#5
|
|
Senior Member
Registered: Aug 2007
Location: Massachusetts, USA
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,189
Rep: 
|
Quote:
Originally Posted by mohammednv
Then, about the title. I wanted to know how rsync finds the files needs to be copied. Is it just by comparing modification timestamps on both sides? It will be fine for differential backups, but of course, not for incremental backups. Now, I really wonder if both differential and incremental concepts are suitable for Linux/Unix systems. Additionally, I have heard Windows (ntbackup.exe) does full/differential/incremental backups based on flag *folder/file is ready for archiving* set on the files/folders which need to be backed up.
|
Before going into any of your other questions, let me say (quoting Curtis W. Preston, Author of the O'Reilly "Backup & Recovery"), "The Windows archive bit is evil." http://www.computerworld.com/hardwar...100697,00.html
In particular, no one backup program can count on having control of the archive bit. If you use 2 different mechanisms for backing up (say one local to a usb drive and one network to a server), one may clear the archive bit, and then the other will never know that it should backup that file.
Most competent backup programs will use their own mechanisms for tracking and will not rely on the archive bit. Dump will keep the last time a partition was dumped in /etc/dumpdates. Tar can track its own as well. Many backup programs keep their own databases of what has been backed up and when.
Looking at timestamps is one of the most typical and will work for either differential or incremental. There just needs to be a record of when the last full was run as well as the last backup of any sort.
|
|
|
|
06-15-2008, 08:29 AM
|
#6
|
|
LQ Newbie
Registered: Mar 2007
Posts: 22
Original Poster
Rep:
|
hey guys, thanks a lot for your replies.
tar with --listed-incremental=snapshot-file looks good one. I think it can be used for both differential and incremental by handling "snapshot-file" carefully. I mean, keep a copy of full backup's "snapshot-file" and compare next backups with that same copy if you want differential instead of incremental. Of course, "snapshot-file" will be updated each time tar runs...we should keep a copy of full backup's snapshot-file which won't be over-written until next full backup runs. Am I right?
Also, I got incremental/differential working with rsync. It's done with --link-dest flag. But, still I've few questions unanswered in mind with this
I used the command below to take differential backup:
mv backup.0 backup.1
rsync --progress -ab --delete --link-dest=../backup.1 source-dir/ backup.0/
Of course, source-dir contains files which have been updated after first backup. As I said in my previous comment, new backup *backup.0* contains all files from source-dir ( which have been modified as well as not modified). File properties show they all as actual files, not links (check with "file" and "ls -l" commands). Their space utilization also has something special. See this:
-------------------------------------------------------------------
localhost $ du -sh backup.*
31M backup.0
2.0K backup.1
2.0K backup.2
localhost $ du -sh backup.0
31M backup.0
localhost $ du -sh backup.1
21M backup.1
localhost $ du -sh backup.2
11M backup.2
localhost $ du -sh backup.1 backup.2
21M backup.1
2.0K backup.2
-------------------------------------------------------------------
If I check size of one file, it shows its actual size. When I check size of two files at same time, latest backup shows full size and older one shows comparative (older one - latest). Do you know what I kind of files/links these are? How the restoration should be? Looks like restoration need only one backup file which we need to be restored.
~mohammed.
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 02:42 AM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|