LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-24-2014, 03:13 PM   #1
Bluehaze43
LQ Newbie
 
Registered: Oct 2014
Location: Land of the cheese, and home of the beer.
Posts: 7

Rep: Reputation: 0
Rsync and backup questions


For starters, I still know nearly nothing about Linux. However, I do know that I need to use Rsync to have the data from my on-site NAS copied to my off-site NAS.

So; that being said, here are my questions (and I apologize if they're dumb).

I currently run two different backup software. I use Acronis for a monthly "clone" of each machine (25 in all), and I use Cobian for a daily backup of drawings and documents. I also have a third pile of data which is my archives of every job we've ever done (all raw data - PDFs, DWGs, TIFFs, XLS, everything).

Will Rsync be able to block level sync these files from Acronis and Cobian? The Cobian output is a simple *.zip file, but the Acronis is a *.TIB file. I do not have near the bandwidth to send my daily back-ups over the internet and even have them complete before the next day begins; let alone running a monthly clone in full. I know Rsync will do exactly what is desired for my archived jobs, as the data is raw, and it rarely gets changed.

My second question is, if I copy all of the data from NAS1 to NAS2 over my LAN, and then take NAS2 off-site, will Rsync still attempt to run a 100% full replication? Meaning, will it be able to "look ahead" and see the files are already sync'd, thus setting a baseline for future (and significantly smaller) Rsync jobs?

Again, my apologies for what may seem as a stupid question. I just do not wish to move all of this data multiple times if it is not required.

If I have left out any pertinent information, please ask, and I will check back often.

Thank you,

Blue
 
Old 10-24-2014, 03:38 PM   #2
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
Quote:
Originally Posted by Bluehaze43 View Post
My second question is, if I copy all of the data from NAS1 to NAS2 over my LAN, and then take NAS2 off-site, will Rsync still attempt to run a 100% full replication? Meaning, will it be able to "look ahead" and see the files are already sync'd, thus setting a baseline for future (and significantly smaller) Rsync jobs?
That won't be a problem. rsync does not keep any kind of database of what it's synced in the past or to where it synced it, everything is done on the fly. When you initiate the rsync command, the first thing it does is build up a file/directory list on both the source and the destination to determine what files need to be sent (if any), and then it sends them. rsync doesn't care how those files on the destination got there, all it cares about is whether or not they need to be updated.

Last edited by suicidaleggroll; 10-24-2014 at 03:40 PM.
 
1 members found this post helpful.
Old 10-24-2014, 03:40 PM   #3
Bluehaze43
LQ Newbie
 
Registered: Oct 2014
Location: Land of the cheese, and home of the beer.
Posts: 7

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by suicidaleggroll View Post
That won't be a problem. rsync does not keep any kind of database of what it's synced in the past or to where it synced it, everything is done on the fly. When you initiate the rsync command, the first thing it does is build up a file/directory list on both the source and the destination to determine what files need to be sent (if any), and then it sends them. rsync doesn't care how those files on the destination got there, all it cares about is whether or not they need to be updated.
Awesome, thank you for that information suicidaleggroll.

One down, one to go!
 
Old 10-24-2014, 04:06 PM   #4
jailbait
LQ Guru
 
Registered: Feb 2003
Location: Virginia, USA
Distribution: Debian 12
Posts: 8,340

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by Bluehaze43 View Post


Will Rsync be able to block level sync these files from Acronis and Cobian? The Cobian output is a simple *.zip file, but the Acronis is a *.TIB file. I do not have near the bandwidth to send my daily back-ups over the internet and even have them complete before the next day begins; let alone running a monthly clone in full. I know Rsync will do exactly what is desired for my archived jobs, as the data is raw, and it rarely gets changed.
I do not know anything about Acronis and Cobian so I cannot give you an authoritative answer. I suspect that your first set of rsync backups will have to be full backups. i.e. I doubt that rsync can start by using your current backup files as a base. Perhaps somebody else can give a better answer.


Quote:
Originally Posted by Bluehaze43 View Post

My second question is, if I copy all of the data from NAS1 to NAS2 over my LAN, and then take NAS2 off-site, will Rsync still attempt to run a 100% full replication?
rsync does not keep a master list of what has been previously copied. rsync determines what has been copied by checking the destination to see what is already there. Therefore if you take NAS2 off-site and substitute a new NAS2 then rsync will copy anything that is not on the current NAS2.

I get around this problem by using three off-site backup media in rotation. When I copy to an off-site backup rsync is working against a three day old backup. Therefore rsync copies every file that has changed in the last three days.

One of the reasons that I use a three day rotation for both my on-site and off-site backup is that I have rsync delete any file from the backup which is no longer on the original. That is a two edged sword. It keeps the backup from growing forever larger but it also means that I could lose a file from the backup if I don't discover the loss before the next backup. With a three day rotation I have three days to discover an accidental deletion before I lose the file entirely.

You could also continue your method of both a daily backup and a monthly backup with rsync. You would simply use the same rsync script for both. You could get by with a single monthly backup media but you should have at least two NAS2 media.


--------------------------
Steve Stites
 
1 members found this post helpful.
Old 10-24-2014, 04:17 PM   #5
Bluehaze43
LQ Newbie
 
Registered: Oct 2014
Location: Land of the cheese, and home of the beer.
Posts: 7

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by jailbait View Post
I do not know anything about Acronis and Cobian so I cannot give you an authoritative answer. I suspect that your first set of rsync backups will have to be full backups. i.e. I doubt that rsync can start by using your current backup files as a base. Perhaps somebody else can give a better answer.


rsync does not keep a master list of what has been previously copied. rsync determines what has been copied by checking the destination to see what is already there. Therefore if you take NAS2 off-site and substitute a new NAS2 then rsync will copy anything that is not on the current NAS2. Steve Stites
Will Rsync look inside of the *.zip file and only change what needs changing, or is it going to re-copy the entire changed zip file? If it will replace the whole zip file, I am doomed, and need to re-create our entire back-up process.


Quote:
Originally Posted by jailbait View Post
You could also continue your method of both a daily backup and a monthly backup with rsync. You would simply use the same rsync script for both. You could get by with a single monthly backup media but you should have at least two NAS2 media.


--------------------------
Steve Stites
I run Acronis, so that I have a "ready to go" complete clone of a user (created once per month) that I can have up and running - even on totally different hardware if need be; within a couple of hours. Augmented by their daily backup, which is merely the zip file, they'll never lose a thing (so long as I don't).

I run my on-site backup to a RAID 6 NAS, so I feel about as safe as I can there. I am very concerned about how I am going to get all of the information to my off-site NAS efficiently.

Thank you.

Last edited by Bluehaze43; 10-24-2014 at 04:20 PM. Reason: typos
 
Old 10-24-2014, 04:52 PM   #6
jailbait
LQ Guru
 
Registered: Feb 2003
Location: Virginia, USA
Distribution: Debian 12
Posts: 8,340

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by Bluehaze43 View Post
Will Rsync look inside of the *.zip file and only change what needs changing, or is it going to re-copy the entire changed zip file? If it will replace the whole zip file, I am doomed, and need to re-create our entire back-up process.
rsync will do block level syncing. Whether it will do so with files produced by Acronis and/or Cobian, I have no idea. If nobody else can give you a definitive answer to this question I suggest that you run some small scale tests to find out.

Quote:
Originally Posted by Bluehaze43 View Post

I run Acronis, so that I have a "ready to go" complete clone of a user (created once per month) that I can have up and running - even on totally different hardware if need be; within a couple of hours. Augmented by their daily backup, which is merely the zip file, they'll never lose a thing (so long as I don't).
That's good. I would like to comment on compression. Compression saves you bandwith and storage space. Compression also makes restores more complex and time consuming. Whether you use compression or not is a trade off. Since you are very concerned with minimizing restore time you might consider not compressing the backup.

-------------------------
Steve Stites
 
1 members found this post helpful.
Old 10-26-2014, 07:18 PM   #7
mlp68
Member
 
Registered: Jun 2002
Location: NY
Distribution: Gentoo,RH
Posts: 333

Rep: Reputation: 40
I think you got the answers to at least one of the questions - few people seem to be familiar with your apparently commercial backup programs.

Let me point out that long ago I had a similar problem with barely enough resources to complete daily backups in time. Then I adopted a new scheme using rsync that allows me to keep *full* daily backups of several entire areas, users, database, software install, etc. The trick, easily accomplished with rsync (because it has been designed with this in mind), is that each day's backup only copies modified files as true new files. In that sense, it behaves like an incremental backup, but it's really not - the area for a given day has the complete set of files, so there is no need to restore the most recent full backup and overwrite with the incrementals.

My users appreciate this since I allow them read-only access to the daily backup area, so they can get back, in self-service mode, yesterday's (or any other day's) version of a file or a directory if they inadvertently deleted it, or modified a version that hadn't yet been checked in. I can usually keep the most recent 80 days of backups online, although that of course depends on the specific usage patterns.

With your constraints that you outline, maybe you should check this method out. I believe I posted about this here before, but I can explain the method (and share the script) if you think it's worthwhile. Please let me know.

- mlp
 
1 members found this post helpful.
Old 10-30-2014, 11:42 AM   #8
Bluehaze43
LQ Newbie
 
Registered: Oct 2014
Location: Land of the cheese, and home of the beer.
Posts: 7

Original Poster
Rep: Reputation: 0
First, to address mlp68's post above; I am running all Windows machines (24 Win7 pro and 4 XP pro). I have no experience with Linux, and am learning Rsync (which is why I am looking for help/guidance here - Rsync being Linux based). Here is how far I have gotten as of today.

OK, I copied all of the files from my 1st NAS to my 2nd NAS over my LAN, and took the 2nd NAS off-site last night. There is about 3.5TB of data on each NAS now.

Mistake #1 was not setting the directory structure exactly the same on both units. This is going to become a nightmare to fix, but I think that I have to do it now; before I get even deeper. After I get the directories looking identical, I'll report back with my progress (or total lack thereof).

Rsync did just fine with the Cobian *.zip files, it took 26 minutes to run an original Rsync backup (490.28MB zip file), and only two minutes to Rsync it again after Sending and receiving a few emails, adding a folder and editing an excel file (493.09MB zip file). This is great news, at least for my daily back-ups.

I am still left with the task of syncing everything the first time, as I don't believe that the two NAS units can be synced locally, and then remotely using the same task, as they are set-up now. *facepalm* I can upload about 1.27GB per hour, which is really not all bad in the scheme of things. It is painfully slow in the fact of a first time off-site replication.

Last edited by Bluehaze43; 10-30-2014 at 03:29 PM.
 
Old 11-06-2014, 03:10 PM   #9
Bluehaze43
LQ Newbie
 
Registered: Oct 2014
Location: Land of the cheese, and home of the beer.
Posts: 7

Original Poster
Rep: Reputation: 0
Ok, update (and slight review) time...


The review

I am trying to sync my local NAS (a Synology DS1513+) to my off-site NAS (a QNAP TS 219P II) using Rsync. I have no problems making the connection and transferring data, but here is my problem...

Also, on a side note, why can I not "browse" the destination directory (s) using Rsync on my QNAP? This seems like something that should be available or easily handled, no? Each destination directory had to be set up as its' own Rsync share in order to maintain the same directory structure.

The Update

I copied my data using Robocopy, from the Synology to the QNAP over my LAN (in-house) before moving the QNAP off-site (thank God for gigabit and aggregation!). Why does Rsync still want to re-copy everything the first time? It will take almost 80 days (at my throttled 125KB/s connection) just to copy my daily back-ups (which is actually an 8 day running rotation) the first time. I still need to sync my monthly clones (even larger file size), and all of my archives (even larger still file size - think TBs).

As I said, ALL of the information was copied over using Robocopy, and maintaining the same exact directory structure; so why isn't Rsync seeing the directories match, and not copying anything?

There has to be some way to "trick out" the Rsync to think it has already completed the first round of remote sync, as I (nor anyone else) can possibly wait 80 days to sync a daily backup. It would never catch up to anything current.

How in the world do you folks deal with your first time off-site replications? I am graying and balding faster than I should over this.

Again, thanks in advance for any help!
 
Old 11-07-2014, 09:31 PM   #10
mlp68
Member
 
Registered: Jun 2002
Location: NY
Distribution: Gentoo,RH
Posts: 333

Rep: Reputation: 40
The way I see it, there are two possibilities. Either you are not rsync'ing to the right place. This is actually quite easy. Let's say you are syncing from /source/a and have already a copy from a few days ago in place in /backup/a at the backup location. Now you want to update, and do

rsync -a /source/a/ /backup/

and you'll see that you copy all over again because the right syntax would be

rsync -a /source/a /backup/

without the trailing "/".

The easiest way to figure this out is to rsync just one file, or a very small directory, and see where it actually ends up. My bet is that you "drop" the files in the wrong location.

The other possibility is that the clock of the backup system is way off (it would need to be in the past). Then all files would be flagged as modified since the date of the source files appears newer.

If you don't make any headway, please send us the ls -l output from two small directories at the source and the backup (with full paths, and the assorted rsync command) that you believe should not get copied again.

- mlp
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Backup rsync parameter --chmod=Du+wx and more questions JZL240I-U Linux - Software 9 03-08-2013 03:27 AM
Rsync + hard links backup strategy questions TrainingPeaks Linux - Server 3 07-25-2012 10:59 AM
LXer: Rsync Backup for Windows, Linux Knoppix, and Other Smart Technologies in Handy Backup by Novos LXer Syndicated Linux News 0 12-24-2011 11:43 AM
LXer: Backup with rsync and rsync.net LXer Syndicated Linux News 0 09-14-2010 04:20 PM
Using RSync to backup a secondary off-site backup server pezdspencer Linux - Software 4 06-29-2007 03:40 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration