LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 09-10-2012, 11:34 AM   #1
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Rep: Reputation: 0
Unhappy rsync problem - syncs all files


Ok, so this one has me flummoxed more than somewhat.

I'm supposed to sync a folder (including sub-folders and contents) between one machine and another. Simple, and straightforward, right? Well, here's where it gets fun. rsync is supposed to sync the difference between the two folders (and their sub-folders and contents).

What this rsync command seems to do is re-sync all folders and contents irrespective of what is happening.

Note: As this is used to keep images (which aren't small things at the best of times) in sync between web servers, it's set in a cron job to run every minute, but fails to complete in less than four minutes. The server load creeps up, the server falls over, etc.

Here's the command I'm running:
Code:
rsync 'server1.co.uk:/home/dev/public/images/' '/home/dev/public/images' -av0L --stats
And this is the result every time.
Code:
Number of files: 80643
Number of files transferred: 14779
Total file size: 5281629189 bytes
Total transferred file size: 916813386 bytes
Literal data: 9320 bytes
Matched data: 916804066 bytes
File list size: 2953774
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 6265807
Total bytes received: 7531562

sent 6265807 bytes  received 7531562 bytes  72048.92 bytes/sec
total size is 5281629189  speedup is 382.80

real    3m10.737s
user    0m7.012s
sys     0m4.468s
I've checked file ownership, and indeed set file ownership to be identical across both servers. And now the time stamps have been set to within seconds of each other, and it's just made things worse! HELP!
 
Old 09-10-2012, 01:55 PM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,777

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
If the timestamps are not identical, then both the source and destination have to spend time generating and comparing checksums to determine how much changed data needs to be sent (just 9320 bytes for the case you presented). If the destination is on any variety of FAT filesystem, its timestamps have a resolution of 2 seconds, and any source timestamp that happens to be an odd number of seconds can never match. You can use rsync's "--modify-window=NUM" option to compare modification times with reduced accuracy. Setting that to 2 seconds is generally sufficient, but you might have to go to 3602 seconds to get around problems with Daylight Savings Time changes.
 
2 members found this post helpful.
Old 09-10-2012, 01:55 PM   #3
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,824

Rep: Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615
Are either the source or destination filesystems FAT32?

Quote:
Originally Posted by man rsync
--modify-window
When comparing two timestamps, rsync treats the timestamps as being equal if they
differ by no more than the modify-window value. This is normally 0 (for an exact
match), but you may find it useful to set this to a larger value in some situa‐
tions. In particular, when transferring to or from an MS Windows FAT filesystem
(which represents times with a 2-second resolution), --modify-window=1 is useful
(allowing times to differ by up to 1 second).
 
1 members found this post helpful.
Old 09-11-2012, 03:20 AM   #4
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Original Poster
Rep: Reputation: 0
Hi rknichols and AlucardZero.

Thanks for both of your contributions. Thankfully, or unfortunately depending on your viewpoint, neither of the servers run FAT filesystems. They are both ext3. I have tried with the --modify-window set to 3602 just in case we had some stupid with the timezones going on, but that isn't the issue. I also tried (purely to see if there was an issue with the timezones) setting them both to be on UTC and then using NTP to get the clocks in sync. None of the above actions has helped.

Keep those suggestions rolling in, please! :-)
 
Old 09-11-2012, 03:52 AM   #5
chandhokshashank
LQ Newbie
 
Registered: Sep 2011
Posts: 20

Rep: Reputation: Disabled
How far these machines are?? And how the data is being transferred?? I mean over a LAN or WAN?
 
Old 09-11-2012, 10:02 PM   #6
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,777

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Do the timestamps match after rsync has done its thing? That should happen regardless of whether the two clocks are in sync since rsync will use the utime() system call to copy the timestamp from the source to the destination.

You could try using the "--itemize-changes" option to see what differences rsync believes exist. (You'll need the manpage to interpret the result.)
 
1 members found this post helpful.
Old 09-12-2012, 05:06 AM   #7
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Original Poster
Rep: Reputation: 0
Hi

chandhokshashank - LAN connection although I'm not sure of the relevance
rknichols - it does appear to be the same, so no issues there. However, I'll try the --itemise-changes option with the man page as suggested. This week seems to be rsync week for me, so nothing to lose, etc.

Many thanks for suggestions so far, and not to appear ungrateful, but do keep them coming in, cos I read every one of them. I really do appreciate it, and I'll keep you all posted.
 
Old 09-12-2012, 06:52 AM   #8
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
1. although you can specify the start time of a cron job down to a minute, creating a new process every minute hammers the system as it has to create an entire new env each time.
Its also quite common to end up having them trip over each other (as you've seen).
2. My rule of thumb is to write a daemon for anything more frequent than every 5 mins and just have it sleep at the bottom of the loop for however long you'd like.
This dramatically lessens the load and also prevents the problem of multiple copies running.
 
2 members found this post helpful.
Old 09-12-2012, 08:19 AM   #9
SecretCode
Member
 
Registered: Apr 2011
Location: UK
Distribution: Kubuntu 11.10
Posts: 562

Rep: Reputation: 102Reputation: 102
As rknichols observed, the literal data sent is just 9320 bytes, so rsync is definitely not "re-syncing all folders and contents irrespective" as you suggested. However, it's comparing about 20% of the 80,000 files so the question is why. --itemize-changes will give you a clue, I suppose.

If it's not the time stamp, my guess is that the owner or group for some of the files is not getting transferred - perhaps because of permissions on the target.

Maybe it takes 3 minutes just to scan through the 80,000 files.
 
1 members found this post helpful.
Old 09-13-2012, 03:23 AM   #10
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Original Poster
Rep: Reputation: 0
Exclamation

quick reply as yesterday was spent managing server performance due to rsync issues (told you it was one of those weeks! ).

chrism01, your point sounds very interesting. I don't suppose you'd have a link to hand about creating daemons, would you? I'd surely appreciate it.

SecretCode and rknichols, now that these servers seem more stable, I'm going to do an in-depth investigation of --itemise-changes to understand this better and move myself a step closer to being an rsync guru!

Kudos and ratings to all of you wonderful contributors, and as before, keep them coming...
 
Old 09-13-2012, 05:40 AM   #11
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Original Poster
Rep: Reputation: 0
Further update now that I'm back and looking at it again:

Major props to rknichols and SecretCode for re-prompting - now I understand --itemize-changes. However having done the following:
Code:
chmod 770 -R /path/to/images
and
Code:
chown owner:group -R /path/to/images
to both servers, I'd have thought that this would have meant the timestamps were now the only issue, but on running the latest
Code:
rsync --itemize-changes
I find that there are still files within this folder that say there are permissions issues, and are therefore rsyncing on the basis of timestamps and permissions.

What the hell am I missing here?
 
Old 09-13-2012, 06:15 AM   #12
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Original Poster
Rep: Reputation: 0
well, I've done the above and now run rsync twice. I'm most definitely confused by this - any help truly appreciated.

Last edited by MarcusWebb1966; 09-13-2012 at 07:08 AM. Reason: premature response - modded to show actual result
 
Old 09-13-2012, 08:42 AM   #13
MarcusWebb1966
Member
 
Registered: Mar 2011
Posts: 58

Original Poster
Rep: Reputation: 0
I'm not convinced that this is part of the cause of the problem, but I thought I'd throw it in there. Possible red herring alert

User is called owner, group is group.

On server server1 owner is listed in /etc/passwd as owner:x:1000:1000::/home/owner
On the other server owner is listed in /etc/passwd as owner:x:1000:441::/home/owner

Both these GIDs are for the group group
Is it possible that this discrepancy is the cause of the files with permission issues?
 
Old 09-13-2012, 01:18 PM   #14
SecretCode
Member
 
Registered: Apr 2011
Location: UK
Distribution: Kubuntu 11.10
Posts: 562

Rep: Reputation: 102Reputation: 102
Could well be ... I'll bet that rsync compares numeric group ids not group names, and (unless you have an LDAP system in place) there's no guarantee these match between different hosts.

Try a test with (I think) --no-g
 
1 members found this post helpful.
Old 09-13-2012, 04:46 PM   #15
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,777

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by SecretCode View Post
Could well be ... I'll bet that rsync compares numeric group ids not group names,
No, as long as the same name exists on both machines, rsync defaults to comparing the names, not the numbers, with the exception of UID 0 and GID 0. See the section for "--numeric-ids" in the manpage.
 
2 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
rsync can not rsync files with include filter... xiutuo Linux - Server 2 07-23-2010 02:10 AM
need to rsync only selected files (--files-from) also need to delete files on dest. ? BrianK Linux - General 5 10-22-2009 09:52 PM
Could I run rsync to download files from a server without rsync daemon? Richard.Yang Linux - Software 1 09-18-2009 04:08 AM
Problem using rsync to back up large files to external harddrive donnaaparis Linux - Software 7 05-20-2009 03:05 PM
[rsync] get the differences between the source files and the existing files djgerbavore Linux - Networking 2 06-04-2008 12:05 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 08:48 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration