LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 09-14-2017, 03:46 PM   #1
doetoe
LQ Newbie
 
Registered: Sep 2017
Location: Barcelona
Posts: 6

Rep: Reputation: Disabled
What can I do to make rsync identify two equal files as equal?


I have a directory on my disk drive with all my photo's, and a partial backup with the same directory structure on an external FAT drive. I wanted to use rsync to copy all changes to the external drive, but it doesn't seem to detect that most of the files are equal:

Code:
$ rsync -va --list-only /home/pictures/fotoos /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos > fotos.lst
$ wc -l fotos.lst
36485 fotos.lst
36485 is close to the number of photo's + directories in the source directory:

Code:
$ find /home/pictures/fotoos/ | wc -l
36481
so it looks like all photo's would get copied (if I correctly understand the --list-only option).

Randomly picking two files that should be equal:

Code:
$ md5sum /home/pictures/fotoos/20130109/P1140467.JPG
858c00056ff77c881a7904e1b2a564a5  /home/pictures/fotoos/20130109/P1140467.JPG

$ md5sum /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos/20130109/P1140467.JPG
858c00056ff77c881a7904e1b2a564a5  /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos/20130109/P1140467.JPG
Looks good. Also

Code:
$stat /home/pictures/fotoos/20130109/P1140467.JPG
  File: '/home/pictures/fotoos/20130109/P1140467.JPG'
  Size: 940113          Blocks: 1840       IO Block: 4096   regular file
Device: 805h/2053d      Inode: 23462448    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1001/  silvia)   Gid: ( 1001/  silvia)
Access: 2017-09-14 21:17:51.308387352 +0200
Modify: 2013-01-09 09:23:04.000000000 +0100
Change: 2013-01-09 21:11:44.541059182 +0100
 Birth: -

$ stat /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos/20130109/P1140467.JPG
  File: '/media/doetoe/EXTERNAL_FA/backup-pictures/fotoos/20130109/P1140467.JPG'
  Size: 940113          Blocks: 1856       IO Block: 16384  regular file
Device: 852h/2130d      Inode: 59476       Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  doetoe)   Gid: (  100/   users)
Access: 2017-09-14 21:17:57.000000000 +0200
Modify: 2013-01-09 08:23:04.000000000 +0100
Change: 2016-01-02 18:57:47.000000000 +0100
 Birth: -
Minor differences, mainly the user, group and the low-level file size.

I tried changing the flags to not to try to preserve users and groups, but that doesn't seem to make a difference:

Code:
$ rsync -vr --list-only /home/pictures/fotoos /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos > fotos.lst
$ wc -l fotos.lst
36485 fotos.lst
Does anyone know what is going on and what I can do to only copy new/changed files?
 
Old 09-14-2017, 04:29 PM   #2
michaelk
Moderator
 
Registered: Aug 2002
Posts: 16,349

Rep: Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908
The most likely answer is that the contents of the fotos.lst contains extra lines which include rsync statistics in addition to the file names.
Code:
sending incremental file list
Directory name
file names
.
.
.
blank line
sent x bytes  received y bytes  zzzz.00 bytes/sec
total size is xyz speedup is abc
Since the file contains many lines you can see via the head/tail command
head fotos.lst
tail fotos.lst
 
Old 09-14-2017, 04:42 PM   #3
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,279
Blog Entries: 8

Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
The fundamental issue is that FAT only stores timestamps with a precision of about 2 seconds. Therefore, rsync will not see them as having identical timestamps as the original. NTFS also has this problem.

To solve this, use the "modify-window" flag. For example:

Code:
rsync -tvr share/. bak/
rsync -tvr --modify-window=5 share. bak/
If there are some files which are updated very often, with identical file size, then this might introduce a danger that an updated file might be skipped. Or maybe, consider the following sequence of events:

1) Jo opens up a text editor for a config file.

2) Jo makes a change.

3) Jo saves it.

4) An rsync backup is made

5) Jo sees a typo and changes it and saves it again within 4 seconds; the resulting file is still the same size.

6) The next time an rsync backup is done, it mistakenly skips the changed file, so it has the typo instead of the correction.

If this is a serious concern for you, then maybe tighten up the modify-window to 2 seconds (I think that should be good enough). I use a modify-window of 5 seconds because this scenario just isn't one I'm worried about.
 
1 members found this post helpful.
Old 09-14-2017, 04:59 PM   #4
michaelk
Moderator
 
Registered: Aug 2002
Posts: 16,349

Rep: Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908Reputation: 1908
In addition you can also try the --size-only option but --modify-window= option should work.
 
1 members found this post helpful.
Old 09-14-2017, 08:00 PM   #5
DVOM
Member
 
Registered: Nov 2010
Posts: 172

Rep: Reputation: 37
Quote:
Originally Posted by doetoe View Post
Code:
$ rsync -va --list-only /home/pictures/fotoos /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos > fotos.lst
$ wc -l fotos.lst
36485 fotos.lst
Does anyone know what is going on and what I can do to only copy new/changed files?
I use grsync for this. I experimented with it when I first got it and discovered that you need an "/" after the two "fotoos" folders. So the code above would become this:

Code:
rsync -va --list-only /home/pictures/fotoos/ /media/doetoe/EXTERNAL_FA/backup-pictures/fotoos/ > fotos.lst
Without the "/", what it would do is recreate another complete fotoos folder.

Last edited by DVOM; 09-14-2017 at 08:06 PM.
 
1 members found this post helpful.
Old 09-16-2017, 06:55 AM   #6
doetoe
LQ Newbie
 
Registered: Sep 2017
Location: Barcelona
Posts: 6

Original Poster
Rep: Reputation: Disabled
Thank you all! A combination of your inputs solved it for me. First, I think I misinterpreted the --list-only switch, it looks like it just outputs all files that will be copied, independent of a physical data transfer taking place. Nevertheless, the problem of identical files being copied was still there (only this was not the right way to detect it).

Second, without the trailing slashes it always seems to make a copy, so I added those.

I indeed observed the discrepancy of up to 2 seconds in the time stamps between the two files systems. However, I also had files where the time was off by exactly an hour (due to DST?) and others where the time was just very different, so finally I went with the other option, to only consider sizes. Being only photo's and video's I assume that it will be very rare that I have an edited photo of the same size.

Thanks again!
 
Old 09-17-2017, 10:00 AM   #7
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 8,624
Blog Entries: 4

Rep: Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998
rsync also has a --checksum option, which forces it to calculate and to compare a hash of the file contents, instead of relying on modification-date and file-size as a quick check. (The algorithm is not actually a "checksum.")
 
1 members found this post helpful.
Old 09-17-2017, 11:06 AM   #8
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,279
Blog Entries: 8

Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
That option is good for using rsync over the internet or other slow connection. But for a local drive or LAN, you might as well just copy over all files considering the time consumed reading all files and comparing them to each other.
 
  


Reply

Tags
back-up, fat, rsync


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
A way to make (c)fdisk divide a drive into equal size partitions? bluesword1969 Linux - Software 1 02-11-2010 01:35 PM
Equal Access to files on a NAS form Debian and Windows? bruceam Linux - Newbie 8 02-27-2009 06:35 PM
Auto delete files less than or equal to 3 KB maac_caam Linux - Desktop 2 06-07-2007 11:45 AM
program files dir equal zchoyt Linux - Newbie 2 08-31-2004 03:43 AM
How do I make a user equal to root ? joncolby Mandriva 3 02-27-2004 03:54 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:34 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration