LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   rsync include/exclude problems (http://www.linuxquestions.org/questions/linux-software-2/rsync-include-exclude-problems-636504/)

toaster.waffle 04-19-2008 07:05 PM

rsync include/exclude problems [solved]
 
Hey,

I'm trying to set up rsync backup for my laptop and I'm having a tiny problem. I want to backup certain subfolders in Library, but only those specific subfolders.

This is my exclude-from file:

Code:

...
# Exclude all of Library except some...
+ /Library/Application Support/Adium 2.0/
+ /Library/Application Support/Firefox/
+ /Library/Application Support/Cyberduck/
+ /Library/Application Support/Camino/
+ /Library/Application Support/iTerm/
- /Library/
...

However, nothing in Library is backed up.
I've also tried

- /Library
- /Library/*
- /Library/**

each of which I've tried to put before and after included ones with no success.

If it helps I'm executing the following rsync command:

Code:

/usr/local/bin/rsync -e 'ssh -x -o Compression=no -p '$DEST_PORT''
  $@ --archive --verbose --progress --stats --exclude-from=$EXCLUDE_FILE
  $SRC $DEST_USER@$DEST:$DEST_FOLDER

Any help is appreciated.

bgoodr 04-20-2008 09:35 PM

Quote:

Originally Posted by toaster.waffle (Post 3126528)
Code:

/usr/local/bin/rsync -e 'ssh -x -o Compression=no -p '$DEST_PORT''
  $@ --archive --verbose --progress --stats --exclude-from=$EXCLUDE_FILE
  $SRC $DEST_USER@$DEST:$DEST_FOLDER


I feel your pain. rsync's syntax is quirky (and there are good reasons for that). I've loaded my shotgun, and will keep shooting in random directions until I hit the varmint, as follows:
  1. Watch out for the significance of the trailing forward-slash on source and destination directory paths that are in the include or exclude file you pass to rsync. Read the manual (not necessarily just the man page) for rsync to gain clarity on its significance. What can happen is that the rsync can execute without failure, but not really do anything. I've got burned by that many more times than I'd like to admit. What I think you want is a trailing slash on all directories in which you want to transfer, since usually you intend to transfer all files underneath those directories.
  2. You have to be careful with the shell syntax (different from the rsync syntax) in that the way you have expressed that command may or may not work as you expect it to if the value of $DEST_PORT contains spaces. I believe that the two single-quote characters after $DEST_PORT in your command really don't do anything for you. What I would recommend is using double quotes from inside the single quoted region of text so as to tell the shell not to interpret embedded spaces in the $DEST_PORT as separate command-line arguments. Try something like this:
    Code:

    /usr/local/bin/rsync -e 'ssh -x -o Compression=no -p '"$DEST_PORT" $@ --archive --verbose --progress --stats --exclude-from=$EXCLUDE_FILE $SRC $DEST_USER@$DEST:$DEST_FOLDER
    Notice that the $DEST_PORT variable reference is moved outside of the single quote, and also wrapped inside double quotes so that the calling shell expands it and any embedded spaces within its value.
  3. In your script in which you have that rsync command, turn on verbose command output with
    Code:

    set -x
    so that you can see what the shell (not rsync) is interpreting as it executes the command. If that isn't enough, try using my spitargs.pl utility shown below to diagnose it further.
  4. Look at the rsync manual again for how the include or exclude path syntax is. I believe ordering of the expressions matters greatly. I think you do the includes before you do the excludes, but that is just from memory so may be incorrect.
  5. Try the rsync command locally on the machine from one set of directories to another set, where the directories are local. Get that working first, then add in the complexity of the -e option. This is to eliminate any confusion that ssh adds in.
  6. When you say "laptop", perhaps that laptop is running Linux, or perhaps it is running Cygwin where the /Library is really rooted in some directory path underneath the C:/ drive point, but not exactly right under C:/ as you and I would naturally expect it to. I think this is what puzzled me when I was trying to sync my Windows files back to my Linux box in such a fashion a long while back until it dawned on me to play with the cygpath command (provided by Cygwin) to figure out where the paths lived as expressed in Windows parlance.

bgoodr

P.S.: Below is my handy-dandy command-line inspection utility that I call "spitargs.pl" because it simply "spits out" the args that are passed to the command line when you execute it, but does not actually execute the command. For example, I would use it with your command like this to see what how the shell is interpreting things (the backslashed-newlines are my own addition to make it look purty):
Code:

perl spitargs.pl /usr/local/bin/rsync \
  -e 'ssh -x -o Compression=no -p '$DEST_PORT'' \
  $@ --archive --verbose --progress --stats \
  --exclude-from=$EXCLUDE_FILE \
  $SRC $DEST_USER@$DEST:$DEST_FOLDER

Here is my spitargs.spl script for your reference. I encoded this in Perl so as to avoid any future confusion about interpretation of shell parameters inside the utility script since there are such variations in the various implementation of Bourne or Korn shell on Solaris, Linux, Cygwin, and MKS shells (the latter two in use on Windows only AFAIK):

Code:

#!/usr/bin/env perl
#
#  Description:  Spits out command line arguments to show what the shell thinks are separate option words.
#

use strict;

foreach my $arg (@ARGV) {
    print "arg <$arg>\n";
}

Oh and by the way, avoid C-shell like the Black Plague. It is so riddled with bugs on various platforms. The "C" stands for Crap. ;)

toaster.waffle 04-21-2008 02:03 AM

the solution...
 
Haha, thanks for the reply.

I am backing up a Macbook which uses bash, so everything is good. The /Library folder houses settings and data for programs. I want this backed up in the event my laptop dies and I need a chat log or my bookmarks, for example, or if I want to restore application layouts/preferences. I'm using rsyncx (I think it's just rsync with HFS+ support).

The script was actually hard-coded, but I run SSH on a different port so I wanted to hide my preferences with shell variables that could be understood. I actually changed this in my script after posting, and the '...''$DEST_PORT'' was the remedy to an error I was getting (thinking it was the literal of $DEST_PORT). I'll keep in mind the different quotes for the future.

On to my problem... I've tried to find several man pages on rsync and the information they provide hasn't provided a wide enough range of examples to help me with my problem.

To figure out what's wrong, here's some selections, courtesy of http://www.samba.org/ftp/rsync/rsync.html.

Quote:

If the pattern starts with a / then it is anchored to a particular spot in the hierarchy of files, otherwise it is matched against the end of the pathname. This is similar to a leading ^ in regular expressions.
/Library is Library in the transfer's root. This appears correct.

Quote:

If the pattern ends with a / then it will only match a directory, not a regular file, symlink, or device.
I hope to include the directory /Library/Application Support/Camino/ for example. I want to exclude everything in the directory /Library/ but the ones I included. This may be the problem, but let's continue.

Quote:

Note that, when using the --recursive (-r) option (which is implied by -a), every subcomponent of every path is visited from the top down, so include/exclude patterns get applied recursively to each subcomponent's full name (e.g. to include "/foo/bar/baz" the subcomponents "/foo" and "/foo/bar" must not be excluded).
This may be the problem. In fact, this sounds like the problem to me. However, I want to avoid excluding all the other directories that aren't included, as that list of directories may change.

Let's look at wildcards...

Quote:

a '*' matches any non-empty path component (it stops at slashes).

use '**' to match anything, including slashes.

a '?' matches any character except a slash (/).
Including a wildcard with my exclude (ie /Library/** to exclude everything) must be excluding the folders I want to include as well as those files/folders that I do not since this does not remedy my situation.

Quote:

The combination of "+ foo/", "+ foo/bar.c", and "- *" would include only the foo directory and foo/bar.c (the foo directory must be explicitly included or it would be excluded by the "*")
What this says to me is I should be able to go
Code:

+ /Library/Calendars/
- /Library/*

to have Calendars/ included... This works! However, this only works for Calendars/.

So by the same token, I should be able to include Application Support/ alone. However, I want to include, again, only select folders in Application Support/... Thus, I'll do the same thing I did for /Library and Calendars/ with the subdirectory.

Code:

+ /Library/Application Support/Adium 2.0/
+ /Library/Application Support/Firefox/
+ /Library/Application Support/Cyberduck/
+ /Library/Application Support/Camino/
+ /Library/Application Support/
+ /Library/Calendars/
- /Library/Application Support/*
- /Library/*

Success! So the trick is not exclude the folder as a whole, but exclude everything in the folder. I had to include the parent subdirectory of the third level directories I wanted to include, but also exclude everything in that parent subdirectory (as illustrated with the red text)!

RTFM really works! I won't tell you I didn't read it before I came here in desperation because that would be a lie.

Woo!

bgoodr 04-21-2008 12:12 PM

Quote:

Originally Posted by toaster.waffle (Post 3127610)
I won't tell you I didn't read it before I came here in desperation because that would be a lie. Woo!

Great you found the solution! The way inclusion and exclusion works in rsync is quite baffling. There are (probably) good reasons for it. Yes, the manual does help, but you probably had to read those paragraphs about 10 times before it starts to make sense (ask me how I know). :)

Good Luck!
bgoodr

jgte 11-19-2009 10:57 AM

In a more general tone, if you want to include all subdirs A of such a structure:

X/B/A
X/B/Y
X/B/Z
X/C/A
X/C/K
X/C/L
X/C/M
X/D/A

... while leaving out all non-A dirs (X,Y,K,L,M), then the following exclude files should do the job:

+ /X/*/A/
+ /X/*/
- /X/*/*
- /X/*

Thank you for the clues.


All times are GMT -5. The time now is 07:43 AM.