ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm working on a robust backup technique for the company for which I work, and one of the things that I need to do is to copy a bunch of files, but only if they have been updated.. This by itself isn't a problem. The problem arises when I need to exclude certain files types from the cp list. I can explain why if need by, but I don't think that is terribly relevant right now.
Anybody know of any easy methods for doing this? I have a work around using find, but it seems kludgy.
First I define a list of file extensions to ignore. Then, I build a string that will regex match those file extensions. Next, I loop through a series of paths, showing backup from/to locations, using an inverted iregex w/ find to ignore those files, calling to cp from an exec. This seems like such a backwards way to accomplish the task, which does get accomplished, but I'd sure be happier if there was a cleaner way to do it.
Here's my code thusfar:
Code:
#!/bin/bash
# define cad file extensions which should be archived/date-name changed
#
cad="dc3 dc5 dwg aec dxf"
# convert that list into the regex needed to match those files
#
for i in $cad; do
if [ "x" != "x$reg" ]; then
reg="$reg\|"
fi
reg="$reg$i"
done
# Full system backup.
# We start by doing update copies of all the files, except for
# cad files. pathinfo is "readpath writepath"
#
for pathinfo in "/support /backup1" \
"/files /backup1" \
"/home /backup1" ; do
path=($pathinfo)
find ${path[0]} -type f ! -iregex ".*\($reg\)" -exec cp --parents -vpRu '{}' ${path[1]} ';'
done
Perhaps much, if not all, of the functionality you need
is already present in the command, rsync.
Rsync copies files from a destination to a source, either
one of which may or may not be on the current machine.
In other words, it will copy across a network or it will copy
on the same machine.
It has options for exclusions, compression, deleting files
on the target that have been deleted on the source, and
time stamp/size checking.
That may very well be the ticket! How does rsync compare to cp as for speed? I'm sure that it will be faster than my current method of using find (since find calls to cp for every individual file, regardless of whether or not it needs to be backed up).
Thanks a bunch! I'll investigate and report my conclusions.
Ok, I've had time to play around with rsync and examine the uses in regards to my need. For excluding files, it works great. I simply list the file matching patterns, and I'm good to go. However, I've a slight problem with using it in the opposite manner. The first pass, I'm ignoring certain file types.. With the second pass, I want to deal with only the file types I ignored in the first pass. I found that using --include doesn't mean "exclusively include" but "include also", which won't do for my purposes at all.
So, I ended up doing a little something like this:
Code:
# define the list of cad file extensions, and our from:to pathlist
#
cad="dc3 dc5 dwg aec dxf"
pathlist="/support:/backup1 /files:/backup1 /home:/backup1"
# convert the cad list into the regex needed to match those files
# with find
#
for i in $cad; do if [ "x" != "x$reg" ]; then reg="$reg\|"; fi
reg="$reg$i"
done
# Next, we do only cad file backups, but check their info based on whether
# or not they have changed, and back them up, archive them, and add the date
# to the name (all done from within 'backup_check')
#
for pathinfo in ${pathlist[@]}; do
readpath=${pathinfo%:*} # strip off all after (leave all before) the colon
writepath=${pathinfo#*:} # strip off all before (leave all after) the colon
find $readpath -type f -iregex ".*\($reg\)" \
-exec /usr/sbin/backup_check.sh '{}' $readpath $writepath ';'
done
It works well, too. It's also faster than I expected, even though I'm using find. Anyway, this has solved my scripting needs. Hopefully someone else will benefit from the code, too. (=
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.