Bash script to copy specified extension file from a directory?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Bash script to copy specified extension file from a directory?
I have a directory and sub-directories (4 or 5 depths). There are several type with extension in them (*.mp3, *.wma, *.jpg, etc). I would like to copy the whole directory to another location recursively but only *.mp3 files.
Any idea?
Click here to see the post LQ members have rated as the most helpful post in this thread.
For the sake of the topic I'll make a copy of the code here.
It's not yet tested but please just try.
Code:
#!/bin/bash
# fcopy.sh
#
# selectively copies files
#
# usage: [bash|.] fcopy.sh <source directory> <target directory> <ext1, [ext2, ...]>
#
LOG_VERBOSE=false
function log_message {
echo "fcopy: $1"
}
function log_error {
message "error: $1" >&2
}
function log_warning {
message "warning: $1"
}
function log_verbose {
if [[ $LOG_VERBOSE = true ]]; then
function log_verbose { echo "fcopy: $1"; }
else
function log_verbose { :; }
fi
log_verbose "$1" # doing this might not be safe but it is
}
function show_usage {
message "usage: $0 -h|--help|-? | [-v|--verbose] <source directory> <target directory> <ext1, [ext2, ...]>"
}
function fail {
log_error "$1"
exit 1
}
function check_directory {
[[ -d $1 ]] || fail "specified $2 $1 does not appear to be directory."
[[ ! -r $1 || ! -x $1 ]] && fail "specified $2 $1 is not searcheable or changeable."
}
function getabspath {
# loader-lite-0L.WP20100217.tar.gz @ sf.net/projects/loader
local -a T1 T2=()
local -i I=0
local IFS=/ A
if [[ $1 == /* ]]; then
read -a T1 <<< "$1"
else
read -a T1 <<< "$PWD/$1"
fi
for A in "${T1[@]}"; do
case "$A" in
..)
[[ I -ne 0 ]] && unset T2\[--I\]
continue
;;
.|'\'\'')
continue
;;
esac
T2[I++]=$A
done
if [[ $1 == */ ]]; then
if [[ I -ne 0 ]]; then
__="/${T2[*]}/"
else
__=/
fi
else
if [[ I -ne 0 ]]; then
__="/${T2[*]}"
else
__=/.
fi
fi
}
function main {
local SOURCEDIR='?' DESTDIR='?' CMD A __
# parse / check arguments
for (( ; $#; )); do
case "$1" in
-h|--help|-\?)
show_usage
exit 1
;;
-v|--verbose)
LOG_VERBOSE=true
;;
*)
if [[ $SOURCEDIR = '?' ]]; then
SOURCEDIR=$1
elif [[ $DESTDIR = '?' ]]; then
DESTDIR=$1
else
break
fi
;;
esac
shift
done
if [[ $# -eq 0 ]]; then
log_error "not enough arguments."
show_usage
exit 1
fi
# get the absolute paths
getabspath "$SOURCEDIR/."; SOURCEDIR=$__
getabspath "$DESTDIR/."; DESTDIR=$__
# check the extensions
for A; do
if [[ $A != [[:alnum:]_] ]]; then # change this if some characters are also required
log_warning "extension $A may not be valid."
fi
done
# check the directories, also create destination directory if needed
check_directory "$SOURCEDIR" "source directory"
pushd "$SOURCEDIR" || fail "failed to change current working directory to $SOURCEDIR"
if [[ ! $DESTDIR = /. || -e $DESTDIR ]]; then
check_directory "$DESTDIR" "destination directory"
else
mkdir -p "$DESTDIR" || fail "failed to create destination directory $DESTDIR."
fi
# prepare the command
CMD="find -type f -name '*.$A'"
shift
for A; do
CMD=$CMD" -or -name '*.$A'"
done
# copy
while read -u 3 A; do
log_verbose "copying $A..."
cp --parent -v "$A" "$DESTDIR" && continue
log_warning "failed to copy $A. press any key to continue.."
read -s -n 1 -p
done 3< <(eval exec "$CMD")
popd # output pushd and popd may also be redirected to /dev/null
exit 0
}
( main "$@"; )
Last edited by konsolebox; 03-15-2010 at 03:00 AM.
Reason: (a) no longer require grep, (b) a minor fix
#!/bin/bash
abort () { echo "ABORTED: $1 missing or doesn't exist!"; usage; }
usage () { echo -e "$0: [-s source dir] [-d dest dir] [-e extention,extention,extention,...]\nex: $0 -s $HOME -d /tmp -e jpg,png,mp3"; exit; }
if [ ! "$1" ]; then usage; fi
until [ -z "$1" ]; do
case "$1" in
"-s") shift; SOURCEDIR="$1";;
"-d") shift; DESTDIR="$1";;
"-e") shift; EXTENTION="$1";;
*) usage;;
esac
shift
done
if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
if [ ! "$DESTDIR" ] || [ ! -e "$DESTDIR" ]; then abort "destination dir"; fi
if [ ! "$EXTENTION" ]; then abort "extention(s)"; fi
echo -e "using source : $SOURCEDIR\nusing destination: $DESTDIR"
EXTENTIONS=( `echo $EXTENTION |sed 's/,/ /g'` )
for ITEM in "${EXTENTIONS[@]}"; do
while read file; do
echo "copying $SOURCEDIR/$file to $DESTDIR/$file"
if [ ! -e "$DESTDIR/`dirname $file`" ]; then mkdir "$DESTDIR/`dirname $file`"; fi
cp "$SOURCEDIR/$file" "$DESTDIR/$file"
done < <(find "$SOURCEDIR" -type f -name "*$ITEM" -printf "%P\n")
done
Since you consider the script as simpler compared to mine I hope you don't mind if I make my own quotes as well.
First, I'm not sure if it's simpler. I think it's more like a squeezed syntax. The script also tend to make unnecessary calls to sed and multiple calls to mkdir. You also haven't included a test if an invalid argument has been added on runtime. Since you considered using a for loop on every extension, if 2 same extensions has been called with, the files will probably be copied twice. At the very least also the files will not be copied in order (based from position in filesystem which I think should be better if followed).
Generally I think the script is more just like a bypass as compared to mine. Bypasses are good but we should make sure that there are no consequence.
Since you consider the script as simpler compared to mine I hope you don't mind if I make my own quotes as well.
Sure, always.
Quote:
The script also tend to make unnecessary calls to sed and multiple calls to mkdir.
Wrong on both counts. Sed is called one time to convert the extention list into an array, which is necessary. Also, mkdir is called only when the full destination path doesn't exist.
Quote:
You also haven't included a test if an invalid argument has been added on runtime.
Yes I have. It checks if you entered a source dir, destination dir, and at least one extention -- and further, if the source and destination dirs actually exist. There is no check if the extention is valid because any file can have any extention so doing so would be absurd.
Quote:
Since you considered using a for loop on every extension, if 2 same extensions has been called with, the files will probably be copied twice.
This is true as I assumed the user wouldn't put the same extention more then once.. However, it's very easy to ensure any duplicates are removed. I've edited the script to deal with this.
Quote:
At the very least also the files will not be copied in order (based from position in filesystem which I think should be better if followed).
I don't see any relevance what order the files are copied so long as they're all copied, which is why I didn't bother to address it.
Quote:
Generally I think the script is more just like a bypass as compared to mine. Bypasses are good but we should make sure that there are no consequence.
My script gives the user the desired result without any consequences. I don't believe in making scripts harder or more complex then they need to be. Simple tasks only need simple scripts 99% of the time. When you address every possible scenario you can dream up, your script gets filled with tons of checks that will hardly, if ever, be triggered.
Btw, I also didn't bother checking if a file being copied is empty/0 bytes but you forgot to point that out.
The only thing I don't like about that method is that it requires the user to have rsync installed. I'm not a fan of making users install packages just to do things that can easily be done without that requirement.
However, it's really a matter of personal preference. If the user wants to install rsync to do the task in 1 line instead of 2 then he's more then welcome to do so.
Thank you all for many choice of doing it. I choose the idea from Guttorm by using rsync. It's simple and easy to understand. When I am free, I will try the others' method.
Wrong on both counts. Sed is called one time to convert the extention list into an array, which is necessary. Also, mkdir is called only when the full destination path doesn't exist.
What I meant is if the arguments are not separated with commas, you won't need sed. Regarding mkdir, maybe you should just use 'cp --parent'?
Quote:
Yes I have. It checks if you entered a source dir, destination dir, and at least one extention -- and further, if the source and destination dirs actually exist. There is no check if the extention is valid because any file can have any extention so doing so would be absurd.
Sorry I didn't read this:
Code:
*) usage;;
Quote:
This is true as I assumed the user wouldn't put the same extention more then once.. However, it's very easy to ensure any duplicates are removed. I've edited the script to deal with this.
ok.
Quote:
I don't see any relevance what order the files are copied so long as they're all copied, which is why I didn't bother to address it.
Maybe you're right. It's just like I have a feeling that something will go wrong if I mess up with the order. Just an intuition and maybe I'm wrong.
Quote:
My script gives the user the desired result without any consequences. I don't believe in making scripts harder or more complex then they need to be. Simple tasks only need simple scripts 99% of the time. When you address every possible scenario you can dream up, your script gets filled with tons of checks that will hardly, if ever, be triggered.
Btw, I also didn't bother checking if a file being copied is empty/0 bytes but you forgot to point that out.
First thing is.. scripts are possible to be made perfect but on our own perception, it's always limited and most of the times, something always goes wrong.
What I meant is I try,.. to build things the way they should be.
I always tend to follow the idea that things can (sometimes should) be made as simple as possible .. but not simpler. My script may look complicated to you but to me it's the most simplest form I can make based on the requirement and based on the range of my perception.
Making it more simple and I'll get the feeling that things will just break.
Last edited by konsolebox; 03-13-2010 at 01:21 AM.
What I meant is if the arguments are not separated with commas, you won't need sed. Regarding mkdir, maybe you should just use 'cp --parent'?
You mean this?
--parents
use full source file name under DIRECTORY
I didn't think that does what you're suggesting based on the description. I know you can make parent dirs with mkdir but I didn't know it could be done with cp as well.
Quote:
Maybe you're right. It's just like I have a feeling that something will go wrong if I mess up with the order. Just an intuition and maybe I'm wrong.
The order of -s -d -e? I normally don't do arguments like that but rather just expect $1 to be source, $2 to be destination, $3* to be the extentions. That method could trim a few lines but I wanted to use a case example as well so he might see how you could add default paths and use -s/-d to override it.
Quote:
I always tend to follow the idea that things can (sometimes should) be made as simple as possible .. but not simpler. My script may look complicated to you but to me it's the most simplest form I can make based on the requirement and based on the range of my perception.
Making it more simple and I'll get the feeling that things will just break.
Your script is very easy to understand, it just contains a lot of unnecessary lines in my opinion. I did used to try to make every script air tight so-to-speak but it was as I said previously, it was just wasting time on conditions that (almost) never occur so I decided I'll just keep things simple unless there's an actual need for complexity. I also assume the user pays at least some attention to what he's doing most of the time -- because he should.
As far as breakage, his request is quite simple. Not a lot of places a breakage can occur.
The cool thing is that this is a good example of very different ideas but both work and get the same end result so it really does just boil down to preference in the end. And it's nice to have options.
--parents
use full source file name under DIRECTORY
I didn't think that does what you're suggesting based on the description. I know you can make parent dirs with mkdir but I didn't know it could be done with cp as well.
Yes I guess the description is quite "different". Using --parents we can copy a whole path to a target directory for example.
Code:
cp --parents a/b.txt c
will create c/a/b.txt
Quote:
The order of -s -d -e? I normally don't do arguments like that but rather just expect $1 to be source, $2 to be destination, $3* to be the extentions. That method could trim a few lines but I wanted to use a case example as well so he might see how you could add default paths and use -s/-d to override it.
No not that order. What i meant is if you group each extension in each loop, all files with only that extension will be copied first before the next extension. Like if we instead use find -name '*.EXT1' -or -name '*.EXT2' we can have a list like
Your script is very easy to understand, it just contains a lot of unnecessary lines in my opinion. I did used to try to make every script air tight so-to-speak but it was as I said previously, it was just wasting time on conditions that (almost) never occur so I decided I'll just keep things simple unless there's an actual need for complexity. I also assume the user pays at least some attention to what he's doing most of the time -- because he should.
As far as breakage, his request is quite simple. Not a lot of places a breakage can occur.
The cool thing is that this is a good example of very different ideas but both work and get the same end result so it really does just boil down to preference in the end. And it's nice to have options.
Well thanks and ok I'll respect the opinion,.. and about preference and options, yes I guess I can agree to that.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.