LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Bash script to copy specified extension file from a directory? (http://www.linuxquestions.org/questions/programming-9/bash-script-to-copy-specified-extension-file-from-a-directory-794890/)

sokha 03-12-2010 01:43 AM

Bash script to copy specified extension file from a directory?
 
I have a directory and sub-directories (4 or 5 depths). There are several type with extension in them (*.mp3, *.wma, *.jpg, etc). I would like to copy the whole directory to another location recursively but only *.mp3 files.

Any idea?

troop 03-12-2010 01:49 AM

Code:

find ./directory -name \*.mp3 -exec cp {} distination_directory \;
need put `distination_directory/{}` to put the file in distination_directory/<path-where-the-file-is-actually-found> which ends in a path.

konsolebox 03-12-2010 02:04 AM

Not too lately I made a post regarding selective recursive copy and posted it here:

http://www.linuxquestions.org/questi...81#post3877881

For the sake of the topic I'll make a copy of the code here.

It's not yet tested but please just try.

Code:

#!/bin/bash

# fcopy.sh
#
# selectively copies files
#
# usage: [bash|.] fcopy.sh <source directory> <target directory> <ext1, [ext2, ...]>
#

LOG_VERBOSE=false

function log_message {
        echo "fcopy: $1"
}

function log_error {
        message "error: $1" >&2
}

function log_warning {
        message "warning: $1"
}

function log_verbose {
        if [[ $LOG_VERBOSE = true ]]; then
                function log_verbose { echo "fcopy: $1"; }
        else
                function log_verbose { :; }
        fi
        log_verbose "$1"  # doing this might not be safe but it is
}

function show_usage {
        message "usage: $0 -h|--help|-? | [-v|--verbose] <source directory> <target directory> <ext1, [ext2, ...]>"
}

function fail {
        log_error "$1"
        exit 1
}

function check_directory {
        [[ -d $1 ]] || fail "specified $2 $1 does not appear to be directory."
        [[ ! -r $1 || ! -x $1 ]] && fail "specified $2 $1 is not searcheable or changeable."
}

function getabspath {

        # loader-lite-0L.WP20100217.tar.gz @ sf.net/projects/loader

        local -a T1 T2=()
        local -i I=0
        local IFS=/ A

        if [[ $1 == /* ]]; then
                read -a T1 <<< "$1"
        else
                read -a T1 <<< "$PWD/$1"
        fi

        for A in "${T1[@]}"; do
                case "$A" in
                ..)
                        [[ I -ne 0 ]] && unset T2\[--I\]
                        continue
                        ;;
                .|'\'\'')
                        continue
                        ;;
                esac

                T2[I++]=$A
        done

        if [[ $1 == */ ]]; then
                if [[ I -ne 0 ]]; then
                        __="/${T2[*]}/"
                else
                        __=/
                fi
        else
                if [[ I -ne 0 ]]; then
                        __="/${T2[*]}"
                else
                        __=/.
                fi
        fi
}

function main {
        local SOURCEDIR='?' DESTDIR='?' CMD A __

        # parse / check arguments

        for (( ; $#; )); do
                case "$1" in
                -h|--help|-\?)
                        show_usage
                        exit 1
                        ;;
                -v|--verbose)
                        LOG_VERBOSE=true
                        ;;
                *)
                        if [[ $SOURCEDIR = '?' ]]; then
                                SOURCEDIR=$1
                        elif [[ $DESTDIR = '?' ]]; then
                                DESTDIR=$1
                        else
                                break
                        fi
                        ;;
                esac

                shift
        done

        if [[ $# -eq 0 ]]; then
                log_error "not enough arguments."
                show_usage
                exit 1
        fi

        # get the absolute paths

        getabspath "$SOURCEDIR/."; SOURCEDIR=$__
        getabspath "$DESTDIR/."; DESTDIR=$__

        # check the extensions

        for A; do
                if [[ $A != [[:alnum:]_] ]]; then  # change this if some characters are also required
                        log_warning "extension $A may not be valid."
                fi
        done

        # check the directories, also create destination directory if needed

        check_directory "$SOURCEDIR" "source directory"

        pushd "$SOURCEDIR" || fail "failed to change current working directory to $SOURCEDIR"

        if [[ ! $DESTDIR = /. || -e $DESTDIR ]]; then
                check_directory "$DESTDIR" "destination directory"
        else
                mkdir -p "$DESTDIR" || fail "failed to create destination directory $DESTDIR."
        fi

        # prepare the command

        CMD="find -type f -name '*.$A'"
        shift

        for A; do
                CMD=$CMD" -or -name '*.$A'"
        done

        # copy

        while read -u 3 A; do
                log_verbose "copying $A..."
                cp --parent -v "$A" "$DESTDIR" && continue
                log_warning "failed to copy $A.  press any key to continue.."
                read -s -n 1 -p
        done 3< <(eval exec "$CMD")

        popd  # output pushd and popd may also be redirected to /dev/null

        exit 0
}

( main "$@"; )


primerib 03-12-2010 02:45 AM

Here's a smaller & more simple script for you:
Code:

#!/bin/bash

abort () { echo "ABORTED: $1 missing or doesn't exist!"; usage; }
usage () { echo -e "$0: [-s source dir] [-d dest dir] [-e extention,extention,extention,...]\nex: $0 -s $HOME -d /tmp -e jpg,png,mp3"; exit; }

if [ ! "$1" ]; then usage; fi
until [ -z "$1" ]; do
        case "$1" in
                "-s")  shift; SOURCEDIR="$1";;
                "-d")  shift; DESTDIR="$1";;
                "-e")  shift; EXTENTIONS=( `echo "$1" |sed 's/,/ /g'` );;
                *)      usage;;
        esac
        shift
done

if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
if [ ! "$DESTDIR" ] || [ ! -e "$DESTDIR" ]; then abort "destination dir"; fi
if [ "${#EXTENTIONS[@]}" == "0" ]; then abort "extention(s)"; fi

echo -e "using source    : $SOURCEDIR\nusing destination: $DESTDIR"

for ITEM in "${EXTENTIONS[@]}"; do
        if echo $CHECK |grep $ITEM &>/dev/null; then continue; fi
        CHECK="$CHECK $ITEM"
        while read file; do
                echo "copying $SOURCEDIR/$file to $DESTDIR/$file"
                if [ ! -e "$DESTDIR/`dirname $file`" ]; then mkdir "$DESTDIR/`dirname $file`"; fi
                cp "$SOURCEDIR/$file" "$DESTDIR/$file"
        done < <(find "$SOURCEDIR" -type f -name "*$ITEM" -printf "%P\n")

done

Just follow the usage.. An example for you:
copy.sh -s /path/to/source -d /path/to/destination -e jpg,mp3,wmv

Use the extention only, no need for *.whatever, or .whatever....

Guttorm 03-12-2010 03:45 AM

Hi

If you want to keep the directory structure, it's simpler to use rsync.

rsync -r -v --include '*/' --include '*.mp3' --exclude '*' sourcedir targetdir

konsolebox 03-12-2010 06:31 AM

@primerib:



Quote:

Originally Posted by primerib (Post 3895444)
Here's a smaller & more simple script for you:
Code:

#!/bin/bash

abort () { echo "ABORTED: $1 missing or doesn't exist!"; usage; }
usage () { echo -e "$0: [-s source dir] [-d dest dir] [-e extention,extention,extention,...]\nex: $0 -s $HOME -d /tmp -e jpg,png,mp3"; exit; }

if [ ! "$1" ]; then usage; fi
until [ -z "$1" ]; do
        case "$1" in
                "-s")  shift; SOURCEDIR="$1";;
                "-d")  shift; DESTDIR="$1";;
                "-e")  shift; EXTENTION="$1";;
                *)      usage;;
        esac
        shift
done

if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
if [ ! "$DESTDIR" ] || [ ! -e "$DESTDIR" ]; then abort "destination dir"; fi
if [ ! "$EXTENTION" ]; then abort "extention(s)"; fi

echo -e "using source    : $SOURCEDIR\nusing destination: $DESTDIR"

EXTENTIONS=( `echo $EXTENTION |sed 's/,/ /g'` )
for ITEM in "${EXTENTIONS[@]}"; do
        while read file; do
                echo "copying $SOURCEDIR/$file to $DESTDIR/$file"
                if [ ! -e "$DESTDIR/`dirname $file`" ]; then mkdir "$DESTDIR/`dirname $file`"; fi
                cp "$SOURCEDIR/$file" "$DESTDIR/$file"
        done < <(find "$SOURCEDIR" -type f -name "*$ITEM" -printf "%P\n")

done


Since you consider the script as simpler compared to mine I hope you don't mind if I make my own quotes as well.

First, I'm not sure if it's simpler. I think it's more like a squeezed syntax. The script also tend to make unnecessary calls to sed and multiple calls to mkdir. You also haven't included a test if an invalid argument has been added on runtime. Since you considered using a for loop on every extension, if 2 same extensions has been called with, the files will probably be copied twice. At the very least also the files will not be copied in order (based from position in filesystem which I think should be better if followed).

Generally I think the script is more just like a bypass as compared to mine. Bypasses are good but we should make sure that there are no consequence.

primerib 03-12-2010 02:32 PM

Quote:

Originally Posted by konsolebox (Post 3895615)
Since you consider the script as simpler compared to mine I hope you don't mind if I make my own quotes as well.

Sure, always.

Quote:

The script also tend to make unnecessary calls to sed and multiple calls to mkdir.
Wrong on both counts. Sed is called one time to convert the extention list into an array, which is necessary. Also, mkdir is called only when the full destination path doesn't exist.

Quote:

You also haven't included a test if an invalid argument has been added on runtime.
Yes I have. It checks if you entered a source dir, destination dir, and at least one extention -- and further, if the source and destination dirs actually exist. There is no check if the extention is valid because any file can have any extention so doing so would be absurd.

Quote:

Since you considered using a for loop on every extension, if 2 same extensions has been called with, the files will probably be copied twice.
This is true as I assumed the user wouldn't put the same extention more then once.. However, it's very easy to ensure any duplicates are removed. I've edited the script to deal with this.


Quote:

At the very least also the files will not be copied in order (based from position in filesystem which I think should be better if followed).
I don't see any relevance what order the files are copied so long as they're all copied, which is why I didn't bother to address it.

Quote:

Generally I think the script is more just like a bypass as compared to mine. Bypasses are good but we should make sure that there are no consequence.
My script gives the user the desired result without any consequences. I don't believe in making scripts harder or more complex then they need to be. Simple tasks only need simple scripts 99% of the time. When you address every possible scenario you can dream up, your script gets filled with tons of checks that will hardly, if ever, be triggered.

Btw, I also didn't bother checking if a file being copied is empty/0 bytes but you forgot to point that out. ;)

primerib 03-12-2010 02:40 PM

Quote:

Originally Posted by Guttorm (Post 3895486)
Hi

If you want to keep the directory structure, it's simpler to use rsync.

rsync -r -v --include '*/' --include '*.mp3' --exclude '*' sourcedir targetdir

The only thing I don't like about that method is that it requires the user to have rsync installed. I'm not a fan of making users install packages just to do things that can easily be done without that requirement.

However, it's really a matter of personal preference. If the user wants to install rsync to do the task in 1 line instead of 2 then he's more then welcome to do so.

sokha 03-13-2010 12:33 AM

Thank you all for many choice of doing it. I choose the idea from Guttorm by using rsync. It's simple and easy to understand. When I am free, I will try the others' method.

Cheers,

konsolebox 03-13-2010 01:20 AM

Quote:

Originally Posted by primerib (Post 3896094)
Sure, always.

Thanks for the diplomacy.
Quote:

Wrong on both counts. Sed is called one time to convert the extention list into an array, which is necessary. Also, mkdir is called only when the full destination path doesn't exist.
What I meant is if the arguments are not separated with commas, you won't need sed. Regarding mkdir, maybe you should just use 'cp --parent'?
Quote:

Yes I have. It checks if you entered a source dir, destination dir, and at least one extention -- and further, if the source and destination dirs actually exist. There is no check if the extention is valid because any file can have any extention so doing so would be absurd.
Sorry I didn't read this:
Code:

                *)      usage;;
Quote:

This is true as I assumed the user wouldn't put the same extention more then once.. However, it's very easy to ensure any duplicates are removed. I've edited the script to deal with this.
ok.
Quote:

I don't see any relevance what order the files are copied so long as they're all copied, which is why I didn't bother to address it.
Maybe you're right. It's just like I have a feeling that something will go wrong if I mess up with the order. Just an intuition and maybe I'm wrong.
Quote:

My script gives the user the desired result without any consequences. I don't believe in making scripts harder or more complex then they need to be. Simple tasks only need simple scripts 99% of the time. When you address every possible scenario you can dream up, your script gets filled with tons of checks that will hardly, if ever, be triggered.

Btw, I also didn't bother checking if a file being copied is empty/0 bytes but you forgot to point that out. ;)
First thing is.. scripts are possible to be made perfect but on our own perception, it's always limited and most of the times, something always goes wrong.

What I meant is I try,.. to build things the way they should be.

I always tend to follow the idea that things can (sometimes should) be made as simple as possible .. but not simpler. My script may look complicated to you but to me it's the most simplest form I can make based on the requirement and based on the range of my perception.

Making it more simple and I'll get the feeling that things will just break.

primerib 03-13-2010 02:17 AM

Quote:

Originally Posted by konsolebox (Post 3896510)
What I meant is if the arguments are not separated with commas, you won't need sed. Regarding mkdir, maybe you should just use 'cp --parent'?

You mean this?

--parents
use full source file name under DIRECTORY

I didn't think that does what you're suggesting based on the description. I know you can make parent dirs with mkdir but I didn't know it could be done with cp as well.

Quote:

Maybe you're right. It's just like I have a feeling that something will go wrong if I mess up with the order. Just an intuition and maybe I'm wrong.
The order of -s -d -e? I normally don't do arguments like that but rather just expect $1 to be source, $2 to be destination, $3* to be the extentions. That method could trim a few lines but I wanted to use a case example as well so he might see how you could add default paths and use -s/-d to override it.

Quote:

I always tend to follow the idea that things can (sometimes should) be made as simple as possible .. but not simpler. My script may look complicated to you but to me it's the most simplest form I can make based on the requirement and based on the range of my perception.

Making it more simple and I'll get the feeling that things will just break.
Your script is very easy to understand, it just contains a lot of unnecessary lines in my opinion. I did used to try to make every script air tight so-to-speak but it was as I said previously, it was just wasting time on conditions that (almost) never occur so I decided I'll just keep things simple unless there's an actual need for complexity. I also assume the user pays at least some attention to what he's doing most of the time -- because he should.

As far as breakage, his request is quite simple. Not a lot of places a breakage can occur.

The cool thing is that this is a good example of very different ideas but both work and get the same end result so it really does just boil down to preference in the end. And it's nice to have options. ;)

konsolebox 03-15-2010 02:51 AM

Quote:

Originally Posted by primerib (Post 3896544)
You mean this?

--parents
use full source file name under DIRECTORY

I didn't think that does what you're suggesting based on the description. I know you can make parent dirs with mkdir but I didn't know it could be done with cp as well.

Yes I guess the description is quite "different". Using --parents we can copy a whole path to a target directory for example.
Code:

cp --parents a/b.txt c
will create c/a/b.txt

Quote:

The order of -s -d -e? I normally don't do arguments like that but rather just expect $1 to be source, $2 to be destination, $3* to be the extentions. That method could trim a few lines but I wanted to use a case example as well so he might see how you could add default paths and use -s/-d to override it.
No not that order. What i meant is if you group each extension in each loop, all files with only that extension will be copied first before the next extension. Like if we instead use find -name '*.EXT1' -or -name '*.EXT2' we can have a list like
Code:

files/0.EXT1
files/a.EXT2
files/Z.EXT1
files/_.EXT2

instead of
Code:

files/0.EXT1
files/Z.EXT1
files/a.EXT2
files/_.EXT2

Not really critical so just let it go.

Quote:

Your script is very easy to understand, it just contains a lot of unnecessary lines in my opinion. I did used to try to make every script air tight so-to-speak but it was as I said previously, it was just wasting time on conditions that (almost) never occur so I decided I'll just keep things simple unless there's an actual need for complexity. I also assume the user pays at least some attention to what he's doing most of the time -- because he should.

As far as breakage, his request is quite simple. Not a lot of places a breakage can occur.

The cool thing is that this is a good example of very different ideas but both work and get the same end result so it really does just boil down to preference in the end. And it's nice to have options. ;)
Well thanks and ok I'll respect the opinion,.. and about preference and options, yes I guess I can agree to that. :)


All times are GMT -5. The time now is 05:21 PM.