LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Batch Rename (https://www.linuxquestions.org/questions/linux-newbie-8/batch-rename-412643/)

shelfitz 02-07-2006 11:28 AM

Batch Rename
 
I have have a bunch of .img files and .hdr files that I need to rename (they are all in the same directory). Specifically, I'd like to rename them to include a subject identifier. For example, I'd like to rename both types of files such that:

con_0040.img (or .hdr) > 4001_con0040.img (or .hdr)

I'm a complete novice when it comes to Linux, and I am in the unenviable position of having to rename hundreds of files! I'd be really grateful for any assistance.

ksgill 02-07-2006 11:35 AM

You should be able to do that by using a shell script - something similar to this:
http://aplawrence.com/Linux/rename.html

muha 02-08-2006 02:41 AM

First of all make a backup copy of the entire folder!
If you only have .img or .hdr files in the dir:

It looks like you want this:
Rename all files with con_ to 4001_con so that should be:
/edit: i typed a mistake here before, instead try this:
rename 'con_' '4001_con' *

or
for i in *; do mv $i `echo $i | sed 's/con_/4001_con/g'`; done
(mind the `'s)
For all files in this dir; do move from old name ($i) to new name where we replace con_ with 4001_con
-------------
If you have other files with other extensions in the same folder:
rename 'con_' '4001_con' *.img
Or both extensions together:
rename 'con_' '4001_con' *.img *.hdr

for i in *.img; do mv $i `echo $i | sed 's/con_/4001_con/g'`; done
Or both extensions together:
for i in *.img *.hdr; do mv $i `echo $i | sed 's/con_/4001_con/g'`; done

I'm not really sure on your renaming scheme. Do all files get the same prefix 4001_con ?

shelfitz 02-08-2006 06:01 PM

Thanks for this! I will try it tomorrow when I'm back in the lab. Yes, all the files in the said directory will get the same numerical prefix, though the prefix will change as a function of the particular subject directory that I'm working with at the time, if that makes sense (so for any given directory, I will want to add the same identifying prefix to the .img and .hdr files, but it will be 4001 ... 4002 ... 4003 and so on).

muha 02-09-2006 04:11 AM

Ah! If you want a counter you should look at the link posted by ksgill, and use the part with x=$((x+1))
If you provide a better example of the actual filenames you are dealing with i could give the proper command for it. Just put down three or four names of them and what the filenames should become. Anyways you can do it with a for-loop with a counter.

shelfitz 02-09-2006 03:11 PM

Here's what I want to do:

old name: con_0027.img
new name: 4001_con0027.img

old name: con_027.hdr
new name: 4001_con027.hdr

The directories consist of both .hdr and .img files and there will be 27-40 of each (i.e., con_027 ... con_028 ... etc.). All the files in a given directory will get the same numerical prefix. As I navigate to each new subject's directory (I'm dealing with fMRI data), I will have to do the same thing, only I will change the prefix to that particular subject's identification number. I need to do this because I have to create one directory for all the subject files, so there has to be something to distinguish between subjects since the files are named the same in each person's directory

I may not be making sense, but I hope I am. Thanks again for your help with this!

muha 02-10-2006 04:16 AM

ok! This is getting interesting :) Now my question is: where does the prefix come from?
Maybe it's like the directory is called 4001 for subject 4001
and 4002 for subject 4002?

Also: con does not change per subject right? Meaning: all files from all directories will get the _con part in it?

Do you want to be able to do multiple directories in one go? Is that an option? I'd try that if i were you since it saves alot of browsing the directories.

/edit: i have a working command for you if you just answered yes the the questions above:
A couple things:
1. first make a backup of the directories yourself!
2. i called the directory where you put all the renamed files: outputdir
You can change it to any name.
3. To create the outputdir (called outputdir) type this command:
Code:

mkdir ./outputdir
4. when you execute this command i'm assuming you are one directory up of the subject-dirs.
So when you do a directory listing we should see this:
Code:

ls
4001  4002  4003  4004  outputdir

5. it will copy all *.img and *.hdr from all subdirectories, if want to exclude directories: let me know.
6. you could also put this command in a script. If you are interested in that, let me know.
7. for this 'problem' there are many possible solutions, i think this one works reasonably well.
8. in my earlier post i was using the move (mv) command, i think it is safer to use copy (cp), so i used it here.
9. I am assuming you named all the subject directories like so: 4001
And another patient directory could be called 4002

Anyways here is the command:
Code:

for i in `find . -name "*.img" -o -name "*.hdr"|grep -v outputdir`; do cp -i $i ./outputdir/`dirname $i |sed 's/^.\///g'``basename $i | sed 's/con_/_con/g'`; done
(make sure to COPY this command to your terminal since it contains backticks `` that can be easily missed when typing by hand.
To paste it in your terminal do a middle-mouse button click or paste with right-click)

Explanation:
for i in `find . -name "*.img" -o -name "*.hdr"|grep -v outputdir`;
find all files with extensions .hdr and .img in all subdirectories. Exclude the output directory (since it may also contain files from previous runs) with grep.
-------------------------------------------------------------------

do cp -i $i
copy the file from its original location to
(the -i option is to specify that we don't want to overwrite any files in the outputdir so
if a file already exists it will prompt you with a question whether to overwrite or not)

./outputdir/
(copy to) ./outputdir specifies the output directory that will contain all copied&renamed files.

`dirname $i |sed 's/^.\///g'`
take the original directory-name, it will output the patient prefixes like 4001. Strip the leading ./ with sed to give a name like 4001

`basename $i | sed 's/con_/_con/g'`;
specify the name of the file to copy to. Replace con_ with _con

done :)
-------------------------------------------------------------------

sources:
http://www.splike.com/howtos/bash_fa...sion%2C+etc%3F

Off-topic: mri rules! I was a subject for a study into gambling-addiction this year, i think it was mostly structural mri.
For the record: i'm was one of the negative-test-group :P

I'm interested whether this solution works for you, if have questions feel free to ask!

shelfitz 02-11-2006 07:37 AM

muha,

Wow, the time you're spending on this--your efforts are truly appreciated! Unfortunately, I haven't had any time to process data in the last few days (busy scanning), but once I do, I will certainly let you know how things worked out! Thanks, too for the explanations of each command--as a novice this is incredibly helpful!


Very best,
SF

muha 02-11-2006 10:03 AM

no problem! I'm in the process of learning about this myself so i have usefull text lying around on how to do these things.
Also, i'm trying to score karma :D
cya,
Muha

Robin Tell 03-03-2006 04:47 PM

variations
 
Great, this is getting very close. I'm another newbie with a similar issue--actually a simpler version I think--but I need a couple more wrinkles ironed out.

I've migrated a bunch of files from my Mac to the new Linux box. I've been a Mac user all my life, and these files, going back to 1991 or so, are mostly named without filetypes and commonly with spaces in the filenames. I just want to go through them and turn them all into compliant filenames.

So two things I need to add to what's been explained already: first, how to I batch-append to filenames? In some cases I have a directory whose contents are all MSWord docs, so it's like the
rename .htm .html *.htm
referenced before, but without the .htm to hang my hat on. I'm sure there's some simple way to signify "at the end of the string" but I haven't found it. I thought it would be $ but no.

Secondly: to strip out the spaces, I'll probably need to use a loop, which I haven't tried yet; how can I make that loop recursive, and de-space the whole directory structure in one go?

Thanks!

RST

muha 03-04-2006 11:33 AM

First to remove all spaces, and for instance to turn them into underscores _
Code:

rename ' ' '_' *
does the same as:
Code:

for i in *; do mv "$i" `echo $i | tr ' ' '_'`; done
Replace all spaces with underscores when moving files. You're 'moving' them to the same directory so basicly it's renaming spaces to underscores.

I'm not sure what you mean with the .htm part but ..
for renaming all files in a dir to *.htm:
Code:

for i in *; do mv "$i" `echo $i.htm`; done
as an example: finding the end of a string and append .htm to it:
Code:

for i in *; do echo $i|sed 's/$/&.htm/g'; done
$ does mean end of the string
s for subdstitute
g for global, so multiple times per line
& inserts the previously matched characters

/edit: i haven't found a one-liner that replaces spaces with underscores recusively.
My guess is your best option would be a rename bash-script.
Maybe this one: Space Replace 1.0.1: http://www.novell.com/coolsolutions/tools/15601.html

shelfitz 10-08-2006 05:28 AM

muha,

I should have checked in sooner to let you know how helpful your suggestions have been! I've been using parts of your code, and it has saved me and my colleagues a bunch of time!

Mostly we've used the for loop when working with individual directories, but now I'm getting greedy and want to try what you've provided to do many directories all at once. I just tested the sequence you provided:

for i in `find . -name "*.img" -o -name "*.hdr" | grep -v outputdir`; do cp -i $i ./outputdir/`dirname $i |sed 's/^.\///g'``basename $i | sed 's/con_/_con/g'`; done

It works brilliantly, but there is something I would like to tweak. The .img and .hdr files I need to rename are not directly in 4001 for example, but are in a subdirectory (i.e., 4001/task/faces). I need to get the files from this subdirectory, prepend the 4001 to them and copy them to an output directory. Is there an easy modification for this?

muha 10-08-2006 06:41 AM

Hey Shelfitz, nice to hear things are working out over there :D
I wrote a small script which I think does what you want:
Code:

#!/bin/bash
# Copy all *.img and *.hdr files from subdirs and rename then to <outputdir><dirnumber>_<con number>.extension
# If the files were not present in the output directory, it gives feedback in the form of:
# 'old filename' -> 'new filename' (because of the -v verbose option in the copy)
# If the files were present in the outputdir it gives NO feedback.

# Rename script that copies files called old, into new:
# old ./4001/task/faces/con_0027.img
# new ./outputdir/4001_con0027.img
# old ./4001/task/faces/con_0027.hdr
# new ./outputdir/4001_con0027.img
# old ./4002/task/faces/con_028.img
# new ./outputdir/4002_con028.img
# etc ..

# usage:
# 1) save this script as rename_script.sh
#    Put it in the same dir as the ./4001 directories and the outputdir.
# 2) Make sure the outputdirectory is present. Set it as variable output_dir
# 3) chmod the script to make it executable: chmod +x rename_script.sh
# 4) When invoked like ./rename_script.sh it will output a test.
#    Look at the output and when that looks ok:
# 5) Invoke the script like so: ./rename_script |sh
#    in order to execute the output of the script (the actual copy commands)

# assumptions:
# 1) all subfolders of ./4001 must start with a letter. In the example t for task
# 2) none of the main folder can start with a letter but must be called as a number: 4001
# 3) It should never overwrite files in the outputdir. See variable copy_overwrite.

# set some variables
# the name for the output directory (set it to what you want)
output_dir="outputdir"
# overwrite file in outputdir when it is already present when copying
copy_overwrite="no"

# The find and copy command. A for loop for each file present.
# First find all files in subdirs called *.hdr and *.img (excluding subdir $output_dir).
# Variable $i is now in the form of ./4001/task/faces/con_0027.img
for i in `find . -name "*.img" -o -name "*.hdr" | grep -v ${output_dir}`; \
# The copy bit; -i --reply=no is not to overwrite files in $output_dir
# -n to delete the trailing endline to keep it on one line
# -v is for verbose mode when copying (not essential)
# What comes out of the echo is: cp -v -i --reply=no ./4001/task/faces/con_0027.img
do echo -n "cp -v -i --reply=${copy_overwrite} ${i} " ; \
# followed by
# ${i%%/[a-zA-Z]*} leaves only the ./4001 part
# In other words, it deletes anything coming after ./4001 which has the form: slash letters; so /task/faces
#
# sed 's#^\.#\.\/'"${output_dir}"'#g'
# This sed converts the output of the echo: ./4001
# First it looks for 'the begin of the sentence' followed immediately by a .
# it then replaces . by ./outputdir and echos out the rest, which is /4001
# So in total this first echo outputs: ./outputdir/4001
echo -n ${i%%/[a-zA-Z]*}| sed 's#^\.#\.\/'"${output_dir}"'#g'; \
# followed by a replace of ./4001/task/faces/con_0027.img into _con0027.img
# What sed does exactly: replace anything (.*) (here: ./4001/task/faces/) followed by con_ and the trailing bit,
# with _con followed by the trailing bit (0027.img)
echo ${i}|sed 's#.*con_#_con#g'; done

All the parts with \ at the end get strung together in one big line, like the previous for-loop you used.
/note: It would be nice to include some more checks to see if the output dir exists and all went well. Maybe some options to feed into the script from the commandline. You could also do a checksum to see if all went well: count the number of files in the input dirs and compare that to the ouput after the copy (or something like a proper hash-checksum). Also, I have not thouroughly tested it yet. Do that yourself in a testdir and try some weird situations to see if still works. I'd say always backup the input dirs before you start (if possible).
I'm sure it's not up to professional standards, but hey: it's free and it works for me! Anyways, have fun :D

The output I get from when testing and running it:
Code:

$ ls -R *
rename_script.sh

4001:
task/

4001/task:
faces/

4001/task/faces:
con_0027.hdr  con_0027.img

4002:
task/

4002/task:
faces/

4002/task/faces:
con_0028.hdr

outputdir:

$ ./rename_script.sh
cp -v -i --reply=no ./4001/task/faces/con_0027.img ./outputdir/4001_con0027.img
cp -v -i --reply=no ./4001/task/faces/con_0027.hdr ./outputdir/4001_con0027.hdr
cp -v -i --reply=no ./4002/task/faces/con_0028.hdr ./outputdir/4002_con0028.hdr

$ ./rename_script.sh |sh
`./4001/task/faces/con_0027.img' -> `./outputdir/4001_con0027.img'
`./4001/task/faces/con_0027.hdr' -> `./outputdir/4001_con0027.hdr'
`./4002/task/faces/con_0028.hdr' -> `./outputdir/4002_con0028.hdr'

$ ls -R *
rename_script.sh*

4001:
task/

4001/task:
faces/

4001/task/faces:
con_0027.hdr  con_0027.img

4002:
task/

4002/task:
faces/

4002/task/faces:
con_0028.hdr

outputdir:
4001_con0027.hdr  4001_con0027.img  4002_con0028.hdr

In the last line you can see that the outputdir now contains the *.img and *.hdr
If you have questions, feel free to ask!

shelfitz 10-09-2006 03:09 PM

Muha,

Wow, you're incredible! Thanks for working on this. I did run into a problem though:

$ ./rename_script.sh
cp -v -i --reply=no ./4001/task/faces/con001.hdr ./outputdir/4001./4001/task/faces/con001.hdr
cp -v -i --reply=no ./4001/task/faces/con001.img ./outputdir/4001./4001/task/faces/con001.img
cp -v -i --reply=no ./4002/task/faces/con001.hdr ./outputdir/4002./4002/task/faces/con001.hdr
cp -v -i --reply=no ./4002/task/faces/con001.img ./outputdir/4002./4002/task/faces/con001.img
cp -v -i --reply=no ./4003/task/faces/con001.hdr ./outputdir/4003./4003/task/faces/con001.hdr
cp -v -i --reply=no ./4003/task/faces/con001.img ./outputdir/4003./4003/task/faces/con001.img

$ ./rename_script.sh | sh
`./4001/task/faces/con001.hdr' -> `./outputdir/4001./4001/task/faces/con001.hdr'
cp: cannot create regular file `./outputdir/4001./4001/task/faces/con001.hdr': No such file or directory
`./4001/task/faces/con001.img' -> `./outputdir/4001./4001/task/faces/con001.img'
cp: cannot create regular file `./outputdir/4001./4001/task/faces/con001.img': No such file or directory
`./4002/task/faces/con001.hdr' -> `./outputdir/4002./4002/task/faces/con001.hdr'
cp: cannot create regular file `./outputdir/4002./4002/task/faces/con001.hdr': No such file or directory
`./4002/task/faces/con001.img' -> `./outputdir/4002./4002/task/faces/con001.img'
cp: cannot create regular file `./outputdir/4002./4002/task/faces/con001.img': No such file or directory
`./4003/task/faces/con001.hdr' -> `./outputdir/4003./4003/task/faces/con001.hdr'
cp: cannot create regular file `./outputdir/4003./4003/task/faces/con001.hdr': No such file or directory
`./4003/task/faces/con001.img' -> `./outputdir/4003./4003/task/faces/con001.img'
cp: cannot create regular file `./outputdir/4003./4003/task/faces/con001.img': No such file or directory

I'm not quite sure what's happening here. Your assumptions were spot on (i.e., my directory contains individual subject directories (4001, 4002, etc.) and within each there are identically named subdirectories. My actual subdirectory pathway differs from the example I provided (it is 4001/stats/snod) but I tested things as if were named 4001/task/faces and it still produced an error. By the way, I just created an output directory named outputdir so I did not change the variable name in the script.

Also, one other thing. Is it possible to have the program look in a specific subdirectory in each subject directory (i.e., 4001/stats/snod)? There are other subdirectories that have .img and .hdr files that I do not want to rename at this time.

Many thanks for your help and time, Muha! Your documentation has really helped me understand what you are doing!

Cheers,
SF

muha 10-09-2006 04:45 PM

/edit:
I see now. Your filenames are different than I thought.
./4001/task/faces/con001.hdr
instead of
./4001/task/faces/con_001.hdr
That's why my script does not work (yet).

Are these filenames consistent? Are they all named:
con<number>.<img or hdr>

- I'll look at /stats/snod
So you want to be able to specify per copy-action which subdirectory you want to use? /task/faces/ or /stats/snod ?

muha 10-10-2006 08:37 AM

I wrote this script underneath which works better then the first one.
It's a littlebit more robust and gives warnings if files already exist with similar names in the outputdir.
TEST IT on directories which are safe to destroy! I tried some things which I think might go wrong and it works over here but you never know what's happening at your side.
So make a copy of some directories and start playing with those first.
One thing I tried to build in is to prevent overwriting of
/task/faces/con0027.img
with
/stats/snod/con0027.img
to the same outputdir: outputdir/con_0027.img
This script should prevent from that situation.

You can now specify the source directories like /task/faces/ or /stats/snod/
If you need to know more about how I did things, feel free to ask.
Let me know how it works for you, m'kay? ;)


Code:

#      -------------------------------------------------------------------
#
#      Shell program to copy all *.img and *.hdr files from subdirs and
#      rename them to ./<outputdir>/<dirnumber>_con<number>.extension
#
#      Copyright 2006, <scriptfreak at gmail.com>
#
#      This program is free software; you can redistribute it and/or
#      modify it under the terms of the GNU General Public License as
#      published by the Free Software Foundation; either version 2 of the
#      License, or (at your option) any later version.
#
#      This program is distributed in the hope that it will be useful, but
#      WITHOUT ANY WARRANTY; without even the implied warranty of
#      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
#      General Public License for more details.


#      -------------------------------------------------------------------
#      Constants
#      -------------------------------------------------------------------

        PROGNAME=$(basename $0)
        VERSION="0.0.2"

#      -------------------------------------------------------------------
#      Variables
#      -------------------------------------------------------------------

        # set some variables
        # make sure the source_dir is not specified yet
        source_dir=
        # the name for the output directory (set it to what you want)
        output_dir="outputdir"
        # overwrite file in outputdir when it is already present when copying
        copy_overwrite="no"

#      -------------------------------------------------------------------
#      Functions
#      -------------------------------------------------------------------

function test_if_file_does_not_exist
{

#      -----------------------------------------------------------------------------------------------
#      Function to test if a file exists
#              argument is $filename which is the inputfile
#              returns 0 (succes) if the file does not exist
#              usage:
#                      test_if_file_exists <inputfile> || error_exit "<Filename> does not exist! Exiting"
#                      OR
#                      if test_if_file_exists <inputfile>;then echo "Succes!"; else echo "Failure"; fi
#      -----------------------------------------------------------------------------------------------

        if [ "$1" != "" ]; then
                if [ -e $1 ]; then
                        return 1
                else
                        # filename is not empty and filename does not exist
                        return 0
                fi
        else
                return 1
        fi
}


function find_files
{

#      -------------------------------------------------------------------
#      Function to find files called *.img and *.hdr from ./<number>/$source_dir/
#              argument is $source_dir which is the subdirectory to read from
#              So now the find command might look like find ./4001/task/faces/ -name etc ...
#              Exists with an error message if the specified subdirs do not exist.
#      -------------------------------------------------------------------

        # First find all files in subdirs called *.hdr and *.img
        # | sed -n "/^\.\/[0-9]*\/[a-zA-Z]/p" excludes all directories that do start with ./<letter> since we expect these
        # to be output directories and we don't want to copy from them. It also filters out subdirectories that start with numbers.
        find ./*/$1 -name "*.img" -o -name "*.hdr"| sed -n "/^\.\/[0-9]*\/[a-zA-Z]/p" || error_exit "Cannot find files in the subdirectory: $1"

}


function find_and_copy
{

#      -------------------------------------------------------------------
#      Function to find and copy from specified dir
#              argument is $source_dir which is the subdirectory (of ./<number>) to read from
#      -------------------------------------------------------------------

        # The find and copy command. A for loop for each file present.
        # Variable $i is now in the form of ./4001/task/faces/con_0027.img
        for i in `find_files $1`
        do
        # Specify the output file
        # ${i%%/[a-zA-Z]*} leaves only the ./4001 part of ./4001/task/faces/con0027.img
        # In other words, it deletes anything coming after ./4001 which has the form: slash letters; so /task/faces/con0027.img
        #
        # sed 's#^\.#\.\/'"${output_dir}"'#g'
        # This sed converts the output of the echo: ./4001
        # First it looks for 'the begin of the sentence' followed immediately by a .
        # it then replaces . by ./outputdir and echos out the rest, which is /4001
        # So in total this first echo outputs: ./outputdir/4001
        #
        # followed by a replace of ./4001/task/faces/con0027.img into _con0027.img
        # What sed does exactly: replace anything (.*) (here: ./4001/task/faces/) followed by con and the trailing bit,
        # with _con followed by the trailing bit (0027.img)
        outfile="`echo -n ${i%%/[a-zA-Z]*}|sed 's#^\.#\.\/'"${output_dir}"'#g'; echo -n ${i}|sed 's#.*con#_con#g'`"

        # test if file exists in outputdir. If it exists program continues with a warning message.
        test_if_file_does_not_exist "$outfile" || echo "$outfile already exists! Not overwritten by $i" >&2

        # The copy bit; -i --reply=no is not to overwrite files in $output_dir
        # -n to delete the trailing endline to keep it on one line
        # -v is for verbose mode when copying (not essential)
        # What comes out of the echo is: cp -v -i --reply=no ./4001/task/faces/con_0027.img
        echo "cp -v -i --reply=${copy_overwrite} ${i} ${outfile}"
        done
}


function error_exit
{

#      -----------------------------------------------------------------------
#      Function for exit due to fatal program error
#              Accepts 1 argument:
#                      string containing descriptive error message
#      -----------------------------------------------------------------------

        echo "${PROGNAME}: ${1:-"Unknown Error"}" >&2
        exit 1
}


function usage
{

#      -----------------------------------------------------------------------
#      Function to display usage message (does not exit)
#              No arguments
#      -----------------------------------------------------------------------

        local tab=$(echo -en "\t\t")

        cat <<- -EOF-

        Usage: ${PROGNAME} [-h | --help]

        SYNOPSIS
        ./${PROGNAME} [-s | --source] [-o | --outputdir]
        Copies all files named con<number>.img and con<number>.hdr
        from the subdirectories (in ./4001 etc ..) specified by SOURCE
        to ./OUTPUTDIR/con_<number>.<extension>

-EOF-

}


function helptext
{

#      -----------------------------------------------------------------------
#      Function to display help message for program
#              No arguments
#      -----------------------------------------------------------------------

        local tab=$(echo -en "\t\t")

        cat <<- -EOF-

        ${PROGNAME} ver. ${VERSION}

        Copy all *.img and *.hdr files from subdirs and rename them to ./<outputdir>/<dirnumber>_con<number>.extension
        If the files were not present in the output directory, it gives feedback in the form of:
        'old filename' -> 'new filename' (because of the -v verbose option in the copy)
        If the files were present in the outputdir it gives NO feedback.

        # Rename script that copies files called old, into new:
        # old ./4001/task/faces/con0027.img
        # new ./outputdir/4001_con0027.img
        # etc ..

        Usage:
        1) Save this script as rename_script.sh
          Put it in the same dir as the ./4001 directories and the outputdir.
        2) chmod the script to make it executable: chmod +x rename_script.sh
        3) When invoked like ./rename_script.sh -s task/faces -o outputdir
          it will output a test.
          Look at the output and when that looks ok:
        5) Invoke the script like so: ./rename_script -s task/faces -o outputdir |sh
          in order to execute the output of the script (this runs the actual copy commands).

        assumptions:
        1) We only copy files from subfolders that start with a letter. In the example t for task
        2) None of the main folders can start with a letter but must be called as a number: 4001
          If they start with letters we don't copy from them.
        3) It should never overwrite files in the outputdir. See variable copy_overwrite.
        4) All files to be copied are named con<number>.<img or hdr>
        5) The directory-names do not contain the letters con.

        $(usage)

        Options:

        [NO OPTIONS]    Show the usage text.
        -s, --source    Dir specified by user is the subdirectory to copy from.
        -o, --outputdir Dir specefied by user is the outputdirectory to copy to.
                        By default this is ./outputdir so this argument is optional.
        -h, --help      Display this help message and exit.

-EOF-
}


#      -------------------------------------------------------------------
#      Program starts here
#      -------------------------------------------------------------------

##### Command Line Processing #####

    while [ "$1" != "" ]; do
        case $1 in
            -s | --source )        shift
                                    if [ "$1" != "" ]; then
                                            # filter out ./ in front of directories
                                            source_dir=`echo $1 |sed 's#^[.]\{0,1\}\/\{0,1\}##g'`
                                    else
                                            helptext >&2
                                            exit
                                    fi
                                    ;;
            -o | --outputdir )      shift
                                    if [ "$1" != "" ]; then
                                            output_dir=$1
                                    else
                                            helptext >&2
                                            exit
                                    fi
                                    ;;
            -h | --help )          helptext >&2
                                    exit
                                    ;;
            * )                    usage >&2
                                    exit 1
        esac
        shift
    done

# test if outputdir exists
# Does $output_dir exist; otherwise create $output_dir
if ! [ -d $output_dir ]; then
    mkdir $output_dir || error_exit "Cannot create directory $output_dir Exiting."
fi

# call the function to find and copy
find_and_copy "$source_dir"

# all done so exit
exit


shelfitz 10-10-2006 07:04 PM

Hi Muha,

Ah, I see what happened. I just made some con*.img and con*.hdr files to test things and I used the wrong naming convention. The actual files are named as such: con_00*.img and con_00*.hdr and this is consistent (not con00* as the output above suggests). Sorry! I tested it with the right naming convention and it works!

But there is this other issue. The directory where the files live is 4001/stats/snod. I want to avoid getting other con.img and con.hdr files from other subdirectories within the subject directory (there is no way to uniquely specify these in the file name--there will be files with the same names in both directories I'm afraid. Is it possible to specify the directory to search in as a variable--it will have the same path for every subject (i.e.: subnum (4001)/stats/snod)? This would help when applying this to differently named directories as well. The task/faces directory, by the way, was just an example I provided.

Thanks for your awesome help!

Best,
SF

konsolebox 10-10-2006 08:40 PM

hello. please try this one.

Code:

SUBJECT=4001

OLDIFS=$IFS; IFS=$'\n'
for a in $(find . -type f -name *.img -print; find . -type f -name *.hdr -print); do
        DIRNAME="$(dirname ${a})"
        BASENAME="$(basename ${a})"
        NEWNAME="${DIRNAME}/${SUBJECT}_${BASENAME/_}"
       
        echo "renaming ${a} to ${NEWNAME}
        # remove the comment marks below if everything's already correct
        #echo mv "${a}" "${NEWNAME}" || {
        #        echo "error renaming ${a} to ${NEWNAME}
        #}
done
IFS=$OLDIFS


muha 10-12-2006 05:52 AM

@konsolebox: I just would like to comment that your script doesn't do anything with the outputdir
It looks like it keeps it all in the same dir?
Quote:

renaming ./4991/stats/snod/con991.img to ./4991/stats/snod/4001_con991.img
Quote:

Originally Posted by shelfitz
Here's what I want to do:
old name: con_0027.img
new name: 4001_con0027.img

old name: con_027.hdr
new name: 4001_con027.hdr

The directories consist of both .hdr and .img files and there will be 27-40 of each (i.e., con_027 ... con_028 ... etc.). All the files in a given directory will get the same numerical prefix. As I navigate to each new subject's directory (I'm dealing with fMRI data), I will have to do the same thing, only I will change the prefix to that particular subject's identification number. I need to do this because I have to create one directory for all the subject files

So what shelfitz wants, as I read it is:
old name: ./4991/stats/snod/con_0027.img
new name: ./outputdir/4001_con0027.img

@shelfitz:
You asked for the subdir part and I already had included that in my script:
Quote:

You can now specify the source directories like /task/faces/ or /stats/snod/
Use the script like so to get all *.img/hdr from ./4001/task/faces/ and ./4002/task/faces/
Code:

./rename_script.sh -s task/faces
Use the script like so to get all *.img/hdr from ./4001/stats/snod and ./4002/stats/snod
Code:

./rename_script.sh -s stats/snod
You can also specify the outputdir with -o so this will work even better:
Use the script like so to get all *.img/hdr from ./4001/task/faces/ and ./4002/task/faces/
to outputdir all_task_faces
Code:

./rename_script.sh -s task/faces -o all_task_faces
Use the script like so to get all *.img/hdr from ./4001/stats/snod and ./4002/stats/snod
to outputdir all_stats_snod
Code:

./rename_script.sh -s stats/snod -o all_stats_snod
Note: include |sh behind the command to make it work!

I made a small change so now it get's con_0027.img or con0027.img and copies that to
4001_con0027.img. So the underscore before the number is optional
The updated script:
Code:

#!/bin/bash

#      -------------------------------------------------------------------
#
#      Shell program to copy all *.img and *.hdr files from subdirs and
#      rename them to ./<outputdir>/<dirnumber>_con<number>.extension
#
#      Copyright 2006, <scriptfreak at gmail.com>
#
#      This program is free software; you can redistribute it and/or
#      modify it under the terms of the GNU General Public License as
#      published by the Free Software Foundation; either version 2 of the
#      License, or (at your option) any later version.
#
#      This program is distributed in the hope that it will be useful, but
#      WITHOUT ANY WARRANTY; without even the implied warranty of
#      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
#      General Public License for more details.


#      -------------------------------------------------------------------
#      Constants
#      -------------------------------------------------------------------

        PROGNAME=$(basename $0)
        VERSION="0.0.3"

#      -------------------------------------------------------------------
#      Variables
#      -------------------------------------------------------------------

        # set some variables
        # make sure the source_dir is not specified yet
        source_dir=
        # the name for the output directory (set it to what you want)
        output_dir="outputdir"
        # overwrite file in outputdir when it is already present when copying
        copy_overwrite="no"

#      -------------------------------------------------------------------
#      Functions
#      -------------------------------------------------------------------

function test_if_file_does_not_exist
{

#      -----------------------------------------------------------------------------------------------
#      Function to test if a file exists
#              argument is $filename which is the inputfile
#              returns 0 (succes) if the file does not exist
#              usage:
#                      test_if_file_exists <inputfile> || error_exit "<Filename> does not exist! Exiting"
#                      OR
#                      if test_if_file_exists <inputfile>;then echo "Succes!"; else echo "Failure"; fi
#      -----------------------------------------------------------------------------------------------

        if [ "$1" != "" ]; then
                if [ -e $1 ]; then
                        return 1
                else
                        # filename is not empty and filename does not exist
                        return 0
                fi
        else
                return 1
        fi
}


function find_files
{

#      -------------------------------------------------------------------
#      Function to find files called *.img and *.hdr from ./<number>/$source_dir/
#              argument is $source_dir which is the subdirectory to read from
#              So now the find command might look like find ./4001/task/faces/ -name etc ...
#              Exists with an error message if the specified subdirs do not exist.
#      -------------------------------------------------------------------

        # First find all files in subdirs called *.hdr and *.img
        # | sed -n "/^\.\/[0-9]*\/[a-zA-Z]/p" excludes all directories that do start with ./<letter> since we expect these
        # to be output directories and we don't want to copy from them. It also filters out subdirectories that start with numbers.
        find ./*/$1 -name "*.img" -o -name "*.hdr"| sed -n "/^\.\/[0-9]*\/[a-zA-Z]/p" || error_exit "Cannot find files in the subdirectory: $1"

}


function find_and_copy
{

#      -------------------------------------------------------------------
#      Function to find and copy from specified dir
#              argument is $source_dir which is the subdirectory (of ./<number>) to read from
#      -------------------------------------------------------------------

        # The find and copy command. A for loop for each file present.
        # Variable $i is now in the form of ./4001/task/faces/con_0027.img
        for i in `find_files $1`
        do
        # Specify the output file
        # ${i%%/[a-zA-Z]*} leaves only the ./4001 part of ./4001/task/faces/con0027.img
        # In other words, it deletes anything coming after ./4001 which has the form: slash letters; so /task/faces/con0027.img
        #
        # sed 's#^\.#\.\/'"${output_dir}"'#g'
        # This sed converts the output of the echo: ./4001
        # First it looks for 'the begin of the sentence' followed immediately by a .
        # it then replaces . by ./outputdir and echos out the rest, which is /4001
        # So in total this first echo outputs: ./outputdir/4001
        #
        # followed by a replace of ./4001/task/faces/con0027.img into _con0027.img
        # What sed does exactly: replace anything (.*) (here: ./4001/task/faces/) followed by con, an optional _ and the trailing bit,
        # with _con followed by the trailing bit (0027.img)
        outfile="`echo -n ${i%%/[a-zA-Z]*}|sed 's#^\.#\.\/'"${output_dir}"'#g'; echo -n ${i}|sed 's#.*con[_]\{0,1\}#_con#g'`"

        # test if file exists in outputdir. If it exists program continues with a warning message.
        test_if_file_does_not_exist "$outfile" || echo "$outfile already exists! Not overwritten by $i" >&2

        # The copy bit; -i --reply=no is not to overwrite files in $output_dir
        # -n to delete the trailing endline to keep it on one line
        # -v is for verbose mode when copying (not essential)
        # What comes out of the echo is: cp -v -i --reply=no ./4001/task/faces/con_0027.img
        echo "cp -v -i --reply=${copy_overwrite} ${i} ${outfile}"
        done
}


function error_exit
{

#      -----------------------------------------------------------------------
#      Function for exit due to fatal program error
#              Accepts 1 argument:
#                      string containing descriptive error message
#      -----------------------------------------------------------------------

        echo "${PROGNAME}: ${1:-"Unknown Error"}" >&2
        exit 1
}


function usage
{

#      -----------------------------------------------------------------------
#      Function to display usage message (does not exit)
#              No arguments
#      -----------------------------------------------------------------------

        local tab=$(echo -en "\t\t")

        cat <<- -EOF-

        Usage: ${PROGNAME} [-h | --help]

        SYNOPSIS
        ./${PROGNAME} [-s | --source] [-o | --outputdir]
        Copies all files named con<number>.img and con<number>.hdr
        from the subdirectories (in ./4001 etc ..) specified by SOURCE
        to ./OUTPUTDIR/con_<number>.<extension>

-EOF-

}


function helptext
{

#      -----------------------------------------------------------------------
#      Function to display help message for program
#              No arguments
#      -----------------------------------------------------------------------

        local tab=$(echo -en "\t\t")

        cat <<- -EOF-

        ${PROGNAME} ver. ${VERSION}

        Copy all *.img and *.hdr files from subdirs and rename them to ./<outputdir>/<dirnumber>_con<number>.extension
        If the files were not present in the output directory, it gives feedback in the form of:
        'old filename' -> 'new filename' (because of the -v verbose option in the copy)
        If the files were present in the outputdir it gives NO feedback.

        # Rename script that copies files called old, into new:
        # old ./4001/task/faces/con0027.img
        # new ./outputdir/4001_con0027.img
        # etc ..

        Usage:
        1) Save this script as rename_script.sh
          Put it in the same dir as the ./4001 directories and the outputdir.
        2) chmod the script to make it executable: chmod +x rename_script.sh
        3) When invoked like ./rename_script.sh -s task/faces -o outputdir
          it will output a test.
          Look at the output and when that looks ok:
        5) Invoke the script like so: ./rename_script -s task/faces -o outputdir |sh
          in order to execute the output of the script (this runs the actual copy commands).

        assumptions:
        1) We only copy files from subfolders that start with a letter. In the example t for task
        2) None of the main folders can start with a letter but must be called as a number: 4001
          If they start with letters we don't copy from them.
        3) It should never overwrite files in the outputdir. See variable copy_overwrite.
        4) All files to be copied are named con<number>.<img or hdr>
        5) The directory-names do not contain the letters con.

        $(usage)

        Options:

        [NO OPTIONS]    Show the usage text.
        -s, --source    Dir specified by user is the subdirectory to copy from.
        -o, --outputdir Dir specefied by user is the outputdirectory to copy to.
                        By default this is ./outputdir so this argument is optional.
        -h, --help      Display this help message and exit.

-EOF-
}


#      -------------------------------------------------------------------
#      Program starts here
#      -------------------------------------------------------------------

##### Command Line Processing #####

    while [ "$1" != "" ]; do
        case $1 in
            -s | --source )        shift
                                    if [ "$1" != "" ]; then
                                            source_dir=`echo $1 |sed 's#^[.]\{0,1\}\/\{0,1\}##g'`
                                    else
                                            helptext >&2
                                            exit
                                    fi
                                    ;;
            -o | --outputdir )      shift
                                    if [ "$1" != "" ]; then
                                            output_dir=$1
                                    else
                                            helptext >&2
                                            exit
                                    fi
                                    ;;
            -h | --help )          helptext >&2
                                    exit
                                    ;;
            * )                    usage >&2
                                    exit 1
        esac
        shift
    done

# test if outputdir exists
# Does $output_dir exist; otherwise create $output_dir
if ! [ -d $output_dir ]; then
    mkdir $output_dir || error_exit "Cannot create directory $output_dir Exiting."
fi

# call the function to find and copy
find_and_copy "$source_dir"

# all done so exit
exit


shelfitz 10-12-2006 10:23 PM

Hi M,

Well, I tested the the marvelous script in a pseudo environment, and it worked brilliantly! But when I tried it in the real world it didn't work. :confused: That is, I executed the command and just got a prompt back instantaneously, with no result in the specified output directory (not even an error message, just nothing). I'm assuming it's because the directory in question (i.e., where I ran it from) violates one of your assumptions (which I should have realized it would sooner!). In addition to numerically named directories, it also has directories named with text. What's more, some of the numerically named directories are named as such: 4001_2. When I tested this in the pseudo environment (no text-based named dirs but dirs named with _2) it just ignored the _2 directories. I imagine the text-based named dirs are also a problem. One other thing: My outputdir is not in the directory in which I launch the script. The program seems to require that the outputdir be in the pwd.

I realize you've given this alot of time, so do know that I understand if you are busy with other things. I can probably work with this myself.

Thank you!!


All times are GMT -5. The time now is 12:20 PM.