Using tr to replace strings in a group of filenames
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Using tr to replace strings in a group of filenames
Hello,
I have hundreds of filenames of the form CEU_NA21322_NSP.CEL. I am trying to use tr to first systemactically remove the 'CEU_' prefix; then remove the '_NSP' string; and finally, replace the suffix with ',CEU.CEL'. So that in the end this filename will be
NA21322,CEU.CEL
For the first step this is what I am trying:
mv *.CEL `echo *.CEL | tr -d [="CEU_"=]`
tr: CEU_: equivalence class operand must be a single character
mv: missing destination file operand after `CEU_NA3242_NSP.CEL'
Try `mv --help' for more information.
You'd see it converted the file name to what you wanted.
For a list of files you could do this with a "for" loop:
Code:
for FILE in `ls CEU_*NSP.CEL`
do NEWFILE=`echo $FILE | sed -e s/^CEU_// -e s/_NSP/,CEU/`
mv $FILE $NEWFILE
done
The above assumes all your files are in the current directory. You'd need to modify it for different location.
I'd recommend you copy a couple of files to a completely different directory and test the above on that new directory to be sure the results are what you expect.
The above is provided "as is" and except for the original echo line has NOT been tested by me. Testing is very important.
In any case, for large number of files, you're better off doing it with a single program (e.g. perl), rather than spawning mv/rename/whatever for each file. Something like:
Caveats:
my perl knowledge is quite limited, the above is probably twice as long/inefficient as necessary.
It seems to work but test it first, as jlightner suggested.
Thanks for the help guys. I am more comfortable with bash shell scripts, so I will stick to using that for now. If I write a script though, how can I pass the contents of 'ls' to that script as an argument?
You don't need to, and shouldn't, pass the output of ls. It's a bad idea to rely on ls's output since it's not portable, it varies according to environment variables, and it's changed over time.
If you just want to iterate over a list of files, you can pipe the output of find to the script, or just use shell globbing.
If you need to know the date/size/whatever, it's better to use stat than the output of ls (ls uses stat anyway).
You might want to write the script so you can use it both ways (piping filenames in or passing on the command line)
E.g.
Code:
#!/bin/sh
handle_file()
{
# do something with file
}
# do option handling here (getopt) if required
# getopt ...
# if no arguments, read from stdin
if [[ -z $1 ]]; then
while read file; do
handle_file "$file"
done
else
for file in "$@"; do
handle_file "$file"
done
fi
Note though that the behaviour in each case is different, since passing files on the command line won't be recursive. But you can add a check for directories.
Code:
....
handle_directory()
{
cd "$1"
for f in *; do
if [[ -d "$f" ]]; then
handle_directory "$f"
else
handle_file "$f"
fi
done
cd ..
}
if [[ -z $1 ]]; then
while read file; do
handle_file "$file"
done
else
for file in "$@"; do
if [[ -d "$file" ]]; then
handle_directory "$file"
else
handle_file "$file"
fi
done
fi
Alternatively, you might want to make recursion optional.
Code:
recursion=""
# use getopt to set recursion=1 if requested
...
if [[ -d "$file" ]] && -n "$recursion" ]]; then
handle_directory "$file"
else
...
fi
You'd see it converted the file name to what you wanted.
For a list of files you could do this with a "for" loop:
Code:
for FILE in `ls CEU_*NSP.CEL`
do NEWFILE=`echo $FILE | sed -e s/^CEU_// -e s/_NSP/,CEU/`
mv $FILE $NEWFILE
done
The above assumes all your files are in the current directory. You'd need to modify it for different location.
I'd recommend you copy a couple of files to a completely different directory and test the above on that new directory to be sure the results are what you expect.
The above is provided "as is" and except for the original echo line has NOT been tested by me. Testing is very important.
This code chunk you suggested does not work:
for ff in 'CEU_*NSP.CEL'; do NEWFILE='echo $ff |sed -e s/^CEU_// -e s/_NSP/,CEU/';
mv $ff $NEWFILE; done
I get the following error message when running it as a script:
mv: invalid option -- e
Try `mv --help' for more information.
The code I suggested DOES work because I tested it before posting.
It appears you combined advice of two posters:
In mine I had you do `ls CEU_*NSP.CEL` (back ticks included) to get the list of files.
A later poster said to drop the ls. He also meant for you to drop the back ticks. The back ticks say "execute this command before the rest of the command line". He basically was saying you don't need to and shouldn't do the ls syntax I'd provided. I personally don't think his objections were that valid but also don't think he was wrong in saying it would work his way.
You can do it the way I had it OR the way he had it. I haven't tested it his way but don't see any reason it wouldn't work.
There is a question why it passed the "-e" to your mv command since it is in the sed syntax rather than the mv syntax. I wouldn't investigate that until you correct as noted above. If you still see the same error it may indicate you accidentally created a file named "-e". That occurs on occasion when you type something incorrectly. If you have such a file you can delete it by typing:
rm ./"-e"
in the directory where the file exists.
P.S. When asking for help it might be better to say "I couldn't get it working" than to say "This code chunk you suggested does not work". It allows for the idea that the mistake was yours rather than that of the person that was attempting to help you and might encourage them to follow up. Not saying I can't a mistake but here the mistake was yours and in a different mood I might just have ignored your post or replied with a simple flame.
Thanks alot for your help. Ive worked out the problems (I wasnt properly doing the command subsitution previously) and as you have suggested the following works:
for ff in CEU_*NSP.CEL
do NEWFILE=$(echo $ff |sed -e s/^CEU_// -e s/_NSP/,CEU/)
mv $ff $NEWFILE
done
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.