LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Script to compare filenames in a directory, sort and delete (https://www.linuxquestions.org/questions/linux-newbie-8/script-to-compare-filenames-in-a-directory-sort-and-delete-942838/)

OutsiderFilms 05-02-2012 12:05 AM

Script to compare filenames in a directory, sort and delete
 
Hello All,

I shoot RAW and JPEG images on my camera. These are then copied to a folder on the computer.
The photo folder, as a result, contains image pairs -- file1.cr2, file1.jpg... file2.cr2, file2.jpg and so on.

Since jpg files load faster, I view and then delete the unwanted jpgs from the folder... So now I'm left with some file pairs and some isolated *.cr2 files.

Some folders have 2000+ files in them.

Is there a script/programme to find *.cr2 files that do NOT have a corresponding *.jpg file and delete them?

Thank you, in advance, for your help.

Amit

lazardo 05-02-2012 12:13 AM

Note: Remove the 'echo' command in front of 'rm' to actually delete the files.

Cheers,

Code:

#!/bin/bash

find . -name \*.cr2 |\
while read CR2FILE; do
  JPGFILE=$( echo $CR2FILE | sed 's/cr2$/jpg/' )
  [ -e $JPGFILE ] && continue

  echo rm -v $CR2FILE
done


OutsiderFilms 05-02-2012 12:31 AM

Thanks for the quick response, Lazardo.

Here's what I did (Probably something I did was incorrect):

I copied your script
In a terminal, I went to the folder of photos
ran /bin/bash
then, pasted (shift-control-v) the script into the terminal
pressed enter.

Nothing happened. I was expecting the name (or names) of the cr2 files to be 'echoed' back.

What am I doing wrong?

Perhaps you could modify the script to rename all isolated files by prefixing a "1" to the filename -- that would cause them to relocate to the top of the folder if I sort by name. I could then easily select them and delete them...

Thanks for you help. I really appreciate this.

Cheers,

Amit

jschiwal 05-02-2012 12:45 AM

You can simplify the line with sed.
Code:

JPGFILE=${CR2FILE%cr2}jpg
Run the find command on its own and see if it finds the .cr2 files. You need to be in the directory with the .cr2 files or a parent.

You could use the base directory of your photos directory in the find command, e.g.:
find /media/photos/ -name "*.cr2"

Also, the case of letters matters in Linux. If the extensions are .CR2, then change it in the script.

OutsiderFilms 05-02-2012 01:00 AM

Hi jschiwal,

Thanks for the input -- I changed references in the script to CR2 and JPG
instead of cr2 and jpg. Now, it seems to be parsing the files...

Here's the output now:

------

amit@amit-desktop:/media/FA1000/5D/29 April/D1$ /bin/bash
amit@amit-desktop:/media/FA1000/5D/29 April/D1$
amit@amit-desktop:/media/FA1000/5D/29 April/D1$ find . -name \*.CR2 |\
> while read CR2FILE; do
> JPGFILE=$( echo $CR2FILE | sed 's/CR2$/JPG/' )
> [ -e $JPGFILE ] && continue
>
> echo rm -v $CR2FILE
> done
bash: [: too many arguments
rm -v ./WIEGO (GC - IMG) (2012.04.29) (10016).CR2
bash: [: too many arguments
rm -v ./WIEGO (GC - IMG) (2012.04.29) (10017).CR2
bash: [: too many arguments

-------- there's many more "echoes" with the same error.


Also, the file *(10017).CR2 does have a JPG pair... I'm testing the script on a backup drive that has ALL pairs except 10016.CR2 (I've deleted just one JPG file in this directory)


Cheers,

Amit

jschiwal 05-02-2012 01:32 AM

Enclose the variables inside double quotes. "${JPEGFILE}" and "${SC2FILE}". The spaces in filenames cause them to be split into separate entries. The parenthesis launch sub shells or create arrays. Dumb filenames will make working in the terminal and writing scripts more difficult.

http://transnum.blogspot.com/2008/11...orts-0-as.html
This blog has an example using read -d $'\0' and -print0 in find to use null characters as separators.

Note, don't quote variables to the left of the equals sign, but to the right, where they are being referenced.

chrism01 05-02-2012 01:56 AM

Also, you'll find it much easier if you put the code into a bash file and then run that file.

OutsiderFilms 05-02-2012 02:25 AM

Hello jschiwal,

It works now! Thank you.

I did not know that parentheses caused these problems. Their usage as separators seemed natural to me.
I should have provided an example filename with my first post. Sorry about that.

For future reference, here's the code that worked:

--------------------------------------------

/bin/bash

find . -name \*.CR2 |\
while read CR2FILE; do
JPGFILE=$( echo "$CR2FILE" | sed 's/CR2$/JPG/' )
[ -e "$JPGFILE" ] && continue

echo rm -v "$CR2FILE"
done


--------------------------------------------

Am marking the post "Solved". Thank you all for your help.

Cheers,

Amit

abhinav4 05-02-2012 06:46 AM

Quote:

Originally Posted by OutsiderFilms (Post 4668253)
[ -e "$JPGFILE" ] && continue

Can someone tell what it does [ -e "$JPGFILE" ]

Mark1986 05-02-2012 07:38 AM

It checks if the file mentioned in the $JPGFILE variable exists.

lazardo 05-03-2012 12:20 AM

Quote:

Originally Posted by OutsiderFilms (Post 4668198)
Hi jschiwal,

Thanks for the input -- I changed references in the script to CR2 and JPG
instead of cr2 and jpg. Now, it seems to be parsing the files...
...

Lesson 1: Be accurate. 'cr2' is not equal to 'CR2', file and directory names are case-sensitive.

Lesson 2: Filenames and directories can have characters that are interpreted with special meanings by the shell. The errors were because spaces and parens were being interpreted outside the context of a filename.

Sorry about that, I don't use special characters in filenames ;)

Cheers,


All times are GMT -5. The time now is 01:08 PM.