Find images within specific dimensions and copy them to another directory
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Find images within specific dimensions and copy them to another directory
Hello, I found some results in the web that partially answer my question but as I'm new to Linux I fail at integrate all of them.
So I have many directories with hundred of images named like 0001.jpg 0002.jpg ... etc.
I would like to:
1- Find images within specific dimensions in all the directories.
2- Copy them to a new directory
3- Rename the images if there are duplicate names (which it will be because the naming method I have in the original directories).
This post partially answers my question but I don't know how to rename the images when there are two with the same name and avoid the overwrite.
Thanks in advance for the help.
Last edited by Ghost-Order; 01-02-2020 at 06:23 PM.
Step 1, is to identify the files of interest. Is image, if Yes, is of proper size, if yes then store path to a file, and add a column with a unique identifier. So at the end you will have a file with a list of filenames from all over you device, and then a unique identifier. (you may not need that because the full path or even the parent path might be enough of a differentiator).
(You can use the find command for find for the information finding because the size etc is all metadata. Use in combination with sed and awk, and you can narrow down the dimensions only.
Step 2, now that you have your list you can then script to go through the list and copy them wherever you want. While the script processes the list you can change the filename etc all at once.
Step 1, is to identify the files of interest. Is image, if Yes, is of proper size, if yes then store path to a file, and add a column with a unique identifier. So at the end you will have a file with a list of filenames from all over you device, and then a unique identifier. (you may not need that because the full path or even the parent path might be enough of a differentiator).
(You can use the find command for find for the information finding because the size etc is all metadata. Use in combination with sed and awk, and you can narrow down the dimensions only.
Step 2, now that you have your list you can then script to go through the list and copy them wherever you want. While the script processes the list you can change the filename etc all at once.
Does this help?
Hello agent82, thank you for taking the time to help me. However, I can't do very much with that, I tried searching for documentations that may serve for my purpose but I'm not a programmer and I can't understand very much. The code from the link I put in my first post is almost enough for the task I need to do, the problem is that I have 10+ directories with images named exactly the same 0001.jpg 0002.jpg etc, and when I copy them to another directory they overwrite themselves.
At the begining it goes fine, but after some seconds I get this like 30 times:
Code:
cp: overwrite '/home/user/dir1/dir2/0010.jpg'? identify: unable to open image 'jpg': No such file or directory @ error/blob.c/OpenBlob/3496.
identify: no decode delegate for this image format `' @ error/constitute.c/ReadImage/562.
identify: unable to open image 'jpg': No such file or directory @ error/blob.c/OpenBlob/3496.
identify: no decode delegate for this image format `' @ error/constitute.c/ReadImage/562.
cp: cannot stat 'jpg': No such file or directory
Then in my target directory, the one where I want to copy all the images, I find just some images.
Last edited by Ghost-Order; 01-02-2020 at 09:02 PM.
Put a variable in the script that increments with each file processed. Since the old files are named numerically just increment a number then use it for a new file name or amend to it when copying the files to the new directory.
Put a variable in the script that increments with each file processed. Since the old files are named numerically just increment a number then use it for a new file name or amend to it when copying the files to the new directory.
It might be safer to use file instead of identify, since identify requires ImageMagick to be installed. Just looking at an strace of file on an image suggests that it can decode images without using an external program. So your dimension variables could be changed to this:
Put a variable in the script that increments with each file processed. Since the old files are named numerically just increment a number then use it for a new file name or amend to it when copying the files to the new directory.
Thank you, it worked! Is there a way to specify a specific dimension or a range of dimensions? Because from what I readed from the user that post the original script, it compares width and height and returns the horizontal images. And it works for me for this specific task but in the future I may need to specify the dimensions.
Quote:
Originally Posted by individual
It might be safer to use file instead of identify, since identify requires ImageMagick to be installed. Just looking at an strace of file on an image suggests that it can decode images without using an external program. So your dimension variables could be changed to this:
Thats a good addition, in fact when I run the script with this change it runs like 3x faster lol, but the problem is that it is not recognizing the images dimensions cause it copies all the images and not only the horizontal ones.
Last edited by Ghost-Order; 01-03-2020 at 11:47 AM.
You can also do a lot within find, too. The options are connected with an implied logial AND unless a logical OR is explicitly specified. That means you can call shell, perl, awk, or any other kind of script and if it exits with a non-zero exit code, it is considered to have failed.
The X Dimension is the first field in the AAAAA x BBBBB syntax.
The Y Dimension is the third field in the AAAAA x BBBBB syntax. (See the cut options above?)
Now that you have the x and y dimension of each field you need to make a decision on that.
You could do (this measures if the ydim is greater then and equal to the xdim)
if [ "$ydim" -gt "$xdim" ]; then
cp $picture /new/pathof/$x$picture
echo Picture count "$x" name "$picture" is copied.
else
echo "$picture" is wrong size, or orientation and not copied.
fi
If you are looking for a specific dimension size, then you can compare $ydim, $xdim or both to that size.
Thank you, it worked! Is there a way to specify a specific dimension or a range of dimensions? Because from what I readed from the user that post the original script, it compares width and height and returns the horizontal images. And it works for me for this specific task but in the future I may need to specify the dimensions.
.
You are welcome, yes as has been mentioned by agent82 use the if then type of logic on the size that you already have in a variable. That will allow you to make the choice based on the size of the variable of the individual pictures as they are being processed.
identify from imagemagick can help ID the dimensions. I tend to rename files with the original file timestamp (tends to keep them "unique"). But an old mainframe guy so I go in phases. Like one script to generate the things I "want" to do, and another to "do them". With an audit path I can inspect before going to phase II.
The X Dimension is the first field in the AAAAA x BBBBB syntax.
The Y Dimension is the third field in the AAAAA x BBBBB syntax. (See the cut options above?)
Now that you have the x and y dimension of each field you need to make a decision on that.
You could do (this measures if the ydim is greater then and equal to the xdim)
if [ "$ydim" -gt "$xdim" ]; then
cp $picture /new/pathof/$x$picture
echo Picture count "$x" name "$picture" is copied.
else
echo "$picture" is wrong size, or orientation and not copied.
fi
If you are looking for a specific dimension size, then you can compare $ydim, $xdim or both to that size.
Thank you for the clarification although, as I said previously, using 'file' instead of 'identify' give me all the images and not the (in this case) horizontal ones only as I want. I tried changing the argument for the condition from -gt to -eq to see if the script is recognizing the condition and it seems it doesn't because it still outputs(I'm using printf instead of cp just to test this) all the images and I know I don't have square images.
I was reading about grep and I think I kind of get it. However, and maybe this is out of the scope of my initial 'question' but, I don't understand why cut is used, I tried without it and the output is the same.
PD: Sorry for the late reply.
Last edited by Ghost-Order; 01-05-2020 at 04:24 PM.
One thing to note is it depends on the version of the file command, whether it outputs the jpg's file dimensions and in what format.
It could be nothing, vvvXhhh or vvv x hhh.
Post the complete output of the command file image.jpg.
cut basically extracts parts of a line from a text file. The -d ' ' means use the space as a delimiter and f1 extracts the 1st field which starts from the beginning of the line to the first space.
With the copy command cp -r --backup=t "$picture". The --backup=t means if $picture exists in the destination directory it will be appended with a number in the form of $picture~N~ where N is a number (i.e. file.jpg~1~).
In a nutshell you can use any sort of conditional statement to compare x and y dimensions to suite your needs.
One thing to note is it depends on the version of the file command, whether it outputs the jpg's file dimensions and in what format.
It could be nothing, vvvXhhh or vvv x hhh.
Post the complete output of the command file image.jpg.
cut basically extracts parts of a line from a text file. The -d ' ' means use the space as a delimiter and f1 extracts the 1st field which starts from the beginning of the line to the first space.
With the copy command cp -r --backup=t "$picture". The --backup=t means if $picture exists in the destination directory it will be appended with a number in the form of $picture~N~ where N is a number (i.e. file.jpg~1~).
In a nutshell you can use any sort of conditional statement to compare x and y dimensions to suite your needs.
Individual's script fails because the file output is different from the expected regex expression. What compounds the problem is that there are two different aaxbb stings in the output.
Not enough information provided to be certain why the original script fails to find some files. It maybe caused if copying files to a directory within the original working search directory.
here is some snippets of functions I use in a script I use to find images at given size and so what I ant with them. it is of course completely modifiable.
just for screen dimensions.
Code:
#Gets Monitor Dimensions for
#knowing what size to make images
getdimensions=$( xdpyinfo | grep dimensions | awk '{print $2}' )
#split the width and height into separate variables.
screenwidth=${getdimensions%x*}
screenheight=${getdimensions#*x}
#desired image width
width=$screenwidth
Using ImageMagick
to find the wide images, and tall images. basic logic is used. if not one then it has to be the other.
Code:
mkdir -p $wideImages $tallImages
findWideImages()
{
echo "Finding wide images..."
while read f
do #if width gt then height, wide image, else it
#is a tall image.
wh=$(identify -format "%wx%h" "$f")
[[ ${wh%x*} -gt ${wh#*x} ]] &&
mv "$f" "$wideImages" ||
mv "$f" "$tallImages"
done < <(find "$source" -type f)
echo "Done finding wide images.."
}
getting the dimensions, you then can use them to single out and separate your images by size as well and using a range too in your conditional statements. to get almost sizes and group them together, or however your like.
-------- For checking to not over write files ----
Code:
#!/usr/bin/env bash
source=/media/data
move2=$HOME/isitacopy
rename_file()
{
echo "is about to change file name $f "
}
while read f
do
#get current filename from f
fn=${f##*/}
#now check to see if there is a same name
#in destanation directroy.
#if yes send it to a function to rename it first.
if [[ -f "$move2/$fn" ]] ; then
#if same file name is in that conditional
#then send the current one to be changed then move it.
rename_file "$f"
fi
done < <(find $source -type f -name "variety_desktop_wp" )
add your code to rename that current file then that function returns into the main code while loop.
you can test by adding a file in a dir, then having a same named file in another area to search for then figure out your rename code.
output on my test. with a same named file already in the dir to be moved to (move2).
Code:
userx@slacky.yo.org:~
$ isitacopy/checkforcopy
is about to change file name /media/data/fbsd-stuff/variety_desktop_wp
is about to change file name /media/data/variety_desktop_wp
that is just a quick down and dirty I wrote up on the fly, so it might have a few kinks in it still. minus the change the file name part too.
and yes I see that it will take a little bit more code to put all of this together in working order. but hey, you need to have some fun too.
Remember to make yourself a test bed and test your scripts first before putting them into production.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.