LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Pipe GREP to copy filenames based on keywords (https://www.linuxquestions.org/questions/linux-newbie-8/pipe-grep-to-copy-filenames-based-on-keywords-937422/)

mainstream 03-31-2012 10:51 AM

Pipe GREP to copy filenames based on keywords
 
Hi everybody,

A while ago i posted that i lost everything due to a malicious script. I however succeeded in recovering almost every bit of data - most importantly my photo's. Thanks to photorec and testdisk.

Now comes the next problem. Sorting all my pictures.

I gave my photo's keywords. So when i look at the image properties, i see:
Code:

Image Type: jpeg (The JPEG image format)
Camera Brand: Canon
Date Taken: 2010:08:15 14:57:51
Keywords: Gambia
Etc

1st problem
Now i'm trying to copy all my files with the keywords Gambia to another directory. I can find the images with the following command, but i can't append a copy command to it.
Code:

find . -type f | xargs grep -lri "gambia"
2nd problem - SOLVED
Other things i want to know is how i can delete all jpg's < 500kb (junk).

3rd problem
I wrote some scripts, but i can seem to find them (too many data recovered). I rememberd some of the code: "changing IFS to newline means spaces", and tried:
Code:

mainstream@dopamine-pc:~/Desktop/Recovery 500GB$ find . -type f | xargs grep -lri "changing IFS to newline means spaces"
But no luck.

I get the following output after a few hours.
Code:

xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
I tried to append the -0, but that gives me an error message (xargs: argument line too long).

Thanks in regards

jschiwal 03-31-2012 11:11 AM

Try using the "file" command to identify file types.

For grep, you can use the -R option to enter directories and grep each file. You don't need large. However you could use the find command to locate small files. That way you wont be searching the contents of large opaque files.


For problem #1, pipe the file name output to xargs, | xargs cp -t /test/dir
Use the -L option to limit the number of arguments.


Problem 2: find /dir -type f -name "*.jpg" -size -500k -delete

mainstream 04-01-2012 05:01 AM

Thanks, i'll try ;-)

1st problem
Something like this:
Quote:

find . -type f | xargs cp -t Gambia/ | grep -lri "gambia"

David the H. 04-01-2012 11:13 AM

I personally would write up a separate script for processing the files, then execute the script on each file using find.

mainstream 04-02-2012 11:00 AM

You mean to first use the find command and list all files containing "Gambia" (grep) and copy these files? Isn't that a bit off a cumbersome manner?

Code:

#!/bin/bash

for file in `find . -type f | xargs grep -lri "gambia"` ; do
mv -v $file backup/


done

This is what i tried. But it just keeps hanging when executing - nothing happens.

mainstream 04-04-2012 06:53 AM

No one who can help me using grep and cp or mv?

David the H. 04-05-2012 11:51 AM

Looking more closely now, from what I see, the commands you've been using so far try to run grep directly on the binary photo files. I don't think that's going to work very well, if at all. You really need to be running them through a program that extracts the metadata, and parse that. I personally recommend the perl-based exiftool.

As I suggested before, what you need to do is wrap up up your copy commands into a separate script, one that will parse the data of a single file at a time and move it appropriately. Just to throw out a quick proof-of-concept:

Code:

#!/bin/bash

metatags=$( exiftool "$1" )

case $metatags in

        *Gambia*) mv -vn "$1" /gambiadir        ;;
        *Kenya* ) mv -vn "$1" /kenyadir                ;;
        *)          echo "no matching keyword found in $1" ;;

esac

exit 0

Then just make the script executable and run it in find.


Code:

find . -type f -exec /path/to/sortscript '{}' \;

You could expand the script to incorporate the find command directly, but this should do for a start.


All times are GMT -5. The time now is 01:48 PM.