LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   BASH: Write only unique strings to text file (cat or while read question) (https://www.linuxquestions.org/questions/programming-9/bash-write-only-unique-strings-to-text-file-cat-or-while-read-question-822031/)

grail 08-10-2010 01:00 AM

Well looking at the input you do not currently require any looping as awk will process each line in turn:
Code:

awk -F":" '{if($3 == "WWW")colonel = "the Web";else colonel = "Usenet";print "All files starting with "$1" come from \033[1m"$2"\033[0m on "colonel"."}' handmade2.txt
Not a hundred percent on the colorising working, but suk it and see

SilversleevesX 08-11-2010 12:54 AM

Staying with what's familiar, got a script that works.
 
I found that unusual (to me, at any rate) grep syntax that helped, temporarily, with another script project and applied it to this one. Seems to work where case/esac didn't (description of that forthcoming).
Code:

while read 'line';
do
        sp=$(echo $line)
        vc=$(echo $sp | cut -d":" -f1)
        cv=$(echo $sp | cut -d":" -f2)

        for striM in $(ls *jpg)
        do
        if grep -q $vc <<<$striM; then
            echo "There's a match. $striM with '$cv'."
        fi
        done
done<handmade2.txt

"handmade2.txt" is an edited version of the hand-written file-name-starts-to-Credits-in-column-B text file I made myself. The other afternoon, I copied a set of 20 JPEGs from a few different sites to a single folder, and stripped their Credit annotations, after making a list of them with iView MP. I ran another script on them (code here)
Code:

while read 'line';
do
        sp=$(echo $line)
        vc=$(echo $sp | cut -d":" -f1)
        cv=$(echo $sp | cut -d":" -f2)

        for striM in $(ls *jpg)
        do
                k=$(echo $striM | cut -d'.' -f1)
                l=$(echo $k | grep $vc)
                if [ -z $l ]
                then
                        echo "$cv is not a match for $striM. Sorry."
                else
                        echo "$striM:$cv">>matches.txt
                fi
        done
done<handmade2.txt

that did a pretty nice job of finding matches with its 'if-then-else-fi' conditional loop. I wanted to cut out the wordiness and make the exception (B and A don't match) the rule (B and A do match), so I Google'd for "if variable contains" and such as that. Not much was making sense to me until the second round of Google-ing, during which I found that grep in a stackoverflow.com thread.

This was after I tried out a case/esac that was suggested to the OP in the same thread. For me, and in my script, arranging the variables one way gave me "There's a match" on every return; arranging them another way gave me nothing at all. So I went with the "special" grep -q and the triple left-handed carats and was quite gratified when I got an output like this, running the first script (above) on the aforementioned "folder of 20":
Code:

There's a match. amanor-154-003.jpg with 'AmateurManor.com'.
There's a match. bv3199-027.jpg with 'badvoyeur-Planet Voyeur'.
There's a match. exmature-12-55811.jpg with 'greatmatures.com (exgranny)'.
There's a match. freshgf-0814-015-012.jpg with 'FreshGF.com'.
There's a match. gae52-5219-017-006.jpg with 'Adult Empire sites'.
There's a match. gae54-5458-008-004.jpg with 'Adult Empire sites'.
There's a match. gae69-6911-275-003.jpg with 'Adult Empire sites'.
There's a match. gae69-6911-275-004.jpg with 'Adult Empire sites'.

...etc etc.

Admittedly it's a bit slow, but it shows the potential of doing what it's supposed to -- either annotating directly (via an Exiv2 command or two) newly-downloaded files with their correct Credit tags or making a list with which that could be done via another script.

There is also -- and this is something I noticed a few days ago perusing the list of 20K+ files that iView generated for me earlier on -- the potential with this script to make corrections with already-annotated files where the Credits don't match their file name "heads". I found more than 100 on the mega-list that meet this description. But that's for another day and another pot of coffee. :)

I think with this script, and its successors (the "housekeeping" ones it's sure to inspire, if not generate) that this issue is really, truly, Solved this time. I'll be sure to mark it so before I log off.

Thanks for all the help to grail, ghostdog, and the others who gave their time, energy and experience to help me out with this.

BZT

grail 08-11-2010 02:24 AM

Well you can probably speed it up a bit by removing some of the superfluous steps and commands:
Code:

while read line
do
        vc=${line%%:*}
        cv=${line##*:}

        for striM in *jpg
        do
        if grep -q $vc $striM; then
                echo "There's a match. $striM with '$cv'."
        fi
        done
done<handmade2.txt

Code:

while read line
do
        vc=${line%%:*}
        cv=${line##*:}

        for striM in *jpg
        do
                k=${striM%%.*}
                l=$(echo $k | grep $vc)
                if [ -z $l ]
                then
                        echo "$cv is not a match for $striM. Sorry."
                else
                        echo "$striM:$cv">>matches.txt
                fi
        done
done<handmade2.txt

Taking out all the cuts and most greps should help improve the speed


All times are GMT -5. The time now is 03:27 PM.