LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Search for two or more 'patterns' and move files according to results (https://www.linuxquestions.org/questions/linux-newbie-8/search-for-two-or-more-patterns-and-move-files-according-to-results-906724/)

lithos 10-06-2011 06:44 AM

Search for two or more 'patterns' and move files according to results
 
1 Attachment(s)
Hi

I'm lost with searching for 2 and then more 'patterns' within a file with a large number of filenames and according model names, then I want to rename and move the files to appropriate 'found' result (the source file attached).

In short,
searching for the name in line:
Code:

D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\motorne_zage\lahke_motorne_zage\30\ms_5F250_c_be\default.htm
    284                                                          <div class="picture"><a href="../../../../../../../media/cache/image/30_big-b4d46b171375e7f3.jpg" class="thumbnail hs"><img src="../../../../../../../media/cache/image/30_big-ee8de7c7fa41401a.jpg" width="500" height="250" alt="MS 250 C BE"/></a></div>
    320                                                          <div class="picture"><a href="../../../../../../../media/cache/image/30_big-b4d46b171375e7f3.jpg" class="thumbnail hs"><img src="../../../../../../../media/cache/image/30_big-2443fdbb986dfef6.jpg" width="120" height="60" alt="MS 250 C BE"/></a></div>
4 matches in D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\motorne_zage\lahke_motorne_zage\30\ms_5F250_c_be\default.htm

should find me the filename cache/image/"30_big-b4d46b171375e7f3.jpg" and the name from alt="MS 250 C BE" (MS 250 C BE)
and then I would like to rename it "30_big-b4d46b171375e7f3.jpg" -> "MS 250 C BE.jpg" and move it to the folder "ms_5F250_c_be" (which is found in 1.st line near end: ...orne_zage\30\ms_5F250_c_be\default.htm ). There is also the smaller picture "30_big-2443fdbb986dfef6.jpg" in the second like line which can be named "MS 250 C BE(1).jpg" and moved

These are CMS uploaded files and named randomly so now the customer wants it sorted (updated and deleted unnecessary).


Since I'm trying to do this with some kind of grep/sed... search for 2 days I'm lost and I have no script made yet, thus I'm asking kindly someone (if it can be done) for the complete search code, which I could use.

I may have not provided exactly what I need to do, but I'm soooo messed up right now doing this manually, I could not appreciate more if this can do some script instead of me.

Regards

thesnow 10-07-2011 10:09 AM

This may help get you started, obviously the syntax is wrong with the mixing of / and \.

Code:

#!/bin/bash

cat files.txt | dos2unix | grep -v " matches " | while read -r line
do
  a=$(echo $line | cut -c1)
  if [ $a == "D" ]
  then
    # Target directory
    b=$(echo $line)
    n=0
  else
    # Source file
    c=$(echo $line | cut -d'"' -f8)
    # Target file name
    d=$(echo $line | cut -d'"' -f14 | tr ' ' '_')

    if [ $n = 0 ]
    then
        echo cp ${c##.*/} ${b%\\*}\\$d.jpg
    else
        echo cp ${c##.*/} ${b%\\*}\\$d\($n\).jpg
    fi
    let "n=n+1"
  fi
done

Output:

Code:

cp 101_big-7903fadb4cddcfde.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kombi_motorji\101\km_130_r\KM_130_R.jpg
cp 101_big-df4a4bcc44d03bbb.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kombi_motorji\101\km_130_r\KM_130_R(1).jpg
cp 688_big-c4c5b0fd711cd0f4.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kombi_motorji\688\km_90_r\KM_90_R.jpg
cp 688_big-44c80e6d7ca82720.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kombi_motorji\688\km_90_r\KM_90_R(1).jpg
cp 153_big-3326289580109cb2.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kompaktni_brusilno_rezalni_stroji\153\ts_5F400\TS_400.jpg
cp 153_big-5077ebb54fe81f59.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kompaktni_brusilno_rezalni_stroji\153\ts_5F400\TS_400(1).jpg
cp 156_big-20fbb0eecaa27f95.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kompaktni_brusilno_rezalni_stroji\156\ts_700\TS_700.jpg
cp 156_big-8f287afe43caadf2.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kompaktni_brusilno_rezalni_stroji\156\ts_700\TS_700(1).jpg
cp 792_big-d1e7d51a6ccb5b2a.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kompaktni_brusilno_rezalni_stroji\792\ts_800\TS_800.jpg
cp 792_big-aaa1cbc882bf62a6.jpg D:\temp\OfflineExplorerPortable\Download\unicommerce.si\produkti\produkti\produkti_stihl\kompaktni_brusilno_rezalni_stroji\792\ts_800\TS_800(1).jpg


JSkywalker 10-07-2011 10:36 AM

i create a file test.awk
Code:

/default.htm/ { h=$(NF-1); print "H="h }
/<div / { if (h!="") { x1=index($0,"image/"); x2=index($0,".jpg"); print substr($0,x1+6,x2-x1-2); }}

when doing this:
Code:

gawk -F "\\" -f test.awk file.txt
i get:
Quote:

H=km_130_r
101_big-3dfdee7212009dda.jpg
101_big-3dfdee7212009dda.jpg
H=km_130_r
H=km_90_r
688_big-6b1563a266b616c9.jpg
688_big-6b1563a266b616c9.jpg
H=km_90_r
H=ts_5F400
153_big-4ea387342346e192.jpg
153_big-4ea387342346e192.jpg
H=ts_5F400
H=ts_700
156_big-58ed5ffae3205364.jpg
156_big-58ed5ffae3205364.jpg
H=ts_700.....
which should be enought to get you where you want to be.... (if you know AWK, of GAWK) :)

lithos 10-07-2011 11:04 AM

That is so great, thank you.

I've come only about 30% manually through, with this should be done in no time :cool:


All times are GMT -5. The time now is 06:48 AM.