Can I use grep inside awk?

Helene · 04-25-2004, 02:03 AM

I have the following file:
www/home/pictures/bird.gif
www/home/pictures/home.gif
www/home/documents/home.gif

I want to filter out the records that have a file name containing "home". I don't want the records with a directory named "home" to be included. To only pick out only the filename, I do this:
fn=`echo $tmp | grep '[^\ /]*$' -o`

where $tmp is the first record in the file.

So, can I do this operation using awk? I have tried something like shown below, but I only get syntax errors (invalid char '`' in expression). Am I not allowed to use grep inside an awk?
awk '{ fn=`echo $1 | grep '[^\/]*$' -o` if ( $fn ~/'home'/ ) print $0 }' $myFile > newfile

I hope someone can help me out!

rkef · 04-25-2004, 02:36 AM

It's really unclear what you're trying to do. You say you want to "filter out" filenames w/'home' in them, but that you don't want "records with a directory named 'home' to be included".

1) by filter out, do you mean to ignore files with 'home' in them or that you want to "work with" files with 'home' in them?
2) all of the directories contain directories named 'home'
3) your regex "[^\ /]*$" matches any input line (see end of this post for why). Also, you needn't escape '/' in a character class - [^/] works fine.
4) what about something with awk -F'/' '{blah blah}' to get at each compenent of the full pathname (whatever it is you're doing

)?

the regex "[^/]*$" matches any line which contains zero or more occurences of any character but '/', followed by an end of line marker. Note that even "//////" matches, because [^/]* is in a sense "optional". Think about that one. Also, the -o option puts grep into an endless loop with that regex (at least for me it does). I've never used -o so I'm not familiar with what can go wrong with it.

Helene · 04-25-2004, 02:50 AM

Ok, I'll try again.

1) I want to work with the records that actually contain a file with "home" in. I don't want to work with www/home/pictures/bird.gif, and to avoid this I use the regular expression. If I don't use a regular expression, this record will be selected because it has "home" in it.

2)Yes, all the records contain "home", and that's the problem. As mentioned above.

3) As far as I can see, my reg.expr is correct because it only returns the filename. I want to use this filename for further testing in the awk:
awk '{ fn=`echo $1 | grep '[^\/]*$' -o` if ( $fn ~/'home'/) print $0 }' $myFile > newfile

Any clearer?

Hko · 04-25-2004, 06:21 AM

You are trying to start the "grep" program from inside a awk script. I'm not very sure, but I think that's not (easily) possible. But it's also doesn't make a lot of sense, since awk is more powerfull than grep. It can do the same, and a lot more, but with different syntax and semantics.

For example if this is "files.txt":

Code:

www/home/pictures/bird.gif
www/pictures/animal.gif
www/home/documents/home.gif
www/nothome/somebitmap.gif
www/somedir/pictures/next.gif
www/home/documents/home.gif

...then, is this what you are looking for?

Code:

awk -F/ '!/\/home\//{print $NF}' files.txt

Helene · 04-26-2004, 07:54 AM

Thank you for help! I realise that I should have told you the whole scenario instead of simplify it too much. Sorry

This is my actual file:
anna;www/home/pictures/bird.gif;23
arna;www/pictures/animal.gif;4
emma;www/home/documents/home.gif;333
kim;www/nothome/somebitmap.gif;123
sarah;www/somedir/pictures/next.gif;43
alf;www/home/documents/home.gif;1

It is semicolon separated, so I can't use field separator you suggested (-F/).
Is it possible to put your awk-command inside mine? Like this:
awk -F';' '{ fn=`your awk` if ( $fn ~/'home'/ ) print $0 }' $myFile > newfile

Again; I want to print out the records that has a filename with "home" in it.

Hko · 04-26-2004, 01:08 PM

Now (I think) I understand the "home"-part of the problem, but it's still unclear to me what part of an input line, you want to see in the output.

So here are 3 options:

Code:

# Entire line in ouput. I this case you can also use grep.
awk '/\/[^/]*home[^/]*;/{print $0}' files.txt
grep '\/[^/]*home[^/]*;' files.txt

# Only file in output, including the path:
awk -F\; '/\/[^/]*home[^/]*;/{print $2}' files.txt

# Only file in output, with path stripped:
awk -F\; '/\/[^/]*home[^/]*;/{sub(".*/","",$2) ; print $2 }' files.txt

Hope I could help this time.

Helene · 04-27-2004, 12:58 AM

Thank you, Hko.
The first option seems to be the right one in my case. I didn't get the idea about not using the field-separator!

jagsb · 09-28-2015, 01:38 AM

cat files.txt
www/home/pictures/bird.gif
www/home/pictures/home.gif
www/home/documents/home.gif

awk -F / '{tmp=match($NF,/home/);if(tmp) print}' files.txt

This gives you the below
www/home/documents/home.gif
www/home/documents/home.gif

chrism01 · 09-28-2015, 02:59 AM

This gives the same o/p as Hko's num 1

Code:

for rec in $(cat t.t)
do
    dn=$(basename $rec)
    if [[ $dn =~ "home" ]]
    then 
        echo $rec
    fi
done

ie the entire rec only for recs whose filename (not checking dirname) includes 'home'

HMW · 09-29-2015, 05:29 AM

In case you missed it, I just want to point out that you are responding to a thread from 2004(!).

Best regards,
HMW

chrism01 · 09-29-2015, 08:48 PM

yeah - I missed it