LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   grep for multiple targets (https://www.linuxquestions.org/questions/linux-newbie-8/grep-for-multiple-targets-727390/)

mjtruco 05-20-2009 05:19 PM

grep for multiple targets
 
I need to select multiple targets from a file.
I've used grep successfully before when using few targets. For example:

egrep 'target1|target2|target3|target4' test.txt > test.out

test.txt is a file that contains:
target1 A A A A B B B
target2 B B B B B B B
target3 D F F F F F F
target4 F F F F F F F
target5 G G G G G G G
etc..

How can I do the same thing but instead of using egrep 'target1|target2|target3|target4|etc' use a file that contains a list with all the 1000 targets I'm interested on. Something like:
target1
target2
target3
target4
.
.
.
target1000

Thanks,

mjtruco

pattwo 05-20-2009 05:29 PM

man grep

Look in the "Matching Control" section for the '-f' switch

Does exactly what you're looking for.

colucix 05-20-2009 05:30 PM

Code:

grep -f pattern_file file
Look at man grep for details.

mjtruco 05-20-2009 07:05 PM

grep -f didn't work for file with targets
 
I've tried
grep -f file_targets
and it didn't work

I get the following error:

illegal option --f
Usage: grep -hblcnsviw pattern file ...

Do you know what is happening?

Thanks,

mjtruco

Tinkster 05-20-2009 07:11 PM

You're probably on an ancient version of Linux, or on a commercial
Unix like solaris.



Cheers,
Tink

ghostdog74 05-20-2009 07:20 PM

Code:

awk 'FNR==NR{t[$1];next}($1 in t)' targets file

syg00 05-20-2009 08:49 PM

I'm not a big user of awk, but won't that only work if the pattern matches the full text - rather than a substring match as seems the requirement ?.

ghostdog74 05-20-2009 09:00 PM

Quote:

Originally Posted by syg00 (Post 3547598)
I'm not a big user of awk, but won't that only work if the pattern matches the full text - rather than a substring match as seems the requirement ?.

Code:

FNR==NR{t[$1];next}
gathers all column 1 of "targets" file into array t. (note, the sample given has only 1 column. so essentially, we can also write :FNR==NR{t[$0];next}
Code:

($1 in t)
check $1 of test.txt against the array t. if found, print.

syg00 05-20-2009 09:12 PM

Thanks - like I said ...

colucix 05-21-2009 02:34 AM

Keep it simple:
Code:

while read pattern; do grep $pattern file; done < patterns

ghostdog74 05-21-2009 02:59 AM

Quote:

Originally Posted by colucix (Post 3547825)
Keep it simple:
Code:

while read pattern; do grep $pattern file; done < patterns

some issues with this, (assuming the samples as it is provided)
1)probably should put a boundary on the grep pattern, eg grep "target1" will also match target10 or target100 ....which will give ambiguous results
2)assuming "patterns" file has 1000 targets as OP has mentioned, so the code will call grep 1000 times on "file". it would be slower than going through the file once( as in the awk example). Its even slower if "file" is a big file.

colucix 05-21-2009 03:43 AM

ghostdog, you're right as always. Anyway, I suspect the OP does not have GNU awk. I tested your code on a Solaris Sparc 5.8 and it doesn't work, due to a syntax error
Code:

$ awk 'FNR==NR{t[$1];next}($1 in t)' targets file
awk: syntax error near line 1
awk: bailing out near line 1

This is due to the last statement ($1 in t). Maybe we can slightly change it for portability. Anyway, let's wait for the OP reply. Cheers! :)

ghostdog74 05-21-2009 04:04 AM

use nawk on Solaris

colucix 05-21-2009 06:14 AM

Quote:

Originally Posted by ghostdog74 (Post 3547879)
use nawk on Solaris

Correct! It works!

mjtruco 05-24-2009 02:01 PM

You were right. I was using an old version of Solaris.

I've changed to a new version from GNU Free Software
Foundation and the command grep -f worked fine.

Thanks for the help

mjtruco


All times are GMT -5. The time now is 11:48 PM.