Grep an entire file but must contain multiple words
Hi,
Just wanted to get a bit better at my grep skills. Basically I want to return a filename if it contains the word 'element' and the word 'name' and the word 'foo' and the word 'bar'. If the file contains all these words, I want to know what file it is. I have been trying the following which seems to work but it is ungodly slow. I will be searching a boatload of files so it would be nice if it were faster. I'm sure there is a built in better way for grep to handle this (some regex magic I dont know about). $ find . -iname "*.txt" -print0 | xargs -0 -n 1 grep Tag | grep Read This doesn't print out the filename, but kind of works. Suggestions for faster results and perhaps printing out the filename? |
It's easy if the words can appear in order.....
Code:
<somecommands>|grep "*element*name*foo*bar*" Code:
<somecommands>|grep element|grep name|grep foo|grep bar |
I would use something like:
Code:
find ./|grep -l <string>|xargs grep -l <string2>|xargs -l grep <string3> ... |
Code:
find -type f | awk '/foo/ && /bar/ && /baz/' |
OP wants the filename - and won't you need to -exec that Tink ?
Code:
find -type f -exec awk '{if (/foo/ && /bar/ && /baz/) print FILENAME}' {} \; |
Hmmm ... I must have misunderstood him ... I thought he was looking
for files with foo, bar and baz in the name. :} If he's looking for files that have foo, bar and baz inside them (anywhere) and then print the filename none of the approaches above will quite do ... they'll only work if foo, bar and baz are on one line, not anywhere in the file ... That'll be more like? Code:
find -type f -exec awk 'BEGIN{foo=0;bar=0;baz=0}/foo/{foo++}/bar/{bar++}/baz/{baz++}END{if(foo>0 && bar>0 && baz>0){print FILENAME}}' {} \; |
good job! works great! Never used awk that way so I'll be storing that little goodie away for future use!
|
That is precisely what I was looking for, thank you very much for your solution. I'll chalk awk up on my list of programs to learn (it looks quite large)
|
Any ideas for how to get this in a script so it could be used like the following:
find . -iname "*.whatever" -type f | myScript any number of strings this would be super useful. I can of course read a bunch and do my best as well ;) |
Now I won't sit down and write this for you ... ;} ... sorry!
But I'd devise a shell script that evaluates a switch -t (for type) for your "*whatever", and accepts everything else on the command line as your "any number of strings". Then (in pseudo code) I'd write "any number of strings" to a stringfile, one per line. invoke find withe the "whatever" parameter. have an awk script that reads stringfile, and assigns the elements to an array in the BEGIN section. for each line that awk processes iterate over the array and check whether any item is part of the line, if so increment it. in the END section loop over the array, skip the record as soon as a zero value occurs. If the loop doesn't exit, print the file name. |
I was thinking at the time 3 was about the limit for that approach and it was about time for a file of values and an array. @wakeboarder3780 do a google on "awk" an "associative array" to get an idea of what you can do - for example this article.
|
All times are GMT -5. The time now is 09:42 AM. |