LinuxQuestions.org - [SOLVED] search for 2 different strings in 2 diffrent lines

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - search for 2 different strings in 2 diffrent lines (https://www.linuxquestions.org/questions/linux-newbie-8/search-for-2-different-strings-in-2-diffrent-lines-4175419335/)

threezerous

07-30-2012 06:56 AM

search for 2 different strings in 2 diffrent lines

I would like to search for 2 different strings on 2 different lines in files in recursive directories. Both these strings must appear at the beginning of the line. I was able to write the script for searching for one string as follows:
The string I searched for was "</string1-abc"

egrep -ilr '^</string1-abc' ./* > /archive/files_with_broken_links.txt &

On the output results I would like to search for all files which have the string "string2-xyz" at the beginning of another line.

Any suggestions/guidance are appreciated much and thanks in advance.

fatmac

07-30-2012 07:18 AM

Quote:

Originally Posted by threezerous (Post 4741227)

Would not this give you what you want?

egrep -ilr '^</string1-abc' ./* >> /archive/files_with_broken_links.txt &
egrep -ilr '^</string2-xyz' ./* >> /archive/files_with_broken_links.txt &

threezerous

07-30-2012 09:24 AM

I guess that would give me two lists, each listing the files having the string grepped for. I would like to have an AND condition. Files having string1-abc AND string2-xyz in the same file.

Thanks

fatmac

07-30-2012 12:40 PM

As written, both searches are appended to the same file, because of the use of the redirector >> instead of >.

First listed would be all your first search string followed by all your second search string.

David the H.

07-30-2012 12:54 PM

Your requirements are rather unclear.

Please post an actual example of the input text, exactly what needs to be matched in it, and how the output needs to be presented. Also explain what kinds of variations we could expect. Does one line always come before another, or can they be reversed? Do they follow one another directly or can they be separated by other text? Are they found randomly in the file or only in certain relationships to other text? Etc.

The best solution to use usually depends on first defining the exact patterns in the input that need to be matched. Both the text to match and the text to exclude need to be taken into account.

Finally, this looks like html or xml. Regex-based tools are not well-suited for working with these freeform data types. It may be better for you to use a tool with a dedicated parser. Again, it will help if you explain your exact requirements.

threezerous

07-30-2012 01:20 PM

Actually I was able to resolve the issue by running the following script on the output file of the first command.

#!/bin/bash
# To execute ./grep.sh &

INPUT_FILE="broken_links.txt"
OUT_RESULT_FILE="broken_links_string2.txt"

echo "">"$OUT_RESULT_FILE"

while read line
do
if egrep '<string2-xyz>' "$line"
then
/bin/echo "$line" >> "$OUT_RESULT_FILE"
else
/bin/echo "NOT found in $line"
fi
done < "$INPUT_FILE"

Thanks all for your suggestions and comments.

David the H.

07-30-2012 02:51 PM

Well, congratulations on figuring out a solution on your own.

But it's a rather clunky one, having to run two separate filtering operations. Again, if you'd provide some details we'd almost certainly be able to work up something cleaner.

And please use ***[code][/code] tags*** around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, bolding, colors, or other fancy formatting.

whizje

07-30-2012 02:56 PM

This doesn't search through the subdirs then you have to use find bla bla exec grep etc

Code:

bash-4.2$ for f in *; do grep -q "^</string1-abc" $f && grep -l "^</string2-abc" $f; done > output.txt

whizje

07-30-2012 03:42 PM

How can I use this

Code:

bash-4.2$ for f in *; do grep -q "^</string1-abc" $f && grep -l "^</string2-abc" $f; done > output.txt

with find I tried with exec and exec sh -c but it doesn't work.

Code:

find . -type 'f' -exec sh -c 'grep -l "^</string1-abc" {} && grep -l "^</string2-abc"' {} \;

All times are GMT -5. The time now is 03:22 PM.