LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Using grep to compare two files and make a filter on key words (https://www.linuxquestions.org/questions/linux-newbie-8/using-grep-to-compare-two-files-and-make-a-filter-on-key-words-4175665915/)

Ikebukuro 12-13-2019 11:09 AM

Using grep to compare two files and make a filter on key words
 
Hello experts,

I have a problem with two files.
File 1 : one word by line; it is the name of Oracle's tables.
File 2 : many words by line; it is the SELECT * from Oracle's views.

I need to find, in the file 1, all the words that are not in the file 2.
I tried with grep but I failed...

Can you tell me how to do?

Here an extract of File 1 : there are many space characters at the end of the line, we have to remove them in the search.
Quote:

WRH$_ACTIVE_SESSION_HISTORY
WRH$_ACTIVE_SESSION_HISTORY_BL
WRH$_ASM_BAD_DISK
WRH$_ASM_DISKGROUP
WRH$_ASM_DISKGROUP_STAT
WRH$_BG_EVENT_SUMMARY
Here an extract I simplified and modified of File 2.
PHP Code:

SELECT from WRH$_ACTIVE_SESSION_HISTORY                                                                                                                                                                             
SELECT 
from WRH$_ASM_BAD_DISK                                                                                                                                                                                       
SELECT col12
col34 from WRH$_ASM_DISKGROUPWRH$_ASM_BAD_DISK etc etc                                                                                                                                                                                      
SELECT 
from WRH$_ASM_DISKGROUP_STAT ORDER BY 1 

The result I want to see:
Quote:

WRH$_ACTIVE_SESSION_HISTORY_BL
WRH$_BG_EVENT_SUMMARY
Sorry, I don't see where I can insert a whole file (is it possible?) I don't want to copy ALL the content of my files, it is too big...
And sorry if I made mistakes for my first post here ...

Have a very nice day.

Turbocapitalist 12-13-2019 11:15 AM

You're probably going to need an awk or perl script with an associative array. Read the shorter word list into the array and then run the second, larger file through it.

Farcrada 12-13-2019 11:17 AM

You could write a small Python script/program that takes two files as arguments and spits out a difference file and a file with everything that matches. I feel like that would be the best short-term solution, assuming you need to do this for the mentioned big file(s).

Ikebukuro 12-13-2019 11:21 AM

Thank you for yours answers but I am an Oracle DBA, not an expert Linux... I don't know how to use Python or Perl or Awk :-(

Turbocapitalist 12-13-2019 11:34 AM

No time like the present. It's pretty much not possible to do system adminsitration without periodically needing some awk or perl or (maybe) python.

AWK is a full language and would take ages to master but getting an idea of the basics can take but a few minutes:

https://www.grymoire.com/Unix/Awk.html

AWK in its most elementary form is just a buch of abbreviated if-then statements:

Code:

awk '
NR==FNR {
        a[$1]++;
        next;
}

a[$4] {
        delete a[$4];
}

END {
        for (i in a) {
                print i;
        }
}
' file1.txt file2.txt | sort

$1, $4, NR, and FNR are built-in variables. END {} is a clause which gets run once after there is no more input.

Ikebukuro 12-13-2019 02:04 PM

Thank you Turbocapitalist for Awk, I've forgotten that I have a book about it...
Tomorrow I will read it, I think it will be very useful to solve my problem.

Turbocapitalist 12-13-2019 10:28 PM

Having the pattern in an unpredicable place on each line and possibly multiple times means you'll probably have to work out something with the match() function or similar.

MadeInGermany 12-14-2019 03:07 PM

Unfortunately the following simple word grep returns the non-matching lines from file2
Code:

grep -vwf file1 file2
And the vice versa won't match as words.

boughtonp 12-14-2019 03:44 PM

Doing this with grep isn't difficult:
Code:

while read -r line; do
        grep -wFq "$line" queries.txt || echo "$line"
done < tables.txt

tables.txt is your file 1 - the list of tables (needles) to search for, with $line being each one.
queries.txt is your file 2 - the queries file (haystack) to search within.

The first and last lines are for looping through the file (for variations see BashFAQ/001).

Grep flags:
-w matches whole words (addresses suffixed names)
-F is for Fixed strings - i.e. disables regex matching
-q is for quiet - i.e. don't output when matches found (we want the opposite)

The || is so that when grep doesn't match, the line searched for is then output.

Ikebukuro 12-15-2019 06:36 AM

boughtonp, you are GREAT!
It works :)

Thank you everybody for your help, I was reading a book about Awk but I see we can manage my problem with grep!

Have a nice day :-)

frankbell 12-15-2019 07:47 PM

See man diff.


All times are GMT -5. The time now is 10:22 AM.